Skip to content

Strands OmniVoice

Give your agent a voice — in 600+ languages, with zero training data.

⚡ Quickstart ★ GitHub

Animated waveform visualization
600+Languages
13Tools
3Voice modes
0.025RTF

Hear it ▶

One click on any sample below to play. All audio was generated by strands-omnivoice itself — code lives next to each card.

Auto voice — the simplest path

from strands_omnivoice import omnivoice_tts

omnivoice_tts(
    text="Hello world, welcome to Strands OmniVoice.",
    output="/tmp/hello.wav",
)

Voice design — describe the speaker

from strands_omnivoice import omnivoice_design

omnivoice_design(
    text="Once upon a time, in a land far far away.",
    instruct="female, elderly, low pitch, british accent",
    output="/tmp/story.wav",
)

Voice cloning — clone any speaker

from strands_omnivoice import omnivoice_clone

omnivoice_clone(
    text="This is the cloned voice speaking different words.",
    ref_audio="/tmp/hello.wav",        # → reference clip
    output="/tmp/cloned.wav",
)

Multilingual — same code, any language

omnivoice_tts(text="...", language="Japanese", output="ja.wav")

Inline tags — [laughter] [sigh] and more

omnivoice_tts(text="[laughter] You really got me!", output="/tmp/laugh.wav")

What you get

600+ languages

The broadest zero-shot TTS coverage available. Same omnivoice_tts call, any language.

Voice cloning

Clone any speaker from a 3–10s reference clip. Whisper-powered auto-transcription.

Voice design

Describe the speaker via attributes — gender, age, pitch, accent, dialect, whisper.

Fast on-device

RTF as low as 0.025. Apple Silicon (MPS), CUDA, and CPU all auto-detected.

Tools at a glance

Family Tools
Synthesis omnivoice_tts · omnivoice_clone · omnivoice_design · omnivoice_batch
ASR omnivoice_transcribe
Lifecycle omnivoice_load_model · unload · download
Info omnivoice_sysinfo · omnivoice_list_languages
Audio audio_probe · audio_play
Web UI omnivoice_demo_serve

→ Continue with Installation · Quickstart · API Reference