Quickstart¶
A two-minute tour of every voice mode — auto, design, and clone.
A minimal end-to-end example.
from strands import Agent
from strands_omnivoice import (
omnivoice_tts, omnivoice_clone, omnivoice_design,
omnivoice_sysinfo, audio_play,
)
agent = Agent(tools=[
omnivoice_tts, omnivoice_clone, omnivoice_design,
omnivoice_sysinfo, audio_play,
])
# 1. Sanity check
agent("omnivoice_sysinfo")
# 2. Auto voice
agent("omnivoice_tts text='Hello world' output=/tmp/hello.wav, then audio_play it")
Direct Tool Calls (No Agent)¶
Each @tool is just a function — call it directly:
from strands_omnivoice import omnivoice_tts
result = omnivoice_tts(text="Hello", output="/tmp/h.wav")
print(result["content"][0]["text"])
# → "🔊 wrote /tmp/h.wav (1.23s @ 24000 Hz)"
Three Generation Modes — All Real Samples¶
Model Pre-warming¶
To avoid load-latency on the first synthesis, pre-warm:
from strands_omnivoice import omnivoice_load_model
omnivoice_load_model(device="mps") # or "cuda", or leave empty for auto
Subsequent calls reuse the cached weights — zero double-load even when chaining
omnivoice_clone → omnivoice_design → omnivoice_tts.
Running the Smoke-test Agent¶
The agent.py script in the repo loads all 13 tools and gives the LLM full
creative freedom.