API Reference¶

Every public tool, signature, and parameter — at a glance.

All tools live in strands_omnivoice and are decorated with @tool from strands.

Synthesis¶

`omnivoice_tts(text, output, language="", duration=0.0, speed=1.0, num_step=32, guidance_scale=2.0, model_id="", device="")`¶

Auto-voice synthesis. The model picks a voice. Returns the WAV path and metadata.

`omnivoice_clone(text, output, ref_audio, ref_text="", language="", duration=0.0, speed=1.0, num_step=32, guidance_scale=2.0, model_id="", device="")`¶

Clone a speaker from ref_audio. If ref_text is empty, OmniVoice auto-transcribes via Whisper.

`omnivoice_design(text, output, instruct, language="", duration=0.0, speed=1.0, num_step=32, guidance_scale=2.0, model_id="", device="")`¶

Voice design. instruct is a comma-separated attribute string ("female, british accent").

`omnivoice_batch(items, output_dir, num_step=32, guidance_scale=2.0, model_id="", device="")`¶

Batch synthesis sharing a single loaded model. Each item: {"id":..., "text":..., "ref_audio"?:..., "instruct"?:..., ...}.

ASR¶

`omnivoice_transcribe(audio_path, model_id="", device="")`¶

Transcribe an audio file using OmniVoice's bundled Whisper ASR.

Model Lifecycle¶

`omnivoice_load_model(model_id="", device="", dtype="", force=False)`¶

Pre-load (or reload) the model. Use as a warmup tool.

`omnivoice_unload_model()`¶

Drop the cached model and free GPU memory.

`omnivoice_download_model(model_id="k2-fsa/OmniVoice")`¶

Snapshot-download the checkpoint from HuggingFace Hub without loading.

Info¶

`omnivoice_sysinfo()`¶

Reports device, dtype, omnivoice version, torch info, and currently-loaded model state.

`omnivoice_list_languages(filter="")`¶

List the supported language names; optionally filter by substring.

Audio Utilities¶

`audio_probe(audio_path)`¶

Inspect an audio file (duration, sample rate, channels, format).

`audio_play(audio_path, blocking=False)`¶

Play an audio file using the host's default player. Supports macOS (afplay), Linux (aplay/paplay/ffplay).

Web UI¶

`omnivoice_demo_serve(action="start", ip="0.0.0.0", port=8001, model_id="")`¶

Manage the upstream omnivoice-demo Gradio web UI as a background process. Actions: start / stop / status / logs.

Loader API (advanced)¶

`get_model(model_id=None, device=None, dtype=None, force=False)`¶

Return the cached OmniVoice instance. Loads on first call.

`unload_model()`¶

Drop cached weights and free GPU.

`get_loaded_info()`¶

Inspect the cache state.

`best_device()`¶

Auto-detect the best available backend (cuda → mps → cpu).

API Reference¶

Synthesis¶

omnivoice_tts(text, output, language="", duration=0.0, speed=1.0, num_step=32, guidance_scale=2.0, model_id="", device="")¶

omnivoice_clone(text, output, ref_audio, ref_text="", language="", duration=0.0, speed=1.0, num_step=32, guidance_scale=2.0, model_id="", device="")¶

omnivoice_design(text, output, instruct, language="", duration=0.0, speed=1.0, num_step=32, guidance_scale=2.0, model_id="", device="")¶

omnivoice_batch(items, output_dir, num_step=32, guidance_scale=2.0, model_id="", device="")¶

ASR¶

omnivoice_transcribe(audio_path, model_id="", device="")¶

Model Lifecycle¶

omnivoice_load_model(model_id="", device="", dtype="", force=False)¶

omnivoice_unload_model()¶

omnivoice_download_model(model_id="k2-fsa/OmniVoice")¶