Voice Design¶

Describe the speaker via attributes — gender, age, pitch, accent, dialect, whisper.

Describe the speaker via attributes — no reference audio needed.

from strands_omnivoice import omnivoice_design

omnivoice_design(
    text="Welcome to the future.",
    output="/tmp/v.wav",
    instruct="female, young adult, high pitch, british accent",
)

Hear the differences¶

The same text in four different designed voices:

Attribute Categories¶

instruct is a comma-separated list. Within each category, only one value is allowed; across categories, combine freely.

Category	Values
Gender	`male`, `female`
Age	`child`, `teenager`, `young adult`, `middle-aged`, `elderly`
Pitch	`very low pitch`, `low pitch`, `moderate pitch`, `high pitch`, `very high pitch`
Style	`whisper`
English accent (EN text only)	`american accent`, `british accent`, `australian accent`, `canadian accent`, `indian accent`, `chinese accent`, `korean accent`, `portuguese accent`, `russian accent`, `japanese accent`
Chinese dialect (ZH text only)	`四川话`, `陕西话`, `东北话`, `云南话`, `河南话`, `贵州话`, `桂林话`, `济南话`, `石家庄话`, `甘肃话`, `宁夏话`, `青岛话`

Examples¶

"female, young adult, high pitch, british accent"
"male, elderly, low pitch, whisper"
"女, 青年, 四川话"
"middle-aged, indian accent"   # gender omitted — model fills in

Tips¶

Case-insensitive — Male, MALE, male are equivalent.
Mix English + Chinese — auto-normalised internally.
Less is more — if a combination behaves oddly, simplify.

See the upstream voice-design reference for the canonical attribute table.