Voice Creation Guidelines

When you give your bot a voice, you're shaping how it sounds to everyone who hears it. These guidelines explain what to keep in mind when writing a voice prompt or cloning a voice, and what tends to cause a voice submission to be rejected.

These articles contain our policy on bot creation: Generative AI Image Policy and the Creating & Using Bots.

Writing a voice prompt

The best voice prompts describe what the voice sounds like, not who it is. The model can build almost any vocal quality you ask for, as long as the description focuses on sound rather than identity.

Helpful things to describe:

Pitch and timbre. Deep, light, gravelly, smooth, breathy, nasal, warm, bright.
Pace and rhythm. Quick, deliberate, lilting, halting, measured, casual.
Tone and mood. Confident, playful, weary, dry, earnest, theatrical, deadpan.
Accent or region (general). A general regional class works (a Southern US drawl, a soft Irish lilt). Avoid pinning the voice to a specific person.

What to avoid in a voice prompt:

Named real people. Voice prompts that reference real or famous people may be rejected. Cantina doesn't allow voice impersonation of real people, regardless of intent.
Named characters. Voice prompts that reference characters from movies, TV, books, or video games are rejected.

What can cause a voice submission to be rejected

Most rejections fall into one of five patterns. Knowing the patterns helps you write a prompt that's likely to produce great results and be accepted.

1. Voices that suggest a minor

Voices that sound like children are not allowed. This includes specific age callouts (any age under 18) and descriptors like baby voice, little, young, infantilized, or childlike. This is a safety guardrail.

2. Voices of real people

Naming or clearly evoking a specific real person will be rejected. That covers politicians, musicians, athletes, influencers, public figures, your friend, or your neighbor. Real-person impersonation runs into right-of-publicity and impersonation policies regardless of intent.

3. Voices of existing characters

Named characters from movies, TV, books, video games, or other franchises are protected IP. Asking for a specific character or a thinly veiled version (changed name, same setup) violates Cantina's IP policy. Build an original character voice instead.

4. Sexualized voices

Cantina has zero-tolerance policies against the sexualization of minors and the sexualization of individuals without their consent.

Sexually suggestive voices may be rejected. Vocal qualities like breathy or whisper are fine in neutral contexts (a sleepy character, an intimate narrator). The issue is when they're paired with sexually suggestive intent.

Youth-coded language with sexual descriptors is strictly prohibited and may lead to enforcement against accounts.

5. Voices that rely on slurs or offensive caricatures

Slurs, offensive caricatures of specific ethnic groups, and explicitly racist language are rejected. Describing a general accent class for a character is fine. The issue is when the prompt relies on offensive caricatures or is explicitly racist.

Voice cloning guidelines

Voice cloning lets you make a digital copy of someone's voice for your bot. The how-to (upload, recording requirements, save) lives in How to Give Your Bot a Voice. The rules:

Use your own voice, or one you have direct, affirmative permission to use. Default to not cloning if you're unsure.
Don't clone a minor's voice. Child-sounding voices aren't allowed on Cantina.
Don't clone real people without their direct, affirmative permission. If the voice belongs to someone else, you need consent from that person specifically for voice cloning or synthetic voice use. Public figures, creators, podcasters, influencers, friends, coworkers, and private individuals are all covered by the same rule.
Don't use a cloned voice for sexual content. Cloning someone's voice and using it for sexual content is strictly prohibited and may lead to enforcement against accounts.
When in doubt, don't clone the voice. Create an original voice prompt instead.

Before any voice clone is created, Cantina shows the creator this notice.

Voice Clone Notice

By uploading a voice file, Cantina will create a digital clone you can apply to your AI characters.

To use this feature, please confirm the voice is yours or that you have permission to provide it, and consent to its collection, use, and storage for this purpose. Cantina will not sell your voice data, and may share it only with service providers acting on its behalf. Your voice may be used to operate and improve our products and services and will be stored securely until you delete it from your voice library.

This consent applies each time you use the feature.

If your submission is rejected

A rejection means the model flagged something in your prompt. To get a working voice:

Reread your prompt. Look for:
- References to a real person (names, public figures, anyone identifiable)
- References to a famous character (named, or a thinly-veiled stand-in)
- Anything age-coded under 18 (specific ages, baby, little, young, childlike)
- Sexual descriptors or suggestive vocal qualities in combination
- Slurs, ethnic caricatures, or stereotype-based mockery
These cover the most common triggers.
Replace identity with sound. Swap named people or characters for vocal qualities. "A confident, gravelly voice with a slow Southern drawl" works. "Sounds like [a specific actor]" doesn't.
Resubmit with the rewrite. If it still gets rejected, look for any remaining references to a real person, a famous character, a minor, sexual themes, or offensive caricatures in your prompt.

If you're stuck, the How to Give Your Bot a Voice walkthrough has examples of prompts that work.