🧪 Skills
Alicloud Ai Audio Cosyvoice Voice Design
Use when designing custom voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from...
v1.0.0
Description
name: alicloud-ai-audio-cosyvoice-voice-design description: Use when designing custom voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from a voice prompt plus preview text before using the returned voice_id in TTS. version: 1.0.0
Category: provider
Model Studio CosyVoice Voice Design
Use the CosyVoice voice enrollment API to create designed voices from a natural-language voice description.
Critical model names
Use model="voice-enrollment" and one of these target_model values:
cosyvoice-v3.5-pluscosyvoice-v3.5-flashcosyvoice-v3-pluscosyvoice-v3-flash
Recommended default in this repo:
target_model="cosyvoice-v3.5-plus"
Region and compatibility
cosyvoice-v3.5-plusandcosyvoice-v3.5-flashare available only in China mainland deployment mode (Beijing endpoint).- In international deployment mode (Singapore endpoint),
cosyvoice-v3-plusandcosyvoice-v3-flashdo not support voice clone/design. - The
target_modelmust match the later speech synthesis model.
Endpoint
- Domestic:
https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization - International:
https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization
Prerequisites
- Set
DASHSCOPE_API_KEYin your environment, or adddashscope_api_keyto~/.alibabacloud/credentials.
Normalized interface (cosyvoice.voice_design)
Request
model(string, optional): fixed tovoice-enrollmenttarget_model(string, optional): defaultcosyvoice-v3.5-plusprefix(string, required): letters/digits only, max 10 charsvoice_prompt(string, required): max 500 chars, Chinese or English onlypreview_text(string, required): max 200 chars, Chinese or Englishlanguage_hints(array[string], optional):zhoren, and should matchpreview_textsample_rate(int, optional): e.g.24000response_format(string, optional): e.g.wav
Response
voice_id(string)request_id(string)status(string, optional)
Operational guidance
- Keep
voice_promptconcrete: timbre, age range, pace, emotion, articulation, and scenario. - If
language_hintsis used, it should match the language ofpreview_text. - Designed voice names include a
-vd-marker in the generated backend naming convention.
Local helper script
Prepare a normalized request JSON:
python skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
--target-model cosyvoice-v3.5-plus \
--prefix announcer \
--voice-prompt "沉稳的中年男性播音员,低沉有磁性,语速平稳,吐字清晰。" \
--preview-text "各位听众朋友,大家好,欢迎收听晚间新闻。" \
--language-hint zh
Validation
mkdir -p output/alicloud-ai-audio-cosyvoice-voice-design
for f in skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/*.py; do
python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt
Pass criteria: command exits 0 and output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt is generated.
Output And Evidence
- Save artifacts, command outputs, and API response summaries under
output/alicloud-ai-audio-cosyvoice-voice-design/. - Include
target_model,prefix,voice_prompt, andpreview_textin the evidence file.
References
references/api_reference.mdreferences/sources.md
Reviews (0)
Sign in to write a review.
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!