🧪 Skills
tencent-tts-podcast
Convert text to podcast audio using Tencent Cloud TTS. Supports both short and long text processing, generates up to 30-minute long audio with automatic chun...
v1.0.0
Description
name: Tencent TTS Podcast Generator description: Convert text to podcast audio using Tencent Cloud TTS. Supports both short and long text processing, generates up to 30-minute long audio with automatic chunking and parallel processing. Supports 26 Chinese voices including basic, featured, customer service, and Tencent featured voices. license: MIT
Tencent TTS Podcast Generator
Convert text content to podcast audio files using Tencent Cloud TTS service.
Capabilities
What This Skill Can Do
- Short & Long Text Compatible: Intelligently detects text length, processes short text directly, auto-chunks long text
- Long Text to Speech: Supports generating podcasts up to 30 minutes long (~7200 characters)
- Concurrent Processing: Long texts are automatically split and processed in parallel for faster generation
- 26 Voices: Supports basic, featured, customer service, and Tencent featured voices
- Smart Chunking: Splits text at semantic boundaries (paragraph/sentence) for natural audio flow
- Duration Estimation: Automatically estimates generated audio duration
- Auto Retry: Automatically retries failed requests to improve success rate
Short & Long Text Processing Strategy
Note: Tencent Cloud TTS single request limit is ~150 characters. Texts exceeding this will be auto-chunked.
| Text Type | Length Range | Processing Method | Concurrency | Timeout |
|---|---|---|---|---|
| Ultra Short | ≤50 chars | Direct request | 1 | 30s |
| Short | 50-150 chars | Direct request | 1 | 30s |
| Medium | 150-500 chars | Auto-chunk (2-4 chunks) | 2-3 | 60s |
| Long | 500-2000 chars | Auto-chunk (4-14 chunks) | 3-5 | 60s |
| Extra Long | 2000-7200 chars | Auto-chunk (14-50 chunks) | 3-5 | 60s |
What This Skill Does NOT Do
- Does not generate mp3 format (wav only)
- Does not support background music or sound effects
- Does not auto-generate podcast scripts (user must provide)
- Does not support dual-speaker dialogue mode (single voice only)
File Structure
This Skill consists of the following files:
-
tts_podcast.pyMain entry script- Tencent Cloud TTS signature generation
- Audio file generation
- COS upload functionality
-
tts_tool.pyAgentScope tool interface wrapper -
SKILL.mdThis file, describing Skill capabilities, boundaries, and usage conventions -
requirements.txtPython dependency configuration
Input & Output Specifications
Input Parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
Text |
Text content to convert | Yes | - |
VoiceType |
Voice ID (see voice table below, either this or VoiceName) | No | 502006 |
VoiceName |
Voice name (see voice table below, either this or VoiceType) | No | - |
secret_id |
Tencent Cloud SecretId | Yes | - |
secret_key |
Tencent Cloud SecretKey | Yes | - |
max_workers |
Concurrent threads (3-5 for long text, 1 for short) | No | 3 |
chunk_size |
Chunk size in characters (long text optimization) | No | 140 |
timeout |
Request timeout in seconds | No | 30/60 |
enable_retry |
Enable automatic retry | No | true |
max_retries |
Max retry attempts | No | 2 |
preserve_paragraphs |
Preserve paragraph boundaries when chunking | No | true |
cos_secret_id |
Tencent Cloud COS SecretId (optional, defaults to TTS credentials) | No | - |
cos_secret_key |
Tencent Cloud COS SecretKey (optional, defaults to TTS credentials) | No | - |
upload_cos |
Whether to upload to COS, true/false (default false, local only) | No | false |
bucket_name |
COS Bucket name (default: ti-aoi) | No | ti-aoi |
app_id |
COS App ID (default: 1257195185) | No | 1257195185 |
region |
COS region (default: ap-guangzhou) | No | ap-guangzhou |
Output
{
"Code": 0,
"Msg": "success",
"AudioUrl": "https://xxx.cos.ap-guangzhou.myqcloud.com/xxx.wav"
}
Usage
Environment Requirements
- Python 3.8+
- tencentcloud-sdk-python
- cos-python-sdk-v5
- requests
Install Dependencies
pip install -r requirements.txt
Basic Usage
from tts_podcast import main
result = main({
"Text": "Hello, welcome to today's podcast.",
"VoiceType": 502006,
"secret_id": "YOUR_SECRET_ID",
"secret_key": "YOUR_SECRET_KEY"
})
print(result)
# {'Code': 0, 'Msg': 'success', 'AudioUrl': 'https://...'}
Short Text Optimized Usage
# Short text (<150 chars) - Use single thread for fast response
result = main({
"Text": "Hello, this is a short message.",
"VoiceType": 502006,
"secret_id": "YOUR_SECRET_ID",
"secret_key": "YOUR_SECRET_KEY",
"max_workers": 1, # Single thread is sufficient
"timeout": 30, # 30 second timeout
"enable_retry": True # Enable retry
})
Long Text Optimized Usage
# Long text (>150 chars) - Use concurrency for speed
long_text = """Chapter 1: The Origin of AI
The concept of artificial intelligence can be traced back to ancient Greek mythology..."""
result = main({
"Text": long_text,
"VoiceType": 502007,
"secret_id": "YOUR_SECRET_ID",
"secret_key": "YOUR_SECRET_KEY",
"max_workers": 5, # Concurrent processing
"chunk_size": 140, # 140 chars per chunk
"timeout": 60, # 60 second timeout
"preserve_paragraphs": True # Preserve paragraph boundaries
})
Voice Reference
| VoiceType | Voice Name | Characteristics |
|---|---|---|
| 0 | 普通女声 | Standard female |
| 1 | 普通男声 | Standard male |
| 5 | 情感女声 | Emotional female |
| 6 | 情感男声 | Emotional male |
| 1000 | 智障少女 | Lively cute |
| 1001 | 阳光少年 | Bright youthful |
| 1002 | 温柔淑女 | Gentle female |
| 1003 | 成熟青年 | Mature male |
| 1004 | 严厉管事 | Stern female |
| 1005 | 亲和女声 | Friendly female |
| 1006 | 甜美女声 | Sweet female |
| 1007 | 磁性男声 | Magnetic male |
| 1008 | 播音主播 | Broadcast anchor |
| 101001 | 客服女声 | Customer service |
| 101005 | 售前客服 | Pre-sales service |
| 101007 | 售后客服 | After-sales service |
| 101008 | 亲和客服 | Friendly service |
| 502006 | 小旭 | Tencent voice |
| 502007 | 小巴 | Tencent voice |
| 502008 | 思驰 | Tencent voice |
| 502009 | 思佳 | Tencent voice |
| 502010 | 思悦 | Tencent voice |
| 502011 | 小宁 | Tencent voice |
| 502012 | 小杨 | Tencent voice |
| 502013 | 云扬 | Tencent voice |
| 502014 | 云飞 | Tencent voice |
Technical Architecture
tts_podcast.py
- TTS: Uses Tencent Cloud TTS API signature v3
- Upload: Uses Tencent Cloud COS SDK for audio file upload
- Auth: Supports credentials from parameters or environment variables
- Short & Long Text Compatible:
- Short text (≤150 chars): Direct single request, fast response
- Long text (>150 chars): Smart chunking + concurrent processing + auto-merge
Text Chunking Strategy
- Paragraph Priority: Try to preserve paragraph integrity, split at paragraph boundaries
- Sentence Boundaries: When paragraphs are too long, split at sentence ending punctuation (。!?;)
- Semantic Protection: Avoid truncating in the middle of words, ensure semantic coherence
- Length Control: Each chunk does not exceed 150 characters (Tencent Cloud API limit)
License
MIT
Reviews (0)
Sign in to write a review.
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!