🧪 Skills

Video Narrator

Generate SenseAudio TTS narration tracks for videos, including timestamped segments, style variants, and editor-ready voiceover exports. Use when users need...

v1.0.0
❤️ 0
⬇️ 9
👁 1
Share

Description


name: senseaudio-video-narrator description: Generate SenseAudio TTS narration tracks for videos, including timestamped segments, style variants, and editor-ready voiceover exports. Use when users need voiceovers, video narration, timed commentary, or accessibility narration. metadata: openclaw: requires: env: - SENSEAUDIO_API_KEY bins: - python3 - ffmpeg primaryEnv: SENSEAUDIO_API_KEY homepage: https://senseaudio.cn install: - kind: uv package: requests - kind: uv package: pydub compatibility: required_credentials: - name: SENSEAUDIO_API_KEY description: API key from https://senseaudio.cn/platform/api-key env_var: SENSEAUDIO_API_KEY homepage: https://senseaudio.cn

SenseAudio Video Narrator

Create professional narration audio for videos with timing-aware segmentation, natural delivery, and editor-friendly exports.

What This Skill Does

  • Generate narration audio synchronized to script timestamps
  • Match narration style to video genre such as documentary or tutorial
  • Control pacing with official TTS parameters and text break markers
  • Create multiple narration takes with different voices or styles
  • Export audio segments and merged narration tracks for editing workflows

Credential and Dependency Rules

  • Read the API key from SENSEAUDIO_API_KEY.
  • Send auth only as Authorization: Bearer <API_KEY>.
  • Do not place API keys in query parameters, logs, or saved examples.
  • If Python helpers are used, this skill expects python3, requests, and pydub.
  • pydub is used only for optional local audio assembly and mixing.

Official TTS Constraints

Use the official SenseAudio TTS rules summarized below:

  • HTTP endpoint: POST https://api.senseaudio.cn/v1/t2a_v2
  • Model: SenseAudio-TTS-1.0
  • Max text length per request: 10000 characters
  • voice_setting.voice_id is required
  • voice_setting.speed range: 0.5-2.0
  • voice_setting.pitch range: -12 to 12
  • Optional audio formats: mp3, wav, pcm, flac
  • Optional sample rates: 8000, 16000, 22050, 24000, 32000, 44100
  • Optional MP3 bitrates: 32000, 64000, 128000, 256000
  • Optional channels: 1 or 2
  • extra_info.audio_length returns segment duration in milliseconds
  • Inline break markup such as <break time=500> is supported in text

Recommended Workflow

  1. Prepare the script:
  • Split narration into timestamped segments.
  • Keep each segment comfortably below the 10000 character limit.
  1. Choose a voice and pacing profile:
  • Pick a voice_id and tune speed, pitch, and optional vol.
  • Use shorter segments when timing precision matters.
  1. Generate audio segments:
  • Call the TTS API for each segment.
  • Decode data.audio from hex before saving.
  • Capture extra_info.audio_length for timeline metadata.
  1. Assemble the narration track locally:
  • Use pydub to position clips on a silent master track.
  • Keep per-segment files for easier editor import and retiming.
  1. Validate timing against the video:
  • Leave small gaps when natural pacing is needed.
  • Adjust segment boundaries instead of overusing extreme speed values.

Minimal Timed Narration Helper

import binascii
import os
import re

import requests

API_KEY = os.environ["SENSEAUDIO_API_KEY"]
API_URL = "https://api.senseaudio.cn/v1/t2a_v2"


def parse_timed_script(script):
    pattern = r"\[(\d{2}):(\d{2}):(\d{2})\]\s*(.+?)(?=\n\[|\Z)"
    segments = []
    for match in re.finditer(pattern, script, re.DOTALL):
        hours, minutes, seconds, text = match.groups()
        timestamp_ms = (int(hours) * 3600 + int(minutes) * 60 + int(seconds)) * 1000
        segments.append({"timestamp": timestamp_ms, "text": text.strip()})
    return segments


def synthesize_segment(text, voice_id, speed=1.0, pitch=0, vol=1.0):
    response = requests.post(
        API_URL,
        headers={
            "Authorization": f"Bearer {API_KEY}",
            "Content-Type": "application/json",
        },
        json={
            "model": "SenseAudio-TTS-1.0",
            "text": text,
            "stream": False,
            "voice_setting": {
                "voice_id": voice_id,
                "speed": speed,
                "pitch": pitch,
                "vol": vol,
            },
            "audio_setting": {
                "format": "mp3",
                "sample_rate": 32000,
                "bitrate": 128000,
                "channel": 2,
            },
        },
        timeout=60,
    )
    response.raise_for_status()
    data = response.json()
    return {
        "audio_bytes": binascii.unhexlify(data["data"]["audio"]),
        "duration_ms": data["extra_info"]["audio_length"],
        "trace_id": data.get("trace_id"),
    }

Local Assembly Pattern

from pydub import AudioSegment


def create_synced_narration(audio_segments, video_duration_ms):
    narration_track = AudioSegment.silent(duration=video_duration_ms)
    for segment in audio_segments:
        clip = AudioSegment.from_file(segment["file"])
        narration_track = narration_track.overlay(clip, position=segment["timestamp"])
    return narration_track

Style Presets

  • Documentary: slower speed such as 0.95, neutral pitch
  • Tutorial: speed near 1.0, slightly warmer pitch
  • Commercial: modestly faster speed, slightly higher pitch

Prefer conservative tuning and script editing over extreme voice parameter changes.

Output Options

  • Per-segment narration clips in mp3 or wav
  • Timing metadata in json
  • Merged narration track for video editors
  • Optional alternate takes with different styles

Safety Notes

  • Do not hardcode credentials.
  • Do not assume local media tooling exists beyond what is declared here.
  • Treat returned trace_id and generated narration assets as potentially sensitive production data.

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs