🧪 Skills

SOTA Zero-shot Voice Cloning TTS

Voice-first OpenClaw skill powered by MOSS APIs. Use when a user wants spoken replies in a preferred timbre, either from an existing voice_id or from a refer...

v1.0.0
❤️ 1
⬇️ 169
👁 1
Share

Description


name: echoforge-moss-voice description: Voice-first OpenClaw skill powered by MOSS APIs. Use when a user wants spoken replies in a preferred timbre, either from an existing voice_id or from a reference audio clip.

EchoForge Moss Voice

Use this skill to run voice interaction with user-preferred timbre.

Required runtime config

  • MOSI_API_KEY (required)
  • MOSI_BASE_URL (optional, default https://studio.mosi.cn)

Always send:

  • Authorization: Bearer <MOSI_API_KEY>

Inputs

Collect:

  • text (required, what to speak)
  • Voice source (one of):
    • voice_id (preferred when available), or
    • reference_audio (public URL), or
    • local audio path (upload first, then clone voice)

Optional:

  • expected_duration_sec
  • sampling_params:
    • max_new_tokens (default 512)
    • temperature (default 1.7)
    • top_p (default 0.8)
    • top_k (default 25)
  • meta_info (default false)

Workflow

  1. Resolve voice source.
    • If voice_id is available, use it directly.
    • If only local audio path is available:
      • Upload file: POST /api/v1/files/upload with multipart field file.
      • Clone voice: POST /api/v1/voice/clone with file_id (or url).
      • If returned voice status is not active, poll GET /api/v1/voices/{voice_id} until ACTIVE or timeout.
    • If reference_audio URL is available, use it directly in TTS.
  2. Run TTS: POST /v1/audio/tts.
    • Required payload:
      • model: "moss-tts"
      • text
      • one of voice_id or reference_audio
  3. Parse response:
    • Decode audio_data (base64) to WAV.
    • Read duration_s and usage when present.
  4. Return a concise result:
    • voice_id used
    • output file path
    • duration
    • brief status message

Error handling

  • If 4010 or 4011: API key missing/invalid, ask user to fix MOSI_API_KEY.
  • If 4020: insufficient credits, ask user to recharge.
  • If 4029: rate limited, retry with exponential backoff.
  • If 5002: invalid audio URL or decode failed, ask user for another clip.
  • If 5004: timeout, shorten text and retry.

Operational constraints

  • Keep request rate <= 5 RPM.
  • Keep single request text short enough to avoid timeout.
  • Never print or log raw API keys.
  • Prefer reusing stable voice_id for multi-turn voice chat to reduce latency.

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs