🧪 Skills

Elevenlabs Toolkit

ElevenLabs voice API integration — TTS, sound effects, music generation, speech-to-text, voice isolation, and streaming. Use when building voice-enabled apps...

v1.0.0
❤️ 0
⬇️ 245
👁 1
Share

Description


name: elevenlabs-toolkit description: ElevenLabs voice API integration — TTS, sound effects, music generation, speech-to-text, voice isolation, and streaming. Use when building voice-enabled apps, generating narration, creating audio content, or transcribing speech. Requires ELEVENLABS_API_KEY. version: 1.0.0 metadata: { "openclaw": { "emoji": "\ud83c\udf99\ufe0f", "requires": { "bins": [], "env": [ "ELEVENLABS_API_KEY" ] }, "primaryEnv": "ELEVENLABS_API_KEY", "network": { "outbound": true, "reason": "Calls ElevenLabs API (api.elevenlabs.io) for TTS, SFX, music generation, STT, and voice operations." } } }

ElevenLabs Toolkit

Programmatic access to all 7 ElevenLabs API capabilities via FastAPI endpoints or standalone Python functions.

Capabilities

Tool Endpoint What It Does
Voices GET /api/voices Browse available voices with metadata
TTS POST /api/voice/tts Batch text-to-speech (any voice, any language)
TTS Stream WS /api/voice/stream Real-time WebSocket TTS streaming
Sound Effects POST /api/voice/sfx Generate ambient audio from text prompts
Music POST /api/voice/music Generate background music from descriptions
STT (Scribe) POST /api/voice/stt Transcribe audio with language detection
Voice Isolation POST /api/voice/isolate Extract clean voice from noisy audio

Quick Start

import httpx

BASE = "http://localhost:8000"  # Your FastAPI app
KEY = os.environ["ELEVENLABS_API_KEY"]

# Get voices
voices = httpx.get(f"{BASE}/api/voices").json()

# Generate speech
audio = httpx.post(f"{BASE}/api/voice/tts", json={
    "text": "Hello world",
    "voice_id": voices[0]["voice_id"],
    "model_id": "eleven_multilingual_v2"
}).content  # Returns raw audio bytes

# Generate sound effects
sfx = httpx.post(f"{BASE}/api/voice/sfx", json={
    "prompt": "ocean waves on a quiet beach at night"
}).content

Voice Selection Guide

  • English only: Use eleven_turbo_v2_5 — faster, no accent bleed
  • Multilingual: Use eleven_multilingual_v2 — supports 29 languages
  • Accent warning: Multilingual model can bleed accents across languages. If an English voice sounds Japanese, switch to turbo.

Quota Management

ElevenLabs charges per character for TTS. Key patterns:

  • Cache aggressively — identical text + voice = identical audio
  • Use prompt-cache skill for SHA-256 dedup before calling TTS
  • A 6-scene children's story ≈ 2,000 characters
  • Free tier: 10k chars/month. Starter: 30k. Creator: 100k.

Integration

Copy scripts/elevenlabs_api.py into your FastAPI app and mount the router:

from elevenlabs_api import router
app.include_router(router)

Set ELEVENLABS_API_KEY in your environment. All endpoints handle errors gracefully with proper HTTP status codes.

Files

  • scripts/elevenlabs_api.py — FastAPI router with all 7 endpoints

Security Notes

This skill uses patterns that may trigger automated security scanners:

  • base64: Used for encoding audio/binary data in API responses (standard practice for media APIs)
  • UploadFile: FastAPI's built-in file upload parameter for STT/voice isolation endpoints
  • "system prompt": Refers to configuring agent instructions, not prompt injection

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs