🧪 Skills

Feishu Voice (ElevenLabs)

Send and receive voice messages on Feishu (Lark) using ElevenLabs TTS and STT. Activate when user asks to send a voice message on Feishu, or when receiving a...

v1.1.0
❤️ 0
⬇️ 179
👁 1
Share

Description


name: feishu-voice description: Send and receive voice messages on Feishu (Lark) using ElevenLabs TTS and STT. Activate when user asks to send a voice message on Feishu, or when receiving a Feishu audio message (media attachment with .ogg/.opus file) that needs transcription. Supports smart reply mode — auto voice-reply to voice messages, text-reply to text messages.

Feishu Voice (TTS + STT)

Send voice messages and transcribe received voice messages on Feishu using ElevenLabs.

Prerequisites

  • sag CLI (ElevenLabs TTS): npm i -g sag or go install
  • ffmpeg / ffprobe: brew install ffmpeg
  • ElevenLabs paid plan (required for library voices)
  • Feishu app with im:message:send_as_bot and im:file permissions

Environment Variables

Variable Required Description
ELEVENLABS_API_KEY ElevenLabs API key
FEISHU_APP_ID ✅ (TTS) Feishu app ID
FEISHU_APP_SECRET ✅ (TTS) Feishu app secret
ELEVENLABS_VOICE_ID ✅ (TTS) Voice ID (browse at elevenlabs.io/voice-library)
ELEVENLABS_MODEL_ID ✅ (TTS) Model ID (e.g. eleven_multilingual_v2, eleven_v3)
ELEVENLABS_SPEED Speech speed 0.5-2.0 (default: 1.0)

If FEISHU_APP_ID / FEISHU_APP_SECRET are not in env, extract from openclaw config:

export FEISHU_APP_ID=$(python3 -c "import json; print(json.load(open('$HOME/.openclaw/openclaw.json'))['channels']['feishu']['appId'])")
export FEISHU_APP_SECRET=$(python3 -c "import json; print(json.load(open('$HOME/.openclaw/openclaw.json'))['channels']['feishu']['appSecret'])")

Voice Selection

See config/voice-config.example.json for a curated voice list. Browse all voices at https://elevenlabs.io/voice-library or run sag voices.

Recommended models:

  • eleven_multilingual_v2 — best for Chinese and multilingual content
  • eleven_v3 — latest English-optimized model

Sending Voice Messages (TTS)

scripts/feishu-voice-send.sh <text> <receive_id> [receive_id_type] [speed]
  • receive_id: target user open_id or chat_id
  • receive_id_type: open_id (default) or chat_id
  • speed: speech speed multiplier, 0.5-2.0 (default: 1.0)

Receiving Voice Messages (STT)

When OpenClaw delivers a Feishu voice message, it arrives as a media attachment (.ogg file). Transcribe with:

scripts/feishu-voice-stt.sh /path/to/audio.ogg

Returns recognized text to stdout. Uses ElevenLabs scribe_v1 model with automatic language detection.

Fallback: Download via Feishu API

If the audio file is not delivered as an attachment (only file_key available):

  1. List recent messages: GET /im/v1/messages?container_id_type=chat&container_id=CHAT_ID&page_size=5&sort_type=ByCreateTimeDesc
  2. Download audio: GET /im/v1/messages/{message_id}/resources/{file_key}?type=file
  3. Run STT script on the downloaded file

Smart Reply Mode

When receiving messages, follow this pattern for natural conversation:

  • Voice message received → transcribe with STT → understand → reply with voice (TTS)
  • Text message received → understand → reply with text
  • Override: user can request voice/text reply explicitly

Important Notes

  • Feishu msg_type must be "audio" — not "media" or "file"
  • OpenClaw's message tool asVoice does not work correctly for Feishu — use this script instead
  • STT uses ElevenLabs scribe_v1 model, supports Chinese, English, and 90+ languages
  • For free ElevenLabs accounts, only premade voices work; library voices require a paid plan

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs