Feishu Voice (ElevenLabs)
Send and receive voice messages on Feishu (Lark) using ElevenLabs TTS and STT. Activate when user asks to send a voice message on Feishu, or when receiving a...
Description
name: feishu-voice description: Send and receive voice messages on Feishu (Lark) using ElevenLabs TTS and STT. Activate when user asks to send a voice message on Feishu, or when receiving a Feishu audio message (media attachment with .ogg/.opus file) that needs transcription. Supports smart reply mode — auto voice-reply to voice messages, text-reply to text messages.
Feishu Voice (TTS + STT)
Send voice messages and transcribe received voice messages on Feishu using ElevenLabs.
Prerequisites
sagCLI (ElevenLabs TTS):npm i -g sagorgo installffmpeg/ffprobe:brew install ffmpeg- ElevenLabs paid plan (required for library voices)
- Feishu app with
im:message:send_as_botandim:filepermissions
Environment Variables
| Variable | Required | Description |
|---|---|---|
ELEVENLABS_API_KEY |
✅ | ElevenLabs API key |
FEISHU_APP_ID |
✅ (TTS) | Feishu app ID |
FEISHU_APP_SECRET |
✅ (TTS) | Feishu app secret |
ELEVENLABS_VOICE_ID |
✅ (TTS) | Voice ID (browse at elevenlabs.io/voice-library) |
ELEVENLABS_MODEL_ID |
✅ (TTS) | Model ID (e.g. eleven_multilingual_v2, eleven_v3) |
ELEVENLABS_SPEED |
❌ | Speech speed 0.5-2.0 (default: 1.0) |
If FEISHU_APP_ID / FEISHU_APP_SECRET are not in env, extract from openclaw config:
export FEISHU_APP_ID=$(python3 -c "import json; print(json.load(open('$HOME/.openclaw/openclaw.json'))['channels']['feishu']['appId'])")
export FEISHU_APP_SECRET=$(python3 -c "import json; print(json.load(open('$HOME/.openclaw/openclaw.json'))['channels']['feishu']['appSecret'])")
Voice Selection
See config/voice-config.example.json for a curated voice list. Browse all voices at https://elevenlabs.io/voice-library or run sag voices.
Recommended models:
eleven_multilingual_v2— best for Chinese and multilingual contenteleven_v3— latest English-optimized model
Sending Voice Messages (TTS)
scripts/feishu-voice-send.sh <text> <receive_id> [receive_id_type] [speed]
receive_id: target useropen_idorchat_idreceive_id_type:open_id(default) orchat_idspeed: speech speed multiplier, 0.5-2.0 (default: 1.0)
Receiving Voice Messages (STT)
When OpenClaw delivers a Feishu voice message, it arrives as a media attachment (.ogg file). Transcribe with:
scripts/feishu-voice-stt.sh /path/to/audio.ogg
Returns recognized text to stdout. Uses ElevenLabs scribe_v1 model with automatic language detection.
Fallback: Download via Feishu API
If the audio file is not delivered as an attachment (only file_key available):
- List recent messages:
GET /im/v1/messages?container_id_type=chat&container_id=CHAT_ID&page_size=5&sort_type=ByCreateTimeDesc - Download audio:
GET /im/v1/messages/{message_id}/resources/{file_key}?type=file - Run STT script on the downloaded file
Smart Reply Mode
When receiving messages, follow this pattern for natural conversation:
- Voice message received → transcribe with STT → understand → reply with voice (TTS)
- Text message received → understand → reply with text
- Override: user can request voice/text reply explicitly
Important Notes
- Feishu
msg_typemust be"audio"— not"media"or"file" - OpenClaw's
messagetoolasVoicedoes not work correctly for Feishu — use this script instead - STT uses ElevenLabs
scribe_v1model, supports Chinese, English, and 90+ languages - For free ElevenLabs accounts, only premade voices work; library voices require a paid plan
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!