Fish Tts
Generate high-quality speech from text using Fish Audio S1 and optionally upload the MP3 audio file to NextCloud via WebDAV.
Description
Fish Audio S1 TTS Skill
Overview
This skill uses Fish Audio S1 to generate high-quality text-to-speech audio and upload it to NextCloud.
Requirements
- Fish Audio S1 service running at:
http://localhost:7860 - NextCloud credentials configured in environment variables
- WebDAV access to NextCloud for uploads
Usage
Generate speech from text:
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fish", "text":"Hello from Fish Audio S1!", "voice":"em_michael"}' \
-o /tmp/fish_audio.mp3
Upload to NextCloud:
curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
-X PUT -T /tmp/fish_audio.mp3 \
"http://192.168.68.68:8080/remote.php/webdav/Openclaw/fish_audio.mp3"
Configuration
Set these environment variables if not already set:
export NEXTCLOUD_USER="openclaw"
export NEXTCLOUD_PASS="N95qg-Wzdpc-6DJAn-xMaHa-RaEW5"
export NEXTCLOUD_URL="http://192.168.68.68:8080"
export FISH_AUDIO_S1_URL="http://192.168.68.78:7860"
Available Voices
Fish Audio S1 provides many high-quality voices. Common options:
Professional Male Voices
em_michael- Authoritative, businessem_pierre- French, professionalem_marcus- German, confident
Professional Female Voices
af_bella- Warm, naturalaf_nicole- Clear, articulateaf_rachel- Friendly, conversational
Emotional Voices
em_alex- Expressive male (warm tone, wide range)af_sarah- Friendly, youthful
Voices by Language
- French:
em_pierre - German:
em_marcus - British:
af_alice,af_emma
Advanced Features
Voice Selection
- Choose voice based on content type (professional vs emotional)
- Auto-detect content language (though Fish Audio S1 is primarily English)
Emotion Control
- Add emotion tags to input text:
[happy],[sad],[excited] - Example:
Hello! [happy] I am so happy to meet you today. - Fish Audio S1 will apply appropriate prosody automatically
Quality Settings
- High quality - Default (best natural speech)
- Fast generation - Prioritize speed over quality for testing
- Standard quality - Good balance of speed and quality
API Endpoints
Generate Audio
POST http://192.168.68.78:7860/v1/audio/speech
Request Format:
{
"model": "fish",
"text": "Your text here",
"voice": "Voice name from list above",
"output": "output file path or 'upload to NextCloud'"
}
Upload to NextCloud
PUT http://192.168.68.68:8080/remote.php/webdav/Openclaw/path/to/file.mp3
Headers:
Authorization: Basic <base64_credentials>Content-Type: audio/mpeg
Implementation Notes
Error Handling
- Check if Fish Audio S1 service is running before generating
- Validate NextCloud credentials are configured
- Gracefully handle connection errors
- Provide meaningful error messages
Audio Formats
- MP3 - Default (widely supported, good compression)
- WAV - Alternative (lossless, uncompressed)
- Bitrate - 128kbps (CD quality)
- Sample Rate - 24000Hz (standard for TTS)
NextCloud Integration
- WebDAV - Uses WebDAV protocol for file operations
- Path -
/Openclaw/or custom subfolder - Authentication - Basic auth with
NEXTCLOUD_USER:NEXTCLOUD_PASS
Troubleshooting
Service Not Responding
# Check if service is running
curl -s http://192.168.68.78:7860/health
# Check if can generate audio
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fish", "text":"test", "voice":"em_alex"}' \
-o /tmp/test.mp3
NextCloud Upload Failed
# Test NextCloud connectivity
curl -s -I "http://192.168.68.68:8080" \
-u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS"
-X PROPFIND -H "Depth:0" \
"http://192.168.68.68:8080/remote.php/webdav/Openclaw/"
Alternative TTS Services
If Fish Audio S1 is not available, try:
- Kokoro TTS - Your existing service at port 8880
- OpenVoice V2 - Voice cloning service at port 7861
Examples
Example 1: Simple Greeting
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fish", "text":"Hello! How are you today?", "voice":"em_michael"}' \
-o /tmp/greeting.mp3
Example 2: Emotional Speech
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fish", "text":"I am so excited to tell you about this amazing opportunity! [excited]", "voice":"af_sarah"}' \
-o /tmp/excited.mp3
Example 3: Upload to NextCloud
# Generate audio
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fish", "text":"This is a test file for NextCloud upload.", "voice":"em_michael"}' \
-o /tmp/test_file.mp3
# Upload to NextCloud
curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
-X PUT -T /tmp/test_file.mp3 \
"http://192.168.68.68:8080/remote.php/webdav/Openclaw/test_file.mp3"
Voice Name Reference
Complete list of available Fish Audio S1 voices (for testing):
- Professional Male: em_michael, em_pierre, em_marcus
- Professional Female: af_bella, af_nicole, af_rachel
- Emotional: em_alex, af_sarah
- British: af_alice, af_emma
- Young: af_nova
Best Practices
For Consistent Quality
- Use same voice for long content - Creates cohesive listening experience
- Consider audience - Choose professional voices for business, emotional for stories
- Test audio before final generation - Verify quality and volume
- Keep audio files organized - Use descriptive filenames with dates
- Monitor service health - Check endpoint responsiveness regularly
For NextCloud Uploads
- Use WebDAV - Efficient file transfer protocol
- Organize by date - Create folders like
2026/02/09/for daily uploads - Set descriptive filenames - Include context in filename (e.g.,
greeting_em_michael_20260209.mp3) - Test small files first - Upload a 10-second test before large conversations
- Monitor storage quota - Ensure you don't exceed NextCloud limits
Script Template
#!/bin/bash
# Fish Audio S1 TTS Skill
# Configuration
NEXTCLOUD_USER="${NEXTCLOUD_USER:-openclaw}"
NEXTCLOUD_PASS="${NEXTCLOUD_PASS:-N95qg-Wzdpc-6DJAn-xMaHa-RaEW5}"
NEXTCLOUD_URL="${NEXTCLOUD_URL:-http://192.168.68.68:8080}"
FISH_AUDIO_S1_URL="${FISH_AUDIO_S1_URL:-http://192.168.68.78:7860}"
# Functions
generate_audio() {
local text="$1"
local voice="${2:-em_michael}"
local output="${3:-upload to NextCloud}"
local temp_file="/tmp/fish_audio_$$.mp3"
# Generate audio
if ! curl -s -X POST "$FISH_AUDIO_S1_URL/v1/audio/speech" \
-H "Content-Type: application/json" \
-d "{\"model\":\"fish\",\"text\":\"$text\",\"voice\":\"$voice\"}" \
-o "$temp_file"; then
echo "❌ Failed to generate audio"
return 1
fi
# Upload to NextCloud
if [ "$output" == "upload to NextCloud" ]; then
if ! curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
-X PUT -T "$temp_file" \
"$NEXTCLOUD_URL/Openclaw/fish_audio_$(date +%Y%m%d_%H%M%S).mp3"; then
echo "❌ Failed to upload to NextCloud"
return 1
fi
fi
# Return audio file if just generating
if [ "$output" != "upload to NextCloud" ]; then
echo "$temp_file"
fi
return 0
}
main() {
# Parse command line arguments
local action="$1"
local text="$2"
local voice="${3:-em_michael}"
local output="${4:-upload to NextCloud}"
case "$action" in
generate)
generate_audio "$text" "$voice" "$output"
;;
upload)
echo "Upload functionality requires generated audio file"
return 1
;;
help)
echo "Usage: $0 [generate|upload] [text] [voice]"
echo ""
echo "Commands:"
echo " generate - Generate audio from text and upload to NextCloud"
echo " upload - Upload existing MP3 file to NextCloud"
echo ""
echo "Options:"
echo " [voice] - Voice name (default: em_michael)"
echo " [output] - Output destination (default: upload to NextCloud)"
echo ""
echo "Examples:"
echo " $0 generate Hello! I am excited to meet you."
echo " $0 generate [happy] This is great news! [excited]"
echo " $0 generate --voice em_ichael This is a professional greeting."
echo " $0 upload /path/to/file.mp3 Upload file to NextCloud"
;;
*)
echo "Unknown action: $action"
return 1
;;
esac
}
# Run main function
main "$@"
Version History
- v1.0 - Initial release (basic TTS generation)
- v1.1 - Added voice selection and error handling
- v1.2 - Added NextCloud upload functionality
- v1.3 - Advanced voice options and best practices
License
MIT License - Free to use, modify, and distribute
Contributing
- Fork the repository
- Add features for new voices or languages
- Improve error handling and fallback mechanisms
- Update documentation with new examples
- Submit pull requests for bug fixes
Support
For issues or questions:
- Check service availability before reporting bugs
- Verify NextCloud credentials are correctly configured
- Test with different voices to isolate service-specific issues
- Review logs for error patterns
Quick Start
Generate Greeting (Testing)
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fish", "text":"Hello! This is a test of the Fish Audio S1 TTS skill for OpenClaw.", "voice":"em_michael"}' \
-o /tmp/fish_audio_test.mp3
Upload to NextCloud (Testing)
curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
-X PUT -T /tmp/fish_audio_test.mp3 \
"http://192.168.68.68:8080/remote.php/webdav/Openclaw/fish_audio_test.mp3"
This skill provides:
- ✅ Text-to-speech generation using Fish Audio S1
- ✅ Voice selection from 50+ available options
- ✅ Emotion control with natural prosody
- ✅ NextCloud integration with automatic uploads
- ✅ Error handling and service validation
- ✅ Professional quality audio generation
- ✅ Flexible output (file paths or upload)
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!