Speechall command-line tool for fast speech-to-text transcription using multiple providers
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list avai
Description
name: speechall-cli description: "Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list available STT models and their capabilities, (4) use speaker diarization, subtitles, or other transcription features from the terminal. Triggers on mentions of speechall, audio transcription CLI, or speech-to-text from the command line."
speechall-cli
CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more).
Installation
Homebrew (macOS and Linux)
brew install Speechall/tap/speechall
Without Homebrew: Download the binary for your platform from https://github.com/Speechall/speechall-cli/releases and place it on your PATH.
Verify
speechall --version
Authentication
An API key is required. Provide it via environment variable (preferred) or flag:
export SPEECHALL_API_KEY="your-key-here"
# or
speechall --api-key "your-key-here" audio.wav
The user can create an API key on https://speechall.com/console/api-keys
Commands
transcribe (default)
Transcribe an audio or video file. This is the default subcommand — speechall audio.wav is equivalent to speechall transcribe audio.wav.
speechall <file> [options]
Options:
| Flag | Description | Default |
|---|---|---|
--model <provider.model> |
STT model identifier | openai.gpt-4o-mini-transcribe |
--language <code> |
Language code (e.g. en, tr, de) |
API default (auto-detect) |
--output-format <format> |
Output format (text, json, verbose_json, srt, vtt) |
API default |
--diarization |
Enable speaker diarization | off |
--speakers-expected <n> |
Expected number of speakers (use with --diarization) |
— |
--no-punctuation |
Disable automatic punctuation | — |
--temperature <0.0-1.0> |
Model temperature | — |
--initial-prompt <text> |
Text prompt to guide model style | — |
--custom-vocabulary <term> |
Terms to boost recognition (repeatable) | — |
--ruleset-id <uuid> |
Replacement ruleset UUID | — |
--api-key <key> |
API key (overrides SPEECHALL_API_KEY env var) |
— |
Examples:
# Basic transcription
speechall interview.mp3
# Specific model and language
speechall call.wav --model deepgram.nova-2 --language en
# Speaker diarization with SRT output
speechall meeting.wav --diarization --speakers-expected 3 --output-format srt
# Custom vocabulary for domain-specific terms
speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction"
# Transcribe a video file (macOS extracts audio automatically)
speechall presentation.mp4
models
List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic.
speechall models [options]
Filter flags:
| Flag | Description |
|---|---|
--provider <name> |
Filter by provider (e.g. openai, deepgram) |
--language <code> |
Filter by supported language (tr matches tr, tr-TR, tr-CY) |
--diarization |
Only models supporting speaker diarization |
--srt |
Only models supporting SRT output |
--vtt |
Only models supporting VTT output |
--punctuation |
Only models supporting automatic punctuation |
--streamable |
Only models supporting real-time streaming |
--vocabulary |
Only models supporting custom vocabulary |
Examples:
# List all available models
speechall models
# Models from a specific provider
speechall models --provider deepgram
# Models that support Turkish and diarization
speechall models --language tr --diarization
# Pipe to jq for specific fields
speechall models --provider openai | jq '.[].identifier'
Tips
- On macOS, video files (
.mp4,.mov, etc.) are automatically converted to audio before upload. - On Linux, pass audio files directly (
.wav,.mp3,.m4a,.flac, etc.). - Output goes to stdout. Redirect to save:
speechall audio.wav > transcript.txt - Errors go to stderr, so piping stdout is safe.
- Run
speechall --help,speechall transcribe --help, orspeechall models --helpto see all valid enum values for model identifiers, language codes, and output formats.
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!