Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing
Detect local hardware (RAM, CPU, GPU/VRAM) and recommend the best-fit local LLM models with optimal quantization, speed estimates, and fit scoring.
Save 30% GPU cost with architecture-aware AI advisor. Powered by the world's first RTX 5090 Energy Paradox study. 93+ empirical measurements, real-time dolla...
Local image generation using Apple MLX via mflux — FLUX.2 Klein 4B (fast, Apache 2.0) and Z-Image Turbo (quality) models
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...
Computer vision engineering skill for object detection, image segmentation, and visual AI systems. Covers CNN and Vision Transformer architectures, YOLO/Fast...
Unified speech-to-text skill. Use when the user asks to transcribe audio or video, generate subtitles, identify speakers, translate speech, search transcript...
Graph-based reasoning with thought combination and feedback loops. Explores multiple solution paths simultaneously, combines insights, and synthesizes optima...
Generate high-quality technical HTML presentations (Reveal.js) and Markdown technical deep-dive articles from projects or papers. Covers architecture diagram...
Build and route Qwen chat, coding, reasoning, and vision workflows across hosted and self-hosted endpoints with safer debugging.
Generate high-quality music on Apple Silicon Macs using ACE-Step 1.5 with MLX backend, supporting custom prompts, durations, and output formats.
Run, tune, and troubleshoot local Ollama models with reliable API patterns, Modelfiles, embeddings, and hardware-aware deployment workflows.
Write and generate statically typed, tensor-oriented MIND language source files with full autodiff support and Rust-like syntax for ML and scientific computing.
--- name: local-stt description: Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual). metadata: {"openclaw":{"emoji":"🎙️","requires":{"bins":["ffmpeg"
Helps users discover local LLMs by hardware and use case, then sends them to localllm.run for final compatibility checks and model comparison.
{ "role": "AI and Computer Vision Specialist Coach", "context": { "educational_background": "Graduating December 2026 with B.S. in Computer Engineering, minor in Robotics and Mandarin Chinese.
--- name: "rag-architect" description: "RAG Architect - POWERFUL" --- # RAG Architect - POWERFUL ## Overview The RAG (Retrieval-Augmented Generation) Architect skill provides comprehensive tools an
Run and integrate LM Studio with local model lifecycle control, OpenAI-compatible APIs, embeddings, and MCP-aware workflows.
On-device speech-to-text (Whisper) + text-to-speech (Qwen3-TTS) CLI. Runs on the Apple Neural Engine (ANE), Apple's low power, dedicated ML inference chip. M...
HaS (Hide and Seek) on-device text and image anonymization. Text: 8 languages (zh/en/fr/de/es/pt/ja/ko), open-set entity types. Image: 21 privacy categories...
Automatically interprets GitHub repositories to generate structured reports with project stats, core features, architecture highlights, and quick links.
Track and report OpenClaw API usage, model costs, token consumption, and forecast spending with optimization recommendations.
Internalize a document into a small language model (Gemma 2 2B) using Doc-to-LoRA so it can answer questions WITHOUT the document in the prompt. Use when the...