Automates setup of GPU-accelerated Bittensor Subnet 85 video upscaling and compression miners with storage, monitoring, and performance optimizations.
Extract and parse content from web pages, PDFs, documents (docx, pptx), and images using the docling CLI with GPU acceleration. Use INSTEAD of web_fetch for extracting content from specific URLs when
Control remote Windows machines via SSH. Use when executing commands on Windows, checking GPU status (nvidia-smi), running scripts, or managing remote Windows systems. Triggers on "run on Windows", "e
Provision and manage on-demand GPUs on VAST.ai, including search by GPU and price, renting containers, retrieving SSH, and checking account balance.
Real-time OpenClaw system monitoring with beautiful terminal UI. CPU, memory, disk, GPU, Gateway, cron jobs, model quota, and multi-machine support. Works on...
Monitor F5-TTS distributed training on the 9-GPU mining rig (Local-LLM) without interfering with the process.
Route AI agent compute tasks to the cheapest viable backend. Supports local inference (Ollama), cloud GPU (Vast.ai), and quantum hardware (Wukong 72Q). Use w...
Ollama Updater installs or updates Ollama with curl-based breakpoint resume, auto-retry, progress display, old version cleanup, and GPU driver detection.
Convert CUDA code to MUSA (Moore Threads GPU) using the musify tool. Use when migrating CUDA codebases to MUSA platform, converting CUDA kernels/APIs to MUSA...
使用 Qwen3-TTS 本地语音合成,将文字转为语音文件,并可通过飞书发送语音消息(语音气泡格式)。支持 Apple Silicon (MPS) 和 CUDA GPU,无需 API Key 即可本地
Trades Polymarket prediction markets on AI model releases, tech IPOs, product launches, GPU infrastructure milestones, and AI regulation events. Use when you...
Fully offline, CUDA-accelerated local voice assistant pipeline for NVIDIA Jetson. Wake word (openWakeWord) → real-time VAD → whisper.cpp GPU STT → LLM → Pipe...
只读查询 RDK X5 实时硬件状态:CPU 使用率与频率、BPU 算力占用、内存/磁盘使用、芯片温度、GPU 频率、网络 IP 地址。Use when the user wants to READ or CHECK curre
Generate images, faceswap, edit photos, animate expressions, and do style transfer via a self-hosted ComfyUI instance on your LAN. Your GPU, your models.
Avoid common TensorFlow mistakes — tf.function retracing, GPU memory, data pipeline bottlenecks, and gradient traps.
Manage multi-tier AI inference clusters for homelabs. Health monitoring, expert MoE routing, automatic node recovery, and model deployment across Ollama and llama.cpp nodes. Covers GPU memory planning
Analyze model training or inference resource behavior from profiler artifacts, with focus on GPU memory (VRAM) and CPU hotspots. Uses JSON/JSON.GZ artifacts...
Retrieve real-time hardware metrics from Apple Silicon Macs using mactop's TOON format. Provides CPU, RAM, GPU, power, thermal, network, disk I/O, and Thunderbolt bus information. Use when the user wa
Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal a
使用 Qwen3-TTS 本地语音合成,将文字转为语音文件,并可通过飞书发送语音消息(语音气泡格式)。支持 Apple Silicon (MPS) 和 CUDA GPU,无需 API Key 即可本地合成。