Safely run local `gpu` commands via a guarded wrapper (`runner.sh`) with preflight checks and budget/time caps.
--- name: openclaw-gpu-bridge description: "Offload GPU-intensive ML tasks (BERTScore, embeddings) to one or multiple remote GPU machines" --- # @elvatis_com/openclaw-gpu-bridge OpenClaw plugin to o
Install and operate KeepGPU for GPU keep-alive with both blocking CLI and non-blocking service workflows. Use when users ask for keep-gpu command constructio...
在 GPU 服务器上部署 vLLM 模型服务。支持多服务器配置,自动检查 GPU 和端口占用,一键部署流行的开源模型。
High-performance local speech-to-text transcription using Faster Whisper with NVIDIA GPU acceleration. Transcribe audio files locally without sending data to...
Offload GPU-intensive ML tasks (BERTScore, embeddings) to one or multiple remote GPU machines
Monitors GPU cluster health and usage, providing real-time status, performance metrics, and alerts for efficient resource management.
Automatically polls and displays memory usage and online status of RTX 3090 and 4090 AI compute nodes in the local network.
Deploy and serve LLM models on GPU. Compare GPU pricing. Launch vLLM on Modal, RunPod, Cerebrium, Cloud Run, Baseten, or Azure with spot instance fallback. O...
Manage RunPod GPU cloud instances - create, start, stop, connect to pods via SSH and API. Use when working with RunPod infrastructure, GPU instances, or need SSH access to remote GPU machines. Handles
Manage and monitor remote GPU servers via SSH with GPU, disk, process status, alerts, log tailing, file sync, and health diagnostics.
在 GPU 服务器上部署 LLM 模型服务(vLLM)。支持多服务器配置,自动检查 GPU 和端口占用,一键部署流行的开源大语言模型。
This skill should be used when the user asks to "provision GPU instance", "spin up a cloud server", "list compute plans", "browse GPU pricing", "extend compu...
Windows SAPI5 text-to-speech with Neural voices. Lightweight alternative to GPU-heavy TTS - zero GPU usage, instant generation. Auto-detects best available voice for your language. Works on Windows 10
Save 30% GPU cost with architecture-aware AI advisor. Powered by the world's first RTX 5090 Energy Paradox study. 93+ empirical measurements, real-time dolla...
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...
Manage multi-tier AI inference clusters for homelabs. Health monitoring, expert MoE routing, automatic node recovery, and model deployment across Ollama and llama.cpp nodes. Covers GPU memory planning
Use this skill when users request to deploy LLMs (Qwen, DeepSeek, etc.) on specified GPU servers and start the model service. This skill can Download models...
Generate photorealistic images, videos, talking heads, and natural TTS audio using GPU-accelerated AI models and scripts on a remote server.
Automates setup of GPU-accelerated Bittensor Subnet 85 video upscaling and compression miners with storage, monitoring, and performance optimizations.