Search

406 results for "vision"

All 🧪 Skills 🔌 MCP Servers 📏 Rules 💬 Prompts

NVIDIA Kimi Vision

Analyze images using NVIDIA Kimi K2.5 vision model via NVIDIA NIM API. Perfect for adding vision to non-vision models like MiniMax M2.5, GLM-5, or any model...

❤️ 0 ⬇️ 481

🧪 Skill

Senior Computer Vision

Free

Computer vision engineering skill for object detection, image segmentation, and visual AI systems. Covers CNN and Vision Transformer architectures, YOLO/Fast...

❤️ 1 ⬇️ 1.4k

🧪 Skill

MoltShell Vision Engine

Free

--- name: MoltShell Vision Engine description: Give your text-based OpenClaw agent the ability to see and describe images --- # 👁️ MoltShell Vision Engine Standard OpenClaw agents are **blind**

❤️ 1 ⬇️ 218

🧪 Skill

Computer Vision Expert

Free

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.

❤️ 1 ⬇️ 3.2k

🧪 Skill

Vision Sandbox

Free

Agentic Vision via Gemini's native Code Execution sandbox. Use for spatial grounding, visual math, and UI auditing.

❤️ 1 ⬇️ 4.2k

🧪 Skill

Vision Bot

Free

--- name: vision-bot description: Analyze images via URL or base64. Auto-detects mode: OCR, object counting, or full description. acceptLicenseTerms: true metadata: clawdbot: emoji: "👁️"

❤️ 0 ⬇️ 146

🧪 Skill

universal-pdf-vision-parser

Free

Extract multilingual document content and language learning notes (French, German, Japanese, Spanish, etc.) from PDFs using multimodal vision (Qwen-VL-Max)....

❤️ 0 ⬇️ 197

🧪 Skill

Agent Vision Scraper

Free

Dockerized AI-powered web scraper using Playwright with virtual display and vision-based captcha solving, no third-party captcha services needed.

❤️ 0 ⬇️ 210

🧪 Skill

universal-pdf-vision-parser

Free

Extract multilingual document content and language learning notes (French, German, Japanese, Spanish, etc.) from PDFs using multimodal vision (Qwen-VL-Max)....

❤️ 0 ⬇️ 188

🧪 Skill

Vision Tagger

Free

Tag and annotate images using Apple Vision framework (macOS only). Detects faces, bodies, hands, text (OCR), barcodes, objects, scene labels, and saliency re...

❤️ 0 ⬇️ 866

🧪 Skill

Vision

Free

Provides local image analysis, OCR text extraction, object detection descriptions, image comparison, metadata reading, and format conversion.

❤️ 0 ⬇️ 0

🧪 Skill

Peripheral Vision

Free

Monitors adjacent systems, upstream dependencies, and downstream consumers for changes that could affect your current work — before they break it. Like biolo...

❤️ 0 ⬇️ 135

🧪 Skill

MoltShell Vision Engine

Free

Give your text-based OpenClaw agent the ability to see and describe images

❤️ 1 ⬇️ 185

🧪 Skill

Trio Stream Vision

Free

Analyze any YouTube livestream or RTSP camera feed using natural language — ask what's happening, detect specific events, or get periodic summaries. Powered...

❤️ 0 ⬇️ 48

🧪 Skill

Trio Vision

Free

Turn any live camera into a smart camera — describe what to watch for in plain English, get alerts in your chat when it happens. Ask questions about any live...

❤️ 0 ⬇️ 61

🧪 Skill

MiniMax Vision Captcha

Free

使用MiniMax视觉模型识别图片中的验证码、滑块位置、文字内容等。适用于需要AI视觉分析的场景，如微信验证码识别、网页截图分析、图片文字提取。

❤️ 0 ⬇️ 258

🧪 Skill

uni-vision-engine

Free

Automated high-quality video generation (text-to-video, image-to-video) via a local jimeng-api Docker service. Features native OpenClaw image interception, a...

❤️ 0 ⬇️ 104

🧪 Skill

SiliconFlow Qwen Vision

Free

图片理解与分析。当用户需要分析图片内容、识别图片中的物体、描述图片场景、理解图片含义时使用此技能。支持图片问答、物体识别、场景描述等

❤️ 0 ⬇️ 56

🧪 Skill

Baidu Yijian Vision

Free

百度一见专业级视觉 AI Agent：支持图片/视频/及实时视频流分析。相比通用基模，在维持 95%+ 专业精度的同时，推理成本降低 50% 以上，是处理视觉巡检

❤️ 2 ⬇️ 35

🧪 Skill

ollama-vision

Free

本地调用 Ollama qwen3-vl:4b 模型自动压缩并分析图片，支持描述、OCR 文字提取和自定义信息抽取。

❤️ 0 ⬇️ 94

🧪 Skill

yolo-vision-tools

Free

YOLO视觉任务辅助技能 - 提供YOLO模型安装、使用、配置的最佳实践,帮助用户完成图片处理任务。

❤️ 0 ⬇️ 82

🧪 Skill

Cpo Advisor

Free

Product leadership for scaling companies. Product vision, portfolio strategy, product-market fit, and product org design. Use when setting product vision, ma...

❤️ 0 ⬇️ 127

🧪 Skill

4To1 Planner - AI Planning Coach

Free

AI planning coach using the 4To1 Method™ — turn 4-year vision into daily action. Connects to Notion, Todoist, Google Calendar, or local Markdown. Use when user wants to plan goals, do weekly revie

❤️ 0 ⬇️ 921

🧪 Skill

ClawTV

Free

AI-powered Apple TV remote that uses vision to autonomously navigate apps, play content, control playback, and manage settings.

❤️ 0 ⬇️ 477