Search

406 results for "vision"

All 🧪 Skills 🔌 MCP Servers 📏 Rules 💬 Prompts

🧪 Skill

MenuVision

Free

Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation

❤️ 0 ⬇️ 285

🧪 Skill

Repo PR Triage

Free

Triage GitHub PRs and issues using vision-based scoring. Use when a user wants to prioritize, score, review, de-duplicate, or batch-process open pull request...

❤️ 0 ⬇️ 400

🧪 Skill

xAI

Free

Chat with Grok models via xAI API. Supports Grok-4, Grok-4.20, Grok-3, Grok-3-mini, vision, and real-time X search.

❤️ 5 ⬇️ 2.4k

🧪 Skill

xAI / Grok

Free

--- name: xai description: Chat with Grok models via xAI API. Supports Grok-3, Grok-3-mini, vision, and more. homepage: https://docs.x.ai user-invocable: true disable-model-invocation: true triggers:

❤️ 15 ⬇️ 9.8k

🧪 Skill

Qwen

Free

Build and route Qwen chat, coding, reasoning, and vision workflows across hosted and self-hosted endpoints with safer debugging.

❤️ 0 ⬇️ 53

🧪 Skill

android-agent

Free

Control a real Android phone via USB or network using GPT-4o vision to run tasks like opening apps, typing, tapping, and automation scripts.

❤️ 4 ⬇️ 560

🧪 Skill

Meta Video Ad Analyzer

Free

Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative

❤️ 1 ⬇️ 1.4k

🧪 Skill

Baoyu Danger Gemini Web

Free

Generates images and text via reverse-engineered Gemini Web API. Supports text generation, image generation from prompts, reference images for vision input,...

❤️ 0 ⬇️ 186

🧪 Skill

Glasses to Social

Free

Turn smart glasses photos into social media posts. Monitors a Google Drive folder for new images from Meta Ray-Ban glasses (or any smart glasses), analyzes them with vision AI, drafts tweets/posts in

❤️ 1 ⬇️ 1.7k

🧪 Skill

Luna Calorie Tracker

Free

Track daily caloric intake by sending food photos. Luna analyzes images using vision AI, estimates calories and macros, and stores everything in memory for d...

❤️ 0 ⬇️ 129

🧪 Skill

Anthropic

Free

Anthropic Claude API integration — chat completions, streaming, vision, tool use, and batch processing via the Anthropic Messages API. Generate text with Cla...

❤️ 0 ⬇️ 845

🧪 Skill

Perceptron

Free

Image and video analysis powered by Isaac vision models. Capabilities include visual Q&A, object detection, OCR, captioning, counting, and grounded spatial r...

❤️ 2 ⬇️ 99

🧪 Skill

Screen Monitor

Free

--- name: screen-monitor description: Dual-mode screen sharing and analysis. Model-agnostic (Gemini/Claude/Qwen3-VL). metadata: {"clawdbot":{"emoji":"🖥️","requires":{"model_features":["vision"]}}

❤️ 3 ⬇️ 3.6k

🧪 Skill

Minimax Image Understanding

Free

使用多模态大模型理解图片内容，生成业务含义描述。支持多种模型：(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等，生

❤️ 0 ⬇️ 148

🧪 Skill

Zerox

Free

Convert PDFs, DOCX, PPTX, and images to Markdown using zerox with GPT-4o vision, including OCR for scanned documents.

❤️ 0 ⬇️ 566

🧪 Skill

Image To Data

Free

Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.

❤️ 0 ⬇️ 1.2k

🧪 Skill

Organise videos

Free

Organize a video folder by cleaning non-video files, removing short/bad videos, and classifying videos into numbered subfolders using AI vision analysis.

❤️ 0 ⬇️ 58

🧪 Skill

Product Strategist

Free

Strategic product leadership toolkit for Head of Product covering OKR cascade generation, quarterly planning, competitive landscape analysis, product vision...

❤️ 1 ⬇️ 1.4k

🧪 Skill

PPT Translator

Free

Translate PowerPoint files to any language while preserving layout. Uses a render-and-verify agent loop (LibreOffice + Vision) to guarantee no text overflow....

❤️ 0 ⬇️ 367

🧪 Skill

Recipe to List

Free

Turn recipes into a Todoist Shopping list. Extract ingredients from recipe photos (Gemini Flash vision) or recipe web pages (search + fetch), then compare against the existing Shopping project with co

❤️ 0 ⬇️ 2.0k

🧪 Skill

Hinge Auto-Liker

Free

Automated Hinge dating profile liker using Android emulator + Gemini vision AI. Scrolls through full profiles, analyzes attractiveness with AI, likes the bes...

❤️ 0 ⬇️ 251

🧪 Skill

Midscene Automations Skills for Computer

Free

Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands. Operates entirely from screensh...

❤️ 2 ⬇️ 1.3k

🧪 Skill

Midscene Automations Skills for Browser

Free

Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible...

❤️ 0 ⬇️ 290

🧪 Skill

4To1 Planner - AI Planning Coach

Free

AI planning coach using the 4To1 Method™ — turn 4-year vision into daily action. Connects to Notion, Todoist, Google Calendar, or local Markdown. Use when user wants to plan goals, do weekly reviews,

❤️ 0 ⬇️ 897