Search

Screen Monitor

--- name: screen-monitor description: Dual-mode screen sharing and analysis. Model-agnostic (Gemini/Claude/Qwen3-VL). metadata: {"clawdbot":{"emoji":"🖥️","requires":{"model_features":["vision"]}}

❤️ 3 ⬇️ 3.6k

PPT Translator

Translate PowerPoint files to any language while preserving layout. Uses a render-and-verify agent loop (LibreOffice + Vision) to guarantee no text overflow....

❤️ 0 ⬇️ 367

Minimax Image Understanding

使用多模态大模型理解图片内容，生成业务含义描述。支持多种模型：(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等，生

❤️ 0 ⬇️ 148

Recipe to List

Turn recipes into a Todoist Shopping list. Extract ingredients from recipe photos (Gemini Flash vision) or recipe web pages (search + fetch), then compare against the existing Shopping project with co

❤️ 0 ⬇️ 2.0k

💬 Prompt

Algorithm Analysis and Improvement Advisor

Act as an Algorithm Analysis and Improvement Advisor. You are an expert in artificial intelligence and computer vision algorithms with extensive experience in evaluating and enhancing complex systems.

Zerox

Convert PDFs, DOCX, PPTX, and images to Markdown using zerox with GPT-4o vision, including OCR for scanned documents.

❤️ 0 ⬇️ 566

Midscene Automations Skills for Browser

Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible...

❤️ 0 ⬇️ 290

Geepers Llm

Send requests to the dr.eamer.dev LLM API for chat completions, vision analysis, image generation, text-to-speech, and video generation across 12 model provi...

❤️ 0 ⬇️ 355

Clawshier

Scan receipt or invoice photos sent via chat, extract expense data using OpenAI Vision, validate and deduplicate, then log to a Google Spreadsheet. Responds...

❤️ 0 ⬇️ 212

OKR & Strategy Execution Engine

Complete OKR & Strategy Execution system — from company vision to weekly execution. Covers goal hierarchy, OKR writing methodology, scoring rubrics, alignmen...

❤️ 0 ⬇️ 601

Midscene Automations Skills for Android

Vision-driven Android device automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all v...

❤️ 0 ⬇️ 1.1k

Smart Router

Intelligent multi-model router — automatically selects the best AI model based on task type (vision, image generation, video generation, audio, reasoning, co...

❤️ 0 ⬇️ 204

💬 Prompt

Step 5: Final Review

Perform a comprehensive final review merging all work streams. Review checklist: - Technical feasibility confirmed - Creative vision aligned - All requirements met - Quality standards achieved - Cons

I Love You Mom

Automated monthly photo-to-Mixtiles pipeline. Collects photos from a WhatsApp group, curates the best ones using vision, builds a multi-photo Mixtiles cart l...

❤️ 0 ⬇️ 340

ReftrixMCP

Web design analysis MCP server with 26 tools for layout extraction, motion detection, quality scoring, and semantic search. Uses Playwright, pgvector HNSW, and Ollama Vision to turn web pages into sea

Midscene Automations Skills for iOS

Vision-driven iOS device automation using Midscene CLI. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all v...

❤️ 0 ⬇️ 836

mcp-hfspace

Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.

AI Frens Ambassador Program

AI Frens Ambassador Program - how to promote the vision of autonomous AI agents with their own economies. Install this skill to become an AI Frens ambassador and learn how to authentically promote the

❤️ 0 ⬇️ 736

Instagram Photo Find

Find high-quality Instagram photos for any destination or place. Searches for Instagram posts via web search, downloads candidate images, vision-scores them...

❤️ 2 ⬇️ 445

screenmonitormcp

Real-time screen analysis, context-aware recording, and UI monitoring MCP server. Supports AI vision, event hooks, and multimodal agent workflows.

Midscene Automations Skills for Browser

Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible...

❤️ 0 ⬇️ 281

Grok-MCP

MCP server for xAI's [Grok API](https://docs.x.ai/docs/overview) with agentic tool calling, image generation, vision, and file support.

ClawCoach Food