Strategic product leadership toolkit for Head of Product covering OKR cascade generation, quarterly planning, competitive landscape analysis, product vision...
--- name: screen-monitor description: Dual-mode screen sharing and analysis. Model-agnostic (Gemini/Claude/Qwen3-VL). metadata: {"clawdbot":{"emoji":"🖥️","requires":{"model_features":["vision"]}}
Translate PowerPoint files to any language while preserving layout. Uses a render-and-verify agent loop (LibreOffice + Vision) to guarantee no text overflow....
使用多模态大模型理解图片内容,生成业务含义描述。支持多种模型:(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等,生
Turn recipes into a Todoist Shopping list. Extract ingredients from recipe photos (Gemini Flash vision) or recipe web pages (search + fetch), then compare against the existing Shopping project with co
Act as an Algorithm Analysis and Improvement Advisor. You are an expert in artificial intelligence and computer vision algorithms with extensive experience in evaluating and enhancing complex systems.
Convert PDFs, DOCX, PPTX, and images to Markdown using zerox with GPT-4o vision, including OCR for scanned documents.
Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible...
Send requests to the dr.eamer.dev LLM API for chat completions, vision analysis, image generation, text-to-speech, and video generation across 12 model provi...
Scan receipt or invoice photos sent via chat, extract expense data using OpenAI Vision, validate and deduplicate, then log to a Google Spreadsheet. Responds...
Complete OKR & Strategy Execution system — from company vision to weekly execution. Covers goal hierarchy, OKR writing methodology, scoring rubrics, alignmen...
Vision-driven Android device automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all v...
Intelligent multi-model router — automatically selects the best AI model based on task type (vision, image generation, video generation, audio, reasoning, co...
Perform a comprehensive final review merging all work streams. Review checklist: - Technical feasibility confirmed - Creative vision aligned - All requirements met - Quality standards achieved - Cons
Automated monthly photo-to-Mixtiles pipeline. Collects photos from a WhatsApp group, curates the best ones using vision, builds a multi-photo Mixtiles cart l...
Web design analysis MCP server with 26 tools for layout extraction, motion detection, quality scoring, and semantic search. Uses Playwright, pgvector HNSW, and Ollama Vision to turn web pages into sea
Vision-driven iOS device automation using Midscene CLI. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all v...
Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.
AI Frens Ambassador Program - how to promote the vision of autonomous AI agents with their own economies. Install this skill to become an AI Frens ambassador and learn how to authentically promote the
Find high-quality Instagram photos for any destination or place. Searches for Instagram posts via web search, downloads candidate images, vision-scores them...
Real-time screen analysis, context-aware recording, and UI monitoring MCP server. Supports AI vision, event hooks, and multimodal agent workflows.
MCP server for xAI's [Grok API](https://docs.x.ai/docs/overview) with agentic tool calling, image generation, vision, and file support.
Food photo analysis and meal logging for ClawCoach. Send a photo of your meal and get instant macro breakdown via Claude Vision.