Manage Weibo posts via Puppeteer with a secure request-approve-execute workflow for drafting, reviewing, and publishing text and images.
Extract sales data from report images using OCR with cnocr, parse JSON via MiniMax API, and convert results to Excel spreadsheets.
Hybrid document intelligence pipeline ingesting PDFs, images, and spreadsheets with OCR, visual and text search, and field fix capture for fast retrieval.
Mask and redact sensitive information (PII) in screenshots and images — phone numbers, emails, IDs, API keys, crypto wallets, credit cards, passwords, and mo...
AI agent self-portrait generator. Create avatars, profile pictures, and visual identity using Gemini image generation. Supports mood-based generation, season...
OCR text recognition using DeepSeek-OCR model. Use when user asks for OCR, text recognition, image text extraction, screenshot recognition, or converting ima...
Convert UI screenshots/images into fully functional HTML/CSS copies. This skill is used when a user provides images of a website, application interface, dash...
Master prompt engineering for AI models: LLMs, image generators, video models. Techniques: chain-of-thought, few-shot, system prompts, negative prompts. Mode...
Provision Postgres databases, deploy static sites, generate images, and build full-stack webapps on Run402 using x402 micropayments. Use when the user asks t...
Generate videos using local AI models (ComfyUI/Stable Video Diffusion) and auto-publish to social media platforms. Supports text-to-video, image-to-video, ba...
AI prompt generator for image and text. AI绘画提示词、AI画图、Midjourney提示词、MJ提示词、Stable Diffusion提示词、SD提示词、DALL-E提示词、AI绘图、AI生图、AI画画、
Automatically creates unique prediction markets on betbud.live by analyzing trending crypto Twitter topics with Claude AI and fetching professional images.
Search the web using Bocha AI Search API (博查AI搜索) - a Chinese search engine optimized for Chinese content. Requires BOCHA_API_KEY. Supports web pages, images, and news with high-quality summar
Use Serpshot Google Search API to perform web searches and image searches. Use when user needs to search Google for information, research topics, or get sear...
Generate videos using Flyworks (a.k.a HiFly) Digital Humans. Create talking photo videos from images, use public avatars with TTS, or clone voices for custom audio.
Parse, extract, and analyze documents using the LlamaParse API (LlamaCloud). Use when the user asks to parse PDFs, images, spreadsheets, or other documents i...
Logo design principles and AI image generation best practices for creating logos. Covers logo types, prompting techniques, scalability rules, and iteration w...
Use for Pixelhub API direct calls when users need image generation/editing, video generation/post-processing, or audio/music generation.
Access ATXP paid API tools for web search, AI image generation, music creation, video generation, and X/Twitter search. Use when users need real-time web sea...
Generate short product videos from images using Runway Gen4 Turbo. Use for TikTok ads, UGC-style product demos, Reels, and YouTube Shorts.
Give your agent eyes on the web — screenshot any URL as an image file. Supports device emulation (iPhone, iPad, Pixel, MacBook), dark mode, full-page scroll,...
Extract Chinese and English text from images and scanned PDFs, including documents like invoices and contracts, using PaddleOCR in Python.
Send build123d CAD commands via HTTP to render images, allowing visual iteration on 3D models entirely within a containerized CAD environment.
Visualize bounding boxes and class labels on images with support for COCO, YOLO, VOC, and LabelMe annotation formats.