Use PoYo AI Seedance 1.5 Pro for higher-end image-to-video generation through the `https://api.poyo.ai/api/generate/submit` endpoint. Use when a user wants l...
Generate and decode QR codes locally. Use when the user wants to create a QR code from text/URL, decode/read QR code content from an image, or asks about QR...
AI image, video, and music generation + editing via VAP API. Flux, Veo 3.1, Suno V5.
Capture screenshots on Windows using mss and Pillow. Provides full-screen, region, and multi-monitor capture with output as PIL Image, PNG file, or base64 st...
3D visualization toolkit wrapping Pangolin viewer for real-time display of point clouds, trajectories, cameras, planes, chessboards, and images. Use when vis...
Access AIKEK APIs for crypto/DeFi research and image generation. Authenticate with a Solana wallet, query the knowledge engine for real-time market data and...
Manage SEO and GEO content updates in Webflow by prioritizing with GSC, drafting content, creating patch JSONs, updating CMS via API, optimizing images and S...
Generate hand-drawn style diagrams, flowcharts, and architecture diagrams as PNG images from Excalidraw JSON
A fast headless browser automation CLI that enables AI agents to navigate, click, type, and snapshot pages. And also 50+ models for image generation, video g...
AI-powered presentation generation using 2slides API. Create slides from text content, match reference image styles, or summarize documents into presentations. Use when users request to "create a pres
Captures learnings, errors, and corrections to enable continuous improvement. And also 50+ models for image generation, video generation, text-to-speech, spe...
Google Workspace CLI for Gmail, Calendar, Drive, Contacts, Sheets, and Docs. And also 50+ models for image generation, video generation, text-to-speech, spee...
Solve CAPTCHAs (reCAPTCHA v2/v3, hCaptcha, Cloudflare Turnstile, image CAPTCHAs) using CapMonster Cloud API. Use when browser automation encounters CAPTCHA c...
Generate 3D models from text or images. Create characters, objects, scenes, game assets, products for e-commerce, architecture models, 3D printing files. Aut...
Send text, image, or file messages to specified users via WeCom applications using configured corporate credentials.
Extract and analyze text, tables, images, and metadata from Korean HWP and HWPX documents, supporting both legacy and modern formats.
Parse PDF, DOC, DOCX, and image files to Markdown or JSON using UniDoc API with sync or async mode and automatic status polling.
End-to-end dropship product lifecycle pipeline. CJ Dropshipping sourcing → margin check → Flux Kontext AI hero image → WooCommerce publish → CJ supplier mapp...
Remove signs of AI-generated writing from text to make it sound more natural and human-written. And also 50+ models for image generation, video generation, t...
Parse PDFs, Word docs, PPTs, and images into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF/Word/PPT/image to Markdown, (2) Extractin...
OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown library. Supports PDF, Word, PowerPoint, Excel, images (OCR), audio (transcription), H
Interact with GitHub using the gh CLI for issues, PRs, CI runs, and advanced queries. And also 50+ models for image generation, video generation, text-to-spe...
Generate images via ComfyUI API (localhost:8188) using Flux2 workflow. Supports structured JSON prompts sent directly as positive prompt parameter, seed/steps customization. Async watcher via sub-agen
Extract text and structured data from documents using Azure Document Intelligence (formerly Form Recognizer). Supports OCR for PDFs, images, scanned document...