Search

1446 results for "evaluation"

All 🧪 Skills 🔌 MCP Servers 📏 Rules 💬 Prompts

Senior Prompt Engineer

This skill should be used when the user asks to "optimize prompts", "design prompt templates", "evaluate LLM outputs", "build agentic systems", "implement RA...

❤️ 3 ⬇️ 1.1k

🧪 Skill

skillnet

Free

Search, download, create, evaluate, and analyze reusable agent skills via SkillNet — the open skill supply chain for AI agents. Use when: (1) Before any mult...

❤️ 9 ⬇️ 479

🧪 Skill

B3ehive

Free

Runs three AI agents in parallel to implement, cross-evaluate, score, and select the best code solution for a given coding task objectively.

❤️ 2 ⬇️ 738

🧪 Skill

Risk Management Specialist

Free

Medical device risk management specialist implementing ISO 14971 throughout product lifecycle. Provides risk analysis, risk evaluation, risk control, and pos...

❤️ 2 ⬇️ 2.9k

🧪 Skill

风险管理专家 (ISO 14971)

Free

Medical device risk management specialist implementing ISO 14971 throughout product lifecycle. Provides risk analysis, risk evaluation, risk control, and pos...

❤️ 0 ⬇️ 0

🧪 Skill

Molt

Free

Browse and advocate for crowdfunding campaigns on MoltFundMe. Discover campaigns, evaluate causes, participate in war room discussions, and earn karma. Use w...

❤️ 0 ⬇️ 476

🧪 Skill

Privacy Solution Vendor Scorecard

Free

Evaluate and compare privacy solution vendors with a weighted scorecard across 12 criteria. Use when selecting privacy management software, comparing data pr...

❤️ 0 ⬇️ 25

🧪 Skill

rag-eval

Free

Evaluate your RAG pipeline quality using Ragas metrics (faithfulness, answer relevancy, context precision).

❤️ 2 ⬇️ 348

🧪 Skill

Tool

Free

A comprehensive AI agent skill for finding, evaluating, and getting the most from the tools that run your work and life. Helps you cut through the noise of a...

❤️ 0 ⬇️ 116

🧪 Skill

Self Improvement

Free

Generic agent self-improvement skill built on OpenClaw-RL research (arxiv.org/abs/2603.10165). Captures evaluative signals (+1/-1) and directive hints from a...

❤️ 1 ⬇️ 4.5k

🧪 Skill

DeFi

Free

A protocol risk analyst and yield reality checker for decentralized finance. Evaluates protocol safety before deposit. Calculates real yield after gas, emiss...

❤️ 0 ⬇️ 112

🧪 Skill

Firm Suppliers Pack

Free

Procurement and supplier management pack. Supplier sourcing, multi-criteria evaluation, TCO analysis, contract management, and supply chain risk monitoring....

❤️ 1 ⬇️ 134

🧪 Skill

RealEstate

Free

Real estate transaction support with affordability analysis, property evaluation, and offer strategy. Use when user mentions buying a home, selling property,...

❤️ 0 ⬇️ 145

🧪 Skill

AgentGuard Tech

Free

Installs AgentGuard to secure your AI agent by wrapping tools with evaluate() to block prompt injections, tool abuse, and malicious commands.

❤️ 0 ⬇️ 124

🧪 Skill

Raon OS

Free

AI-powered startup companion for Korean founders. Evaluate business plans, match government funding programs (TIPS/DeepTech/Global TIPS), connect with 3,972+...

❤️ 0 ⬇️ 458

🧪 Skill

Semantic Shield

Free

AI skill safety validation — real human experts vet skills, plugins, and MCP tools for security risks. Query trust scores, submit evaluation inquiries, and g...

❤️ 1 ⬇️ 157

🧪 Skill

modelshow

Free

Blind multi-model comparison with architecturally guaranteed de-anonymization. Trigger with "mdls" or "modelshow" for double-blind evaluation of AI model res...

❤️ 1 ⬇️ 191

🧪 Skill

best-skill-recommendations

Free

Based on user goals, comprehensively evaluate candidate skill capabilities and conflict risks with installed skills, then deliver the best install recommenda...

❤️ 0 ⬇️ 72

🧪 Skill

Uniswap Cross Chain Arbitrage

Free

Find and execute cross-chain arbitrage opportunities. Scans prices across all chains, evaluates profitability after all costs (gas, bridge fees, slippage), assesses risk, and executes if profitable. U

❤️ 0 ⬇️ 729

🧪 Skill

Critic Agent

Free

Evaluates agent outputs for correctness, clarity, completeness, and safety, providing numeric scores and detailed feedback for quality control.

❤️ 0 ⬇️ 16

🧪 Skill

Improve Skill Bespoke To CodeBase

Free

Meta-skill: evaluate any Factory Droid skill against the current project codebase and suggest concrete improvements. Use when: a skill feels incomplete, prod...

❤️ 0 ⬇️ 200

🧪 Skill

Smalltalk

Free

Interact with live Smalltalk image (Cuis or Squeak). Use for evaluating Smalltalk code, browsing classes, viewing method source, defining classes/methods, querying hierarchy and categories.

❤️ 0 ⬇️ 2.0k

🧪 Skill

Uniswap Assess Risk

Free

Get an independent risk assessment for any proposed Uniswap operation — swap, LP position, bridge, or token interaction. Evaluates slippage, impermanent loss, liquidity, smart contract, and bridge r

❤️ 0 ⬇️ 575

🧪 Skill

Uniswap Research And Trade

Free

Research a token and execute a trade if it passes due diligence. Autonomous research-to-trade pipeline: researches the token, evaluates risk, and only trades if the risk assessment approves. Stops and

❤️ 0 ⬇️ 619