Search

1446 results for "evaluation"

All 🧪 Skills 🔌 MCP Servers 📏 Rules 💬 Prompts

🧪 Skill

Deep Research

Free

Conduct exhaustive multi-source investigation with methodology tracking, source evaluation, and iterative depth.

❤️ 5 ⬇️ 2.3k

🧪 Skill

PubMed

Free

Search and evaluate biomedical literature with effective queries, filters, and critical appraisal.

❤️ 3 ⬇️ 1.3k

🧪 Skill

Self-Improving + Proactive Agent

Free

Self-reflection + Self-criticism + Self-learning + Self-organizing memory. Agent evaluates its own work, catches mistakes, and improves permanently. Use when...

❤️ 306 ⬇️ 63k

🧪 Skill

Json Repair Kit

Free

Repair malformed JSON files by normalizing them through Node.js evaluation. Use this to fix trailing commas, single quotes, unquoted keys, or other common sy...

❤️ 0 ⬇️ 466

🧪 Skill

Procurement Manager

Free

Assist with vendor evaluation, purchase order creation, contract negotiation prep, spend analysis, and adherence to procurement policies and approval thresho...

❤️ 0 ⬇️ 515

🧪 Skill

llm-judge-ensemble

Free

Build a cost-efficient LLM evaluation ensemble with sampling, tiebreakers, and deterministic validators. Learned from 600+ production runs judging local Olla...

❤️ 0 ⬇️ 165

🧪 Skill

On-Chain Skill Audit

Free

On-chain skill provenance registry. Check, register, audit, and vouch for agent skills on Solana. Use when evaluating skill safety, registering new skills, or looking up provenance before installation

❤️ 0 ⬇️ 964

🧪 Skill

Open Sentinel - Agent Reliability Layer

Free

Transparent LLM proxy that monitors and enforces policies on AI agent behavior — evaluates responses against configurable rules for hallucinations, PII leaks...

❤️ 2 ⬇️ 322

🧪 Skill

Ml Pipeline Starter

Free

Build and deploy production ML pipelines with data processing, model training, evaluation, and deployment using TensorFlow, PyTorch, or Scikit-learn.

❤️ 0 ⬇️ 115

🧪 Skill

Sharedintellect Quorum

Free

Multi-agent validation framework — 6 independent AI critics evaluate artifacts against rubrics with evidence-grounded findings.

❤️ 0 ⬇️ 309

🧪 Skill

fundraising from top tier vc

Free

Assist startups in securing venture capital from top-tier VCs by evaluating potential, crafting narratives, identifying and ranking investors, and managing o...

❤️ 0 ⬇️ 36

🧪 Skill

Skill Test

Free

Test skills before using or publishing. Trial, compare, evaluate in isolation without affecting your environment.

❤️ 2 ⬇️ 999

🧪 Skill

AI Researcher

Free

Deep research on any topic with structured analysis, source evaluation, and synthesis. Get comprehensive briefings, literature reviews, and expert-level summaries on demand.

❤️ 5 ⬇️ 1.3k

🧪 Skill

PinchBench

Free

Run PinchBench benchmarks to evaluate OpenClaw agent performance across real-world tasks. Use when testing model capabilities, comparing models, submitting b...

❤️ 0 ⬇️ 400

🧪 Skill

momentspost

Free

Persuasive copy analysis for WeChat Moments. Use when users need to: (1) Evaluate the persuasiveness of WeChat Moments posts, (2) Improve conversion or engagement of social media copy, (3) Get actiona

❤️ 0 ⬇️ 633

🧪 Skill

Tax Planning Framework

Free

Guide business owners through tax optimization by evaluating entity structure, maximizing deductions, planning compensation, and scheduling key tax deadlines.

❤️ 0 ⬇️ 403

🧪 Skill

Inventory Supply Chain

Free

Manage inventory, forecast demand, evaluate suppliers, optimize reorder points, and improve supply chain for businesses of all sizes.

❤️ 0 ⬇️ 507

🧪 Skill

Watcha Finder

Free

Find, evaluate, and recommend AI products using the watcha.cn platform API. Use this skill whenever the user asks about AI tools, AI products, AI apps, or wa...

❤️ 0 ⬇️ 152

🧪 Skill

Interview Architect

Free

Design and manage structured evidence-based interviews, including scorecards, question banks, rubrics, panel coordination, evaluation, and offer decision sup...

❤️ 0 ⬇️ 853

🧪 Skill

Self Review

Free

Automatically evaluates and approves agent outputs based on clarity, conciseness, actionability, and structure using a rule-based system.

❤️ 0 ⬇️ 366

🧪 Skill

Improve Skill Bespoke To CodeBase

Free

Meta-skill: evaluate any Factory Droid skill against the current project codebase and suggest concrete improvements. Use when: a skill feels incomplete, prod...

❤️ 0 ⬇️ 183

🧪 Skill

Memory Bench Pioneer

Free

Be one of the first to benchmark your agent's memory — and help shape how AI remembers. Runs a peer-review-grade evaluation suite (LLM-as-judge, nDCG/MAP/MRR...

❤️ 0 ⬇️ 338

🧪 Skill

CrewHaus Startup Validation

Free

Quick startup idea evaluation from your terminal. Score ideas on 3 dimensions, run deeper scans with real competitor data and risk assessment. A structured t...

❤️ 0 ⬇️ 52

🧪 Skill

SWOT Analyzer

Free

Conduct detailed SWOT analyses for businesses or products by evaluating strengths, weaknesses, opportunities, threats, and strategic recommendations based on...

❤️ 2 ⬇️ 544