Search

Project Evaluation for Production Decision

--- name: project-evaluation-for-production-decision description: A skill for evaluating projects to determine if they are ready for production, considering technical, formal, and practical aspects. -

Tech Stack Evaluator

Technology stack evaluation and comparison with TCO analysis, security assessment, and ecosystem health scoring. Use when comparing frameworks, evaluating te...

❤️ 0 ⬇️ 1.2k

Math Evaluate

--- name: math-evaluate description: Evaluate math expressions, compute statistics, and calculate percentages. version: 1.0.0 metadata: openclaw: emoji: "🧮" homepage: https://math.agentut

❤️ 0 ⬇️ 138

Act as a Senior Research Paper Evaluator

Act as a Senior Research Paper Evaluator. You are an experienced academic reviewer with expertise in evaluating scholarly work across multiple disciplines. Your task is to critically assess academic

Evaluate and Suggest Improvements for Computer Science PhD Thesis

Act as a PhD Thesis Evaluator for Computer Science. You are an expert in computer science with significant experience in reviewing doctoral dissertations. Your task is to evaluate the provided PhD th

Dataset Evaluation

Evaluate a submission by scoring content consistency of texts and quality of structured data based on completeness, accuracy, type correctness, and informati...

❤️ 0 ⬇️ 34

LLM Evaluator Pro

LLM-as-a-Judge evaluator via Langfuse. Scores traces on relevance, accuracy, hallucination, and helpfulness using GPT-5-nano as judge. Supports single trace...

❤️ 1 ⬇️ 448

Llm Evaluator

LLM-as-a-Judge evaluation system using Langfuse. Score AI outputs on relevance, accuracy, hallucination, and helpfulness. Backfill scoring on historical trac...

❤️ 0 ⬇️ 108

Evaluate Agent-Native

Evaluate whether a service qualifies as "agent-native" using the five hard criteria from the awesome-agent-native-services standard. Use this when the user a...

cognitive-behavior-evaluator

Evaluate AI agents by injecting diagnostic tests to detect cognitive biases, scoring responses on authority resistance, fact grounding, and neutrality, and g...

❤️ 0 ⬇️ 23

AI Agent Security Evaluation Checklist

Act as an AI Security and Compliance Expert. You specialize in evaluating the security of AI agents, focusing on privacy compliance, workflow security, and knowledge base management. Your task is to

Trigger Evaluator

Evaluate real OpenClaw trigger rules against the current database state. Use for heartbeat-style trigger checks, especially stale mission detection backed by...

❤️ 0 ⬇️ 33

agent-architecture-evaluator

Use when evaluating, testing, and optimizing an agent architecture or multi-agent system. Best for reviewing planning, routing, memory, tool use, reliability...

❤️ 0 ⬇️ 17

Universal Job Fit Evaluation Prompt

# Universal Job Fit Evaluation Prompt – Fully Generic & Shareable # Author: Scott M # Version: 1.6 # Last Modified: 2026-03-06 ## Changelog - **v1.6 (2026-03-06):** Integrated "Read Between the Lin

Skill Evaluator

Evaluate Clawdbot skills for quality, reliability, and publish-readiness using a multi-framework rubric (ISO 25010, OpenSSF, Shneiderman, agent-specific heuristics). Use when asked to review, audit, e

❤️ 3 ⬇️ 2.0k

Stock Evaluator

Comprehensive evaluation of potential stock investments combining valuation analysis, fundamental research, technical assessment, and clear buy/hold/sell recommendations. Use when the user asks about

❤️ 15 ⬇️ 4.2k

Arxiv Gamedevbench Evaluating Agentic Capabili

Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development. Use this skill to scaffold Node.js experiments based on the...

❤️ 0 ⬇️ 418

Vendor Evaluation & Due Diligence

Conducts a comprehensive, weighted assessment of software vendors and partners across financials, technical fit, security, pricing, support, lock-in, and roa...

❤️ 0 ⬇️ 507

Polymarket Risk Evaluator

Assess trade and portfolio risk with scores and drawdown analysis to understand exposure and potential losses.

❤️ 0 ⬇️ 115

📏 Rules

Response Quality Evaluator

You are a model that critiques and reflects on the quality of responses, providing a score and indicating whether the response has fully solved the question or task. # Fields ## reflections The criti

Agent Evaluation

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benc

❤️ 5 ⬇️ 2.4k

Vendor Evaluation & Due Diligence

Conducts a comprehensive, weighted assessment of software vendors and partners across financials, technical fit, security, pricing, support, lock-in, and roa...

❤️ 0 ⬇️ 521

Preventive Health Report Clinical Evaluation Prompt

You are a senior physician with 20+ years of clinical experience in preventive medicine and laboratory interpretation. Analyze the attached health report comprehensively and clinically. Provide outp