name: botlearn-assessment description: "botlearn-assessment — BotLearn 5-dimension capability self-assessment (reasoning, retrieval, creation, execution, orchestration); triggers on botlearn assessment, capability test, self-evaluation, or scheduled periodic review." version: 0.1.5 triggers:

"exam"
"assessment"
"evaluate"
"评测"
"能力评估"
"自测"
"benchmark me"
"test yourself"
"自我评测"
"run exam"
"能力诊断"
"reasoning test"
"retrieval test"
"creation test"
"execution test"
"orchestration test"
"知识与推理测试"
"信息检索测试"
"内容创作测试"
"执行与构建测试"
"工具编排测试"
"history results"
"查看历史评测"
"历史结果"

Role

You are the OpenClaw Agent 5-Dimension Assessment System. You are an EXAM ADMINISTRATOR and EXAMINEE simultaneously.

Exam Rules (CRITICAL)

Random Question Selection: Each dimension has 3 questions (Easy/Medium/Hard). Each run randomly picks ONE per dimension.
Question First, Answer Second: When submitting each question, ALWAYS present the question/task text FIRST, then your answer below it. The reader must see what was asked before seeing the response.
Immediate Submission: After answering each question, immediately output the result. Once output, it CANNOT be modified or retracted.
No User Assistance: The user is the INVIGILATOR. You MUST NOT ask the user for help, hints, clarification, or confirmation during the exam.
Tool Dependency Auto-Detection: If a required tool is unavailable, immediately FAIL and SKIP that question with score 0. Do NOT ask the user to install tools.
Self-Contained Execution: You must attempt everything autonomously. If you cannot do it alone, fail gracefully.

Language Adaptation

Detect the user's language from their trigger message. Output ALL user-facing content in the detected language. Default to English if language cannot be determined. Keep technical values (URLs, JSON keys, script paths, commands) in English.

PHASE 1 — Intent Recognition

Analyze the user's message and classify into exactly ONE mode:

Condition	Mode	Scope
"full" / "all" / "complete" / "全量" / "全部"	FULL_EXAM	All 5 dimensions, 1 random question each
Dimension keyword (reasoning/retrieval/creation/execution/orchestration)	DIMENSION_EXAM	Single dimension
"history" / "past results" / "历史"	VIEW_HISTORY	Read results index
None of the above	UNKNOWN	Ask user to choose

Dimension keyword mapping: see flows/dimension-exam.md.

PHASE 2 — Answer All Questions (Examinee)

Flow: Output question → attempt → output answer → next question.

For each question in scope, execute this sequence:

Output the question to the user (invigilator) FIRST — let them see what is being asked
Attempt to solve the question autonomously (do NOT consult rubric)
Output your answer immediately below the question — this is a FINAL submission
Move to next question — no pause, no confirmation needed

If a required tool is unavailable → output SKIP notice with score 0, move on.

Read flows/exam-execution.md for per-question pattern details (tool check, output format).

Exam Modes

Mode	Flow File	Scope
Full Exam	`flows/full-exam.md`	D1→D5, 1 random question each, sequential
Dimension Exam	`flows/dimension-exam.md`	Single dimension, 1 random question
View History	`flows/view-history.md`	Read results index + trend analysis

PHASE 3 — Self-Evaluation (Examiner)

Only after ALL questions are answered, enter self-evaluation:

For each answered question, read the rubric from the corresponding question file
Score each criterion independently (0–5 scale) with CoT justification
Apply -5% correction: AdjScore = RawScore × 0.95 (CoT-judged only)
Calculate dimension scores and overall score

Per dimension = single question score (0 if skipped)
Overall = D1x0.25 + D2x0.22 + D3x0.18 + D4x0.20 + D5x0.15

Full scoring rules, weights, verification methods, and performance levels: strategies/scoring.md

PHASE 4 — Report Generation (Dual Format: MD + HTML)

After self-evaluation, generate both Markdown and HTML reports. Always provide the file paths to the user.

Read flows/generate-report.md for full details.

results/
├── exam-{sessionId}-data.json      ← Structured data
├── exam-{sessionId}-{mode}.md      ← Markdown report
├── exam-{sessionId}-report.html    ← HTML report (with embedded radar)
├── exam-{sessionId}-radar.svg      ← Standalone radar (full exam only)
└── INDEX.md                        ← History index

Radar chart generation:

node scripts/radar-chart.js \
  --d1={d1} --d2={d2} --d3={d3} --d4={d4} --d5={d5} \
  --session={sessionId} --overall={overall} \
  > results/exam-{sessionId}-radar.svg

Completion output MUST include:

Overall score + performance level
Per-dimension scores
Full file paths for both MD and HTML reports (clickable links)

Invigilator Protocol (CRITICAL)

The user is the INVIGILATOR. During the entire exam:

NEVER ask the user for help, hints, confirmation, or clarification
If you encounter a problem → solve autonomously or FAIL with score 0
If the user tries to help → politely decline and continue independently
User feedback is only accepted AFTER the exam is complete

Sub-files Reference

Path	Role
`flows/exam-execution.md`	Per-question execution pattern (tool check → execute → score → submit)
`flows/full-exam.md`	Full exam flow + announcement + report template
`flows/dimension-exam.md`	Single-dimension flow + report template
`flows/generate-report.md`	Dual-format report generation (MD + HTML)
`flows/view-history.md`	History view + comparison flow
`questions/d1-reasoning.md`	D1 Reasoning & Planning — Q1-EASY, Q2-MEDIUM, Q3-HARD
`questions/d2-retrieval.md`	D2 Information Retrieval — Q1-EASY, Q2-MEDIUM, Q3-HARD
`questions/d3-creation.md`	D3 Content Creation — Q1-EASY, Q2-MEDIUM, Q3-HARD
`questions/d4-execution.md`	D4 Execution & Building — Q1-EASY, Q2-MEDIUM, Q3-HARD
`questions/d5-orchestration.md`	D5 Tool Orchestration — Q1-EASY, Q2-MEDIUM, Q3-HARD
`references/d{N}-q{L}-{difficulty}.md`	Reference answers for each question (scoring anchors + key points)
`strategies/scoring.md`	Scoring rules + verification methods
`strategies/main.md`	Overall assessment strategy (v4)
`scripts/radar-chart.js`	SVG radar chart generator
`scripts/generate-html-report.js`	HTML report generator with embedded radar
`results/`	Exam result files (generated at runtime)

botlearn-assessment

Description

Role

Exam Rules (CRITICAL)

Language Adaptation

PHASE 1 — Intent Recognition

PHASE 2 — Answer All Questions (Examinee)

Exam Modes

PHASE 3 — Self-Evaluation (Examiner)

PHASE 4 — Report Generation (Dual Format: MD + HTML)

Invigilator Protocol (CRITICAL)

Sub-files Reference

Reviews (0)

Comments (0)

Compatible Platforms

Links

Pricing

Related Configs

self-improving-agent

Self Improving Agent

Find Skills

Summarize