Restart Task Recovery
Preserve and resume in-progress multi-agent work across OpenClaw config patch/apply restarts. Use when a restart is required during active tasks, when users...
Description
name: restart-task-recovery description: Preserve and resume in-progress multi-agent work across OpenClaw config patch/apply restarts. Use when a restart is required during active tasks, when users ask to minimize interruption, or when agent runs/tool calls were interrupted by gateway restart/timeouts.
Restart Task Recovery
Use this workflow to maximize successful recovery after OpenClaw restart.
1) Pre-restart checkpoint (required)
Before any gateway.config.patch, gateway.config.apply, gateway.update.run, or gateway.restart:
- List active sessions that may be impacted (
sessions_list). - For each active work session, capture the latest context (
sessions_history, limit 20-50). - Write a compact checkpoint file at:
memory/restart-checkpoints/<YYYY-MM-DD>/<HHmmss>.md
- Include per session:
- sessionKey / label / agent
- goal
- last completed step
- next exact step
- blocked dependencies (if any)
- a ready-to-send resume message (1-2 lines)
Keep checkpoint concise and executable.
2) Restart with explicit recovery intent
When calling gateway restart/config change, set note to include recovery intent, e.g.:
- “配置已更新并重启;将按 checkpoint 恢复中断任务。”
3) Post-restart recovery sweep
After restart:
- Re-list sessions (
sessions_list) and compare against checkpoint. - For each interrupted/idle target session, send resume message via
sessions_send:- “Continue where you left off. Last completed:
. Next: . If previous tool call failed, retry from .”
- “Continue where you left off. Last completed:
- Do not poll in tight loops. Check on-demand only.
- Summarize recovery status to user:
- recovered sessions
- still blocked sessions
- manual follow-up needed
4) Idempotent task design rules
When resuming tasks, enforce:
- Re-run-safe steps (idempotency key / upsert / duplicate-safe writes).
- Small step boundaries with explicit “done markers”.
- External writes batched, not one-by-one loops.
- On uncertainty, verify state first then continue.
5) V2 automation helper
Use script: scripts/build_checkpoint.py to generate checkpoint markdown from structured JSON.
Example:
cat session-snapshot.json | python3 scripts/build_checkpoint.py memory/restart-checkpoints/$(date +%F)/$(date +%H%M%S).md
Expected stdin JSON shape:
{
"sessions": [
{
"sessionKey": "agent:engineer:main",
"agentId": "engineer",
"goal": "Finish regression verification",
"lastDone": "401/幂等/时区/retention case passed",
"nextStep": "Publish final acceptance summary",
"blockers": "none"
}
]
}
6) V3 resume-plan automation
Use script: scripts/generate_resume_plan.py to parse the latest checkpoint and produce a structured resume plan.
Example:
python3 scripts/generate_resume_plan.py memory/restart-checkpoints/2026-03-09/162200.md /tmp/resume-plan.json
Then send each items[].resumeMessage to items[].sessionKey via sessions_send.
Rules:
- Send once per session (no loop polling).
- If a session is already active and progressing, skip resend.
- After sends, post one concise recovery summary to user.
7) V4 one-click recovery payload generator
Use script: scripts/recover_from_latest_checkpoint.py.
It auto-selects the latest checkpoint file and emits a ready JSON payload list for sessions_send calls.
Examples:
# Use latest checkpoint automatically
python3 scripts/recover_from_latest_checkpoint.py > /tmp/recover-actions.json
# Use a specific checkpoint
python3 scripts/recover_from_latest_checkpoint.py memory/restart-checkpoints/2026-03-09/162200.md > /tmp/recover-actions.json
Execution guidance:
- Read
/tmp/recover-actions.json - Execute each
actions[]item withsessions_send - Post one concise summary to user
8) V5 pre-resume verifier + manual confirmation gate
Use script: scripts/pre_resume_verify.py to score resume actions before sending.
Examples:
python3 scripts/pre_resume_verify.py /tmp/recover-actions.json /tmp/recover-verified.json
Behavior:
- Marks each action as
risk=normal|high highrisk actions are set todecision=holdandrequiresManualConfirm=true- Only send
decision=sendautomatically - Ask user confirmation before executing held actions
Recommended execution flow:
- Generate actions with V4
- Verify with V5
- Send all
decision=send - Present
decision=holdlist to user for explicit confirmation
9) V6 execution-plan generator (auto-send safe items)
Use script: scripts/execute_verified_recovery.py with V5 output.
Example:
python3 scripts/execute_verified_recovery.py /tmp/recover-verified.json > /tmp/recover-exec.json
Behavior:
- Emits
sendActions[]for auto-safe resumes (decision=send) - Emits
holdForManualConfirm[]for risky resumes (decision=hold)
Execution:
- Execute all
sendActions[]withsessions_send - Ask user to confirm
holdForManualConfirm[] - Execute confirmed held items
- Post concise summary
10) Message templates
Read and use: references/templates.md
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!