🧪 Skills

Smart Wake

--- name: smart-wake version: 1.0.0 owner: platform-team description: Prevent subagent timeout using checkpoint + cron wake + resume via session spawn mechanism. --- # SKILL: smart-wake ## Objective

v1.0.0
❤️ 1
⬇️ 154
👁 1
Share

Description


name: smart-wake version: 1.0.0 owner: platform-team description: Prevent subagent timeout using checkpoint + cron wake + resume via session spawn mechanism.

SKILL: smart-wake

Objective

Ensure long-running tasks do not lose progress when subagent timeout occurs.

Standard mechanism:

  1. Auto-save state before timeout / reaching timeout threshold.
  2. Cron wake to re-invoke agent at scheduled time.
  3. Resume task from latest checkpoint using sessions_spawn (or session_spawn depending on runtime) with appropriate wakeMode.

Activation Triggers

Use smart-wake when any of the following conditions are met:

  • Task duration exceeds default subagent timeout.
  • Requires periodic polling (by minute/hour) to continue processing.
  • Multi-step workflow that may pause mid-execution but must resume accurately.
  • Running overnight / off-hours but still needs to auto-continue.

Available Tools (Built-in)

  • sessions_spawn / session_spawn (spawn new work session)
  • Gateway cron jobs (schedule wake events)
  • Memory/slot for state persistence (state checkpoint)

Do not create new tools. Only use existing mechanisms.


State Contract (MANDATORY)

Each task must implement checkpointing with minimum schema:

{
  "task_id": "smartwake_<slug>_<timestamp>",
  "goal": "Final objective",
  "status": "running|waiting|blocked|done|failed",
  "current_step": "step_name",
  "progress_pct": 0,
  "last_completed": ["step_a", "step_b"],
  "next_actions": ["action_1", "action_2"],
  "artifacts": ["path/file1", "path/file2"],
  "errors": [],
  "retry_count": 0,
  "updated_at": "ISO_TIMESTAMP"
}

Requirements:

  • Update checkpoint after each completed step.
  • Final checkpoint before timeout.
  • Resume always reads latest checkpoint; never re-run blindly.

Standard Workflow

Step 1 — Preflight (Estimate Timeout)

  • Determine subagent timeout and total task duration.
  • If estimated_duration > 70% timeout: enable smart-wake immediately.
  • Create task_id and initial checkpoint.

Step 2 — Execute by Small Chunks

  • Divide task into 5–15 minute chunks.
  • After each chunk:
    • Write progress to checkpoint,
    • Record artifact/path,
    • Update next_actions.

Step 3 — Pre-timeout Auto-save

  • When ~10–20% timeout remains:
    • Flush final checkpoint,
    • Set status = waiting,
    • Prepare resume payload (task_id + next_actions).

Step 4 — Schedule Cron Wake

  • Register Gateway cron job to wake after appropriate interval (e.g., 2–10 minutes depending on load).
  • Cron payload MUST contain:
    • task_id
    • resume_from=latest_checkpoint
    • reason=timeout_recovery

Step 5 — Spawn Resume Session

  • At wake time, call sessions_spawn (or session_spawn) with wakeMode enabling auto-resume via cron.
  • Pass concise context:
    • objective,
    • latest checkpoint,
    • next step,
    • completion criteria.

Step 6 — Resume + Idempotency Guard

  • New session must:
    1. Read checkpoint,
    2. Verify artifacts exist,
    3. Skip completed steps,
    4. Continue exact next_actions.
  • When entire task completes: set status = done, cancel remaining cron wakes.

Sample Resume Packet (Reference)

{
  "task": "Resume long-running task",
  "task_id": "smartwake_repo_scan_20260301T120000Z",
  "wakeMode": "cron",
  "resume": {
    "from": "latest_checkpoint",
    "current_step": "collect_phase_2",
    "next_actions": ["fetch page 6-10", "dedupe", "export report"]
  },
  "done_criteria": [
    "output file generated",
    "validation passed",
    "status marked done"
  ]
}

Note: Field names may vary by Gateway/OpenClaw version, but checkpoint + wakeMode + resume context principles are mandatory.


Operational Rules

  1. Never resume without valid checkpoint.
  2. Never overwrite artifacts without checksum/timestamp verification.
  3. No infinite wake loops: limit retry_count (e.g., max 5).
  4. Every wake must have clear reason (timeout_recovery, dependency_ready, scheduled_progress).
  5. Clean up on done: completed task must clear related cron jobs.

Output Format After Each Wake Cycle

{
  "task_id": "smartwake_<...>",
  "status": "running|waiting|done|failed",
  "progress_pct": 65,
  "current_step": "...",
  "resumed_from_checkpoint": true,
  "next_wake_scheduled": true,
  "next_wake_at": "ISO_TIMESTAMP|null",
  "notes": "concise, auditable"
}

Anti-patterns (FORBIDDEN)

  • Running long tasks without periodic checkpointing.
  • Waking without passing task_id/resume context.
  • Re-running entire pipeline from start when only 1 step is missing.
  • Not canceling cron after done.

Expected Outcomes

  • No state loss on subagent timeout.
  • Long tasks progress steadily via cron wake.
  • Full system auditability: checkpoint → wake → resume → done.

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs