Upgrade Guardian
A cognitive protocol for safely managing and auditing OpenClaw application upgrades. Analyzes configuration-level risks (schema, defaults) and runtime-level...
Description
name: upgrade-guardian description: A cognitive protocol for safely managing and auditing OpenClaw application upgrades. Analyzes configuration-level risks (schema, defaults) and runtime-level behavioral shifts (routing, sessions, streaming) using semantic changelog analysis to prevent silent breaking changes.
Cognitive Protocol: The Upgrade Guardian
This skill defines a formal, multi-phase cognitive protocol for an agent to execute when tasked with managing an application upgrade. Its purpose is to transcend simple, static checks and provide a dynamic, intelligent analysis that prevents "silent breaking change" incidents.
This is not a script. It is a directive for higher-order reasoning.
Core Principle
An application upgrade is a high-stakes event. The agent must not trust that the upgrade is safe. The agent must assume that any change can have unintended consequences on a stable system. The goal is to make implicit environmental assumptions explicit and resilient before they break.
Protocol Activation
This protocol is activated when a human operator declares their intent to upgrade the application (e.g., "We are planning to upgrade OpenClaw from vA to vB").
Analysis Scope
Upgrade Guardian covers two categories of risks:
-
Configuration-level risks: Changes that affect
openclaw.jsonor static config files- Breaking changes in schema or validation
- Deprecated config fields
- New required config options
- Default value changes
-
Runtime-level risks: Changes that affect behavior without config modifications
- Behavioral shifts in session handling, routing, or delivery
- Logic changes in compaction, memory, or agents
- Protocol-level changes (streaming, API compatibility)
- CLI UX changes (e.g.,
/newbehavior)
See references/RISK_CATEGORIES.md for detailed taxonomy.
Phase 1: Information Gathering & Semantic Analysis
- Ingest Release Notes: Fetch the
CHANGELOGor release notes for the target version range. - Semantic Analysis: Perform semantic analysis using patterns in
references/changelog_analysis_patterns.md.- Do not just search for "breaking change"
- Look for behavioral shift indicators (refactor, unify, improve handling, etc.)
- Identify both config-affecting and runtime-only changes
- Cross-Reference with Environment:
- For config risks: Load
openclaw.jsonand identify dependencies on implicit behaviors - For runtime risks: Identify active workflows (cron jobs, TUI usage, session routing patterns) that may be affected
- For config risks: Load
Phase 2: Risk Assessment & Scenario Planning
2.1 Formulate "What-If" Scenarios
For each identified change, generate concrete, testable failure scenarios:
Config-level examples:
- Scenario A: "What if 'improved session handling' means a new, destructive default for unconfigured session types? → Data loss."
- Scenario B: "What if 'refactored security policy' means the
allowlistnow requires explicit IP ranges? → Plugin executions fail."
Runtime-level examples:
- Scenario C: "What if 'duplicate reply suppression' changes session routing logic? → Bot stops responding in some groups."
- Scenario D: "What if
/newnow creates independent sessions instead of resetting shared session? → User workflow disrupted." - Scenario E: "What if 'streaming compatibility fix' breaks non-native OpenAI-compatible providers? → Long responses fail mid-stream."
2.2 Quantify Risk
Assign a risk score based on:
- Impact: data loss > service outage > UX friction > cosmetic
- Likelihood: direct config/workflow overlap > tangential relation > theoretical
2.3 Generate Audit Report
Present findings to the operator using the template in references/AUDIT_REPORT_TEMPLATE.md.
Key sections:
- Configuration risks (with jq paths and explicit mitigations)
- Runtime risks (with behavioral descriptions and verification tests)
- Risk prioritization (High/Medium/Low)
Phase 3: Mitigation & Verification
3.1 Proactive Mitigation
For config risks: Propose specific openclaw.json changes to make implicit assumptions explicit. Do not execute without operator approval.
For runtime risks: Document expected behavioral changes and suggest workflow adjustments if needed.
3.2 Verification Plan
Define clear, simple tests for each risk:
Config verification examples:
- "Run
openclaw doctorand confirm no validation errors" - "Check
gateway.err.logfor auth mode complaints"
Runtime verification examples:
- "Send test message in group chat, verify bot responds"
- "Open TUI, run
/new, confirm it creates independent session" - "Trigger long completion from streaming provider, verify no mid-stream failure"
3.3 Post-Upgrade Audit
After the operator confirms upgrade is complete:
- Execute verification plan
- Report results systematically
- Recommend rollback if critical failures detected
3.4 Archive Upgrade Artifacts (relative to workspace)
Save the upgrade write-ups and check results inside the agent workspace so they remain discoverable and portable.
Write locations (relative paths):
- Pre-upgrade analysis report →
kb/logs/upgrade-reports/YYYY-MM-DD_<from>-to-<to>_upgrade-analysis.md - Post-upgrade verification report →
kb/logs/upgrade-verifications/YYYY-MM-DD_post-upgrade-verification.txt
Notes:
- Prefer workspace-relative paths in reports (avoid hard-coded absolute home paths).
- If
kb/is a symlink in a particular deployment, still refer to it askb/...in the protocol/report; the filesystem mapping is an implementation detail.
References
references/changelog_analysis_patterns.md- Semantic analysis patternsreferences/RISK_CATEGORIES.md- Detailed risk taxonomyreferences/AUDIT_REPORT_TEMPLATE.md- Report structurereferences/VERIFICATION_CHECKLIST.md- Common verification tests
Notes
- This protocol is designed to be conservative. It's better to flag a false positive than miss a silent breaking change.
- Runtime risks are often harder to detect than config risks. Pay extra attention to behavioral keywords like "improve", "fix", "refactor" in areas you actively use (sessions, routing, streaming).
- When in doubt, ask the operator about their workflow patterns before deeming a risk "Low" priority.
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!