Little Steve Agent Guard
Self-evolving security system for agent skills enforcing risk assessment, audit logging, tiered approvals, and continuous rule updates on all skill commands.
Description
name: little-steve-agent-guard version: 0.1.4 description: Little Steve Agent Guard: a self-evolving security system for agent skills. Wraps all skill commands with risk assessment, audit logging, approval levels, and continuous rule evolution. / 小史安全卫士:面向 Agent Skill 的自进化安全系统。为所有技能命令提供风险评估、审计日志、分级审批和持续规则进化。 homepage: https://github.com/EchoOfZion/little-steve-agent-guard requires: bins: - jq
Little Steve Agent Guard
A self-evolving security system for agent skills. Wraps all skill command execution with risk assessment, audit logging, tiered approval, and continuous rule learning.
Dependencies
- jq (required) — install via
brew install jqorapt install jq
Filesystem Scope
This is a cross-skill security guard. By design, it needs read access to other skills' directories to:
guard-exec.sh: read target scripts for static risk analysis before executioncapability-diff.sh: compare a skill's SKILL.md declarations against its actual scripts
The guard does not write to other skills' directories. All writes (audit logs, rules) stay within its own reports/ and rules/ directories.
Bypass & Emergency Procedures
The runbook (docs/runbook.md) documents emergency bypass procedures (circuit-break, manual script execution, log reset). These are human-operator-only actions for when the guard itself malfunctions. The agent must never execute bypass procedures autonomously.
CRITICAL: Execution Rule
ALL skill script executions MUST go through guard-exec.sh. Never call skill scripts directly. Always use:
bash {baseDir}/scripts/guard-exec.sh exec <script-path> [args...]
Example:
bash {baseDir}/scripts/guard-exec.sh exec {workspaceDir}/skills/<other-skill>/scripts/<script>.sh <command> [args...]
Approval Levels
- L1 (low/medium risk): Auto-execute, audit logged
- L2 (dry-run): Preview without executing
- L3 (high risk): Block and prompt user — output warning, wait for user to reply "确认" or "confirm"
- BLOCK (critical): Reject entirely, no execution possible
When guard-exec.sh returns exit code 10 (prompt), present the warning to the user and wait for confirmation. On "确认"/"confirm", re-run with confirm instead of exec.
Agent Command Conventions
- Execute a skill command (with guard)
bash {baseDir}/scripts/guard-exec.sh exec <script> [args...]
- Confirm a prompted action (after user approval)
bash {baseDir}/scripts/guard-exec.sh confirm <script> [args...]
- Preview without executing
bash {baseDir}/scripts/guard-exec.sh dry-run <script> [args...]
- Quick risk check
bash {baseDir}/scripts/guard-exec.sh check <script> [args...]
- Run capability consistency check on a skill
bash {baseDir}/scripts/capability-diff.sh check --skill-dir <skill-path>
- View audit stats
bash {baseDir}/scripts/audit.sh stats
- Generate weekly security report
bash {baseDir}/scripts/weekly-report.sh generate [days]
- Manage rules
bash {baseDir}/scripts/promote-rule.sh list
bash {baseDir}/scripts/promote-rule.sh add --rule <name> --pattern <regex> --level <low|medium|high|critical>
bash {baseDir}/scripts/promote-rule.sh promote --rule <name>
bash {baseDir}/scripts/promote-rule.sh demote --rule <name>
- Test candidate rules against history
bash {baseDir}/scripts/replay-verify.sh test --rule <name>
bash {baseDir}/scripts/replay-verify.sh test-all
Five Core Security Policies (Immutable)
- Least Privilege — scripts only access their own data directory
- Credential Protection — no secrets in args, output, or logs
- Capability Consistency — runtime must match SKILL.md declarations
- Outbound Control — no undeclared network access
- High-Risk Confirmation — destructive/critical actions need human approval
Risk Classification
| Level | Examples |
|---|---|
| low | read-only: list, view, status check |
| medium | single-item mutation: add, update status |
| high | delete, bulk mutation, file write outside data/ |
| critical | network access, secret exposure, system commands |
Data Files
reports/audit-events.jsonl— audit log (auto-created)reports/failure-dataset.json— failure samples for evolutionrules/active/*.rule— active custom rulesrules/candidates/*.rule— candidate rules pending promotion
小史安全卫士
面向 Agent Skill 的自进化安全系统。为所有技能命令提供风险评估、审计日志、分级审批和持续规则进化。
依赖
- jq(必须)— 通过
brew install jq或apt install jq安装
文件系统范围
这是一个跨技能安全卫士。按设计,它需要读取其他技能目录的权限:
guard-exec.sh:执行前读取目标脚本做静态风险分析capability-diff.sh:对比技能的 SKILL.md 声明与实际脚本行为
卫士不会写入其他技能的目录。所有写入(审计日志、规则)都在自身的 reports/ 和 rules/ 目录内。
绕过与紧急操作
运行手册(docs/runbook.md)记录了紧急绕过操作(熔断、直接执行脚本、日志重置)。这些是仅限人工操作员的紧急措施,用于卫士本身出故障的情况。Agent 绝对不可以自主执行绕过操作。
关键规则:执行约束
所有技能脚本执行必须通过 guard-exec.sh。 不要直接调用技能脚本,始终使用:
bash {baseDir}/scripts/guard-exec.sh exec <脚本路径> [参数...]
审批分级
- L1(低/中风险):自动执行,记录审计日志
- L2(预览):只预览不执行
- L3(高风险):阻断并提示用户——显示警告,等待用户回复"确认"
- 阻断(严重):直接拒绝,无法执行
当 guard-exec.sh 返回退出码 10(提示)时,向用户展示警告并等待确认。用户回复"确认"后,用 confirm 替代 exec 重新执行。
Agent 执行约定
- 执行技能命令(带防护)
bash {baseDir}/scripts/guard-exec.sh exec <脚本> [参数...]
- 确认被提示的操作(用户批准后)
bash {baseDir}/scripts/guard-exec.sh confirm <脚本> [参数...]
- 预览不执行
bash {baseDir}/scripts/guard-exec.sh dry-run <脚本> [参数...]
- 快速风险检查
bash {baseDir}/scripts/guard-exec.sh check <脚本> [参数...]
- 对技能做声明-行为一致性检查
bash {baseDir}/scripts/capability-diff.sh check --skill-dir <技能路径>
- 查看审计统计
bash {baseDir}/scripts/audit.sh stats
- 生成周报
bash {baseDir}/scripts/weekly-report.sh generate [天数]
- 管理规则
bash {baseDir}/scripts/promote-rule.sh list
bash {baseDir}/scripts/promote-rule.sh add --rule <名称> --pattern <正则> --level <low|medium|high|critical>
bash {baseDir}/scripts/promote-rule.sh promote --rule <名称>
bash {baseDir}/scripts/promote-rule.sh demote --rule <名称>
- 测试候选规则
bash {baseDir}/scripts/replay-verify.sh test --rule <名称>
bash {baseDir}/scripts/replay-verify.sh test-all
五条核心安全策略(不可变)
- 最小权限 — 脚本只能访问自身数据目录
- 凭证保护 — 参数、输出、日志中不得出现密钥
- 能力一致性 — 运行时行为必须与 SKILL.md 声明一致
- 外发控制 — 不得有未声明的网络访问
- 高风险确认 — 破坏性/严重操作需人工审批
风险分级
| 级别 | 示例 |
|---|---|
| low | 只读操作:列表、查看、状态检查 |
| medium | 单项变更:新增、更新状态 |
| high | 删除、批量变更、数据目录外写文件 |
| critical | 网络访问、密钥暴露、系统命令 |
数据文件
reports/audit-events.jsonl— 审计日志(自动创建)reports/failure-dataset.json— 失败样本(用于进化)rules/active/*.rule— 活跃自定义规则rules/candidates/*.rule— 候选规则(待晋升)
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!