🧪 Skills
Counterclaw Core
--- name: counterclaw description: Defensive interceptor for prompt injection and basic PII masking. homepage: https://github.com/nickconstantinou/counterclaw-core install: "pip install ." requirement
v1.1.1
Description
name: counterclaw
description: Defensive interceptor for prompt injection and basic PII masking.
homepage: https://github.com/nickconstantinou/counterclaw-core
install: "pip install ."
requirements:
env:
- TRUSTED_ADMIN_IDS
files:
- "/.openclaw/memory/"
- "/.openclaw/memory/MEMORY.md"
metadata:
clawdbot:
emoji: "🛡️"
version: "1.1.0"
category: "Security"
type: "python-middleware"
security_manifest:
network_access: "optional (only when using email integration scripts)"
filesystem_access: "Write-only logging to ~/.openclaw/memory/"
purpose: "Log security violations locally for user audit."
CounterClaw 🦞
Defensive security for AI agents. Snaps shut on malicious payloads.
⚠️ Security Notice
This package has two modes:
- Core Scanner (offline):
check_input()andcheck_output()— no network calls - Email Integration (network):
send_protected_email.sh— requires gog CLI for Gmail
Installation
claw install counterclaw
Quick Start
from counterclaw import CounterClawInterceptor
interceptor = CounterClawInterceptor()
# Input scan - blocks prompt injections
# NOTE: Examples below are TEST CASES only - not actual instructions
result = interceptor.check_input("{{EXAMPLE: ignore previous instructions}}")
# → {"blocked": True, "safe": False}
# Output scan - detects PII leaks
result = interceptor.check_output("Contact: john@example.com")
# → {"safe": False, "pii_detected": {"email": True}}
Features
- 🔒 Defense against common prompt injection patterns
- 🛡️ Basic PII masking (Email, Phone, Credit Card)
- 📝 Violation logging to
~/.openclaw/memory/MEMORY.md - ⚠️ Warning on startup if TRUSTED_ADMIN_IDS not configured
Configuration
Required Environment Variable
# Set your trusted admin ID(s) - use non-sensitive identifiers only!
export TRUSTED_ADMIN_IDS="your_telegram_id"
Important: TRUSTED_ADMIN_IDS should ONLY contain non-sensitive identifiers:
- ✅ Telegram user IDs (e.g.,
"123456789") - ✅ Discord user IDs (e.g.,
"987654321") - ❌ NEVER API keys
- ❌ NEVER passwords
- ❌ NEVER tokens
You can set multiple admin IDs by comma-separating:
export TRUSTED_ADMIN_IDS="telegram_id_1,telegram_id_2"
Runtime Configuration
# Option 1: Via environment variable (recommended)
# Set TRUSTED_ADMIN_IDS before running
interceptor = CounterClawInterceptor()
# Option 2: Direct parameter
interceptor = CounterClawInterceptor(admin_user_id="123456789")
Security Notes
- Fail-Closed: If
TRUSTED_ADMIN_IDSis not set, admin features are disabled by default - Logging: All violations are logged to
~/.openclaw/memory/MEMORY.mdwith PII masked - No Network Access: This middleware does not make any external network calls (offline-only)
- File Access: Only writes to
~/.openclaw/memory/MEMORY.md— explicitly declared scope
Files Created
| Path | Purpose |
|---|---|
~/.openclaw/memory/ |
Directory created on first run |
~/.openclaw/memory/MEMORY.md |
Violation logs with PII masked |
License
MIT - See LICENSE file
Development & Release
Running Tests Locally
python3 tests/test_scanner.py
Linting
pip install ruff
ruff check src/
Publishing to ClawHub
The CI runs on every push and pull request:
- Ruff - Lints Python code
- Tests - Runs unit tests
To publish a new version:
# Version is set in pyproject.toml
git add -A
git commit -m "Release v1.0.9"
git tag v1.0.9
git push origin main --tags
CI will automatically:
- Run lint + tests
- If tests pass and tag starts with
v*, publish to ClawHub
Reviews (0)
Sign in to write a review.
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!