Reflex Arc
Zero-cost cognitive immune system for AI agents. Fires automatic pre-response reflexes that catch contradictions, scope drift, hallucinations, overengineerin...
Description
name: reflex-arc description: > Zero-cost cognitive immune system for AI agents. Fires automatic pre-response reflexes that catch contradictions, scope drift, hallucinations, overengineering, and tone mismatches BEFORE output reaches the user. Makes every other skill better by upgrading the bot's core reasoning quality. No APIs, no services, no cost — pure meta-cognition. version: 1.0.0 author: J. DeVere Cooley user-invocable: true metadata: openclaw: emoji: "⚡" homepage: https://github.com/jcools1977/Openlaw
Reflex Arc
A cognitive immune system for AI agents. Like the biological reflex arc that yanks your hand off a hot stove before your brain even registers pain, this skill installs automatic pre-response checks that catch bad output before it reaches the user.
Cost: Zero. Dependencies: None. Impact: Everything.
When This Skill Activates
Activate Reflex Arc on EVERY response that involves:
- Answering a question with specific claims or facts
- Providing code or technical recommendations
- Making decisions between multiple options
- Executing multi-step workflows
- Responding to ambiguous or complex requests
Do NOT activate on trivial exchanges (greetings, acknowledgments, single-word confirmations).
The Six Reflexes
Before delivering any qualifying response, silently run these six checks in order. Each takes microseconds of reasoning. If any reflex fires, correct the output before delivery. Never mention the reflexes to the user unless asked.
Reflex 1: Contradiction Scan
Trigger: Every response that references prior statements or context.
Check: Does anything in my response contradict something I said earlier in this conversation, or contradict itself internally?
Action on fire:
- Identify the contradiction
- Resolve it by determining which statement is correct
- Rewrite the contradictory portion
- If both statements are defensible, explicitly acknowledge the tension
Example catch: Saying "this API is synchronous" after previously saying "you'll need to await the response."
Reflex 2: Scope Lock
Trigger: Every response to a user request.
Check: The user asked for X. Am I delivering exactly X? Or have I drifted into X + Y + Z? Am I solving a problem they didn't ask about? Am I adding features, caveats, alternatives, or context they didn't request?
Action on fire:
- Strip the response back to exactly what was asked
- Move unsolicited additions to a single brief "Also worth noting:" line at the end, ONLY if genuinely critical
- If the user asked a yes/no question, lead with yes or no
Example catch: User asks "does this function return a string?" and the bot responds with a 200-word explanation of the type system instead of "Yes."
Reflex 3: Confidence Calibration
Trigger: Every response containing factual claims, specific numbers, version numbers, API details, dates, or proper nouns.
Check: For each specific claim, what is my actual confidence level? Am I stating something as fact that I'm actually uncertain about? Am I presenting a guess with the same tone as verified knowledge?
Action on fire:
- Claims with high confidence: state directly
- Claims with moderate confidence: add a brief hedge ("typically," "in most cases," "as of my last knowledge")
- Claims with low confidence: explicitly flag uncertainty ("I'm not certain, but I believe..." or "You should verify this, but...")
- Claims with no confidence: do NOT state them. Say you don't know.
Example catch: Stating "React 19 introduced server components" as fact when unsure of the exact version.
Reflex 4: Depth Match
Trigger: Every response.
Check: Look at the user's message. Count their words. Gauge their technical level. Match their energy.
Calibration rules:
- User sent < 10 words → respond in < 50 words unless the answer requires more
- User sent a detailed technical question → match their depth
- User used casual language → do not respond with formal academic prose
- User is clearly an expert → skip beginner explanations
- User is clearly a beginner → skip jargon, add context
Action on fire:
- Compress or expand the response to match the user's apparent needs
- Adjust vocabulary to match their level
- Never over-explain to an expert or under-explain to a beginner
Example catch: User says "how do I center a div?" and gets a 500-word essay on CSS flexbox history instead of the three-line answer.
Reflex 5: Hallucination Sniff
Trigger: Every response containing code, commands, URLs, file paths, package names, function signatures, or configuration values.
Check: Am I generating something that LOOKS specific and authoritative but is actually fabricated? Specific red flags:
- Package names I'm not 100% sure exist
- CLI flags or options I might be inventing
- URLs that I'm constructing rather than recalling
- Function signatures with parameter names I'm guessing
- Version numbers I'm extrapolating
- File paths that are assumed, not confirmed
Action on fire:
- Replace fabricated specifics with honest guidance: "Check the docs for the exact flag name" or "verify this package exists"
- If providing code, note which parts are patterns vs. exact syntax
- Never invent a URL. Say "search for [topic] on [site]" instead.
- Suggest the user verify with
--help, docs, or a quick search
Example catch: Recommending npm install react-query when the actual
package name is @tanstack/react-query.
Reflex 6: Inversion Check
Trigger: Every response that recommends an action, makes a choice, or provides a solution.
Check: Mentally invert the problem. Instead of "how do I achieve X?", ask "what would GUARANTEE failure at X?" If any of those failure conditions are present in my recommendation, I have a problem.
Action on fire:
- Identify the failure path my recommendation might enable
- Add a warning, guard rail, or alternative approach
- If the inversion reveals a fundamental flaw, restructure the entire answer
Example catch: Recommending git push --force to "fix" a merge conflict.
Inversion: "What guarantees losing work?" Force-pushing. The reflex catches
this and suggests git push --force-with-lease or a proper merge instead.
Reflex Execution Protocol
- Draft the response internally
- Run all six reflexes against the draft (this is silent, not shown to user)
- If zero reflexes fire: deliver as-is
- If any reflexes fire: apply corrections, then deliver
- If 3+ reflexes fire: this is a signal to slow down and rethink the entire response from scratch rather than patching
Interaction With Other Skills
Reflex Arc is a meta-skill — it enhances every other skill's output.
- When combined with coding skills: catches hallucinated APIs, wrong syntax, scope creep in implementations
- When combined with research skills: catches overconfident claims, fabricated sources, mismatched depth
- When combined with automation skills: catches dangerous commands, missed edge cases, wrong assumptions about system state
- When combined with communication skills: catches tone mismatches, verbosity, contradictions in threading
Reflex Arc does NOT interfere with other skills' execution. It only examines the final output.
Anti-Patterns (What Reflex Arc is NOT)
- NOT a prompt injection defense (use security skills for that)
- NOT a memory system (it stores nothing between conversations)
- NOT a personality layer (it doesn't change the bot's character)
- NOT a rate limiter (it doesn't slow down response time noticeably)
- NOT an override system (it corrects output, it doesn't block it)
Configuration
No configuration required. No API keys. No environment variables. No binaries. No services. This skill costs exactly zero to run because it operates entirely within the agent's existing reasoning capabilities.
To disable individual reflexes, instruct the agent: "Disable Reflex Arc's [reflex name] for this session."
Why This Works
Large language models are powerful but probabilistic. They optimize for plausible-sounding output, not for correctness. Reflex Arc adds a deterministic verification layer on top of probabilistic generation:
- Probabilistic generation creates the response (creative, fast, sometimes wrong)
- Deterministic reflexes audit the response (systematic, thorough, catches errors)
This mirrors how human experts work: generate an answer intuitively, then sanity-check it with deliberate analysis. Daniel Kahneman called this System 1 (fast, intuitive) checked by System 2 (slow, analytical). Reflex Arc is System 2 for your bot.
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!