🧪 Skills
Code Review Engine
Enterprise-grade code review agent. Reviews PRs, diffs, or code files for security vulnerabilities, performance issues, error handling gaps, architecture smells, and test coverage. Works with any lang
v1.0.0
Description
name: afrexai-code-reviewer description: Enterprise-grade code review agent. Reviews PRs, diffs, or code files for security vulnerabilities, performance issues, error handling gaps, architecture smells, and test coverage. Works with any language, any repo, no dependencies required. auto_trigger: false
Code Review Engine
Enterprise-grade automated code review. Works on GitHub PRs, local diffs, pasted code, or entire files. No dependencies — pure agent intelligence.
Quick Start
Review a GitHub PR
Review PR #42 in owner/repo
Review a local diff
Review the staged changes in this repo
Review a file
Review src/auth/login.ts for security issues
Review pasted code
Just paste code and say "review this"
Review Framework: SPEAR
Every review follows the SPEAR framework — 5 dimensions, each scored 1-10:
🔴 S — Security (Weight: 3x)
| Check | Severity | Example |
|---|---|---|
| Hardcoded secrets | CRITICAL | API keys, passwords, tokens in source |
| SQL injection | CRITICAL | String concatenation in queries |
| XSS vectors | HIGH | Unsanitized user input in HTML/DOM |
| Path traversal | HIGH | User input in file paths without validation |
| Insecure deserialization | HIGH | eval(), pickle.loads(), JSON.parse on untrusted input |
| Auth bypass | CRITICAL | Missing auth checks on endpoints |
| SSRF | HIGH | User-controlled URLs in server requests |
| Timing attacks | MEDIUM | Non-constant-time string comparison for secrets |
| Dependency vulnerabilities | MEDIUM | Known CVEs in imported packages |
| Sensitive data logging | MEDIUM | PII, tokens, passwords in log output |
| Insecure randomness | MEDIUM | Math.random() for security-sensitive values |
| Missing rate limiting | MEDIUM | Auth endpoints without throttling |
🟡 P — Performance (Weight: 2x)
| Check | Severity | Example |
|---|---|---|
| N+1 queries | HIGH | DB call inside a loop |
| Unbounded queries | HIGH | SELECT * without LIMIT on user-facing endpoints |
| Missing indexes (implied) | MEDIUM | Frequent WHERE/ORDER on unindexed columns |
| Memory leaks | HIGH | Event listeners never removed, growing caches |
| Blocking main thread | HIGH | Sync I/O in async context, CPU-heavy in event loop |
| Unnecessary re-renders | MEDIUM | React: missing memo, unstable refs in deps |
| Large bundle imports | MEDIUM | import _ from 'lodash' vs import get from 'lodash/get' |
| Missing pagination | MEDIUM | Returning all records to client |
| Redundant computation | LOW | Same expensive calc repeated without caching |
| Connection pool exhaustion | HIGH | Not releasing DB/HTTP connections |
🟠 E — Error Handling (Weight: 2x)
| Check | Severity | Example |
|---|---|---|
| Swallowed errors | HIGH | Empty catch blocks, Go _ := on error |
| Missing error boundaries | MEDIUM | React components without error boundaries |
| Unchecked null/undefined | HIGH | No null checks before property access |
| Missing finally/cleanup | MEDIUM | Resources opened but not guaranteed closed |
| Generic error messages | LOW | catch(e) { throw new Error("something went wrong") } |
| Missing retry logic | MEDIUM | Network calls without retry on transient failures |
| Panic/exit in library code | HIGH | panic(), os.Exit(), process.exit() in non-main |
| Unhandled promise rejections | HIGH | Async calls without .catch() or try/catch |
| Error type conflation | MEDIUM | All errors treated the same (4xx vs 5xx, retriable vs fatal) |
🔵 A — Architecture (Weight: 1.5x)
| Check | Severity | Example |
|---|---|---|
| God functions (>50 lines) | MEDIUM | Single function doing too many things |
| God files (>300 lines) | MEDIUM | Monolithic module |
| Tight coupling | MEDIUM | Direct DB calls in request handlers |
| Missing abstraction | LOW | Repeated patterns that should be extracted |
| Circular dependencies | HIGH | A imports B imports A |
| Wrong layer | MEDIUM | Business logic in controllers, SQL in UI |
| Magic numbers/strings | LOW | Hardcoded values without named constants |
| Missing types | MEDIUM | any in TypeScript, missing type hints in Python |
| Dead code | LOW | Unreachable branches, unused imports/variables |
| Inconsistent patterns | LOW | Different error handling styles in same codebase |
📊 R — Reliability (Weight: 1.5x)
| Check | Severity | Example |
|---|---|---|
| Missing tests for changes | HIGH | New logic without corresponding test |
| Test quality | MEDIUM | Tests that only check happy path |
| Missing edge cases | MEDIUM | No handling for empty arrays, null, boundary values |
| Race conditions | HIGH | Shared mutable state without synchronization |
| Non-idempotent operations | MEDIUM | Retrying could cause duplicates |
| Missing validation | HIGH | User input accepted without schema validation |
| Brittle tests | LOW | Tests depending on execution order or timing |
| Missing logging | MEDIUM | Error paths with no observability |
| Configuration drift | MEDIUM | Hardcoded env-specific values |
| Missing migrations | HIGH | Schema changes without migration files |
Scoring System
Per-Finding Severity
CRITICAL → -3 points from dimension score
HIGH → -2 points
MEDIUM → -1 point
LOW → -0.5 points
INFO → 0 (suggestion only)
Overall SPEAR Score Calculation
Raw Score = (S×3 + P×2 + E×2 + A×1.5 + R×1.5) / 10
Final Score = Raw Score × 10 (scale 0-100)
Verdict Thresholds
| Score | Verdict | Action |
|---|---|---|
| 90-100 | ✅ EXCELLENT | Ship it |
| 75-89 | 🟢 GOOD | Minor suggestions, approve |
| 60-74 | 🟡 NEEDS WORK | Address findings before merge |
| 40-59 | 🟠 SIGNIFICANT ISSUES | Major rework needed |
| 0-39 | 🔴 BLOCK | Critical issues, do not merge |
Review Output Template
Use this structure for every review:
# Code Review: [PR title or file name]
## Summary
[1-2 sentence overview of what this code does and overall quality]
## SPEAR Score: [X]/100 — [VERDICT]
| Dimension | Score | Key Finding |
|-----------|-------|-------------|
| 🔴 Security | X/10 | [worst finding or "Clean"] |
| 🟡 Performance | X/10 | [worst finding or "Clean"] |
| 🟠 Error Handling | X/10 | [worst finding or "Clean"] |
| 🔵 Architecture | X/10 | [worst finding or "Clean"] |
| 📊 Reliability | X/10 | [worst finding or "Clean"] |
## Findings
### [CRITICAL/HIGH] 🔴 [Title]
**File:** `path/to/file.ts:42`
**Category:** Security
**Issue:** [What's wrong]
**Impact:** [What could happen]
**Fix:**
```[lang]
// suggested fix
[MEDIUM] 🟡 [Title]
...
What's Done Well
- [Genuinely good patterns worth calling out]
Recommendations
- [Prioritized action items]
---
## Language-Specific Patterns
### TypeScript / JavaScript
- `any` type usage → Architecture finding
- `as` type assertions → potential runtime error
- `console.log` in production code → Style
- `==` instead of `===` → Reliability
- Missing `async/await` error handling
- `useEffect` missing cleanup return
- Index signatures without validation
### Python
- Bare `except:` or `except Exception:` → Error Handling
- `eval()` / `exec()` → Security CRITICAL
- Mutable default arguments → Reliability
- `import *` → Architecture
- Missing `__init__.py` type hints
- f-strings with user input → potential injection
### Go
- `_ :=` discarding errors → Error Handling HIGH
- `panic()` in library code → Reliability HIGH
- Missing `defer` for resource cleanup
- Exported functions without doc comments
- `interface{}` / `any` overuse
### Java
- Catching `Exception` or `Throwable` → Error Handling
- Missing `@Override` annotations
- Mutable static fields → thread safety
- `System.out.println` in production
- Missing null checks (pre-Optional code)
### SQL
- String concatenation in queries → Security CRITICAL
- `SELECT *` → Performance
- Missing WHERE on UPDATE/DELETE → Security CRITICAL
- No LIMIT on user-facing queries → Performance
- Missing indexes for JOIN columns
---
## Advanced Techniques
### Reviewing for Business Logic
Beyond code quality, check:
- Does the code match the PR description / ticket requirements?
- Are there edge cases the spec didn't mention?
- Could this break existing functionality?
- Is there a simpler way to achieve the same result?
### Reviewing for Operability
- Can this be debugged in production? (logging, error messages)
- Can this be rolled back safely?
- Are feature flags needed?
- What monitoring should accompany this change?
### Reviewing Database Changes
- Is the migration reversible?
- Will it lock tables during migration?
- Are there indexes for new query patterns?
- Is there a data backfill needed?
### Security Review Depth Levels
| Level | When | What |
|-------|------|------|
| Quick | Internal tool, trusted input | OWASP Top 10 patterns only |
| Standard | User-facing feature | + auth, input validation, output encoding |
| Deep | Payment, auth, PII handling | + crypto review, session management, audit logging |
| Threat Model | New service/API surface | + attack surface mapping, trust boundaries |
---
## Integration Patterns
### GitHub PR Review
```bash
# Get PR diff
gh pr diff 42 --repo owner/repo
# Get PR details
gh pr view 42 --repo owner/repo --json title,body,files,commits
# Post review comment
gh pr review 42 --repo owner/repo --comment --body "review content"
Local Git Review
# Review staged changes
git diff --cached
# Review branch vs main
git diff main..HEAD
# Review last N commits
git log -5 --oneline && git diff HEAD~5..HEAD
Heartbeat / Cron Integration
Check for open PRs in [repo] that I haven't reviewed yet.
For each, run a SPEAR review and post the results as a PR comment.
Edge Cases & Gotchas
- Large PRs (>500 lines): Break into logical chunks. Review file-by-file. Flag the PR size itself as a finding (Architecture: "PR too large — consider splitting").
- Generated code: Skip generated files (proto, swagger, migrations from ORMs). Note that you skipped them.
- Dependency updates: Focus on breaking changes in changelogs, not the lockfile diff.
- Merge conflicts markers: Flag immediately as CRITICAL —
<<<<<<<in code means broken merge. - Binary files: Note presence, can't review content.
- Config changes: Extra scrutiny — wrong env var = production outage.
- Refactors: Verify behavior preservation. Check if tests still pass conceptually.
Review Checklist (Quick Mode)
For fast reviews when full SPEAR isn't needed:
- No hardcoded secrets or credentials
- No SQL injection / XSS / path traversal
- All errors handled (no empty catch, no discarded errors)
- No N+1 queries or unbounded operations
- Tests exist for new/changed logic
- No
console.log/print/fmt.Printleft in - Functions under 50 lines, files under 300 lines
- Types are specific (no
any/interface{}) - PR description matches the actual changes
- No TODOs without linked issues
Reviews (0)
Sign in to write a review.
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!