Skip to main content

Code Review

mcpammer_code_review is an AST-based code quality verification tool designed for AI-to-AI loops, not human review. It returns structured, actionable feedback that agents consume and act on directly—no buttons to click, no comments to read.

The Problem with Human-in-the-Loop Review

Traditional code review tools (including AI-powered ones like CodeRabbit) assume:

AI writes code → Human reads review → Human clicks buttons → Human decides

This model made sense when implementation was done by humans who needed oversight. But when AI agents write 20,000 lines overnight, test comprehensively, and iterate autonomously—what is the human clicking buttons adding?

Nothing.

The human isn't catching bugs—automated tests do that. The human isn't understanding the code—they can't deeply understand 20,000 lines in a review session. The human is performing review theater.

The MCPammer Model

We believe humans belong around the loop, not in it:

Human defines:
├── Initiative (what to build)
├── Success criteria (how we know it works)
├── Constraints (what AI can/can't do)
└── Requirements (what must be true)

AI executes:
├── Writes code
├── Runs tests
├── Self-reviews via mcpammer_code_review
├── Gets second_opinion if needed
├── Iterates until criteria pass
└── Ships

Human verifies:
└── Did the outcome match the intent? Yes/no.

The human judges results, not process. No line-by-line review.

Why This Matters

The Historical Pattern

Every abstraction layer has removed humans from implementation:

EraHuman Role
AssemblyWrite every instruction
CWrite algorithms, compiler handles instructions
PythonWrite logic, runtime handles memory
FrameworksWrite business logic, framework handles plumbing
AI AgentsWrite specifications, AI handles implementation

This isn't a break from pattern—it's the next step.

The Brooks Insight (1986)

Fred Brooks identified in "No Silver Bullet":

"The hard part of building software is the specification, design, and testing of the conceptual construct, not the labor of representing it."

The bottleneck was never typing code. It was knowing what to build and why. AI just makes this explicit.

The Economic Forcing Function

Consider two teams:

Team A (Human-in-Loop):

  • AI writes → Human reviews (2-4 hours) → Fixes → Re-review (1-2 hours)
  • Cycle time: 1-2 days

Team B (Autonomous):

  • Human defines constraints → AI writes, tests, self-reviews, iterates → Ship
  • Cycle time: 2-4 hours

Team B ships 4-8x faster. Market pressure is unidirectional.

Tool Design

mcpammer_code_review

AST-based static analysis returning structured, actionable output.

Parameters:

ParameterTypeRequiredDescription
file_pathsstring[]YesFiles to review
repo_pathstringNoRepository context for cross-file analysis
checksstring[]NoSpecific check categories to run
severity_thresholdstringNoMinimum severity to report (error, warning, info)

Response:

{
"passed": false,
"summary": {
"files_analyzed": 12,
"errors": 2,
"warnings": 5,
"info": 8
},
"issues": [
{
"file": "src/api/users.ts",
"line": 42,
"column": 15,
"severity": "error",
"category": "security",
"code": "SQL_INJECTION",
"message": "Unsanitized user input in SQL query",
"suggestion": "Use parameterized query: db.query('SELECT * FROM users WHERE id = $1', [userId])",
"context": "const result = db.query(`SELECT * FROM users WHERE id = ${userId}`)"
}
],
"metrics": {
"avg_complexity": 8.3,
"max_complexity": 24,
"test_coverage": 0.72,
"duplication_ratio": 0.03
}
}

Key Design Decisions:

  1. Structured output - JSON, not prose. Agents parse and act on it.
  2. Actionable suggestions - Include the fix, not just the problem.
  3. Metrics included - Quantitative data for automated gates.
  4. Pass/fail boolean - Enables simple gate checks in hooks.

Check Categories

Security

OWASP Top 10 and common vulnerabilities:

CodeDescription
SQL_INJECTIONUnsanitized input in SQL
XSSCross-site scripting vectors
CMD_INJECTIONShell command injection
PATH_TRAVERSALDirectory traversal vulnerabilities
HARDCODED_SECRETCredentials in source code
INSECURE_RANDOMWeak random number generation

Correctness

Logic and type errors:

CodeDescription
NULL_DEREFPotential null/undefined dereference
UNREACHABLE_CODEDead code after return/throw
UNUSED_VARDeclared but never used
TYPE_MISMATCHIncompatible type assignment
INFINITE_LOOPLoop without exit condition
RACE_CONDITIONConcurrent access without synchronization

Performance

Efficiency issues:

CodeDescription
N_PLUS_ONEDatabase query in loop
UNNECESSARY_ALLOCAllocation inside hot loop
MISSING_INDEX_HINTQuery pattern suggests missing index
SYNC_IN_ASYNCBlocking call in async context
LARGE_BUNDLEImport increases bundle size significantly

Maintainability

Code health metrics:

CodeDescription
HIGH_COMPLEXITYCyclomatic complexity > threshold
DEEP_NESTINGNesting depth > 4 levels
LONG_FUNCTIONFunction exceeds line limit
CODE_DUPLICATIONSubstantial duplicate blocks
MAGIC_NUMBERUnexplained numeric literal

Integration with Autonomous Loops

As an Initiative Gate

Add code review as a criterion for initiative completion:

Initiative: Build user authentication
├── Criterion: All tests pass
├── Criterion: mcpammer_code_review passes with no errors
├── Criterion: Coverage > 80%
└── Criterion: No security issues

The agent iterates until all criteria pass. No human review needed.

As an Epic Hook

Configure code review as a gate hook on epic completion:

{
"hook_type": "gate",
"event": "epic_complete",
"action": "code_review",
"action_config": {
"severity_threshold": "error",
"required_checks": ["security", "correctness"]
}
}

Epic cannot complete until code review passes.

In the Iteration Loop

Typical agent workflow:

1. Agent claims ticket
2. Agent implements feature
3. Agent runs tests
4. Agent calls mcpammer_code_review
5. If issues found:
a. Agent reads structured issues
b. Agent applies suggested fixes
c. Goto step 3
6. If passed: Agent marks ticket complete

The loop runs in seconds, not days.

Comparison with Traditional Tools

AspectTraditional (CodeRabbit, etc.)mcpammer_code_review
Output formatProse comments for humansStructured JSON for agents
Human requiredYes, to read and decideNo
Iteration speedHours (waiting for human)Seconds (agent loops)
IntegrationPR-based, GitHub UIMCP tool, programmatic
Actionability"Consider using..."Exact fix provided
Gate capabilityManual approvalBoolean pass/fail

Architecture

┌─────────────────┐
│ Claude Code │
│ (or any agent) │
└────────┬────────┘
│ MCP call

┌─────────────────┐ ┌─────────────────┐
│ MCPammer │────▶│ Tree-sitter │
│ code_review │ │ AST Parser │
└────────┬────────┘ └─────────────────┘

├─── Rule Engine (pattern matching)

├─── Metrics Calculator

└─── Optional: second_opinion for deeper analysis

Why AST-Based

  1. Language agnostic - Tree-sitter supports 100+ languages
  2. Fast - Parse once, query many patterns
  3. Precise - Line/column accuracy for issues
  4. Semantic - Understands code structure, not just text

Optional Second Opinion Integration

For complex issues, code_review can internally call second_opinion:

{
"use_second_opinion": true,
"second_opinion_threshold": "high_complexity"
}

This adds LLM-based analysis for issues that pattern matching can't catch.

The One-Year Prediction

By January 2027:

  1. Per-PR human review will be optional in most organizations
  2. Autonomous coding agents will handle 80%+ of implementation
  3. Tools designed for human-in-loop (CodeRabbit, etc.) will have pivoted or become irrelevant
  4. The senior engineer role will be explicitly about specification, not code review
  5. Constraint-based autonomous execution will be the standard model

This isn't speculation—it's extrapolation from the last 12 months of AI development.

When Human Review Still Makes Sense

We're not absolutists. Human review remains valuable for:

ScenarioWhy
Novel domainsCan't specify what you don't understand
Architectural decisionsTrade-offs require business context
Adversarial securityHumans may catch what AI systematically misses
Regulatory complianceSome industries require human sign-off

But notice: these are specification and judgment activities, not line-by-line code reading.

Getting Started

Prerequisites

  • MCPammer configured in your MCP client
  • Repository with supported languages (TypeScript, Python, Go, Rust, etc.)

Basic Usage

{
"tool": "mcpammer_code_review",
"parameters": {
"file_paths": ["src/api/users.ts", "src/api/auth.ts"],
"severity_threshold": "warning"
}
}

With Initiative Integration

  1. Create initiative with success criteria including code review
  2. Let agent iterate autonomously
  3. Review shipped outcome, not implementation process

Philosophy

The hard part of software was never typing code. It was:

  • Understanding what to build
  • Defining constraints and boundaries
  • Specifying success criteria
  • Identifying failure modes

Implementation was always commoditized labor. AI just makes this explicit.

Level 9 engineers—the ones who write design docs, not code—are the template for what all valuable engineers become. The rest is automation.


Artifact Reference

This page is based on the artifact: "The End of Human-in-the-Loop Implementation Review" (f2c3c8cc-5c55-4b14-9da6-b1ae873e84bb)

Created during the initiative to integrate CodeRabbit, which pivoted to building native autonomous code review instead.