Case Study: Second Opinion Reviews Itself

When we shipped the Second Opinion tools, we used the tool to critique its own implementation. This is a real example of how the tool catches issues you might miss.

The Setup

{
  "context": "Built a 'Second Opinion' feature for MCPammer that lets AI agents get challenger feedback from alternative models via Artemis LLM Gateway.",
  "proposal": "Implementation includes 5 tools with file reading security checks (500KB limit, blocked patterns, path containment).",
  "mode": "challenge",
  "file_paths": ["mcpammer_api/clients/second_opinion.py"]
}

What It Found

The tool identified several categories of issues:

Category	Issues Found
Security	Symlink traversal possible after path resolution, no rate limiting, no MIME type validation
Reliability	No retry logic for API calls, 120s timeout too long, no circuit breaker
Performance	Files loaded into memory (no streaming), sequential API calls in dialogue mode
Architecture	Singleton pattern limits testing, tight coupling to providers, no fallback models

Key Finding: Symlink Traversal

The most critical issue was subtle. Our security check used resolve() to get the canonical path, but a carefully crafted symlink inside an allowed directory could point outside:

~/allowed/evil-link -> /etc/passwd

The path ~/allowed/evil-link passes the containment check, but after resolution points to /etc/passwd.

The Outcome

Instead of blocking the release, we:

Shipped the feature - It's an internal tool with limited exposure
Created a hardening epic - Tracked all issues for follow-up
Used second_opinion_quick to validate this approach:

{
  "likely_to_work": true,
  "confidence": "high",
  "brief_reasoning": "Since it's an internal tool with limited exposure, fixing these issues in a follow-up sprint is reasonable."
}

Lessons Learned

Use the tool on your own code - You'll find things you missed
Pass actual file paths - The tool gives better feedback with real code
Don't let perfect be the enemy of shipped - Use quick mode to validate your prioritization
Create tickets from findings - Don't lose the insights

The hardening epic (5023653e) now tracks: symlink fix, rate limiting, retry logic, circuit breaker, and observability improvements.

Second Opinion Tools - Full tool documentation
Cronus Initiative Case Study - Pre-implementation review example

The Setup​

What It Found​

Key Finding: Symlink Traversal​

The Outcome​

Lessons Learned​

Related​

The Setup

What It Found

Key Finding: Symlink Traversal

The Outcome

Lessons Learned

Related