Case Study: Second Opinion Reviews Itself
When we shipped the Second Opinion tools, we used the tool to critique its own implementation. This is a real example of how the tool catches issues you might miss.
The Setup
{
"context": "Built a 'Second Opinion' feature for MCPammer that lets AI agents get challenger feedback from alternative models via Artemis LLM Gateway.",
"proposal": "Implementation includes 5 tools with file reading security checks (500KB limit, blocked patterns, path containment).",
"mode": "challenge",
"file_paths": ["mcpammer_api/clients/second_opinion.py"]
}
What It Found
The tool identified several categories of issues:
| Category | Issues Found |
|---|---|
| Security | Symlink traversal possible after path resolution, no rate limiting, no MIME type validation |
| Reliability | No retry logic for API calls, 120s timeout too long, no circuit breaker |
| Performance | Files loaded into memory (no streaming), sequential API calls in dialogue mode |
| Architecture | Singleton pattern limits testing, tight coupling to providers, no fallback models |
Key Finding: Symlink Traversal
The most critical issue was subtle. Our security check used resolve() to get the canonical path, but a carefully crafted symlink inside an allowed directory could point outside:
~/allowed/evil-link -> /etc/passwd
The path ~/allowed/evil-link passes the containment check, but after resolution points to /etc/passwd.
The Outcome
Instead of blocking the release, we:
- Shipped the feature - It's an internal tool with limited exposure
- Created a hardening epic - Tracked all issues for follow-up
- Used
second_opinion_quickto validate this approach:
{
"likely_to_work": true,
"confidence": "high",
"brief_reasoning": "Since it's an internal tool with limited exposure, fixing these issues in a follow-up sprint is reasonable."
}
Lessons Learned
- Use the tool on your own code - You'll find things you missed
- Pass actual file paths - The tool gives better feedback with real code
- Don't let perfect be the enemy of shipped - Use
quickmode to validate your prioritization - Create tickets from findings - Don't lose the insights
The hardening epic (5023653e) now tracks: symlink fix, rate limiting, retry logic, circuit breaker, and observability improvements.
Related
- Second Opinion Tools - Full tool documentation
- Cronus Initiative Case Study - Pre-implementation review example