Question
A teammate built a CI step where an autonomous agent attempts to auto-fix failing tests on every PR, looping up to 20 times running the full suite each iteration. It works, but builds now take 25 minutes and the monthly token bill tripled. Leadership loves the 'self-healing CI' demo. As the senior on the team, how do you judge whether this agent loop is worth its cost and latency, and what would you change?
Treat the AI’s output as a draft to verify, not an answer to trust. Name the specific flaw and the input that triggers it, say how you’d catch it — tests, edge cases, reading critically — and how you’d re-prompt or decompose to get it right.
Vibe coding: describe the solution in plain language (or narrate it) and the coach grades your approach. Generating runnable code from your description is coming next.