Question
Your team merges a lot of AI-agent-authored PRs. A subtle pricing-calculation bug ships, and during the postmortem someone asks 'why was this written this way?' — the author re-ran the agent and got a different implementation, and the original prompt and reasoning are gone. There's no record of what was asked, what the agent changed, or what the human reviewer actually checked. What process would you institute to make AI-written changes auditable and reproducible, and what are the limits of 'reproducibility' with a model?
Treat the AI’s output as a draft to verify, not an answer to trust. Name the specific flaw and the input that triggers it, say how you’d catch it — tests, edge cases, reading critically — and how you’d re-prompt or decompose to get it right.
Vibe coding: describe the solution in plain language (or narrate it) and the coach grades your approach. Generating runnable code from your description is coming next.