Question
You're shipping an in-editor autocomplete feature: as the user types, your service must return a code suggestion. Product wants 'the smartest model so suggestions are great.' You measure the premium model at ~1.8s median latency; the smaller one at ~250ms with noticeably lower acceptance on hard completions. Walk through how you'd choose the model tier here.
Treat the AI’s output as a draft to verify, not an answer to trust. Name the specific flaw and the input that triggers it, say how you’d catch it — tests, edge cases, reading critically — and how you’d re-prompt or decompose to get it right.
Vibe coding: describe the solution in plain language (or narrate it) and the coach grades your approach. Generating runnable code from your description is coming next.