Code Room
Vibe codingHardvc-g267
Subject Ai code reviewLevel Senior–Staff~20 minCommon in Algorithms & data structures interviewsIndustries Software development

Question

An AI wrote this feature-selection-then-CV pipeline for a high-dimensional genomics classifier. CV accuracy is 0.97 on 80 samples and 20,000 features. Explain why that number is a mirage.

python
from sklearn.feature_selection import SelectKBest, f_classiffrom sklearn.model_selection import cross_val_score selector = SelectKBest(f_classif, k=50)X_sel = selector.fit_transform(X, y)        # select using ALL datascores = cross_val_score(SVC(), X_sel, y, cv=5)print('CV acc:', scores.mean())            # 0.97
What a strong answer looks like

Treat the AI’s output as a draft to verify, not an answer to trust. Name the specific flaw and the input that triggers it, say how you’d catch it — tests, edge cases, reading critically — and how you’d re-prompt or decompose to get it right.

Describe your solution

Vibe coding: describe the solution in plain language (or narrate it) and the coach grades your approach. Generating runnable code from your description is coming next.

Run or narrate your approach, then ask the coach.