Code Room
Code reviewHard
Question
Review this Python validation setup for a demand-forecasting model.
The CV MAE is great but the model misses badly in production. Find the leakage.
What a strong answer looks like
Separate real bugs from style. Rank issues by severity, point at the root cause rather than the symptom, and suggest a concrete fix — specific and kind.
Learn the concepts
import pandas as pdfrom sklearn.model_selection import KFoldfrom sklearn.ensemble import GradientBoostingRegressorfrom sklearn.metrics import mean_absolute_error def cv_forecast(df): # df sorted by date; features include lag_1, lag_7, rolling_mean_30 X = df.drop(columns=["date", "demand"]).values y = df["demand"].values kf = KFold(n_splits=5, shuffle=True, random_state=0) maes = [] for tr, te in kf.split(X): m = GradientBoostingRegressor().fit(X[tr], y[tr]) maes.append(mean_absolute_error(y[te], m.predict(X[te]))) print("CV MAE:", sum(maes) / len(maes))Run or narrate your approach, then ask the coach.