Code Room
Code reviewMediumcr-g625
Subject Model training determinismLevel Mid–Senior~15 minCommon in Code quality & review interviewsIndustries Software development

Question

Review this Python experiment-comparison script.

Results flip between runs — sometimes A wins, sometimes B. What's the problem and how do you make this trustworthy?

What a strong answer looks like

Separate real bugs from style. Rank issues by severity, point at the root cause rather than the symptom, and suggest a concrete fix — specific and kind.

Talk through your review
Code to reviewpython
import numpy as npfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import f1_score def compare(X, y, feature_set_a, feature_set_b):    Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.25)    def run(cols):        m = RandomForestClassifier(n_estimators=300)        m.fit(Xtr[:, cols], ytr)        return f1_score(yte, m.predict(Xte[:, cols]))    fa = run(feature_set_a)    fb = run(feature_set_b)    print("A:", fa, "B:", fb, "-> pick", "A" if fa > fb else "B")
Run or narrate your approach, then ask the coach.