Code Room
CodingHardcod-g1065
Subject Machine learningLevel Senior~35 minCommon in ML systems · Algorithms & data structures interviewsIndustries Software development, Technology

Question

Find the best split for a decision stump on a single numeric feature using Gini impurity. Given X (a list of numeric feature values) and y (binary labels 0/1) of equal length, consider every threshold equal to a value present in X. A sample goes left when X[i] <= threshold, else right. The weighted Gini of a split is (n_left/n)*gini(left) + (n_right/n)*gini(right), where gini of a group = 1 - sum_over_classes (fraction)^2; an empty group has gini 0. Return [best_threshold, best_weighted_gini], with the gini rounded to 6 decimals, choosing the threshold that minimizes weighted Gini; break ties by the smallest threshold. There is at least one sample.

Implement
best_split(X: list[float], y: list[int]) → list[float]
Examples
in[[1,2,3,4],[0,0,1,1]]out[2,0]
What a strong answer looks like

State your approach and its time/space complexity out loud before you optimize. Handle the edge cases (empty input, duplicates, overflow), and say why you chose this over the brute force. Green tests are the floor, not the grade.

Vibe coding: describe the solution in plain language (or narrate it) and the coach grades your approach. Generating runnable code from your description is coming next.

Run or narrate your approach, then ask the coach.