CodingMediumcod-g1067

Subject Machine learningLevel Mid–Senior~30 minCommon in ML systems interviewsIndustries Software development

Question

Preprocess a tiny tabular dataset. You are given rows, a list of records where each record is [value, category] with value a number and category a string, and categories, the ordered list of all possible category strings. Min-max normalize the numeric column to [0,1] using v' = (v - min) / (max - min); if max == min, every normalized value is 0.0. One-hot encode the category against the categories list (a 1.0 in the position of the matching category, 0.0 elsewhere). Return a list of feature rows, where each row is [normalized_value] followed by the one-hot vector. Round every output number to 6 decimals. There is at least one row.

Implement

preprocess(rows: list[list], categories: list[str]) → list[list[float]]

Examples

in[[[10,"a"],[20,"b"],[30,"a"]],["a","b"]]out[[0,1,0],[0.5,0,1],[1,1,0]]

What a strong answer looks like

State your approach and its time/space complexity out loud before you optimize. Handle the edge cases (empty input, duplicates, overflow), and say why you chose this over the brute force. Green tests are the floor, not the grade.

Learn the concepts

Vibe coding: describe the solution in plain language (or narrate it) and the coach grades your approach. Generating runnable code from your description is coming next.

Run or narrate your approach, then ask the coach.