Code Room
CodingMediumcod-g454
Subject TokenizationLevel Mid–Senior~25 minCommon in Algorithms & data structures interviewsIndustries Software development

Question

Tokenize a line of plain English into word tokens. A word is a maximal run of letters (A-Z, a-z) and apostrophes ('), where an apostrophe only counts as part of a word if it sits BETWEEN two letters (so contractions like don't and possessives like cat's stay whole, but leading/trailing quotes are dropped). All other characters (digits, spaces, punctuation) are separators. Lowercase every token. Return the list of word tokens in order.

Implement
tokenize_words(text: str) → list[str]
Examples
in["Don't stop, it's fine."]out["don't","stop","it's","fine"]
What a strong answer looks like

State your approach and its time/space complexity out loud before you optimize. Handle the edge cases (empty input, duplicates, overflow), and say why you chose this over the brute force. Green tests are the floor, not the grade.

Vibe coding: describe the solution in plain language (or narrate it) and the coach grades your approach. Generating runnable code from your description is coming next.

Run or narrate your approach, then ask the coach.