String matching (KMP)

Search for a pattern in a text without ever stepping backward.

The idea

Naive string matching is O(N×M). If a mismatch occurs, it shifts the pattern by just 1 character and starts comparing from the beginning again.

Knuth-Morris-Pratt (KMP) uses an LPS (Longest Prefix Suffix) array. If we mismatch at pattern index j, the LPS tells us exactly how much of the prefix we've already matched, allowing us to safely jump j backward without moving the text pointer i backward. It runs in O(N+M).

LPS Array: [0, 0, 1, 2, 0] (for "ababd")
Click Step to compare characters.

How it works (KMP Search)

def kmp_search(text, pat):
    lps = compute_lps(pat)
    i, j = 0, 0
    res = []
    
    while i < len(text):
        if pat[j] == text[i]:
            i += 1
            j += 1
            if j == len(pat):
                res.append(i - j)
                j = lps[j - 1]
        else:
            if j != 0:
                j = lps[j - 1] # The magic jump!
            else:
                i += 1
    return res