Fundamental algorithms like K-Means are just math loops on vectors.
Many machine learning algorithms don't require heavy frameworks; they are simple loops. For example, K-Means Clustering aims to group points into K clusters.
The algorithm is remarkably simple (Lloyd's algorithm): 1) Assign every point to the closest centroid using Euclidean distance. 2) Move each centroid to the exact average (mean) of all points assigned to it. 3) Repeat until the centroids stop moving.
def kmeans(X, k, max_iters=10):
centroids = X[:k] # Initialize randomly
for _ in range(max_iters):
# 1. Assign clusters
clusters = [[] for _ in range(k)]
for x in X:
dists = [euclidean(x, c) for c in centroids]
best_k = dists.index(min(dists))
clusters[best_k].append(x)
# 2. Update centroids
new_centroids = []
for cl in clusters:
new_c = np.mean(cl, axis=0)
new_centroids.append(new_c)
if np.allclose(centroids, new_centroids):
break # Stable!
centroids = new_centroids
return clusters