ML Fraud Detection (Imbalanced Classes)

How to train an AI to find needles in a haystack of legitimate transactions.

The idea

Banks use Machine Learning to instantly block stolen credit cards. But training these models is notoriously difficult because of Class Imbalance. In the real world, 99.9% of transactions are perfectly legitimate, and only 0.1% are fraud. If you train a naive AI on this data, it will quickly learn a "genius" trick: just guess "Legitimate" every single time! It will score 99.9% accuracy and be completely useless. To actually catch fraud, we must artificially balance the dataset or heavily penalize the model for missing the rare fraudulent events.

Step 1: The Raw Data. Out of 100 transactions, 99 are Legitimate (Green) and 1 is Fraud (Orange).

How it works (SMOTE & Class Weights)

We cannot deploy a model that just guesses "Legitimate". We have two main strategies to force the model to care about the minority class:

Synthetic Minority Oversampling (SMOTE): We mathematically generate fake, realistic fraud examples in the training data until we have a 50/50 split of Fraud and Legitimate. The model is forced to learn the patterns.
Class Weights: We tell the algorithm that making a mistake on a Legitimate transaction costs $1, but missing a Fraud transaction costs $100. The model will become hyper-sensitive to fraud.

from sklearn.ensemble import RandomForestClassifier

# BAD: Naive training on imbalanced data
model = RandomForestClassifier()
model.fit(X_train, y_train) # Will just predict 'Legitimate' forever

# GOOD: Using Class Weights to penalize missing fraud
# '0' is Legitimate (weight 1), '1' is Fraud (weight 100)
model = RandomForestClassifier(class_weight={0: 1, 1: 100})
model.fit(X_train, y_train)

# Better metrics than 'Accuracy'
from sklearn.metrics import recall_score
# Recall tells us: "Out of all actual fraud, how much did we catch?"

Cost

By forcing the model to be hyper-sensitive to fraud, you increase the False Positive Rate. The model will start flagging legitimate transactions (like buying a coffee in a new city) as fraud, declining the user's card. This creates massive friction and angry customer support calls. Fraud detection is a constant tug-of-war between catching bad guys and annoying good guys.

Watch out for

Accuracy is a lie: Never use "Accuracy" as a metric for fraud. Use Recall (how many frauds did we catch) and Precision (when we yelled "Fraud!", how often were we right?). You usually optimize a combined metric like the F1-Score or the Area Under the Precision-Recall Curve (PR-AUC).