5 min read
Credit Card Fraud Detection: EDA & Rules Engine

Credit card fraud is a massive problem for financial institutions and consumers alike. In this analysis, I’ll explore a dataset of 1.3 million credit card transactions to understand fraud patterns and build a simple rules-based detection system.

Data Overview

The dataset contains 1,296,675 credit card transactions from January 2019 to June 2020. Each transaction includes details like merchant, category, amount, and whether it was flagged as fraud.

from datasets import load_dataset
import polars as pl

dataset = load_dataset("pointe77/credit-card-transaction", split="train")
df = pl.from_arrow(dataset.data.table)

# Parse transaction time
df = df.with_columns(
    pl.col('trans_date_trans_time').str.strptime(pl.Datetime, format="%Y-%m-%d %H:%M:%S").alias('transaction_time')
)

print(f"Total transactions: {len(df):,}")
print(f"Columns: {df.columns}")
print(f"Fraud rate: {df['is_fraud'].mean() * 100:.2f}%")

Key Dataset Stats:

  • Total transactions: 1,296,675
  • Time range: 2019-01-01 to 2020-06-21
  • Overall fraud rate: 0.58%

Exploratory Data Analysis

Amount Distribution

One of the most striking patterns: fraud transactions are significantly higher in value.

fraud_df = df.filter(pl.col('is_fraud') == 1)
legit_df = df.filter(pl.col('is_fraud') == 0)

avg_fraud_amount = fraud_df['amt'].mean()
avg_legit_amount = legit_df['amt'].mean()

print(f"Average fraud transaction: ${avg_fraud_amount:.2f}")
print(f"Average legitimate transaction: ${avg_legit_amount:.2f}")
Transaction TypeAverage Amount
Fraud$531.32
Legitimate$67.67

Finding: Fraudulent transactions are nearly 8x higher on average than legitimate ones.

Category Analysis

category_fraud = (
    df.group_by('category')
    .agg(
        pl.len().alias('total_transactions'),
        pl.col('is_fraud').sum().alias('fraud_count'),
    )
    .with_columns(
        (pl.col('fraud_count') / pl.col('total_transactions') * 100).alias('fraud_rate')
    )
    .sort('fraud_rate', descending=True)
)

print(category_fraud)

Fraud Rate by Category:

CategoryTotal TransactionsFraud CountFraud Rate
shopping_net97,5431,7131.76%
misc_net63,2879151.45%
grocery_pos123,6381,7431.41%
shopping_pos116,6728430.72%
gas_transport131,6596180.47%
misc_pos79,6552500.31%
grocery_net45,4521340.29%
travel40,5071160.29%
entertainment94,0142330.25%
personal_care90,7582200.24%
kids_pets113,0352390.21%
food_dining91,4611510.17%
home123,1151980.16%
health_fitness85,8791330.15%

Finding: Online shopping (shopping_net) has the highest fraud rate at 1.76%, nearly 12x higher than the lowest category (health_fitness at 0.15%).

Time-Based Patterns

# Fraud by Hour of Day
fraud_by_hour = (
    df.with_columns(
        pl.col('transaction_time').dt.hour().alias('hour_of_day')
    )
    .group_by('hour_of_day')
    .agg(
        pl.len().alias('total_transactions'),
        pl.col('is_fraud').sum().alias('fraud_count')
    )
    .with_columns(
        (pl.col('fraud_count') / pl.col('total_transactions') * 100).alias('fraud_rate')
    )
    .sort('hour_of_day')
)

print("Fraud by Hour of Day:")
print(fraud_by_hour)

Key Insight - Late Night Fraud:

Time PeriodFraud Rate
10 PM - 11 PM2.88%
11 PM - Midnight2.84%
Midnight - 1 AM1.53%
1 AM - 2 AM1.49%
2 AM - 3 AM1.47%
Daytime (6 AM - 9 PM)< 0.15%

Finding: Fraud peaks dramatically between 10 PM and 3 AM, with rates 15-20x higher than daytime hours. This is when fraudsters operate, likely because:

  • Victims are asleep and can’t detect unauthorized charges
  • Less monitoring during off-hours
  • International transactions cross time zones

Day of Week

# Fraud by Day of Week
fraud_by_day = (
    df.with_columns(
        pl.col('transaction_time').dt.weekday().alias('day_of_week')
    )
    .group_by('day_of_week')
    .agg(
        pl.len().alias('total_transactions'),
        pl.col('is_fraud').sum().alias('fraud_count')
    )
    .with_columns(
        (pl.col('fraud_count') / pl.col('total_transactions') * 100).alias('fraud_rate')
    )
    .sort('day_of_week')
)

day_names = {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday', 4: 'Thursday', 5: 'Friday', 6: 'Saturday', 7: 'Sunday'}
fraud_by_day = fraud_by_day.with_columns(
    pl.col('day_of_week').map_elements(lambda x: day_names.get(x, str(x)), return_dtype=pl.String).alias('day_of_week')
)

print(fraud_by_day)
DayFraud Rate
Friday0.71% (highest)
Thursday0.68%
Wednesday0.66%
Saturday0.61%
Tuesday0.58%
Sunday0.49%
Monday0.46% (lowest)

Finding: Friday has the highest fraud rate, while Monday has the lowest. The latter half of the work week sees more fraud.

Building a Fraud Rules Engine

Now let’s build some detection rules and measure their effectiveness.

# Create detection rules
def high_amount_rule(df, threshold=500):
    return df.filter(pl.col('amt') > threshold)

high_risk_categories = ['shopping_net', 'misc_net', 'grocery_pos']

def category_risk_rule(df, high_risk_categories=high_risk_categories):
    return df.filter(pl.col('category').is_in(high_risk_categories))

def outlier_rule(df, percentile=99):
    threshold = df['amt'].quantile(percentile / 100)
    return df.filter(pl.col('amt') > threshold)

Rule 1: High Amount (>$500)

high_amount_flagged = high_amount_rule(df, threshold=500)
fraud_caught = high_amount_flagged.filter(pl.col('is_fraud') == 1).height
false_positives = high_amount_flagged.height - fraud_caught

precision = fraud_caught / high_amount_flagged.height * 100
recall = fraud_caught / fraud_count * 100

print(f"High Amount Rule (>$500):")
print(f"  Precision: {precision:.1f}%")
print(f"  Recall: {recall:.1f}%")
print(f"  False positives: {false_positives:,}")
MetricValue
Precision23.3%
Recall48.6%
False Positives11,983

Rule 2: High-Risk Categories

high_risk_category_flagged = category_risk_rule(df, high_risk_categories=high_risk_categories)
fraud_caught = high_risk_category_flagged.filter(pl.col('is_fraud') == 1).height
false_positives = high_risk_category_flagged.height - fraud_caught

precision = fraud_caught / high_risk_category_flagged.height * 100
recall = fraud_caught / fraud_count * 100

print(f"High Risk Category Rule:")
print(f"  Precision: {precision:.1f}%")
print(f"  Recall: {recall:.1f}%")
print(f"  False positives: {false_positives:,}")
MetricValue
Precision1.5%
Recall58.2%
False Positives280,097

Rule 3: Outlier Detection (Top 1%)

outlier_flagged = outlier_rule(df, percentile=99)
fraud_caught = outlier_flagged.filter(pl.col('is_fraud') == 1).height
false_positives = outlier_flagged.height - fraud_caught

precision = fraud_caught / outlier_flagged.height * 100
recall = fraud_caught / fraud_count * 100

print(f"Outlier Rule (top 1% amount):")
print(f"  Precision: {precision:.1f}%")
print(f"  Recall: {recall:.1f}%")
print(f"  False positives: {false_positives:,}")
MetricValue
Precision27.8%
Recall48.0%
False Positives9,367

Key Takeaways

  1. Amount is the strongest signal: The outlier rule achieved the best precision (27.8%), confirming that fraudulent transactions are disproportionately high-value.

  2. Time matters: Late night hours (10 PM - 3 AM) have 15-20x higher fraud rates than daytime. Any real fraud system should weight time heavily.

  3. Category helps but creates noise: High-risk categories catch more fraud (58% recall) but with massive false positives (280K). Better as a feature weight than a standalone rule.

  4. Simple rules have limits: Even the best rule only catches ~50% of fraud with ~25% precision. A production system would need ML models to improve these numbers.

The data tells a clear story: Fraudsters prefer high-value transactions during off-hours when victims are likely asleep. A smart fraud system should flag high amounts in late-night transactions from high-risk categories.


Analysis performed using Polars on a dataset of 1.3M credit card transactions.