Skip to content

Amanroy666/fraud-detection-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Credit Card Fraud Detection System

Portfolio Note: Portfolio recreation of production fraud detection system built at Omfys Technologies.

🎯 Overview

Real-time fraud detection validating 500K+ transactions/day with 92% accuracy using Spark Scala, XGBoost, and ONNX with <100ms latency.

📊 Key Metrics

  • Volume: 500K+ transactions/day
  • Accuracy: 92%
  • False Positive Rate: <2%
  • Latency: <100ms p95
  • Fraud Prevention: ₹10L+ annually

🛠️ Tech Stack

  • Processing: Spark (Scala), Kafka Streams
  • Storage: HBase, HDFS (Parquet), Hive
  • ML: XGBoost, Random Forest, ONNX Runtime
  • ETL: Apache Sqoop
  • Monitoring: Prometheus, Grafana

⚡ Key Features

1. High-Performance Stream Processing

  • Spark Scala for optimal JVM performance
  • Exactly-once processing with checkpointing
  • Write-ahead logs for fault tolerance

2. Advanced Feature Engineering (50+ features)

  • Velocity checks: Transaction frequency patterns
  • Geographical anomalies: Location-based risk
  • Amount deviation: Statistical outlier detection
  • Merchant patterns: Historical analysis
  • Device fingerprinting: Behavioral analysis

3. Ensemble ML Models

  • Random Forest + XGBoost stacking
  • ONNX format for 3x faster inference
  • A/B testing framework

4. Low-Latency Profile Store

  • HBase for <10ms p95 lookups
  • 10M+ transaction profiles
  • Bloom filters for efficiency

📁 Project Structure

fraud-detection-system/
├── src/
│   ├── main/scala/          # Spark Scala code
│   │   ├── FraudDetector.scala
│   │   ├── FeatureEngineering.scala
│   │   └── KafkaStreaming.scala
│   └── python/              # ML training
│       ├── model_training.py
│       └── onnx_conversion.py
├── config/
│   ├── hbase-site.xml
│   └── application.conf
├── build.sbt
├── requirements.txt
└── README.md

🚀 Getting Started

git clone https://github.com/Amanroy666/fraud-detection-system.git
cd fraud-detection-system

# Build Scala project
sbt clean compile assembly

# Install Python dependencies  
pip install -r requirements.txt

📈 Performance Results

Metric Value
Accuracy 92%
Precision 89%
Recall 94%
F1-Score 91.5%
False Positive Rate 1.8%
Latency (p95) 98ms

👤 Author

Aman Roy - Data Engineer at Omfys Technologies
📧 contactaman000@gmail.com | 💼 LinkedIn | 🐙 @Amanroy666

Releases

No releases published

Packages

No packages published