Project information
- Category: Machine Learning
- Tools: Flask, Python, XGBoost, MLOps
- F1 Score: 86.7% (XGBoost)
- Project URL: View on GitHub
Card Fraud Detection System 🕵️♂️💳
This project implements a robust end-to-end machine learning pipeline for detecting fraudulent credit card transactions.
🔍 Overview
- ⚙️ Modular pipeline for preprocessing, training, evaluation
- 🗂️ YAML-based configuration for cleaner code and experiments
- 📦 DVC for versioning datasets and pipeline stages
- 🚀 GitHub Actions for CI/CD automation
🗃️ Dataset Overview
Dataset: Credit Card Fraud Detection on Kaggle
Transactions: 284,807
Fraudulent: 492 (0.172%)
Features are anonymized using PCA, except 'Time', 'Amount', and 'Class' (target). Due to the imbalance, AUPRC is prioritized over accuracy.
🔁 Handling Class Imbalance
Used SMOTE (only on training set):
- Original: 227,845 × 31
- Post-SMOTE: 442,012 × 30
📁 Project Structure
CardFraud/
├── config/ # YAML configs
├── src/CardFraud/ # All modules
├── research/ # EDA Notebooks
└── .github/workflows/ # CI/CD
📦 DVC Versioning
- Version control for large files
- Tracked training outputs & reproducibility
⚙️ CI/CD Pipeline
Uses GitHub Actions to trigger ML workflows on push/merge events.
📊 Model Results Summary
Model | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|
Logistic Regression | 0.9803 | 0.0742 | 0.9072 | 0.1371 |
Random Forest | 0.9995 | 0.8804 | 0.8350 | 0.8571 |
Decision Tree | 0.9975 | 0.3878 | 0.7835 | 0.5187 |
XGBoost | 0.9995 | 0.8586 | 0.8763 | 0.8673 |
🏆 Best Model: XGBoost
Best performing with highest F1-score.
🚀 How to Run
git clone https://github.com/Mazenasag/End-to-end-Detecting-Card-Fraud
cd CardFraud
pip install -r requirements.txt
python src/CardFraud/pipeline/main.py
✍️ Author
Developed by Mazen Asag