Diamond Prediction Overview
Diamond UI Screenshot

Project Information

  • Category: Regression / ML
  • Tools: Python, Flask, CatBoost, XGBoost
  • Best R² Score: 0.9792 (CatBoost Regressor)
  • Project URL: View on GitHub

Diamond Price Prediction 💎📈

Student Performance Preview

This project builds a regression model to accurately predict diamond prices based on various features such as carat, cut, color, and clarity.

📌 Problem Statement

Predict the price of diamonds using multiple features for improved valuation and pricing decisions.

📊 Data Collection

Dataset: Kaggle - Playground Series Season 3 Episode 8

🧬 Features

  • carat: Weight of the diamond
  • cut: Quality of the diamond cut
  • color: Diamond color grade (D–J)
  • clarity: Purity level based on internal flaws
  • depth, table, x, y, z: Physical dimensions
  • price: Target variable

📈 Model Performance Comparison

ModelMSERMSEMAETrain R²Test R²Adj. R²
CatBoost Regressor3.36e+05579.96295.480.98270.97920.9792
XGBRegressor3.42e+05585.44296.960.98380.97880.9788
Random Forest3.71e+05609.35310.420.99680.97700.9770
KNN Regressor4.49e+05670.78350.550.98150.97220.9721
Decision Tree7.00e+05837.25422.661.00000.95660.9566
Linear Regression1.01e+061006.60671.580.93660.93730.9373
Ridge1.01e+061006.60671.610.93660.93730.9373
Lasso1.01e+061006.87672.990.93660.93730.9372
AdaBoost1.94e+061393.41971.550.88270.87980.8798

🏆 Best Model: CatBoost Regressor

Achieved highest R² and lowest error metrics across all models.

🔍 Insights & Findings

  • Tree-based models outperform linear models significantly.
  • CatBoost and XGBoost showed the best test-time performance.
  • Decision Tree overfitted; Lasso/Ridge failed to capture nonlinearities.

📓 Notebooks

🛠️ How to Run


git clone https://github.com/Mazenasag/Gemstone_Price_Prediction_.git
cd diamond-price-prediction
# Run the Flask app
python app.py
        

Then open your browser and visit: http://127.0.0.1:5000/

✍️ Author

Developed by Mazen Asag