Skip to content

grigorassorin/car-price-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Car Market Value Estimator and Deal Recommender

This project estimates the value of used cars in Turkey and recommends good car deals based on ML model predictions and depreciation data. It includes a full data pipeline from scraping, cleaning, modeling, and interactive user notebooks.

The main notebooks are the interactive ones: 06 - ..., which is meant to estimate the value of the user's current car, and 07 - ... to help them find a new (used) vehicle. The rest (01 - 05) were used to colect and clean the data as well as to train the ML models and gather Depreciation Data for the cars.


📚 Table of Contents


Project Summary


Repository Structure

car_prediction/
│
├── data/
│   ├── raw/
│   │   ├── 2020_turkey_car_market.csv             <- Public dataset (not used in final pipeline)
│   │   └── 2024_turkey_car_market.csv             <- Dataset generated by "01 - Scrape - 2024 Turkey Used Cars.ipynb" which scrapes Turkish used car websites
│   │   └── 2024_US_craigslist.csv.                <- Not used in the final model. It only generated sample screenshots for comparison in the Capstone PDF
│   ├── processed/
│   │   └── 2024_turkey_car_market_clean.csv       <- Intermediary dataset, created by "03 - Cleaning.ipynb" after cleaning "2024_turkey_car_market.csv"
│   │   └── 2024_turkey_car_market_ML.csv          <- Main dataset, used by the interactive Notebooks (06.. and 07..); created by "04 - Model Training.ipynb"
│   └── models/                                    
│   │   └── depreciation_data/                     <- Depreciation data for Makes and Models generated by "05 - Depreciation Data.ipynb"
│   │   └── ML_MakeModel/                          <- Trained ML models generated by "04 - Model Training.ipynb" (model + scaler + encoder files)

│
├── notebooks/
│   ├── 01 - Scrape - 2024 Turkey Used Cars.ipynb            <- Output "2024_turkey_car_market.csv"
│   ├── 02 - Scrape - Craigslist US.ipynb                    <- Used only for PDF comparisons
│   ├── 03 - Cleaning.ipynb                                  <- Input "2024_turkey_car_market.csv" scraped data; output "2024_turkey_car_market_clean.csv"
│   ├── 04 - Model Training.ipynb                            <- Input "2024_turkey_car_market_clean.csv"; output "2024_turkey_car_market_ML.csv" and the models in folder "models/ML_MakeModel"
│   ├── 05 - Depreciation Data.ipynb                         <- Input "2024_turkey_car_market_clean.csv"; output the depreciation data in folder "models/depreciation_data"
│   ├── 06 - Input - Car Value Estimation.ipynb ✅           <- Standalone; requires only "2024_turkey_car_market_ML.csv" and the data in the folder "models": "depreciation_data" and "ML_MakeModel"
│   └── 07 - Input - Used Car Deal Recommender.ipynb ✅      <- Standalone; requires only "2024_turkey_car_market_ML.csv" and the data in the folder "models": "depreciation_data" and "ML_MakeModel"
│
├── demos/                                                   <- Screen Recordings showing the input codes working (notebooks 06.. and 07..)
│
├── docs/
│   └── Sorin Grigoras Capstone Presentation - [...].pdf    <- PDF presentation used to present my Capstone Project
│
├── .gitignore
├── README.md
├── requirements.txt
└── LICENSE


How to Use

1. Clone the Repo

git clone https://github.com/grigorassorin/car-price-prediction.git
cd car_prediction

2. Install Dependencies

pip install -r requirements.txt

3. Run Estimate Your Car's Price Notebook

This notebook will:

  • Ask for user inputs (car specs)
  • Use ML model predictions and depreciation lookup
  • Estimate user's current car’s value based on the available dataset

Demo:

  • To see a video recording demo-ing this interactive code, go to the "demos" folder -> "Demo - Car Value.mov" or go to: https://youtu.be/PRyM5Z6U8_A

4. Run Find Best Used Car Notebook

This notebook will:

  • Ask for user inputs for desired car: any combination of make / model / transmission / body type / year / kilometers / engine size / price / color etc (any combination of these) to subset the main set
  • Ask user to input the amount of years they think they will keep this new car; it uses this to estimate a resale price based on current dataset
  • looks for deals - for each car in the subset it runs the ML model to estimate its price; it compares it to similar cars and average depreciation estimates to come up with a price of what that car it thinks is worth; it compares this to the price this car was listed for to highlight "deals"
  • then looks at the same make and model to estimate its value at resale at the end of ownership period
  • lists cars in order of the Deal + Depreciation

Demo:


ML Modeling

  • Models are trained per Make-Model pair
  • Uses RandomForestRegressor and StandardScaler, with one-hot encoding
  • Saves ML model, encoder, and scaler in data/models/MakeModel/

Notebooks 01 - 05

  • No need to run these notebooks, numbered from '01 - ...' to '05 - ...', unless you wish to scrape new data.
  • Should you choose to want this, you will need to update scraping locations in Notebook '01..' and run it; this will generate new data; then run '03..' to clean this new data, '04..' to retrain the ML models and '05..' to update the depreciation data.
  • Then you can run '06..' and '07..' and they will use your newly scraped vehicles.

Demos

The folder "demos" includes screen recordings of me demoing the interactive Notebooks: '06 - ..' and '07 - ..'


Educational Context

This project was originally developed for educational purposes as a Capstone project for the UCLA Extension Certificate in Data Science in June 2024. The 'PDF presentation' I used to present this project can be found in docs/.

The interactive notebooks (06 - ... and 07 - ...) were developed after presenting the project, to provide real-world use of the analysis.


License

License: GPL v3
This project is licensed under the GNU General Public License v3.0.

About

End-to-end ML pipeline to recommend the best used car deals in Turkey

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published