This project analyzes sales and product data using Python and the Pandas library to demonstrate practical data handling and exploratory analysis skills.
The analysis explores datasets containing sales transactions and product information, highlighting techniques such as data loading, inspection, cleaning, and summarization.
The goal is to extract insights from raw data and showcase essential Python data analysis capabilities.
- Project Overview
- Files in Repository
- Journey of the Data
- Script Functionality
- Key Learnings
- Recommendations
- Tools and Technologies
- Contact
- Acknowledgement
| File Name | Description |
|---|---|
script.ipynb |
Jupyter Notebook containing the analysis code |
sales.xlsx |
Dataset containing sales records |
products.xlsx |
Dataset containing product information |
Project_Documentation - Sales Data Analysis.docx |
Full project documentation |
- sales.xlsx – Contains individual sales transactions, including product IDs, quantities, and totals.
- products.xlsx – Contains product information such as product ID, name, and category.
- Verified data consistency across both datasets.
- Checked for null values, duplicates, and data type mismatches.
- Conducted exploratory analysis to understand data structure and relationships.
- pandas – For data analysis and manipulation.
- os – For managing file paths and directories.
import pandas as pd
import os
sales_data = pd.read_excel('sales.xlsx')
products_data = pd.read_excel('products.xlsx')