PROJECT DESCRIPTION:
This project is about creating an ETL pipeline which extracts the data from Youtube Data API and transforms it using PySpark and loads the data in AWS S3.
USED TECHNOLOGIES, LIBRARIES AND APIs:
- 
Apache Airflow
 - 
Youtube Data API
 - 
PySpark
 - 
AWS S3
 - 
Docker
 
PROJECT ARCHITECTURE:
The below medium article has everything in detail how this etl pipeline works! Happy Coding !! https://medium.com/@swathireddythokala16/youtube-trend-analysis-pipeline-etl-with-airflow-spark-s3-and-docker-85a7d76992eb
