Skip to content

Swathi-Reddy1408/Etl_Pipeline_With_Airflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PROJECT DESCRIPTION:

This project is about creating an ETL pipeline which extracts the data from Youtube Data API and transforms it using PySpark and loads the data in AWS S3.

USED TECHNOLOGIES, LIBRARIES AND APIs:

  1. Apache Airflow

  2. Youtube Data API

  3. PySpark

  4. AWS S3

  5. Docker

PROJECT ARCHITECTURE:

image

The below medium article has everything in detail how this etl pipeline works! Happy Coding !! https://medium.com/@swathireddythokala16/youtube-trend-analysis-pipeline-etl-with-airflow-spark-s3-and-docker-85a7d76992eb

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages