Spotify ETL Pipeline: Extract, Transform, and Load Data into PostgreSQL

Overview

This repository implements an ETL (Extract, Transform, Load) pipeline that retrieves song data from Spotify's API based on the user's last 24 hours of activity. The data is transformed and loaded into a PostgreSQL database for storage and further analysis. The process is automated to run daily using CRON.

Technologies Used:

Spotify API (OAuth 2.0 Authentication)
Python (for data extraction, transformation, and loading)
PostgreSQL (for database storage)
CRON (for scheduling automation)

Requirements

1. Developer Setup on Spotify API

To use the Spotify API, you must first set up a Spotify Developer account and create an application to get your client_id and client_secret.

Go to Spotify Developer Dashboard.
Log in with your Spotify account and click on "Create an App".
Provide necessary information (App Name, Description, etc.).
Once the app is created, you’ll have access to:
- Client ID
- Client Secret
- Redirect URI (You’ll need to set up one if it’s not automatically provided)
Save these credentials for later use in the .env file.

2. Installing Requirements

Before running the ETL pipeline, you need to install the necessary dependencies. Follow these steps:

Clone the repository:

Create a virtual environment:

python3 -m venv env

source env/bin/activate

Install the dependencies:

pip install -r requirements.txt

3. Setup Environment Variables

To securely store your credentials and other configuration variables, create a .env file based on the .env.example template.

Copy the example .env file:

cp .env.example .env

Update the .env file with your Spotify credentials and PostgreSQL connection details:

4. Running the ETL Pipeline

To run the ETL pipeline, execute the following command:

python main.py

5. Automating the ETL Pipeline

To automate the ETL pipeline to run daily, you can use CRON jobs. To set up a CRON job, follow these steps:

Open the CRON tab for editing:

crontab -e

Add the following line to the CRON tab to run the ETL pipeline every day at 12:00 AM:

0 0 * * * /bin/bash/ /path/to/your/virtualenv/bin/python /path/to/your/repository/main.py

Save and exit the CRON tab.

6. Viewing the Data in UI using Streamlit

To view the data in a user-friendly interface, you can use Streamlit. To run the Streamlit app, execute the following command:

streamlit run frontend.py

for fetch data from database use fastapi running using below command

uvicorn server:app --reload

Note: If you want to run the Streamlit app and FastAPI server simultaneously, you can use the following command:

chmod +x run.sh
./run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
pages		pages
visualization		visualization
.env.example		.env.example
.gitignore		.gitignore
README.MD		README.MD
frontend.py		frontend.py
main.py		main.py
requirements.txt		requirements.txt
run.sh		run.sh
run_main.sh		run_main.sh
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spotify ETL Pipeline: Extract, Transform, and Load Data into PostgreSQL

Overview

Technologies Used:

Requirements

1. Developer Setup on Spotify API

2. Installing Requirements

Clone the repository:

Create a virtual environment:

Install the dependencies:

3. Setup Environment Variables

4. Running the ETL Pipeline

5. Automating the ETL Pipeline

6. Viewing the Data in UI using Streamlit

Note: If you want to run the Streamlit app and FastAPI server simultaneously, you can use the following command:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

DishenMakwana/spotify-data-retrieval

Folders and files

Latest commit

History

Repository files navigation

Spotify ETL Pipeline: Extract, Transform, and Load Data into PostgreSQL

Overview

Technologies Used:

Requirements

1. Developer Setup on Spotify API

2. Installing Requirements

Clone the repository:

Create a virtual environment:

Install the dependencies:

3. Setup Environment Variables

4. Running the ETL Pipeline

5. Automating the ETL Pipeline

6. Viewing the Data in UI using Streamlit

Note: If you want to run the Streamlit app and FastAPI server simultaneously, you can use the following command:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages