This repository implements an ETL (Extract, Transform, Load) pipeline that retrieves song data from Spotify's API based on the user's last 24 hours of activity. The data is transformed and loaded into a PostgreSQL database for storage and further analysis. The process is automated to run daily using CRON.
- Spotify API (OAuth 2.0 Authentication)
- Python (for data extraction, transformation, and loading)
- PostgreSQL (for database storage)
- CRON (for scheduling automation)
To use the Spotify API, you must first set up a Spotify Developer account and create an application to get your client_id and client_secret.
- Go to Spotify Developer Dashboard.
- Log in with your Spotify account and click on "Create an App".
- Provide necessary information (App Name, Description, etc.).
- Once the app is created, you’ll have access to:
- Client ID
- Client Secret
- Redirect URI (You’ll need to set up one if it’s not automatically provided)
- Save these credentials for later use in the
.envfile.
Before running the ETL pipeline, you need to install the necessary dependencies. Follow these steps:
python3 -m venv env
source env/bin/activatepip install -r requirements.txtTo securely store your credentials and other configuration variables, create a .env file based on the .env.example template.
- Copy the example .env file:
cp .env.example .env- Update the .env file with your Spotify credentials and PostgreSQL connection details:
To run the ETL pipeline, execute the following command:
python main.pyTo automate the ETL pipeline to run daily, you can use CRON jobs. To set up a CRON job, follow these steps:
- Open the CRON tab for editing:
crontab -e- Add the following line to the CRON tab to run the ETL pipeline every day at 12:00 AM:
0 0 * * * /bin/bash/ /path/to/your/virtualenv/bin/python /path/to/your/repository/main.py- Save and exit the CRON tab.
To view the data in a user-friendly interface, you can use Streamlit. To run the Streamlit app, execute the following command:
streamlit run frontend.pyfor fetch data from database use fastapi running using below command
uvicorn server:app --reloadNote: If you want to run the Streamlit app and FastAPI server simultaneously, you can use the following command:
chmod +x run.sh
./run.sh