This project implements an ETL pipeline to extract data from MySQL, transform it using Pandas, and load it into Google BigQuery. It automates data ingestion, cleaning, and structured loading for efficient processing.
✔ Automated Data Extraction – Connects to MySQL to retrieve structured datasets.
✔ Data Transformation – Uses Pandas for filtering, type conversion, and formatting.
✔ BigQuery Integration – Loads processed data into Google BigQuery for analytics.
✔ Modular Design – Independent scripts for MySQL connection, transformation, and BigQuery loading.
Clone the repository and install dependencies:
git clone https://github.com/your-repo/etl-using-python-mysql-to-bigquery.git
cd etl-using-python-mysql-to-bigquery
pip install -r requirements.txtRun mysqlconn.py to establish a database connection, then execute queries using mysqlquery.py:
python mysqlconn.py
python mysqlquery.pyUse Pandas scripts to clean and format extracted data:
python pandas-read-csv.py
python pandas-trans.py Execute bq_conn.py and bq_load_cities.py to send structured data to BigQuery:
python bq_conn.py
python bq_load_cities.py 1️⃣ Extract – Connect to MySQL and retrieve raw data.
2️⃣ Transform – Process and clean data using Pandas.
3️⃣ Load – Transfer formatted data into BigQuery tables.
etl-using-python-mysql-to-bigquery
│── .git/
│── .venv/
│── data_files/
│── mysqlconn.py
│── mysqlquery.py
│── pandas-read-csv.py
│── pandas-trans.py
│── bq_conn.py
│── bq_load_cities.py
│── requirements.txt
│── archive.zip
│── Course Files.zip
- Python 3.x
- MySQL Server
- Google Cloud SDK & BigQuery API
- Pandas, SQLAlchemy, and Google Cloud Client Libraries
This project is licensed under MIT License.
For inquiries, reach out via email:
📧 [email protected]