Skip to content

JMasr/raw2database

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Raw2DataBase: a Metabase + PostgreSQL + Docker Solution to Showcase Your Data πŸ“Š

About ✨

Raw2DataBase is a streamlined solution for loading raw CSV data into a PostgreSQL database, leveraging Docker for easy deployment and Metabase for powerful data visualization. This project provides a robust framework for managing database connections, processing CSV data, and seamlessly integrating with Metabase for data analysis and reporting.

Project Requirements πŸ‘¨β€πŸ”§

  1. Database Connection Handler πŸ›°οΈ

    • Generic and extensible to support multiple database types (PostgreSQL, MySQL, MongoDB etc.).
    • Handles CSV processing using pandas, converting files into dataframes for database insertion.
  2. Main Application Logic βš™οΈ

    • Script to receive configuration and raw data paths.
    • Manages database connection setup, data processing, and data insertion.
  3. Tests 🩹

    • Tests for each feature to ensure correct functionality and reliability.

Features (βœ…=DONE, ❌=TODO)

  • βœ… Database Connection Handler: A flexible, extensible handler for connecting to various databases.
  • βœ… CSV Processing: Efficient CSV data processing using pandas, converting raw data into SQL-like objects for database insertion.
  • βœ… Dockerized Environment: Easy setup and deployment using Docker and Docker Compose.
  • βœ… Data Visualization: Integration with Metabase for creating and sharing interactive dashboards and reports.
  • ❌ Test Coverage: Comprehensive tests using Pytest to ensure the reliability and correctness of each component.

General Project Structure 🧱

raw2database/
β”œβ”€β”€ docker/
β”‚   β”œβ”€β”€ .env
β”‚   └── docker-compose.yml
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ data/
β”‚   β”‚    β”œβ”€β”€ __init__.py
β”‚   β”‚    └── data_processor.py
β”‚   └── database/
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ database.py
β”‚       └── database_loader.py
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ test_database.py
β”‚   β”œβ”€β”€ test_data_loader.py
β”‚   └── test_data_processor.py
β”œβ”€β”€ config/
β”‚   └── your_db_config.json
β”œβ”€β”€ data/
β”‚   └── raw/
β”‚   └── processed
β”‚   └── interim
β”‚   └── external
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md
└── .gitignore

Getting Started πŸ€“

⚠️Recommended Python version: 3.9⚠️

  1. Clone the Repository
git clone https://github.com/JMasr/raw2database.git
  1. Navigate to the Project Directory
cd raw2database
  1. Create and Activate the Conda Environment
conda create -n raw2database python=3.9
conda activate raw2database
  1. Install Requirements
pip install -r requirements.txt
  1. Navigate to the Docker Folder
cd docker
  1. Configure the .env File with your Credentials
cat <<EOL > .env
POSTGRES_USER=<user_postgres>
POSTGRES_PASSWORD=<pass_postgres>
POSTGRES_DB=<ps_db_name>
PGADMIN_DEFAULT_EMAIL=<[email protected]>
PGADMIN_DEFAULT_PASSWORD=<pass_ui-admin_tool>
EOL
  1. Run Docker Compose
docker-compose up -d
  1. Create a configuration folder
mkdir config
  1. Configure the Database Edit the config/postgres_config.json file to set the database connection details:
cd config
cat <<EOL > postgres_config.json
{
  "db_type": "postgres",
  "DB_NAME": "<ps_db_name>",
  "DB_HOST": "<host>",
  "DB_PORT": <port>,
  "DB_USER": "<user_postgres>",
  "DB_PASSWORD": "<pass_postgres>"
}
EOL
  1. Running the Application To load data from a CSV file into the database, run:
python src/main.py --raw_files_path <path/to/your_data.csv> --config_file config/postgres_config.json --db_type postgres

Contributing πŸ€—

Contributions are welcome! Please fork the repository and create a pull request with your improvements.

License πŸ“œ

This project is licensed under the MIT License - see the LICENSE file for details.

Contact πŸ“§

For any questions or issues, please open an issue in the repository or contact the maintainer at [email protected]

About

Raw2DataBase is a Metabase+PostgreSQL+Docker Solution to Showcase Your Data πŸ“Š

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages