Dump of all datasets found in the dataset catalog @ https://data.pr.gov to disk. There are 148 datasets at the moment of the initial commit 2017-07-25. Please remember your disk space!
PRs are welcome!
All created files are saved to the data_files directory using the following steps:
- Fetches the catalog of datasets from https://data.pr.gov/data.json
- Saves the dataset catalog to disk with a timestamp.
- Consumes dataset catalog and downloads all distributions for each dataset.
- All downloaded files will be named
data.{file_type}
- All downloaded files will be named
- Install pipenv 'cause we fancy.
- Initialize a Python 3 virtual environment
pipenv --three - Install dependencies
pipenv install - Activate the virtual environment
pipenv shell - Execute
python data_pr_downloader.py
- Run
./build.shto build docker image - Run
./run.shto fetch data. Files will be downloaded in thedata_filesdirectory.