Skip to content

Add pycsw 3#71

Merged
mjanez merged 9 commits intomainfrom
latest
Oct 31, 2025
Merged

Add pycsw 3#71
mjanez merged 9 commits intomainfrom
latest

Conversation

@mjanez
Copy link
Owner

@mjanez mjanez commented Oct 31, 2025

This pull request updates the project to support pycsw 3.0 and modernizes the configuration and documentation throughout the codebase. It migrates from the legacy .conf configuration format to the new YAML-based .yml format required by pycsw 3.0, updates Docker images and environment variables accordingly, and enhances documentation for both users and developers. The changes also include improved testing and debugging instructions, as well as updates to the CI workflow for better version handling.

Migration to pycsw 3.0 and configuration overhaul:

  • Migrated configuration from .conf to .yml format for pycsw 3.0, updating all references, environment variables (PYCSW_CONFIG, PYCSW_SERVER_URL, PYCSW_URL), and Dockerfiles to use the new YAML configuration and endpoint conventions.
  • Updated Dockerfiles to use new base images, install additional dependencies, and include migration scripts for pycsw 3.0. Added support for both legacy and new configuration templates for backward compatibility.

Documentation improvements:

  • Revised the README.md to document the new configuration variables and workflow for pycsw 3.0, including clear setup instructions, endpoint explanations, and migration notes.
  • Added detailed sections on automated testing (pytest, Docker, PDM), pycsw 3.0 endpoints, and expanded debugging instructions with VS Code and debugpy, replacing deprecated ptvsd.

CI/CD and versioning updates:

  • Improved GitHub Actions workflow to support semantic versioning for pre-releases (e.g., 3.0-dev) and handle new version tag formats.
  • Updated Docker image tags and descriptions in README.md to reflect pycsw 3.0 and legacy support, with accurate size and usage notes.

General fixes and environment variable corrections:

  • Fixed environment variable typos and improved default values, including contact information and temporal extent variables for INSPIRE metadata.
  • Cleaned up unused or obsolete Docker build options and environment variable documentation.

…ration

- Introduce PYCSW_SERVER_URL and set PYCSW_URL to ${PYCSW_SERVER_URL}/csw; update .env.example
- Add pycsw.yml.template (pycsw 3.0 YAML config) and update pycsw.conf.template to use PYCSW_SERVER_URL
- Update README to document PYCSW_SERVER_URL, endpoints and YAML usage
- Update Dockerfiles (base, dev, ghcr): bump CKAN_PYCSW_VERSION, switch to pycsw.yml config, expose CSW endpoint, add helper files and dev tools
- Add migrate_db_pycsw3.sh to migrate pycsw 2.x DBs (sqlite/postgres/mysql) adding new columns required by pycsw 3.0
- Update entrypoint to generate YAML config and run the migration script before starting the app
- Update ckan2pycsw to read pycsw YAML config, use PYCSW_SERVER_URL, adapt pycsw API calls (repository.setup/export_records/admin, wsgi_flask) and update process restart logic
- Update pyproject deps to install pycsw from GitHub and add pycsw 3.0 runtime requirements (flask, pygeofilter, pygeoif) and SQLAlchemy < 2.0
… support, and mapping guard

- Implement paginated get_datasets with batch fetching, DCAT type filtering and robust error handling/logging.
- Add transform_dataset_to_xml to convert CKAN -> MCF -> ISO19139 XML with schema selection and detailed error traces.
- Add save_xml_files to write XML files, optionally clean metadata dir, collect statistics and per-DCAT summaries.
- Add load_xml_records to import XML files into pycsw using admin.load_records (force_update) and report load stats.
- Update initialize_pycsw_database to handle dev_mode properly, remove existing DB file, call repository.setup and return (database, table, context).
- Rework main() to load YAML pycsw config, orchestrate fetch -> transform -> load steps, emit execution summary and improved logging.
- General: add typing hints, imports (shutil, defaultdict), better logging module prefixes and warnings for insecure SSL mode.
- Fix template mapping bug: guard against missing/None input_field before membership check in get_mapping_values_dict_from_yaml_list.

These changes modularize the pipeline, improve error resilience, and complete migration to pycsw 3.x behaviour.
- Switch configuration format from pycsw.conf to pycsw.yml across:
  .env.example, Dockerfiles (base/dev/ghcr), docker-entrypoint scripts, README and run commands.
  Use env default ${PYCSW_CONFIG:-pycsw.yml} in entrypoints.
- Upgrade container base to python:3.11-slim-bookworm and bump pdm to 2.26.1.
- Replace deprecated ptvsd with debugpy and update VSCode/docker remote debug instructions.
- Update GH Actions docker tag extraction regex to support semantic versions and pre-release tags (eg. 3.0-dev).
- Refresh dependencies and packaging:
  - Update pyproject.toml to target pycsw 3.0 (git master) and pin/upgrade required libs.
  - Update pdm.lock to reflect updated dependency graph.
- Dockerfile updates: simplify template copies (use pycsw.yml.template), fix env vars and contact email/address.
- Docs & scripts: simplify install command, add testing instructions, expand debugging docs; add doc/csw to .gitignore.
- Update README built images table to mark 3.0-dev as the latest development image and adjust tags/sizes.
- Use ghcr.io/mjanez/ckan-pycsw:3.0-dev as base in Dockerfile.dev and Dockerfile.ghcr.
- Set CKAN_PYCSW_VERSION=3.0-dev in ckan-pycsw Dockerfile, Dockerfile.dev and Dockerfile.ghcr.
- Bump default CKAN_PYCSW_VERSION in ckan2pycsw to 3.0-dev.
@mjanez mjanez self-assigned this Oct 31, 2025
@mjanez mjanez added documentation Improvements or additions to documentation enhancement New feature or request labels Oct 31, 2025
@mjanez mjanez linked an issue Oct 31, 2025 that may be closed by this pull request
3 tasks
@mjanez mjanez merged commit ced62de into main Oct 31, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support pycsw 3

1 participant