Skip to content

Fix op.translate HCOP URL after EBI FTP drop (#303)#304

Merged
PauBadiaM merged 2 commits intomainfrom
fix/hcop-gcs-migration
Apr 10, 2026
Merged

Fix op.translate HCOP URL after EBI FTP drop (#303)#304
PauBadiaM merged 2 commits intomainfrom
fix/hcop-gcs-migration

Conversation

@PauBadiaM
Copy link
Copy Markdown
Collaborator

Summary

  • Fixes 404 Error from Mouse in dc.op.resource Function #303: HGNC has removed the entire /pub/databases/genenames/ subtree from the EBI FTP mirror, so every op.translate call — and every op.resource(..., organism=...) call for a non-human organism — was failing with HTTPError: 404. The HCOP fifteen-column files are now fetched from HGNC's public Google Cloud Storage bucket (https://storage.googleapis.com/public-download-files/hcop/...), which is where HGNC currently publishes them (confirmed via bucket listing; anole_lizard object was last updated 2026-03-27). The move is only documented on https://www.genenames.org/help/hcop/ — there's no dedicated announcement on the HGNC news page, which is likely why this went unnoticed until the FTP directory was actually taken down.
  • Drops a stray pd.read_csv(url, ...) at the end of op.translate's download block that was silently re-downloading the HCOP file a second time immediately after _download + _bytes_to_pandas had already produced map_df, doubling the network cost per call and bypassing the retry logic in _download.
  • Bumps version to 2.1.6 and adds a CHANGELOG entry.

Test plan

  • PYTHONPATH=src pytest tests/op/test_translate.py — 7 passed (including test_translate parametrized over mouse, anole_lizard, and fruitfly, which hits the live bucket)
  • CI green

🤖 Generated with Claude Code

PauBadiaM and others added 2 commits April 10, 2026 16:13
HGNC removed the entire /pub/databases/genenames/ subtree from the
EBI FTP mirror, so every op.translate call (and every op.resource
call for a non-human organism) was failing with HTTP 404. Point the
download at the new HGNC Google Cloud Storage bucket, which is where
HGNC now publishes the HCOP fifteen-column files.

Also drop a stray pd.read_csv(url, ...) that was re-downloading the
file a second time right after _download + _bytes_to_pandas had
already produced map_df, silently doubling the network cost and
bypassing the retry logic in _download.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.31%. Comparing base (6e18630) to head (2a54478).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #304      +/-   ##
==========================================
- Coverage   94.31%   94.31%   -0.01%     
==========================================
  Files          78       78              
  Lines        4079     4078       -1     
==========================================
- Hits         3847     3846       -1     
  Misses        232      232              
Files with missing lines Coverage Δ
src/decoupler/op/_translate.py 95.12% <100.00%> (-0.06%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@PauBadiaM PauBadiaM merged commit 6118a32 into main Apr 10, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

404 Error from Mouse in dc.op.resource Function

2 participants