Skip to content

Commit ba1430a

Browse files
gitosaurusCopilotCopilot
authored
Design for integrating LanceDB into Hyrax (#683)
* Design for integrating LanceDB into Hyrax * Address PR comments, add CLAUDE.md * Remove storage_format config option from LanceDB design (#697) * Simplify writer factory to always use ResultDatasetWriter without config option * Clarify tensor metadata storage in LanceDB design document (#698) Address feedback on lines 104-106: clarify that tensor metadata (shape, dtype) is stored in Arrow table schema's custom metadata dictionary, show proper JSON serialization using json.dumps(), and explain the serialization/deserialization process with concrete code examples. * Move spec to dedicated directory * Manual proofreading --------- Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: gitosaurus <6794831+gitosaurus@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 19fb5f9 commit ba1430a

File tree

2 files changed

+494
-0
lines changed

2 files changed

+494
-0
lines changed

CLAUDE.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
Hyrax is a Python-based machine learning framework for hunting rare and anomalous sources in large astronomical imaging surveys (Rubin-LSST, HSC, Euclid, NGRST). Built on PyTorch with PyTorch Ignite for training. Requires Python 3.11+.
8+
9+
## Essential Commands
10+
11+
```bash
12+
# Development setup
13+
bash .setup_dev.sh # Full setup (5-15 min)
14+
pip install -e .'[dev]' # Manual alternative
15+
16+
# Testing
17+
python -m pytest -m "not slow" # Fast tests (2-5 min)
18+
python -m pytest -m "slow" # Slow/integration tests
19+
python -m pytest tests/hyrax/test_file.py::test_name # Single test
20+
21+
# Linting/formatting (line length: 110)
22+
ruff check src/ tests/
23+
ruff format src/ tests/
24+
25+
# Pre-commit (run before committing)
26+
pre-commit run --all-files
27+
28+
# Documentation
29+
sphinx-build -M html ./docs ./_readthedocs
30+
```
31+
32+
## CLI Usage
33+
34+
Main entry point: `hyrax`
35+
36+
```bash
37+
hyrax --help
38+
hyrax <verb> --help
39+
hyrax train --runtime-config config.toml # or -c
40+
```
41+
42+
**Verbs:** train, infer, download, prepare, umap, visualize, lookup, save_to_database, database_connection, test, to_onnx, model, engine, rebuild_manifest
43+
44+
## Architecture
45+
46+
```
47+
src/
48+
├── hyrax/
49+
│ ├── verbs/ # CLI command implementations (inherit from Verb base class)
50+
│ ├── models/ # PyTorch models with registry system
51+
│ ├── data_sets/ # Dataset loaders (HSC, LSST, CIFAR, FITS, etc.)
52+
│ ├── vector_dbs/ # ChromaDB, Qdrant implementations
53+
│ ├── config_schemas/ # Pydantic v2 configuration validation
54+
│ ├── hyrax.py # Main Hyrax class (programmatic interface to verbs)
55+
│ ├── hyrax_default_config.toml # Default configuration
56+
│ └── pytorch_ignite.py # Training wrapper
57+
└── hyrax_cli/
58+
└── main.py # CLI dispatcher
59+
```
60+
61+
## Key Patterns
62+
63+
**Verbs:** Each verb in `src/hyrax/verbs/` is a class with `setup_parser()` for CLI args and inherits from `Verb`. Registry system auto-discovers verbs.
64+
65+
**Models:** Defined in `src/hyrax/models/` with registry for auto-discovery. Support ONNX export. Configure via `[model]` and `[model.<ModelName>]` sections.
66+
67+
**Datasets:** Loaders in `src/hyrax/data_sets/` support train/validation/test splits. Configure via `[data_set]` section.
68+
69+
**Configuration:** TOML-based with Pydantic validation. Default config merged with runtime config passed via `-c`.
70+
71+
## Test Fixtures
72+
73+
- `loopback_hyrax` - Pre-configured Hyrax instance with random dataset
74+
- `RandomDataset` / `RandomIterableDataset` - Test data generators
75+
76+
## Validation Workflow
77+
78+
After changes:
79+
1. `ruff check src/ tests/ && ruff format src/ tests/`
80+
2. `python -m pytest -m "not slow"`
81+
3. `hyrax --help` (verify CLI works)

0 commit comments

Comments
 (0)