-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Task
0 / 10 of 1 issue completed
Copy link
Description
The PSF dataset generation script (data_generation_script.py) is a large, monolithic function that would benefit from refactoring to improve maintainability, testability, and code reuse.
Current Issues
Code Structure
- Monolithic design: Single 800+ line
main()function handling multiple responsibilities - Poor separation of concerns: Configuration, data processing, visualization, and I/O all mixed together
- Complex control flow: Deeply nested conditionals make the code difficult to follow and debug
Testing & Reliability
- No unit tests: Critical data generation logic lacks test coverage
- Missing error handling: File operations and array manipulations lack proper exception handling
- No input validation: Configuration parameters aren't validated before use
- Sign convention mismatch: Data generation script appears to use different sign conventions than the main codebase preventing the "saved ground-truths" information from being used correctly for verification
Code Quality
- Inconsistent naming: Mix of camelCase (
sim_PSF_toolkit) and snake_case (train_positions) - Magic numbers: Hard-coded values scattered throughout (e.g.,
selected_id_SED = np.random.randint(low=0, high=13)) - Duplicate code: Similar operations repeated for train/test datasets
Proposed Refactoring
-
Create a Class-Based Architecture
-
Separate Concerns into Modules
-
Add Comprehensive Testing
-
Improve Error Handling
-
Code Quality Improvements
Benefits
- Maintainability: Smaller, focused functions are easier to understand and modify
- Testability: Individual components can be unit tested in isolation
- Reusability: Modular design allows components to be reused in other contexts
- Debugging: Easier to isolate and fix issues in specific components
- Documentation: Clear separation makes the codebase more approachable for new contributors
This refactoring should be completed before the next release cycle. If not possible, then it will be removed from develop into a dedicated feature branch when there is time to refactor it to prevent it from blocking the develop→main merge.
Reactions are currently unavailable
Sub-issues
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status