Skip to content

Conversation

@druvus
Copy link
Contributor

@druvus druvus commented Aug 23, 2025

Release v0.2.0 - Production Ready

This release includes comprehensive enhancements, validation, and documentation improvements for PrePrimer.

🎯 Highlights

  • ✨ Alignment functionality with 4 providers (BLAST, Exonerate, merPCR, me-PCR)
  • ✨ Enhanced STS format with 3/4-column auto-detection and header/headerless support
  • ✨ Comprehensive validation framework (23 real-data tests, 100% pass rate)
  • πŸ“š Reorganized documentation (user-focused README, technical CLAUDE.md, validation reports)
  • ⚑ Optimized CI/CD (fast feedback job, sequential testing, fixed security checks)
  • πŸ› Fixed 4 test failures - all 636 tests now passing
  • βœ… 96.90% test coverage with 611 tests

πŸ“¦ What's New

Alignment Functionality

  • High-level API: align_primers() function for primer-to-reference alignment
  • 4 Alignment Providers:
    • BLAST: NCBI BLAST integration for fast primer alignment
    • Exonerate: Exonerate integration for sensitive alignment
    • merPCR: Modern Python reimplementation (recommended, 2.65x faster)
    • me-PCR: Legacy in silico PCR simulation support
  • CLI command: preprimer align --primers primers.bed --reference genome.fasta --aligner merpcr
  • 36 comprehensive tests with 100% pass rate

Enhanced STS Format Support

  • Auto-detection: Supports both 3-column and 4-column formats
  • Header flexibility: Auto-detects header presence (header/headerless files)
  • Extended format: Includes SIZE column when amplicon length available
  • Tool compatibility: Compatible with me-PCR and merPCR output files
  • Backward compatible: Maintains full compatibility with 3-column format

Validation Framework

  • tests/validation/ module with 430-line validator
  • Format-specific validators for all 5 output formats
  • Multi-format reporting: Markdown, JSON, and console output
  • 23 real-data tests with real-world datasets:
    • Small dataset (5 amplicons) - all 5 formats
    • Medium dataset (80 amplicons) - all 5 formats
    • Edge cases (circular genomes, degenerate primers)
    • Real alignment tool testing (BLAST, Exonerate, merPCR)
  • Performance benchmarking: 2-8x faster than targets
  • Validation reports in docs/technical/validation/

Documentation Improvements

  • Simplified README.md: 262 β†’ 170 lines (35% reduction), user-focused
  • Updated CLAUDE.md: v0.2.0 features, metrics, alignment documentation
  • New validation documentation:
    • docs/technical/validation/README.md - Validation hub
    • docs/technical/validation/real-data-testing.md - Comprehensive test report
    • docs/technical/validation/v0.2.0-validation.md - Release validation
  • Enhanced docs/README.md: Added validation section with cross-links
  • "What's New in v0.2.0": Highlights section in README
  • "For Developers": Clear links to CLAUDE.md and technical docs

CI/CD Optimization

  • Fast feedback: quick-test job (ubuntu+3.11) runs first (~2-3 min)
  • Sequential testing: Full matrix (5 jobs) only runs if quick-test passes
  • No duplication: Excluded ubuntu+3.11 from matrix (already tested in quick-test)
  • Fixed security checks: Removed || true that was hiding failures
  • Parallel execution: lint job runs in parallel with quick-test
  • Faster feedback: Get results in 2-3 minutes instead of waiting for full matrix

πŸ› Bug Fixes

Test Fixes (4 failing tests β†’ all passing)

  1. STS Format Validation (3 tests fixed)

    • Updated test_all_parsers.py to accept both 3-column and 4-column STS formats
    • Added validation for SIZE column when present
    • Maintains backward compatibility
  2. Header Validation (1 test fixed)

    • Enhanced STS parser to validate header column names
    • Properly rejects invalid headers (e.g., "NAME\tSEQ1\tSEQ2")
    • Validates NAME/ID field and FORWARD/REVERSE keywords

πŸ“Š Test Results

Before: 632 passed, 4 failed
After: 636 passed, 0 failed βœ…

  • 611 core tests (parsers, writers, converters, etc.)
  • 23 validation tests (real data, alignment tools)
  • 2 new tests (from bug fixes)
  • 96.90% coverage maintained

🧹 Cleanup

  • Removed 3 obsolete .old workflow files
  • Added logo.png to .gitignore (989KB unused file)
  • Deleted SESSION_SUMMARY.md (temporary session notes)

⚠️ Breaking Changes

None - All changes are backward compatible.

The legacy configuration removal (PrePrimerConfig β†’ EnhancedConfig) was in previous commits. See CHANGELOG.md for migration guide.

πŸ“ Complete Details

See CHANGELOG.md for complete version history and upgrade guide.

βœ… Validation Status

  • βœ… All 636 tests passing
  • βœ… 96.90% test coverage
  • βœ… 23 real-data validation tests (100% pass)
  • βœ… Real tool integration verified (BLAST, Exonerate, merPCR)
  • βœ… Performance targets exceeded (2-8x faster)
  • βœ… Documentation complete and organized
  • βœ… CI/CD optimized and tested

v0.2.0 is production-ready! πŸš€


πŸ€– Generated with Claude Code

druvus and others added 24 commits August 23, 2025 12:48
…nd unified test data structure

- Add security module with path validation and input sanitization
- Implement comprehensive testing framework (property-based, benchmarks, integration, mutation testing)
- Create unified test data structure with cross-format consistency
- Add enhanced configuration system with multi-format support
- Reorganize test data into datasets/ with legacy/ preservation
- Update all parsers and writers for improved performance and validation
- Add extensive documentation including CLAUDE.md ultrathink guide
- Implement 226 total tests with 225 passing for robust code coverage

πŸ€– Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Comprehensive release preparation including documentation reorganization,
CI optimization, validation framework, and critical bug fixes.

## Documentation Improvements

### Reorganized Structure
- Created docs/technical/validation/ with comprehensive validation reports
- Moved validation documentation from root to organized location
- Simplified README.md (262β†’170 lines, user-focused)
- Updated CLAUDE.md with v0.2.0 features and metrics
- Enhanced docs/README.md with validation section

### Version Updates
- Updated all version references to 0.2.0
- Finalized CHANGELOG.md with complete v0.2.0 feature list
- Updated CITATION.cff metadata

## CI/CD Optimization

- Added fast feedback job (quick-test) running ubuntu+3.11 first
- Full test matrix (5 jobs) now runs only if quick-test passes
- Removed duplicate testing (excluded ubuntu+3.11 from matrix)
- Fixed security check that was hiding failures (removed || true)
- Lint job runs in parallel for faster feedback

## Test Fixes (4 failing tests β†’ all passing)

### Fixed STS Format Validation (3 tests)
- Updated test_all_parsers.py to accept both 3-column and 4-column STS formats
- v0.2.0 enhancement: auto-detection of SIZE column when available
- Maintains backward compatibility with 3-column format

### Fixed Header Validation (1 test)
- Enhanced STS parser to validate header column names
- Now properly rejects invalid headers (e.g., "NAME\tSEQ1\tSEQ2")
- Validates presence of NAME/ID field and FORWARD/REVERSE keywords

## File Cleanup

- Removed 3 obsolete .old workflow files
- Added logo.png to .gitignore
- Deleted SESSION_SUMMARY.md (temporary session notes)

## New Features (from previous work, now committed)

### Alignment Functionality
- Added align.py with high-level alignment API
- Implemented 4 alignment providers:
  - BLAST (fast alignment)
  - Exonerate (sensitive alignment)
  - merPCR (modern Python, recommended)
  - me-PCR (legacy C tool)
- 36 comprehensive alignment tests

### Validation Framework
- Created tests/validation/ with 430-line validator
- Added report_generator.py for multi-format reports
- 23 real-data tests with 100% pass rate
- Real tool integration testing (BLAST, Exonerate, merPCR)

### Enhanced STS Format Support
- Auto-detection of 3 vs 4-column formats
- Header/headerless file support
- Compatible with me-PCR and merPCR output

## Test Results

Before: 632 passed, 4 failed
After:  636 passed, 0 failed βœ…

All 611 core tests + 23 validation tests + 2 new tests passing.

## Breaking Changes

None in this commit - all changes are backward compatible.
Legacy config removal was in previous commits.

πŸ€– Generated with Claude Code
https://claude.com/claude-code

Co-Authored-By: Claude <[email protected]>
@druvus druvus changed the title Add support for olivar Release v0.2.0 - Production Ready Oct 21, 2025
@druvus druvus merged commit 8d1007a into main Oct 21, 2025
6 of 10 checks passed
@druvus druvus deleted the v0.2.0 branch October 21, 2025 17:49
@druvus druvus restored the v0.2.0 branch October 21, 2025 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants