One of the best parts of the LLM revolution is that it can scratch niche itches. We've used Geneious in our lab to visualize sequence data for about a decade. One of its most convenient features is the ability to search NCBI databases to retrieve sequence files from Genbank and other NCBI databases. However, Geneious has never offered a database search interface to SRA. This is a Geneious Prime plugin that enables direct searching and downloading of sequencing data from NCBI's Sequence Read Archive (SRA) without leaving the Geneious workspace.
Tested on an Apple Silicon M4 Macbook Pro and an X64 iMac Pro. It should work on Windows and Linux.
- Key Features
- System Requirements
- Installation
- Quick Start
- Usage Guide
- Technical Details
- Troubleshooting
- Contributing
- Support
- License
- π Integrated SRA Search: Search NCBI SRA directly from Geneious using accession numbers, organism names, or other search terms
- π Performance Tuned: Optimized with 8 threads and 1GB memory limit for faster downloads
- β‘ Two-Phase Download: Uses prefetch + fasterq-dump for 2-3x faster downloads
- 𧬠Paired-End Support: Automatically detects and properly configures paired-end reads
- π Quality Score Preservation: Downloads include quality scores when available
- π Metadata Display: Shows comprehensive SRA metadata including organism, platform, library strategy, and more
- π Enhanced Progress Tracking: Real-time progress updates with detailed phase information (download β conversion)
- π Cross-Platform: Includes binaries for Windows, macOS, and Linux
- Geneious Prime: Version 2024.0 or later
- Operating System:
- Windows 10 or later (64-bit)
- macOS 10.14 (Mojave) or later
- Linux (64-bit distributions)
- RAM: 4 GB minimum
- Storage: Sufficient space for temporary files during download
- Internet: Stable connection for NCBI SRA database access
- RAM: 8 GB or more for large datasets
- Storage: SSD with ample free space (downloads can be large)
- Network: Broadband connection for faster downloads
- Download the latest
NcbiSraSearch.gpluginfrom the Releases page - In Geneious Prime, go to Tools β Plugins
- Click Install plugin from a gplugin file
- Select the downloaded
NcbiSraSearch.gpluginfile - Restart Geneious Prime when prompted
# Clone the repository
git clone https://github.com/dholab/geneious-plugin-sra-search.git
cd geneious-plugin-sra-search
# Build the plugin
ant clean
ant create-plugin
# The plugin will be created as dist/NcbiSraSearch.gpluginAfter building, follow the installation steps from Method 1 using your locally built .gplugin file.
- Access the Plugin: In the Sources panel, locate "NCBI SRA Search" with a database icon
- Search for Data:
- Click on the SRA Search to activate
- Enter search terms (e.g.,
SRR11192680,Homo sapiens, or any NCBI query) - Press Enter or click Search
- Download Results:
- Select desired SRA records from results
- Click "Download FASTQ Data"
- Monitor progress in the Operations panel
- Specific Accession:
SRR11192680 - Organism:
Escherichia coli - Study:
PRJNA613958 - Technology:
Illumina AND RNA-seq - Date Range:
Homo sapiens AND 2024[PDAT]
The plugin supports all standard NCBI SRA search syntax:
- Simple Searches: Enter organism names, accession numbers, or keywords
- Boolean Operators: Use AND, OR, NOT for complex queries
- Field Tags: Use
[ORGN],[ACCN],[TITL]for specific field searches - Wildcards: Use
*for partial matches
Results display includes:
- Accession: SRA run accession number
- Title: Experiment title
- Organism: Source organism
- Platform: Sequencing platform used
- Layout: Single or paired-end
- Spots: Number of reads
- Bases: Total base count
- Published: Release date
When downloading SRA data:
- Automatic Paired-End Detection: Properly imports paired reads
- Quality Scores: Preserves original quality information
- Metadata Integration: Attaches all available metadata to sequences
- Progress Monitoring: Real-time download status with ability to cancel
The plugin includes pre-compiled binaries from NCBI SRA Toolkit v3.1.1:
| Platform | Binaries | Version | Architecture |
|---|---|---|---|
| Windows | fasterq-dump.exe, prefetch.exe |
3.1.1 | x86_64 |
| macOS | fasterq-dump, prefetch |
3.1.1 | Universal (x86_64 + arm64) |
| Linux | fasterq-dump, prefetch |
3.1.1 | x86_64 |
The plugin includes several performance optimizations:
- 8 threads by default (vs 6 in standard fasterq-dump) for faster processing
- 1GB memory limit for efficient sorting operations
- Persistent binary caching (~90% faster plugin startup after first run)
Two-Phase Download Strategy (enabled by default): The plugin uses a two-phase download strategy that provides 2-3x faster performance:
- Phase 1 - Prefetch: Downloads SRA file to temp directory (network I/O)
- Phase 2 - Conversion: Converts SRA file to FASTQ (CPU/disk I/O)
This separation of network I/O from CPU processing significantly improves overall download performance and provides better progress tracking with distinct phases.
The plugin uses NCBI E-utilities API:
- ESearch: Query execution against SRA database
- ESummary: Metadata retrieval for search results
- Rate Limiting: Respects NCBI's 3 requests/second limit
- Error Handling: Automatic retry with exponential backoff
To build from source:
- Java 8 or later
- Apache Ant
- Geneious Plugin Development Kit (included in
lib/)
- Ensure Geneious Prime 2024.0 or later is installed
- Verify plugin installation in Tools β Plugins
- Restart Geneious Prime after installation
- Check internet connection stability
- Verify sufficient disk space for temporary files
- Some older SRA entries may be unavailable
- Try downloading fewer files simultaneously
- Large datasets require more RAM (increase in Geneious preferences)
- SSD storage recommended for better performance
- Download speed limited by NCBI servers
- Check existing GitHub Issues23. Create a new issue with:
- Geneious Prime version
- Plugin version
- Operating system
- Error messages or logs
- Steps to reproduce
This project is licensed under the MIT License - see the LICENSE file for details.
The included fasterq-dump binaries are from the NCBI SRA Toolkit, which is in the public domain.
- NCBI for the SRA database and toolkit
- Biomatters/Geneious for the plugin API
- Contributors and the scientific community for making sequence data publicly available
- Streaming Import Architecture: Complete redesign for constant memory usage regardless of file size
- Incremental Document Loading: Sequences appear in Geneious immediately as they're read from disk
- Memory Optimization: Eliminates out-of-memory errors even with 100M+ read files
- Improved User Experience: Can browse and work with sequences while import continues
- Uses callback-based
OperationCallback.addDocument()API for real-time document availability - Implements forwarding
ImportCallbackto stream directly: File β Importer β Geneious (no accumulation) - Fixes progress message overflow in Geneious UI
- Removes invalid import options warnings
- Successfully handles large datasets that previously caused JVM heap exhaustion
- Two-Phase Download: Enabled prefetch + fasterq-dump strategy for 2-3x faster downloads
- Performance Improvements: Increased default thread count to 8 (from 6) for faster processing
- Added 1GB memory limit (
--mem 1024M) for improved sorting operations - Enhanced progress tracking with real-time spot count display and distinct download phases
- Bundled standalone
prefetchbinaries for all platforms (Windows, macOS, Linux) - Updated all binaries to NCBI SRA Toolkit v3.1.1 standalone versions
- Performance optimizations result in significantly faster downloads compared to v1.0.x
- Fixed macOS compatibility issue with universal binary
- Now supports both Intel (x86_64) and Apple Silicon (arm64) Macs
- Updated macOS fasterq-dump to version 3.1.1
- Initial release
- Basic SRA search functionality
- FASTQ download with quality scores
- Paired-end read support
- Cross-platform compatibility (Windows, macOS, Linux)
- Bundled fasterq-dump binaries
- Real-time progress tracking
- Comprehensive metadata display
For the latest updates and releases, visit our GitHub repository