NCBI SRA Search Plugin for Geneious Prime

One of the best parts of the LLM revolution is that it can scratch niche itches. We've used Geneious in our lab to visualize sequence data for about a decade. One of its most convenient features is the ability to search NCBI databases to retrieve sequence files from Genbank and other NCBI databases. However, Geneious has never offered a database search interface to SRA. This is a Geneious Prime plugin that enables direct searching and downloading of sequencing data from NCBI's Sequence Read Archive (SRA) without leaving the Geneious workspace.

Tested on an Apple Silicon M4 Macbook Pro and an X64 iMac Pro. It should work on Windows and Linux.

Key Features

🔍 Integrated SRA Search: Search NCBI SRA directly from Geneious using accession numbers, organism names, or other search terms
🚀 Performance Tuned: Optimized with 8 threads and 1GB memory limit for faster downloads
⚡ Two-Phase Download: Uses prefetch + fasterq-dump for 2-3x faster downloads
🧬 Paired-End Support: Automatically detects and properly configures paired-end reads
📊 Quality Score Preservation: Downloads include quality scores when available
📋 Metadata Display: Shows comprehensive SRA metadata including organism, platform, library strategy, and more
📈 Enhanced Progress Tracking: Real-time progress updates with detailed phase information (download → conversion)
🌍 Cross-Platform: Includes binaries for Windows, macOS, and Linux

System Requirements

Minimum Requirements

Geneious Prime: Version 2024.0 or later
Operating System:
- Windows 10 or later (64-bit)
- macOS 10.14 (Mojave) or later
- Linux (64-bit distributions)
RAM: 4 GB minimum
Storage: Sufficient space for temporary files during download
Internet: Stable connection for NCBI SRA database access

Recommended Specifications

RAM: 8 GB or more for large datasets
Storage: SSD with ample free space (downloads can be large)
Network: Broadband connection for faster downloads

Installation

Method 1: Download from Releases (Recommended)

Download the latest NcbiSraSearch.gplugin from the Releases page
In Geneious Prime, go to Tools → Plugins
Click Install plugin from a gplugin file
Select the downloaded NcbiSraSearch.gplugin file
Restart Geneious Prime when prompted

Method 2: Build from Source

# Clone the repository
git clone https://github.com/dholab/geneious-plugin-sra-search.git
cd geneious-plugin-sra-search

# Build the plugin
ant clean
ant create-plugin

# The plugin will be created as dist/NcbiSraSearch.gplugin

After building, follow the installation steps from Method 1 using your locally built .gplugin file.

Quick Start

Basic SRA Search and Download

Access the Plugin: In the Sources panel, locate "NCBI SRA Search" with a database icon
Search for Data:
- Click on the SRA Search to activate
- Enter search terms (e.g., SRR11192680, Homo sapiens, or any NCBI query)
- Press Enter or click Search
Download Results:
- Select desired SRA records from results
- Click "Download FASTQ Data"
- Monitor progress in the Operations panel

Example Search Queries

Specific Accession: SRR11192680
Organism: Escherichia coli
Study: PRJNA613958
Technology: Illumina AND RNA-seq
Date Range: Homo sapiens AND 2024[PDAT]

Usage Guide

Searching SRA

The plugin supports all standard NCBI SRA search syntax:

Simple Searches: Enter organism names, accession numbers, or keywords
Boolean Operators: Use AND, OR, NOT for complex queries
Field Tags: Use [ORGN], [ACCN], [TITL] for specific field searches
Wildcards: Use * for partial matches

Understanding Search Results

Results display includes:

Accession: SRA run accession number
Title: Experiment title
Organism: Source organism
Platform: Sequencing platform used
Layout: Single or paired-end
Spots: Number of reads
Bases: Total base count
Published: Release date

Download Options

When downloading SRA data:

Automatic Paired-End Detection: Properly imports paired reads
Quality Scores: Preserves original quality information
Metadata Integration: Attaches all available metadata to sequences
Progress Monitoring: Real-time download status with ability to cancel

Technical Details

Bundled Binaries

The plugin includes pre-compiled binaries from NCBI SRA Toolkit v3.1.1:

Platform	Binaries	Version	Architecture
Windows	`fasterq-dump.exe`, `prefetch.exe`	3.1.1	x86_64
macOS	`fasterq-dump`, `prefetch`	3.1.1	Universal (x86_64 + arm64)
Linux	`fasterq-dump`, `prefetch`	3.1.1	x86_64

Performance Optimization

The plugin includes several performance optimizations:

8 threads by default (vs 6 in standard fasterq-dump) for faster processing
1GB memory limit for efficient sorting operations
Persistent binary caching (~90% faster plugin startup after first run)

Two-Phase Download Strategy (enabled by default): The plugin uses a two-phase download strategy that provides 2-3x faster performance:

Phase 1 - Prefetch: Downloads SRA file to temp directory (network I/O)
Phase 2 - Conversion: Converts SRA file to FASTQ (CPU/disk I/O)

This separation of network I/O from CPU processing significantly improves overall download performance and provides better progress tracking with distinct phases.

API Integration

The plugin uses NCBI E-utilities API:

ESearch: Query execution against SRA database
ESummary: Metadata retrieval for search results
Rate Limiting: Respects NCBI's 3 requests/second limit
Error Handling: Automatic retry with exponential backoff

Build Requirements

To build from source:

Java 8 or later
Apache Ant
Geneious Plugin Development Kit (included in lib/)

Troubleshooting

Common Issues

Plugin Not Appearing

Ensure Geneious Prime 2024.0 or later is installed
Verify plugin installation in Tools → Plugins
Restart Geneious Prime after installation

Download Failures

Check internet connection stability
Verify sufficient disk space for temporary files
Some older SRA entries may be unavailable
Try downloading fewer files simultaneously

Performance Issues

Large datasets require more RAM (increase in Geneious preferences)
SSD storage recommended for better performance
Download speed limited by NCBI servers

Getting Help

Check existing GitHub Issues23. Create a new issue with:
- Geneious Prime version
- Plugin version
- Operating system
- Error messages or logs
- Steps to reproduce

License

This project is licensed under the MIT License - see the LICENSE file for details.

The included fasterq-dump binaries are from the NCBI SRA Toolkit, which is in the public domain.

Acknowledgments

NCBI for the SRA database and toolkit
Biomatters/Geneious for the plugin API
Contributors and the scientific community for making sequence data publicly available

Version History

Version 1.3.0 (2024-11-19)

Streaming Import Architecture: Complete redesign for constant memory usage regardless of file size
Incremental Document Loading: Sequences appear in Geneious immediately as they're read from disk
Memory Optimization: Eliminates out-of-memory errors even with 100M+ read files
Improved User Experience: Can browse and work with sequences while import continues
Uses callback-based OperationCallback.addDocument() API for real-time document availability
Implements forwarding ImportCallback to stream directly: File → Importer → Geneious (no accumulation)
Fixes progress message overflow in Geneious UI
Removes invalid import options warnings
Successfully handles large datasets that previously caused JVM heap exhaustion

Version 1.2.0 (2024-11-18)

Two-Phase Download: Enabled prefetch + fasterq-dump strategy for 2-3x faster downloads
Performance Improvements: Increased default thread count to 8 (from 6) for faster processing
Added 1GB memory limit (--mem 1024M) for improved sorting operations
Enhanced progress tracking with real-time spot count display and distinct download phases
Bundled standalone prefetch binaries for all platforms (Windows, macOS, Linux)
Updated all binaries to NCBI SRA Toolkit v3.1.1 standalone versions
Performance optimizations result in significantly faster downloads compared to v1.0.x

Version 1.0.1 (2024-08-12)

Fixed macOS compatibility issue with universal binary
Now supports both Intel (x86_64) and Apple Silicon (arm64) Macs
Updated macOS fasterq-dump to version 3.1.1

Version 1.0.0 (2024-08-12)

Initial release
Basic SRA search functionality
FASTQ download with quality scores
Paired-end read support
Cross-platform compatibility (Windows, macOS, Linux)
Bundled fasterq-dump binaries
Real-time progress tracking
Comprehensive metadata display

For the latest updates and releases, visit our GitHub repository

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
SRR1754483		SRR1754483
lib		lib
resources/binaries		resources/binaries
src/com/biomatters/plugins/ncbisra		src/com/biomatters/plugins/ncbisra
test-resources		test-resources
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
EFFICIENCY_IMPROVEMENTS.md		EFFICIENCY_IMPROVEMENTS.md
IMPLEMENTATION_SUMMARY_v1.1.0.md		IMPLEMENTATION_SUMMARY_v1.1.0.md
INCREMENTAL_LOADING_FIX.md		INCREMENTAL_LOADING_FIX.md
LICENSE		LICENSE
QUICK_START_TESTING.md		QUICK_START_TESTING.md
README.md		README.md
RELEASE_NOTES_v1.1.0.md		RELEASE_NOTES_v1.1.0.md
SETUP_STATUS.md		SETUP_STATUS.md
STREAMING_IMPORT_SUMMARY.md		STREAMING_IMPORT_SUMMARY.md
TEST_AND_OPTIMIZATION_SUMMARY.md		TEST_AND_OPTIMIZATION_SUMMARY.md
TEST_COVERAGE_REPORT.md		TEST_COVERAGE_REPORT.md
TEST_INFRASTRUCTURE_REPORT.md		TEST_INFRASTRUCTURE_REPORT.md
TEST_SETUP.md		TEST_SETUP.md
build.xml		build.xml
download-test-deps.sh		download-test-deps.sh
git_status.txt		git_status.txt
plugin.properties		plugin.properties

License

dholab/geneious-plugin-sra-search

Folders and files

Latest commit

History

Repository files navigation

NCBI SRA Search Plugin for Geneious Prime

Table of Contents

Key Features

System Requirements

Minimum Requirements

Recommended Specifications

Installation

Method 1: Download from Releases (Recommended)

Method 2: Build from Source

Quick Start

Basic SRA Search and Download

Example Search Queries

Usage Guide

Searching SRA

Understanding Search Results

Download Options

Technical Details

Bundled Binaries

Performance Optimization

API Integration

Build Requirements

Troubleshooting

Common Issues

Plugin Not Appearing

Download Failures

Performance Issues

Getting Help

License

Acknowledgments

Version History

Version 1.3.0 (2024-11-19)

Version 1.2.0 (2024-11-18)

Version 1.0.1 (2024-08-12)

Version 1.0.0 (2024-08-12)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages