Skip to content

Commit 39f1cd0

Browse files
authored
Merge pull request #217 from ctmrbio/develop
StaG v6.1
2 parents ec8a5f9 + 57f98bf commit 39f1cd0

File tree

17 files changed

+120
-67
lines changed

17 files changed

+120
-67
lines changed

CHANGELOG.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,16 +14,39 @@ committed to the master branch that does not trigger any of the aforementioned
1414
situations.
1515

1616

17-
## [0.6.1] Unreleased
17+
## [0.6.1] 2023-06-01
1818
### Added
19+
- BBMap: now outputs sorted BAM file, added options `keep_sam` and `keep_bam`.
20+
- Bowtie2: added option `keep_bam`.
1921

2022
### Fixed
23+
- KrakenUniq: environment variable `LC_ALL` has been added to Singularity image
24+
to prevent unnecessary warning messages related to it being undefined.
25+
- KrakenUniq: now able to run when host removal is skipped, solved by adding
26+
`krakenuniq_merge_reads` rule to create a temporary merged fasta file with
27+
input data for KrakenUniq to avoid giving KrakenUniq symlinks as input.
2128

2229
### Changed
30+
- MetaPhlAn: Updated to v4.0.6
31+
- HUMAnN3: Updated to v3.7
32+
- HUMAnN3: Changed the way the temporary directory is resolved, now using
33+
Snakemake's built-in `resources.tmpdir`. This should prevent HUMAnN from
34+
creating large temporary directories outside of Slurm job folders so that
35+
they cannot be automatically cleaned up if the Slurm job times out or fails
36+
before HUMAnN can clean up after itself.
37+
- KrakenUniq: Concatenate reads with BBMap's `fuse.sh` with a padding of one
38+
`N` instead of interleaving the paired inputs into a single FASTA to avoid
39+
KrakenUniq treating paired reads independently.
2340

2441
### Deprecated
42+
- Kaiju, Kraken2, MetaPhlAn: area plot removed due to repeatedly leading to
43+
failed runs in cached Singularity containers. The script still works as
44+
intended in newer matplotlib versions and will remain in the scripts folder
45+
for potential manual use if desired.
2546

2647
### Removed
48+
- Groot: Removed settings related to read length window as that feature was
49+
removed in a previous StaG release.
2750

2851

2952
## [0.6.0] 2023-04-17

config/config.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,7 @@ input_fn_pattern: "{sample}_{readpair}.fq.gz"
1717
samplesheet: "" # Three-column samplesheet with sample_id,fastq_1,fastq_2 columns. Used instead of inputdir
1818
outdir: "output_dir"
1919
logdir: "output_dir/logs"
20-
tmpdir: "/tmp" # Specifying a tmpdir is not normally necessary but some tools like HUMAnN requires it. If required please enter path to a temp directory e.g. /scratch
21-
dbdir: "databases" # Databases will be downloaded to this dir, if requested
20+
dbdir: "databases" # Databases will be downloaded to this dir
2221
report: "StaG_report-" # Filename prefix for report file ("-{datetime}.html" automatically appended)
2322
email: "" # Email to send status message after completed/failed run.
2423

@@ -161,8 +160,6 @@ humann:
161160
#########################
162161
groot:
163162
index_dir: "" # [Required] Path to groot indexDir
164-
minlength: 110 # Minlength for groot index
165-
maxlength: 125 # Maxlength for groot index
166163
covcutoff: 0.97 # Coverage cutoff for groot report
167164
lowcov: False # Report ARGs with no 5' or 3' coverage. Overrides covcutoff.
168165

@@ -185,6 +182,8 @@ bbmap:
185182
- db_name: "" # [Required] Custom name for BBMap database
186183
db_path: "" # [Required] Path to BBMap database (folder should contain a 'ref' folder)
187184
min_id: 0.76 # Minimum id for read alignment, BBMap default is 0.76
185+
keep_sam: False # Set to True to keep intermediary SAM file
186+
keep_bam: True # Set to False to remove bam files after counting annotations
188187
extra: "" # Extra BBMap command line parameters
189188
counts_table:
190189
annotations: "" # Tab-separated annotation file with headers, first column is full FASTA header of reference sequences
@@ -196,6 +195,7 @@ bbmap:
196195
extra: "" # Extra featureCount command line parameters
197196
bowtie2:
198197
- db_prefix: "" # [Required] Full path to Bowtie2 index (not including file extension)
198+
keep_bam: True # Set to False to remove bam files after counting annotations
199199
extra: "" # Extra bowtie2 commandline parameters
200200
counts_table:
201201
annotations: "" # Tab-separated annotation file with headers, first column is full FASTA header of reference sequences

docs/source/conf.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,9 +56,9 @@
5656
# built documents.
5757
#
5858
# The short X.Y version.
59-
version = '0.6.0'
59+
version = '0.6.1'
6060
# The full version, including alpha/beta/rc tags.
61-
release = '0.6.0-dev'
61+
release = '0.6.1'
6262

6363
# reStructuredText prolog contains a string of reStructuredText that will be
6464
# included at the beginning of every source file that is read.

docs/source/modules.rst

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
.. _featureCounts: https://subread.sourceforge.net/featureCounts.html
1313
.. _HUMAnN: https://github.com/biobakery/biobakery/wiki/humann3
1414
.. _GTF format: https://genome.ucsc.edu/FAQ/FAQformat.html#format4
15-
.. _SAF format: http://bioinf.wehi.edu.au/featureCounts/
15+
.. _SAF format: https://subread.sourceforge.net/featureCounts.html
1616
.. _MEGAHIT: https://github.com/voutcn/megahit
1717
.. _MultiQC: https://multiqc.info/
1818
.. _amrplusplus: https://megares.meglab.org/amrplusplus/latest/html/what_AMR++_produces.html
@@ -259,13 +259,13 @@ the default unit is counts per million (cpm).
259259

260260
HUMAnN uses the taxonomic profiles produced by MetaPhlAn as input,
261261
so all MetaPhlAn-associated steps are run regardless of whether it is actually
262-
enabled in ``config.yaml`` or not. It is important to use a MetaPhlAn database
263-
compatible with HUMAnN3, e.g. mpa_v30_CHOCOPhlAn_201901 (run the metaphlan
264-
step with the extra ``--mpa3`` flag in the StaG config file).
262+
enabled in ``config.yaml`` or not.
265263

266-
Due to temporary disk space issues with running HUMAnN it is now a requirement
267-
to specify a $TMPDIR in ``config.yaml``, e.g. ``/scratch`` or ``/tmp`` depending
268-
on your system's configuration.
264+
HUMAnN requires large amounts of temporary disk space when processing a sample
265+
and will automatically use a suitable temporary directory from system
266+
environment variable ``$TMPDIR``, using Snakemake's resources feature to
267+
evaluate the variable at runtime (which means it can utilize node-local
268+
temporary disk if executing on a compute cluster).
269269

270270

271271
Antibiotic resistance
@@ -296,9 +296,8 @@ mapped graphs of all detected antibiotic resistance genes.
296296
The feature may reintroduced in future versions of GROOT but is not
297297
available in StaG now.
298298

299-
The read lengths input to `groot`_ must conform to the settings used during
300-
`groot`_ database construction. The length window can be configured in the
301-
config file.
299+
The read lengths used with `groot`_ should preferably conform to the settings
300+
used during `groot`_ database construction.
302301

303302
AMRPlusPlus_v2
304303
-------
@@ -353,10 +352,11 @@ BBMap
353352
:Tool: `BBMap`_
354353
:Output folder: ``bbmap/<database_name>``
355354

356-
This module maps read using `BBMap`_. The output is in gzipped SAM format. It
357-
is possible to configure the mapping settings almost entirely according to
358-
preference, with the exception of changing the output format from gzipped SAM.
359-
Use the configuration parameter ``bbmap:extra`` to add any standard BBMap
355+
This module maps read using `BBMap`_. The output is in sorted and indexed BAM
356+
format (with an option to keep the intermediary SAM file used to create the
357+
BAM). It is possible to configure the mapping settings almost entirely
358+
according to preference, with the exception of changing the output format. Use
359+
the configuration parameter ``bbmap:extra`` to add any standard BBMap
360360
commandline parameter you want.
361361

362362
Bowtie2
@@ -381,6 +381,8 @@ signifies a list)::
381381
- db_name: ""
382382
db_path: ""
383383
min_id: 0.76
384+
keep_sam: False
385+
keep_bam: True
384386
extra: ""
385387
counts_table:
386388
annotations: ""
@@ -400,6 +402,8 @@ configuration options, but with different settings. For example, to map against
400402
- db_name: "db1"
401403
db_path: "/path/to/db1"
402404
min_id: 0.76
405+
keep_sam: False
406+
keep_bam: True
403407
extra: ""
404408
counts_table:
405409
annotations: ""
@@ -412,6 +416,8 @@ configuration options, but with different settings. For example, to map against
412416
- db_name: "db2"
413417
db_path: "/path/to/db2"
414418
min_id: 0.76
419+
keep_sam: False
420+
keep_bam: True
415421
extra: ""
416422
counts_table:
417423
annotations: "/path/to/db2/annotations.txt"

workflow/Snakefile

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ from scripts.common import UserMessages, SampleSheet
2121

2222
user_messages = UserMessages()
2323

24-
stag_version = "0.6.0"
24+
stag_version = "0.6.1"
2525
singularity_branch_tag = "-master" # Replace with "-master" before publishing new version
2626

2727
configfile: "config/config.yaml"
@@ -31,7 +31,6 @@ citations = {publications["StaG"], publications["Snakemake"]}
3131
INPUTDIR = Path(config["inputdir"])
3232
OUTDIR = Path(config["outdir"])
3333
LOGDIR = Path(config["logdir"])
34-
TMPDIR = Path(config["tmpdir"])
3534
DBDIR = Path(config["dbdir"])
3635
all_outputs = []
3736

workflow/envs/Singularity.biobakery

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ From: mambaorg/micromamba:0.17.0
1111
-c bioconda \
1212
-c conda-forge \
1313
-c biobakery \
14-
python=3.7 metaphlan=4.0.3 humann=3.6 krona=2.8.1
14+
python=3.7 metaphlan=4.0.6 humann=3.7 krona=2.8.1
1515
micromamba clean --yes --all
1616

1717
humann_databases --download utility_mapping full /opt/humann --update-config yes

workflow/envs/Singularity.krakenuniq

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,20 @@ From: debian:11
88
AUTHOR boulund
99
VERSION 1.0
1010

11+
%environment
12+
export LC_ALL="C"
13+
1114
%post
1215
apt-get update && \
13-
apt-get install -y \
16+
apt-get install -y \
1417
bash \
1518
perl \
1619
make \
1720
g++ \
1821
libbz2-dev \
1922
zlib1g-dev \
20-
file \
21-
wget
23+
file \
24+
wget
2225

2326
mkdir -pv /opt/krakenuniq /opt/krakenuniq-src
2427

@@ -28,9 +31,9 @@ From: debian:11
2831
tar -xf v1.0.3.tar.gz
2932
cd krakenuniq-1.0.3
3033

31-
./install_krakenuniq.sh -l /usr/local/bin /opt/krakenuniq
34+
./install_krakenuniq.sh -l /usr/local/bin /opt/krakenuniq
3235

33-
rm -fv /opt/krakenuniq-src/v1.0.3.tar.gz
36+
rm -fv /opt/krakenuniq-src/v1.0.3.tar.gz
3437

3538

3639
%runscript

workflow/envs/humann.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@ channels:
55
- bioconda
66
- defaults
77
dependencies:
8-
- humann=3.6
8+
- humann=3.7

workflow/envs/metaphlan.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@ channels:
33
- conda-forge
44
- bioconda
55
- defaults
6-
- biobakery
76
dependencies:
8-
- metaphlan =4.0.3
7+
- metaphlan =4.0.6
98
- krona =2.8.1

workflow/rules/antibiotic_resistance/groot.smk

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,6 @@ rule groot_align:
4545
threads: 8
4646
params:
4747
index_dir=groot_config["index_dir"],
48-
minlength=groot_config["minlength"],
49-
maxlength=groot_config["maxlength"],
5048
shell:
5149
"""
5250
groot align \

0 commit comments

Comments
 (0)