diff --git a/sites/docs/src/content/docs/contributing/documentation.md b/sites/docs/src/content/docs/contributing/documentation.md
index c8c09be455..0cf169e710 100644
--- a/sites/docs/src/content/docs/contributing/documentation.md
+++ b/sites/docs/src/content/docs/contributing/documentation.md
@@ -38,7 +38,7 @@ Before you start writing, familiarize yourself with these essential resources:
### Style guide
-The [style guide](../developers/documentation/style_guide) covers all the essential styling rules for nf-core documentation, including:
+The [style guide](../developers/documentation/style_guide.md) covers all the essential styling rules for nf-core documentation, including:
- Voice and tone guidelines for conversational, concise writing
- Grammar and punctuation rules (British English, active voice, Oxford comma)
diff --git a/sites/docs/src/content/docs/running/configuration/nextflow-for-your-system.md b/sites/docs/src/content/docs/running/configuration/nextflow-for-your-system.md
index 1313d60a28..27acabc617 100644
--- a/sites/docs/src/content/docs/running/configuration/nextflow-for-your-system.md
+++ b/sites/docs/src/content/docs/running/configuration/nextflow-for-your-system.md
@@ -9,17 +9,19 @@ This page shows you how to configure pipelines to match your system's capabiliti
## Workflow resources
-The base configuration of nf-core pipelines defines default resource allocations for each workflow step (e.g., in the [`base.config`](https://github.com/nf-core/rnaseq/blob/master/conf/base.config) file).
+The base configuration of nf-core pipelines defines default resource allocations for each workflow step (for example, in the [`base.config`](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/base.config) file).
These default values are generous to accommodate diverse workloads across different users.
-Your jobs might receive more resources than needed, which can reduce system efficiency.
-You might also want to increase resources for specific tasks to maximise speed.
+However, your jobs might receive more resources than needed, which can reduce system efficiency.
+In contrast, You might also want to increase resources for specific tasks to maximise speed.
Consider increasing resources if a pipeline step fails with a `Command exit status` of `137`.
-Pipelines configure tools to use available resources when possible (e.g., with `-p ${task.cpus}`), where `${task.cpus}` is dynamically set from the pipeline configuration.
+:::note
+Pipelines are coded to configure tools to use available resources when possible (e.g., with `-p ${task.cpus}`), where `${task.cpus}` is dynamically set from the pipeline configuration.
Not all tools support dynamic resource configuration.
+:::
-Most process resources use process labels, as shown in this base configuration example:
+Most nf-core pipelines use process labels to define resource requirements for each module, as shown in this base configuration example:
```groovy
process {
@@ -48,7 +50,7 @@ process {
The `resourceLimits` list sets the absolute maximum resources any pipeline job can request (typically matching your machine's maximum available resources).
The label blocks define the initial default resources each pipeline job requests.
-When a job runs out of memory, most nf-core pipelines retry the job and increase the resource request up to the `resourceLimits` maximum.
+When a job runs out of memory, most nf-core pipelines will attempt to retry the job and increase the resource request up to the `resourceLimits` maximum.
### Customize process resources
@@ -56,7 +58,7 @@ When a job runs out of memory, most nf-core pipelines retry the job and increase
Copy only the labels you want to change into your custom configuration file, not all labels.
:::
-To set a fixed memory allocation for all large tasks across most nf-core pipelines (without increases during retries), add this to your custom configuration file:
+To set a fixed memory allocation for all large tasks across most nf-core pipelines (without increases during retries), add this to a custom Nextflow configuration file:
```groovy
process {
@@ -66,6 +68,10 @@ process {
}
```
+:::tip
+To find the default labels and resources of the pipeline you want to optimise, go to its GitHub repository and look at the file `conf/base.config`.
+:::
+
You can target a specific process (job) name instead of a label using `withName`.
Find process names in your console log when the pipeline runs.
For example:
@@ -88,13 +94,9 @@ process {
}
```
-:::info
-If you receive a warning about an unrecognised process selector, check that you specified the process name correctly.
-:::
-
For more information, see the [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html#process-selectors).
-After writing your [configuration file](#custom-configuration-files), supply it to your pipeline command with `-c`.
+After writing your [configuration file](#custom-configuration-files), supply it to your pipeline command with `-c //.conf`.
:::warning
Check your syntax carefully.
@@ -105,11 +107,16 @@ Use quotes with a space or no quotes with a dot: `"200 GB"` or `200.GB`.
See the Nextflow documentation for [memory](https://www.nextflow.io/docs/latest/process.html#memory), [cpus](https://www.nextflow.io/docs/latest/process.html#cpus), and [time](https://www.nextflow.io/docs/latest/process.html#time).
:::
+:::info
+If you receive a warning when about an unrecognised process selector when running a pipeline, check that you specified the process name correctly.
+:::
+
If the pipeline defaults need adjustment, contact the pipeline developers on Slack in the pipeline channel or submit a GitHub issue on the pipeline repository.
## Change your executor
-Nextflow pipelines run in local mode by default, executing jobs on the same system where Nextflow runs.
+Nextflow pipelines run in 'local' mode by default, executing jobs on the same system where Nextflow runs and assuming all tools the pipeline need are already on the machine's environment `$PATH`.
+
Most users need to specify an executor to tell Nextflow how to submit jobs to a job scheduler (e.g., SGE, LSF, Slurm, PBS, or AWS Batch).
You can configure the executor in shared configuration profiles or in custom configuration files.
@@ -135,12 +142,12 @@ process {
```
When a job exceeds the default memory request, Nextflow retries the job with increased memory.
-The memory increases with each retry until the job completes or reaches the `256.GB` limit.
-
-These parameters cap resource requests to prevent Nextflow from submitting jobs that exceed your system's capabilities.
+The memory increases with each retry until the job completes or reaches one of the limits, such as `256.GB` for memory.
+:::warning
Specifying resource limits does not increase the resources available to pipeline tasks.
See [Tuning workflow resources](#tuning-workflow-resources) for more information.
+:::
:::note{collapse title="Note on older nf-core pipelines"}
@@ -166,8 +173,10 @@ The `--max_` parameters represent the maximum for a single pipeline jo
## Customize Docker registries
-Most pipelines use `quay.io` as the default Docker registry for Docker and Podman images.
-When you specify a Docker container without a full URI, Nextflow pulls the image from `quay.io`.
+Most nf-core pipelines use `quay.io` as the default Docker registry for Docker and Podman images.
+In some cases, you may want to customise where a pipeline sources their images.
+
+By default, when you specify a Docker container without a full URI, Nextflow pulls the image from `quay.io`.
For example, this container specification:
@@ -177,15 +186,11 @@ Pulls from `quay.io`, resulting in the full URI:
- `quay.io/biocontainers/fastqc:0.11.7--4`
-If you specify a different `docker.registry` value, Nextflow uses that registry instead.
+If you specify a different `docker.registry` value in a configuration file, Nextflow uses that registry instead.
For example, if you set `docker.registry = 'myregistry.com'`, the image pulls from:
- `myregistry.com/biocontainers/fastqc:0.11.7--4`
-When you specify a full URI in the container specification, Nextflow ignores the `docker.registry` setting and pulls exactly as specified:
-
-- `docker.io/biocontainers/fastqc:v0.11.9_cv8`
-
## Update tool versions
The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of nf-core pipelines uses one container or Conda environment per process, which simplifies software dependency maintenance and updates.
@@ -232,13 +237,12 @@ You can override the default container by creating a custom configuration file a
```
:::warning
-Pipeline developers provide no warranty when you update containers.
-Major changes in the container tool may break the pipeline.
+Note that when you specify a full URI in the container specification, Nextflow ignores the `docker.registry` setting and pulls exactly as specified.
:::
:::warning
-Tool developers sometimes change version reporting between updates.
-Container updates may break version reporting within the pipeline and create missing values in MultiQC version tables.
+Pipeline developers provide no warranty when you update containers.
+Major changes in the container tool may break the pipeline or see a degradation of output (e.g. missing versions in MultiQC reports when the tool changes how versions are reported).
:::
## Modifying tool arguments
diff --git a/sites/docs/src/content/docs/running/configuration/overview.md b/sites/docs/src/content/docs/running/configuration/overview.md
index a902f7b334..4546718895 100644
--- a/sites/docs/src/content/docs/running/configuration/overview.md
+++ b/sites/docs/src/content/docs/running/configuration/overview.md
@@ -21,11 +21,14 @@ For pipeline-specific parameters, see the pipeline documentation.
## Configuration options
-You can configure pipelines using three approaches:
+You can configure pipelines to your infrastructure using three approaches:
-1. [Default pipeline configuration profiles](#default-configuration-profiles)
-2. [Shared nf-core/configs configuration profiles](#shared-nf-coreconfigs)
-3. [Custom configuration files](#custom-configuration-files)
+- [Configuration options](#configuration-options)
+- [Choosing your configuration approach](#choosing-your-configuration-approach)
+ - [Default configuration profiles](#default-configuration-profiles)
+ - [Shared nf-core/configs](#shared-nf-coreconfigs)
+ - [Custom configuration files](#custom-configuration-files)
+- [Additional resources](#additional-resources)
:::warning{title="Do not edit the pipeline code to configure nf-core pipelines"}
Editing pipeline defaults prevents you from updating to newer versions without overwriting your changes.
@@ -48,7 +51,7 @@ Use shared nf-core/configs when:
Use custom configuration files when:
-- You need specific resource limits
+- You need pipeline-specific resource limits
- Running on unique infrastructure
- You are the only user of the pipeline
@@ -65,6 +68,7 @@ Order matters.
Profiles load in sequence.
Later profiles overwrite earlier ones.
:::
+
nf-core provides these basic profiles for container engines:
- `docker`: Uses [Docker](http://docker.com/) and pulls software from quay.io
@@ -78,7 +82,7 @@ nf-core provides these basic profiles for container engines:
Use Conda only as a last resort (that is, when you cannot run the pipeline with Docker or Singularity).
:::
-Without a specified profile, the pipeline runs locally and expects all software to be installed and available on the `PATH`.
+Without a specified profile, the pipeline runs locally and expects all software to be installed and available on the `$PATH`.
This approach is not recommended.
Each pipeline includes `test` and `test_full` profiles.
@@ -97,9 +101,11 @@ If not, follow the repository instructions or the tutorial to add your cluster.
### Custom configuration files
-If you run the pipeline alone, create a local configuration file.
+If you run the pipeline alone on a local machine, create a local configuration file.
Nextflow searches for configuration files in three locations:
+
+
1. User's home directory: `~/.nextflow/config`
2. Analysis working directory: `nextflow.config`
3. Custom path on the command line: `-c path/to/config` (you can specify multiple files)
@@ -113,11 +119,15 @@ The loading order is:
4. Each `-c` file in the order you specify
5. Command line parameters (`--`)
-:::warning
+
+
+:::warnings
Parameters in `custom.config` files will not override defaults in `nextflow.config`.
Use `-params-file` with YAML or JSON format instead.
:::
+
+
:::tip
Generate a parameters file using the **Launch** button on the [nf-co.re website](https://nf-co.re/launch).
:::
diff --git a/sites/docs/src/content/docs/running/reference-genomes.md b/sites/docs/src/content/docs/running/reference-genomes.md
index dcbbeb23c1..058167181d 100644
--- a/sites/docs/src/content/docs/running/reference-genomes.md
+++ b/sites/docs/src/content/docs/running/reference-genomes.md
@@ -7,9 +7,60 @@ shortTitle: Reference genomes
Many nf-core pipelines use reference genomes for alignment, annotation, and similar tasks.
This page describes available approaches for managing reference genomes.
+There are three main ways to use reference genomes with nf-core pipelines:
+
+- [Local copies of genomes](#local-copies-of-genomes): user downloaded and self-managed
+- [AWS iGenomes](#aws-igenomes): Illumina-hosted pre-build reference genomes and indices
+- [Refgenie](#refgenie): programmatic genome asset management tool
+
+## Local copies of genomes
+
+Most genomics nf-core pipelines can start from just a FASTA and GTF file and create downstream reference assets (genome indices, interval files, etc.) as part of pipeline execution.
+
+Using GRCh38 as an example:
+
+1. Download the latest files:
+
+ ```bash
+ #!/bin/bash
+
+ VERSION=108
+ wget -L ftp://ftp.ensembl.org/pub/release-$VERSION/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
+ wget -L ftp://ftp.ensembl.org/pub/release-$VERSION/gtf/homo_sapiens/Homo_sapiens.GRCh38.$VERSION.gtf.gz
+ ```
+
+2. Run pipeline with `--save_reference` to generate indices:
+
+ ```bash
+ nextflow run \
+ nf-core/rnaseq \
+ --input samplesheet.csv \
+ --fasta Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz \
+ --gtf Homo_sapiens.GRCh38.108.gtf.gz \
+ --save_reference
+ ```
+
+ :::note
+ The pipeline will generate and save reference assets. For example, the STAR index will be stored in `/genome/index/star`.
+ :::
+
+3. Move generated assets to a central, persistent storage location for re-use in future runs.
+4. Use pre-generated indices in future runs.
+
+ ```bash
+ nextflow run \
+ nf-core/rnaseq \
+ --input samplesheet.csv \
+ --fasta Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz \
+ --gtf Homo_sapiens.GRCh38.108.gtf.gz \
+ --star_index \
+ --gene_bed
+ ```
+
## AWS iGenomes
-AWS iGenomes is Illumina's centralized resource that organizes commonly used reference genome files in a consistent structure for multiple genomes:
+AWS iGenomes is Illumina's centralized resource that organizes commonly used reference genome and pre-built index files in a consistent structure for multiple genomes.
+It provides the following benefits:
- Hosted on AWS S3 through the [Registry of Open Data](https://registry.opendata.aws/aws-igenomes/)
- Free to access and download
@@ -25,39 +76,37 @@ Consider using custom genomes for current annotations.
:::warning{title="GRCh38 assembly issues"}
GRCh38 in iGenomes comes from NCBI instead of Ensembl, not the masked Ensembl assembly.
-This can cause pipeline issues in some cases.
-See [nf-core/rnaseq issue #460](https://github.com/nf-core/rnaseq/issues/460) for details.
+This can cause pipeline issues in some cases. See [nf-core/rnaseq issue #460](https://github.com/nf-core/rnaseq/issues/460) for details.
For GRCh38 with masked Ensembl assembly, use [Custom genomes](#custom-genomes).
:::
### Use remote AWS iGenomes
-To use remote AWS iGenomes:
+To use remote AWS iGenomes in supported nf-core pipelines, supply the `--genome` flag to your pipeline (e.g., `--genome GRCh37`).
+On execution the pipeline will then:
-1. Supply the `--genome` flag to your pipeline (e.g., `--genome GRCh37`).
-1. Pipeline automatically downloads required reference files.
-1. Reference genome parameters are auto-populated from `conf/igenomes.config`.
+1. Automatically download required reference files.
+2. Auto-populated reference genome parameters from `conf/igenomes.config`.
- Parameters like FASTA, GTF, and index paths are set automatically.
-1. Pipeline downloads only what it requires for that specific workflow.
+3. Download only what it requires for that specific workflow.
:::tip
Downloading reference genome files takes time and bandwidth.
-We recommend using a local copy when possible.
-See [Use local AWS iGenomes](#use-local-aws-igenomes) for more information.
+We recommend using a local copy when possible. See [Use local AWS iGenomes](#use-local-aws-igenomes) for more information.
:::
### Use local AWS iGenomes
To use local AWS iGenomes:
-1. Download the iGenomes reference files you need to a local directory.
-1. Set `params.igenomes_base` to your local iGenomes directory path.
+1. [Download](https://github.com/ewels/AWS-iGenomes?tab=readme-ov-file#download-script) the iGenomes reference files you need to a local directory.
+2. Set `--igenomes_base` to your local iGenomes directory path.
:::warning
- This path must reflect the structure defined in `conf/igenomes.config`.
+ This directory structure must reflect the structure defined in [`conf/igenomes.config`](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/igenomes.config).
:::
-1. Pipeline will use local files instead of downloading from AWS.
+3. Pipeline will use local files instead of downloading from AWS.
### Check annotation versions
@@ -69,7 +118,7 @@ To check the version of annotations used by AWS iGenomes:
aws s3 cp --no-sign-request s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt
```
-1. View the README to see annotation details:
+2. View the README to see annotation details:
```bash
cat README.txt
@@ -85,62 +134,6 @@ To check the version of annotations used by AWS iGenomes:
This confirms the annotations are from Ensembl release 75 (July 2015), which is significantly outdated.
-## Custom genomes
-
-Use custom genomes when AWS iGenomes doesn't meet your requirements.
-
-Custom genomes allow you to:
-
-- Use current genome annotations
-- Avoid repetitive index generation
-- Maintain full control over reference files
-- Achieve faster pipeline execution when indices are pre-generated
-
-### Use custom genomes
-
-Most genomics nf-core pipelines can start from just a FASTA and GTF file and create downstream reference assets (genome indices, interval files, etc.) as part of pipeline execution.
-
-Using GRCh38 as an example:
-
-1. Download the latest files:
-
- ```bash
- #!/bin/bash
-
- VERSION=108
- wget -L ftp://ftp.ensembl.org/pub/release-$VERSION/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
- wget -L ftp://ftp.ensembl.org/pub/release-$VERSION/gtf/homo_sapiens/Homo_sapiens.GRCh38.$VERSION.gtf.gz
- ```
-
-1. Run pipeline with `--save_reference` to generate indices:
-
- ```bash
- nextflow run \
- nf-core/rnaseq \
- --input samplesheet.csv \
- --fasta Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz \
- --gtf Homo_sapiens.GRCh38.108.gtf.gz \
- --save_reference
- ```
-
- :::note
- The pipeline will generate and save reference assets.
-For example, the STAR index will be stored in `/genome/index/star`.
- :::
-
-1. Move generated assets to a central, persistent storage location for re-use in future runs.
-1. Use pre-generated indices in future runs.
-
- ```bash
- nextflow run \
- nf-core/rnaseq \
- --input samplesheet.csv \
- --fasta Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz \
- --gtf Homo_sapiens.GRCh38.108.gtf.gz \
- --star_index \
- --gene_bed
- ```
-
## Refgenie
Refgenie provides programmatic genome asset management as an alternative to manual file handling.
@@ -157,13 +150,15 @@ Refgenie allows you to:
To use Refgenie:
1. Install Refgenie following the [official documentation](http://refgenie.databio.org/).
-1. Initialize Refgenie.
+2. Initialize Refgenie.
:::note
Refgenie creates `~/.nextflow/nf-core/refgenie_genomes.config` and appends an `includeConfig` statement to `~/.nextflow/config` that references this file.
:::
-1. Pull required genome assets. For example:
+
+
+3. Pull required genome assets. For example:
```bash
refgenie pull t7/fasta
@@ -171,7 +166,7 @@ To use Refgenie:
```
Asset paths are automatically added to `~/.nextflow/nf-core/refgenie_genomes.config`.
-For example:
+ For example:
```groovy title="refgenie_genomes.config"
// This is a read-only config file managed by refgenie. Manual changes to this file will be overwritten.
@@ -186,7 +181,7 @@ For example:
}
```
-1. Run your pipeline with the required genome. For example:
+4. Run your pipeline with the required genome. For example:
:::bash
nextflow run nf-core/ --genome t7
diff --git a/sites/docs/src/content/docs/running/run-pipelines-offline.md b/sites/docs/src/content/docs/running/run-pipelines-offline.md
index 45b19ff9f0..6de990ceb4 100644
--- a/sites/docs/src/content/docs/running/run-pipelines-offline.md
+++ b/sites/docs/src/content/docs/running/run-pipelines-offline.md
@@ -21,21 +21,21 @@ Running pipelines offline requires three main components:
To transfer Nextflow to an offline system:
1. [Install Nextflow](https://nextflow.io/docs/latest/getstarted.html#installation) in an online environment.
-1. Run your pipeline locally.
+2. Run your pipeline locally.
:::note
Nextflow fetches the required plugins.
-It does not need to run to completion.
+ It does not need to run to completion.
:::
-1. Copy the Nextflow binary and `$HOME/.nextflow` folder to your offline environment.
-1. In your Nextflow configuration file, specify each plugin (both name and version), including default plugins.
+3. Copy the Nextflow binary and `$HOME/.nextflow` folder to your offline environment.
+4. In your Nextflow configuration file, specify each plugin (both name and version), including default plugins.
:::note
This prevents Nextflow from trying to download newer versions of plugins.
:::
-1. Add the following environment variable in your `~/.bashrc` file:
+5. Add the following environment variable in your `~/.bashrc` file:
```bash title=".bashrc"
export NXF_OFFLINE='true'
@@ -55,24 +55,27 @@ To transfer pipeline code to an offline system:
Add the argument `--container singularity` to fetch the singularity container(s).
:::
-1. Transfer the `.tar.gz` file to your offline system and unpack it.
+2. Transfer the `.tar.gz` file to your offline system and unpack it.
:::note
The archive contains directories called:
- - `workflow`: The pipeline files
- - `config`: [nf-core/configs](https://github.com/nf-core/configs) files
- - `singularity`: Singularity images (if you used `--container singularity`)
+ - `workflow/`: The pipeline files
+ - `config/`: [nf-core/configs](https://github.com/nf-core/configs) files
+ - `singularity/`: Singularity images (if you used `--container singularity`)
:::
:::tip
If you are downloading _directly_ to the offline storage (e.g., a head node with internet access whilst compute nodes are offline), use the `--singularity-cache-only` option for `nf-core pipelines download` and set the `$NXF_SINGULARITY_CACHEDIR` environment variable.
-This reduces total disk space by downloading singularity images to the `$NXF_SINGULARITY_CACHEDIR` folder without copying them into the target downloaded pipeline folder.
+
+ This reduces total disk space by downloading singularity images to the `$NXF_SINGULARITY_CACHEDIR` folder without copying them into the target downloaded pipeline folder.
:::
### Transfer reference genomes offline
To use nf-core reference genomes offline, download and transfer them to your offline cluster.
-See [Reference genomes](./reference_genomes.md) for more information.
+See [Reference genomes](./reference-genomes.md) for more information.
+
+
## Additional resources