Why is the FORMATCLEANUP step running out of memory and failing with an error?
Why is the FORMATCLEANUP step running out of memory and failing with an error?
- This is usually the result of very large fastq.gz files. Very large fastq.gz files that have over 2–3 billion reads and/or have sequence lengths much longer than 55 bp can cause the Seeker pipeline to run out of memory when the hardware has limited RAM (256 GB or less).
- Reduce the barcode_chunk_size parameter in the Seeker Primary Analysis Pipeline. The default barcode_chunk_size default is 100,000. First, try lowering the value to 80,000. If that does not work, you can try reducing it further (e.g., 60,000). You may permanently reduce its size by modifying the value in the datasets_standard.config file. Alternatively, you may change the value temporarily for each run by adding a flag to your NextFlow command line statement, as shown below.
nextflow run /path/to/curioseeker-v3.0.0/main.nf
--input /absolute/path/to/samplesheet.csv
--outdir /absolute/path/to/results/
-work-dir/absolute/path/to/work/
--igenomes_base /absolute/path/to//absolute/path/to/reference/dir/
-profile singularity
--barcode_chunk_size 80000