Is the Seeker Primary Analysis Pipeline compatible with MGI/BGI sequencers?
Is the Seeker Primary Analysis Pipeline compatible with MGI/BGI sequencers?
- No, but the FASTQ files can be edited to work with the Seeker Primary Analysis Pipeline. The fastq.gz files generated by MGI/BGI sequencers are incompatible with the Seeker pipeline. They are in the following format (notice the designator for Read 1 and Read 2, highlighted in yellow).
Example R1.fastq.gz:
@HWI-ST12345:1:1:1001:1000/1
ACGTACGTACGTACGTACGTACGTACGTACGT
+
!''*((((***)))))))))***)))))))))
@HWI-ST12345:1:1:1002:1000/1
CGTACGTACGTACGTACGTACGTACGTACGT
+
*''*((((***)))))))))***)))))))))
Example R2.fastq.gz:
@HWI-ST12345:1:1:1001:1000/2
TACGTACGTACGTACGTACGTACGTACGTACG
+
!''*((((***)))))))))***)))))))))
@HWI-ST12345:1:1:1002:1000/2
GTACGTACGTACGTCGTACGTACGTACGTAC
+
*''*((((***)))))))))***)))))))))
To ensure the Seeker pipeline recognizes the sequence pairs properly, a space needs to be added before the Read 1/Read 2 designator (/1 or /2) in the FASTQ file (see below). This will make sure the sequence names will match properly, because the Seeker pipeline only parses the sequence name up to the first space.
Example R1.fastq.gz:
@HWI-ST12345:1:1:1001:1000 /1
ACGTACGTACGTACGTACGTACGTACGTACGT
+
!''*((((***)))))))))***)))))))))
@HWI-ST12345:1:1:1002:1000 /1
CGTACGTACGTACGTACGTACGTACGTACGT
+
*''*((((***)))))))))***)))))))))
Example R2.fastq.gz:
@HWI-ST12345:1:1:1001:1000 /2
TACGTACGTACGTACGTACGTACGTACGTACG
+
!''*((((***)))))))))***)))))))))
@HWI-ST12345:1:1:1002:1000 /2
GTACGTACGTACGTCGTACGTACGTACGTAC
+
*''*((((***)))))))))***)))))))))