File Format | Comments |
---|---|
SEQ/PRB | The file names do not need to be identical, but they must be appended with the phrases “_seq” and “_prb” respectively. For example, SRR01842a_seq.txt and SRR01842c_prb.txt. |
FASTQ (merged pairs) | Select this option for paired end files in FASTQ format that contain both reads in a pair in the same line in opposite orientation (Read 1 -> <- Read2). NextGENe converts these files by splitting each read in two. Two new files are created titled *_1.fasta and *_2.fasta with read names >*/1 and >*/2. The second half of the original read and the quality scores are reverse complemented. The file is then converted to .fasta format and quality filtering is implemented as with other FASTQ files. |
• SCARF Numeric • SCARF ASCII | Caution: Make sure to choose the correct quality score format – either Numeric or ASCII. |
• CFASTA | The SOLiD System instrument produces color space sequence reads in a .fasta format labeled as CSFASTA. If you select CFASTA as the input format type and FASTA as the output format type, then NextGENe converts the reads from color space to base space. Caution: Errors in color space can lead to the propagation of errors downstream within the read when converted to base-space, so SoftGenetics recommends that you leave the reads in color space. Note: You can select CSFASTA as the output format type to quality filter the CSFASTA files without conversion. If you select this option, the output file remains in color space. This option can be used to quality trim reads while maintaining color-space. This is the preferred conversion option for SOLiD System data. Note: You can quality trim reads using the .csfasta and .qual files only if the file names are identical, for example, SRR01842.cfasta and SRR01842_QV.qual. |
FASTA | Select this option and choose CSFASTA as the output format type to convert .fasta files in base space into .csfasta files in color space. |
Mate Pair SFF | Select this option for mate-pair files in SFF format that contain both reads in a pair in the same line. NextGENe converts these files by splitting each read in two. Two new files are created titled *_1.fna and *_2.fna with read names >*/1 and >*/2. The file is then converted to .fasta format and quality filtering is implemented as with other SFF files. |
Mate Pair FASTQ | Select this option for mate-pair files in FASTQ format that contain both reads in a pair in the same line. NextGENe converts these files by splitting each read in two. Two new files are created titled *_1.fna and *_2.fna with read names >*/1 and >*/2. The file is then converted to .fasta format and quality filtering is implemented as with other FASTQ files. |
Option | Description |
---|---|
Median score threshold >= [20] | Selected by default. Removes entire reads from the sample file when the median quality score is below the specified threshold. |
Max # of uncalled bases <= [3] | Selected by default. Remove entire reads from the sample file when the contains more N calls then specified. |
Called base number of each read >= [25] | Selected by default. Trims low quality bases from reads when a consecutive number of bases (“x”) falls below the specified qualify score threshold (“y”). Note: If Trimming is also selected, then the called base number that is used for this function is the number of bases that remain after trimming. |
Trim or reject read when >= [3] base(s) with score <= [16] | Selected by default. Trims low quality based from reds when a consecutive number of bases (“x”) falls below the specified quality score threshold (“y”). Note: For additional information about how this option works, see Trim or Reject Read While >= [x] Bases with Score <= [y]. |
Paired read data | Available only if paired read data is being analyzed. Select this option if you are converting a mate-paired or paired-end files. NextGENe uses a placeholder “N” for reads that are removed because of low quality, which is necessary to maintain mate-paired or paired-end read information. |
Remove 5’ [2] bases and 3’ [4] bases | Trims the specified number of bases from the 5’ and/0r 3’ ends. |
Keep only bases [0] to [0] | Trims the reads to keep the only the specified portion of the read. |
Trim by sequences | Select this option to trim reads where the specified sequence occurs. See Trim by Sequences. |
Trim by sequences in file | Select this option and then click the Browse button load a tab-delimited text file that contains the sequences by which the reads are to be trimmed. See Trim by Sequences in the File. |
Custom linker | Applicable only for mate-paired Roche data or mate-paired Ion Torrent data where both pairs are located in the same read. Select this option if you used a custom linker. NextGENe automatically detects the standard linker sequences. |
![]() | Even if you select the options by which to filter and trim low quality reads, at any time, you can click Default Settings to clear your options and replace them with the preset values from SoftGenetics. |
![]() | You can always load this file at a later date and process other data files according to the saved settings in the file. |