Alignment settings – Preloaded reference

NextGENe Online Help : Sequence Alignment Tool : Sequence Alignment Tool Settings : Alignment settings – Preloaded reference

The following settings are available for .fasta sample files and BAM files with the Realignment option selected. If you load aligned BAM sample files without the Realignment option selected, then see BAM Sample Files settings.

Setting	Description
Reads: • Allowable Mismatched Bases [ ]	• If a read does not align exactly to the reference, then the entire read can still be aligned to the reference if the number of mismatched bases does not exceed the indicated threshold. If the read cannot be aligned with this number of mismatches, it might still be possible to align the read using seed sequences.
• Allowable Ambiguous Alignments	• Applies to reads that match perfectly to the reference sequence or to reads that have a number of mismatches less than the threshold for Allowable Mismatched Bases. For perfectly matched read, or a read that has a number of mismatches, if multiple matching locations are found, the read is aligned to the reference sequence up to the specified number of ambiguous alignments that are allowed. If this option is set to “1,” the read is aligned to the first matching position from the start of the reference. If this option is set to “0,” then a read that matches at multiple locations is not aligned to the reference.
Seed [x] Bases, Move Step [y] Bases	“x” is the length of the seed that is used to determine the matching positions in the reference genome. “y” is the number bases between seed start positions.
Inspect Input Files	Click this option to have NextGENe automatically set the values for Allowable Mismatched Bases, Seed Bases/Move Step Bases, and Allowable Alignments. Note: If multiple data files are being analyzed, each value is the total for all files.
Allowable Alignments [ ]	If a seed matches more than this number of positions in the reference genome, then the seed is ignored.
Overall Matching Base Percentage >= [85]	The percentage of the read that must match to the reference genome for the read to be aligned to the reference. Default value is 85.
Detect Large Indels	After an initial alignment is carried out, a consensus sequence is created and if an indel is found that occurs in at least 5% of the reads, this indel in reflected in the consensus sequence. The reads are then aligned again to this consensus sequence. Note: This option helps to align reads that include indels towards the end of the read, which in turn, allows allow for correctly calling the variant in the Mutation report. Processing time increases if this option is selected.