Glossary
BED file
Also known as Region of Interest (*.bed file). A BED file is a tab-delimited text file. You can upload a BED file only if the reference sequence contains chromosome information. Each row in the file contains a region of the reference that is to be used for the analysis, and at a minimum, the file must contain the following information:
• Field #1 - Chromosome number or name for the region (for example, chrM)
• Field #2 - Chromosome start position (for example, 300)
• Field #3 - Chromosome end position (for example, 305)
• Field #4 - Optional description column (for example, Region 102)
The positions are 0-based and open-ended. For example, a start of 10 and end of 15 would include reference positions 11, 12, 13, 14, and 15.
Consensus alignment
An optional alignment step. Because each read is considered on its own, indels that are near the end of reads are sometimes not correctly aligned. If this alignment option is selected, then an extra step is added after the initial alignment. In this extra step, indel positions are
re-examined to possibly improve the alignment of these reads based on the alignment of other reads that have the indel.
Motif alignment
An optional alignment step. When equivalent sequences can be aligned in multiple ways (motifs), an analyst might not prefer the alignment method that the software selects. This step ensures that the selected motif is the motif that is defined by the motif file.
PCR duplicates
PCR duplicates are a set of pairs (paired-end data) or reads (non-paired-end data) that have been generated from the same original fragment. The “Remove PCR Duplicates” optional alignment step attempts to identify these reads and ignores all but one pair (or read) in each set.
Personal Health Information (PHI)
Personal health information can be inferred from some regions of whole mtDNA sequence. To maintain privacy during whole mtDNA analysis, a BED file can be used to specify the positions that must be hidden from the analyst.
Proper Pairs
A pair of aligned reads is referred to as a “proper pair” if the reads are aligned on opposite strands in the correct orientation. Non-proper pairs can result from misalignment, sequencing error, or off-target amplification.