Transcriptome project with Alternative splicing alignment settings

NextGENe Online Help : Sequence Alignment Tool : Transcriptome Alignment Project with Alternative Splicing in the NextGENe Sequence Alignment Tool : Transcriptome project with Alternative splicing alignment settings

The settings that are available for a Transcriptome alignment project with Alternative splicing are very different from the alignment settings for all other application types.

• Analysis Options

Setting	Description
Auto Detect PE Library Size	Available only if Paired Reads is selected. Select this option if you do not want to manually specify the library size. Instead, NextGENe automatically determines the library size.
Paired Reads	Select this option if you are analyzing paired reads. Note: Processing paired read data for transcriptome analysis requires at least 24GB of RAM, and takes significant processing time. If your system does not have sufficient RAM, or paired end information is not critical for your project, you can clear this option to process the data as single reads.
Library Size: Min [ ] Max [ ]	Available only if Paired Reads is selected and Auto Detect PE Library Size is not selected. You must manually enter the size of the DNA fragment that is being used for sequencing.
Match Reference	Applicable only if BAM sample files were loaded. Click this option to match the reference that was used to create the BAM file with the reference that was loaded during the Load Data step for the project. See To load the reference files.

• Parameters for Alternative Splicing Analysis

Setting	Description
Seed Length	The size of the seeds that should be used for the first step of the Transcriptome Alignment algorithm.
Move Step	The distance in base pairs between the starting points for each seed.
• Min Coverage in Annotated Region • Minimum Coverage in Unannotated Regions	Set the value to the coverage depth that is expected for the data. If the experimental coverage for the region meets or exceeds this threshold, then an exon is called in this region. Note: A higher minimum coverage value results in faster data processing, and more specific, but less sensitive, results.
Allowable Ambiguous Number	The maximum number of allowed matches for each seed. For example, if you have a seed that matches to 100 positions in the reference sequence, and the Allowable Ambiguous Number is set to 20, then only the first 20 matches are considered for analysis. Note: The allowed range is 10-50.
Remove Non-Linked Exons	Remove any exons that do not have a link. Note: Removing these exons reduces the noise in the analysis.
Single-Strand Sequencing	Select this option if single strand sequencing was carried out on the samples. Forward and reverse coverage information is also used to separate overlapping transcripts.
Ignore Fusions Between Similar Genes	Select this option to improve the accurate detection of fusion genes. Eliminates fusion calls between genes with similar names, for example, ABCD1 and ABCD2.
Rigorous Fusion Detection	Select the option to improve the accurate detection of fusion genes.
Ambiguous Alignment for Similar Genes	By default, NextGENe checks for similarity between transcript calls. After the initial alignment, it checks for transcripts that are 95% similar in their calls, and then after the final alignment, it checks for transcripts that are 80% similar in their calls. NextGENe removes the called transcripts that meet or exceed these similarity thresholds. Select this option to disable this check and keep all called transcripts, regardless of similarity. Note: In most cases, if you select this option, then the processing time and the number of called transcripts are increased, but the number of mapped reads is not significantly increased.

• Parameters for New Gene Detection

Setting	Description
Exon Size Min [ ] Max [ ]	The range in bps for a region to be called an exon.
Average Coverage	The expected coverage for calling an exon, which is carried out in the second alignment step. This value is used is similarly to the alternative splicing's average coverage option of the first alignment step. Note: The value that you enter here is not an absolute threshold. It is used simply as an approximation when calling an exon.
Intron Size Min [ ] Max [ ]	The expected range in bps for introns (the regions between called exons).
Donor-Acceptor	Defines the beginning and ending base pairs for identifying a region that can be called as an exon.