Please review the questions and answers below. If you still need further help please contact us.
1. General Software Questions
Mutation Surveyor detects mutations and SNPs with DNA sequencing trace data. The variations of sample DNA sequence traces are compared to wildtype and GenBank sequence traces.
There are only three simple steps to produce results using the automatic software:
I. Data Entry
II. Automated analysis
III. Reporting
The software has many advantages for mutation analysis: the software detects real mutations and ignores base calling errors; the analysis is automated, making analysis a simple, fast process; the software is 99% accurate in detecting mutations; the software uses direct trace comparison for mutation detection; there are different reporting options to meet specific research or clinical needs.
Mutation Surveyor is designed for researchers to tweak mutation detection, whereas Mutation Explorer is designed for clinical diagnostics where consistency is critical. The parameters are fixed in Explorer.
You can obtain a demo version (6 lanes) of the software by downloading it from our website. In order to obtain a 30 day trial version, you need to register with our company and you will be given a password with which you can download the trial version. If you prefer a hard copy, we can Federal Express a trial version of the software anywhere in North America overnight.
2. Data Entry Questions
There are a few options for data input:
I. If you do not have a GenBank sequence for the gene of interest, simply enter your trace samples in the third window and click "OK" and the software will automatically download the sequence from the database.
II. Enter a GenBank file in the first panel and trace samples in the third panel to compare the trace files to the GenBank sequence text.
III. You may input all of the sample traces in the reference panel and the software will choose the best quality trace as a reference.
IV. Use your own reference samples and enter them in the second window along with your trace samples in the third window (and GenBank sequence in the first window: Optional).
Data analysis is most accurate when both forward and reverse sequence traces are used.
I. Software assumes that the filenames with only _F and _R or _2F and _2R differences are from the same sample.
II. If the filenames of both forward and reverse do not follow any rules, you may construct a 2D text file containing the sample filenames. The text file shall contain many columns and rows, and each row represents filenames of a sample. The text file must be in the same directory as the trace file. You may check "Load 2D Match" adjacent to "OK" in the open-data-file dialog box.
III. To sort forward and reverse trace files, go to 2D Filename Match Editor in the Tools menu or the ND Filename Match Editor if you have more than 2 primers. Load the data, and the Editor will sort forward and reverse samples according to the chosen model. Save the 2D files as text files (*.txt) and check Load 2D Match when adding the samples for analysis.
If you have over 400 samples or data from a whole gene, it is necessary to utilize the Whole Gene Data feature of the software. Analyze the samples in projects of 400 or less samples. In order to merge the project files, use the Open Whole Gene Data option under the File menu. This feature displays mutation analysis of a whole gene.
Currently Mutation Surveyor cannot read files that were generated using the SeqScape auto-analysis software. However, it is possible to export files into standard data collection software rather than allowing the samples to be analyzed with SeqScape's autoanalysis program as they are generated. By exporting the files to standard data collection software, it is possible to import the files into both SeqScape and Mutation Surveyor for analysis.
3. Data Analysis Questions
Scroll through each mutation and delete those that only appear in one direction (Forward or Reverse) and not in both directions. Mutations that appear in both directions and are written in blue are most likely real and should be kept. If the quality of the sample lane is poor (0-10) and a 1 directional mutation is written in red, either delete the mutation or the entire lane by right clicking on Lane Quality.
The software detects homozygous and heterozygous point mutations and indels.
To verify the existing mutations, open the 2D Output Table and check the frequency of each mutation located below the list of mutations. If the mutation frequency is high, the mutation is most likely real. If a mutation is written in blue, the confidence is high, but if it is written in red, the confidence is low and should be checked by double-clicking the cell to bring up the electropherogram. If the background of the cell is shaded in purple, the mutation has been reported to the database.
In order to prevent the software from missing mutations, it is important to specify the directional parameters of the analysis by clicking the 2 Directional icon if you are using bi-directional data and the 1 directional icon if you are working with samples in one direction.
The software adds mutations in two different instances:
I. When the dropping factor of a mutation is > .15 and the intensity is >500, yet the score does not meet the detection threshold of 5.00, the software adds this mutation, because the low score is most likely due to mobility shift.
II. When the data is noisy, but the frequency of the mutation is high (>10%).
The software deletes false positive mutations when they exist only in 1 direction with lower local quality than the other direction for 2 directional data.
Since the software remembers the last used settings, it is best to click the default settings for all tabs under Process >> Options. This will prevent the software from missing mutations as a result of incorrect parameter settings.
You may exclude poor quality data from the report either before or after data analysis. To exclude prior to analysis, go to Process>Options>Output and set the Lane Quality Threshold to 5 for example. The software will then reject lanes with a quality score (signal to noise ratio) below 5 from the statistical mutation frequency calculations. You may also delete lanes with poor quality data after running the analysis by right clicking directly above the electropherogram where it says "Quality." The mutations in this lane will then be excluded from the overall mutation frequency calculations.
First, under Process à Options, choose the Mutation tab, then click on "High Sensitivity". Second, analyze with 1D parameters, by clicking on the icon in the toolbar and ensuring that the colors are gray and blue, rather than red and blue , which indicates 2D settings. By using high sensitivity and 1D settings, you will detect mutations in areas of high noise, although you will need to review the called mutations because you may have more false positives using this method.
First, under Process à Options, choose the Mutation tab, then click on "Medium Sensitivity". Second, analyze with 2D parameters, by clicking on the icon in the toolbar and ensuring that the colors are red and blue, rather than gray and blue , which indicates 2D settings. By using medium sensitivity and 2D settings, you will miss mutations in regions of high noise.
4. System Tools Questions
To group sample files for whole gene analysis, open the 2D Output Table and click the Whole Gene Output Icon. Select Filename Match and enter desired characters. This function will group samples with the same designated characters together in the output reports. For example, if you select characters 1-7, the software will group all samples with identical characters 1-7 in the output report:
In order to change the numbering for the coding region in the SEQ and GBK File Editors, select Adjust and choose the desired number for the first coding base. If you enter 1, the number of the first base in the coding sequence will be 1.
In order to change the numbering for the coding region in the SEQ and GBK File Editors, select Adjust and choose the desired number for the first coding base. If you enter 1, the number of the first base in the coding sequence will be 1.
To download GenBank Files, follow this guide.
When you copy/paste sequences into the SEQ File Editor, you need to enter the specific numbering for the "Number of the First Base" and "Coding Sequence" and then click Refresh. If you would like the coding sequence to begin at number 1, click on Adjust and have the "Number of the Coding Base" start at 1.
The user can change the reading frame for the reference trace by opening the .seq file in the SEQ File Editor (Tools Menu) and specify the reading frame (1, 2, 3). Save the file and then input as the reference in data entry. The user may also specify the reading frame for .gbk files in Features 2 of the GBK File Editor.
5. Data Output Questions
The software has various output reports available including 1Directional, 2 Directional, Whole Gene Data, Bentley (shows flanking sequence around mutations), Pretty Base (displays the alleles at mutation sites), graphic display, and sequence text report.
To save the reports, we suggest you to save them as text, htm files, then open in Excel, where it is possible to manipulate the report before printing.
Go to Process >> Options >> Output and select either "Region of Interest Only" or "Exon Only".
To view comments in the Output Tables Go to Process >> Options >> Output and check the Comments box.
To print the mutation reports, it is suggested that you save the report as either a text file (*.txt) or Excel file (*.xls), open in an Excel spreadsheet, and print using Excel.
To print a clinical report, click on the print icon and when the page opens, click the print sample Icon , and a page will open that displays electropherograms of all the mutations for that sample.
6. Mutation and Quality Score Questions
Our mutation score is from the theoretical calculation of signal to noise ratio with the normal distribution. Signal is defined as a peak in the mutation electropherogram and noise is defined as the smaller peaks surrounding the mutation in the electropherogram. Mutation Score = -10 log[erfc(s/n)] of a math function.
The Lane Quality is a measurement of the average signal to noise ratio. A lane quality score of 20 signifies that there is 5% noise in that lane. (n/s = 1/20 = .05)
7. Process Options Questions
When unselected, the software uses 300 as starting point and sorts lanes with a starting base difference greater than 300 bases into two contigs. For example, the reference covers 1 -- 700 bases, sample one covers 10 to 600 base, and the 2nd sample covers 500-900 bases, then the software would sort them into the same contig. However, it would be very difficult to find mutations, because of the fat peaks at the end of the sequences with which is compared to the front sharp peaks. If this option is selected to 100, the software would sorts the two samples into two contigs.
The "BasePatch" option corrects for basecalling errors caused by poor mobility shift (Adds mutations where the mutation threshold for score is unmet due to mobility shift but overlap and dropping factor are sufficient).
Score trimming is designed to eliminate false positives from regions of low quality. If the score trimming is set to 15 (similar to a phred score) for example, and 7 out of 9 consecutive bases have a score below 15, then the software will ignore these bases for mutation calling.
End trimming allows the user to trim or cut the beginning and the end of traces in order to eliminate poor quality data.