ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Major Minor: Difference between revisions
No edit summary |
|||
Line 35: | Line 35: | ||
==Pre specified Major using a reference== | ==Pre specified Major using a reference== | ||
You can force the major and minor according to your reference states if you have defined those '''-ref'''. | You can force the major and minor according to your reference states if you have defined those '''-ref'''. | ||
; -ref [fasta.fa] | ; -ref [fasta.fa] | ||
==Pre specified Major using the ancestral state== | ==Pre specified Major using the ancestral state== | ||
You can force the major and minor according to your ancestral states if you have defined those '''-anc'''. We first estimate the major/minor from the data using '''-doMajorMinor 1/-doMajorMinor 2''', and swap these accordingly with the major we are trying to force. If that is not the case the site will be discarded from downstream analysis. | You can force the major and minor according to your ancestral states if you have defined those '''-anc'''. We first estimate the major/minor from the data using '''-doMajorMinor 1/-doMajorMinor 2''', and swap these accordingly with the major we are trying to force. If that is not the case the site will be discarded from downstream analysis. | ||
; -doMajorMinor 5 | ; -doMajorMinor 5 | ||
; -anc [fasta.fa] | ; -anc [fasta.fa] |
Revision as of 16:45, 5 March 2014
We allow the major and minor to be determined from either the counts of nucleotides, based on genotype likelihoods, specified by the ancestral/reference or even force both major minor to specific bases, which can be useful if you compare with HapMap data etc.
Brief Overview
./angsd -doMajorMinor Command: /home/software/angsd/angsd0.583/angsd -doMajorMinor -> angsd version: 0.580 build(Feb 26 2014 11:19:53) -> Analysis helpbox/synopsis information: ------------------- analysisMajorMinor.cpp: -doMajorMinor 0 1: Infer major and minor from GL 2: Infer major and minor from allele counts 3: use major and minor from a file (requires -sites file.txt) 4: Use reference allele as major (requires -ref) 5: Use ancestral allele as major (requires -anc)
Details
From genotype likelihood data
- -doMajorMinor 1
From input for either sequencing data like bam files or from genotype likelihood data like glfv3 the major and minor allele can be inferred directly from likelihoods. We use a maximum likelihood approach to choose the major and minor alleles. Details of the method can be found in the theory section of this page and for citation use this publication Skotte2012 and is briefly described here.
From counts of data
- -doMajorMinor 2
If you input sequencing data like the bam format you can choose to infer the major and minor allele by picking the two most frequently observed bases across individuals. This is the approach from here: citation.
Pre specified Major and Minor
Using the -sites option the major and minor allele can be predefined for the desired sites. The is very useful when comparing with other data sources e.g. SNP chips where the major and minor allele is known.
- -doMajorMinor 3
- -sites [filename]
Pre specified Major using a reference
You can force the major and minor according to your reference states if you have defined those -ref.
- -ref [fasta.fa]
Pre specified Major using the ancestral state
You can force the major and minor according to your ancestral states if you have defined those -anc. We first estimate the major/minor from the data using -doMajorMinor 1/-doMajorMinor 2, and swap these accordingly with the major we are trying to force. If that is not the case the site will be discarded from downstream analysis.
- -doMajorMinor 5
- -anc [fasta.fa]