ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Genotype calling
Genotype calling
The program can do genotype calling based either on the genotype til the highest likelihood or by using the frequency as a prior(recommended see Kim2011).
options
- -doGeno [int]
1: print out major minor
2: print the called genotype as 0,1,2
4: print the called genotype as AA, AC, AG, ...
8: print all 3 posts (major,major),(major,minor),(minor,minor)
16: print the posterior of the called genotype
32: somewhat different dumps the binary posterior for all samples, encoded as 3*nind double
Use the sum of the above to give the output you want. Forexample -doGeno 5 (1+4) prins the major and minor allele followed by the genotype (AA, AC ...) for each individual
- -doPost [int]
1: estimate the posterior genotype probability based on the allele frequency as a prior
2: estimate the posterior genotype probability assuming a uniform prior
- -postCutoff [float]
Call only a genotype with a posterior above this threshold.
NB if the raw posterior dump is requested the -postCutoff is not used
example
./angsd -bam bam.filelist -GL 1 -out outfile -doMaf 2 -doSNP 1 -doMajorMinor 1 -minLRT 24 -doGeno 5 -doPost 1 -postCutoff 0.95
gives a output like this:
1 14000202 G A GG NN NN GA NN 1 14000873 G A GG GG GG AA GA 1 14001018 T C NN NN NN CC NN 1 14001867 A G NN AA AA NN NN 1 14002342 C T CC CC CC CC CC 1 14002422 A T AA NN NN NN NN 1 14002474 T C TC TT TT TT TT 1 14003581 C T CC CC NN NN CT 1 14004623 T C TT TT TT NN TC 1 14005069 A G AA AA AA AA AA