ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

Genotype calling

From angsd
Jump to navigation Jump to search

Genotype calling

The program can do genotype calling based either on the genotype til the highest likelihood or by using the frequency as a prior(recommended see Kim2011).

options

-doGeno [int]

1: print out major minor

2: print the called genotype as 0,1,2

4: print the called genotype as AA, AC, AG, ...

8: print all 3 posts (major,major),(major,minor),(minor,minor)

16: print the posterior of the called genotype

32: somewhat different dumps the binary posterior for all samples, encoded as 3*nind double

Use the sum of the above to give the output you want. Forexample -doGeno 5 (1+4) prins the major and minor allele followed by the genotype (AA, AC ...) for each individual

-doPost [int]

1: estimate the posterior genotype probability based on the allele frequency as a prior

2: estimate the posterior genotype probability assuming a uniform prior

-postCutoff [float]

Call only a genotype with a posterior above this threshold.

example

./angsd -bam bam.filelist -GL 1 -out outfile -doMaf 2 -doSNP 1 -doMajorMinor 1 -minLRT 24 -doGeno 5 -doPost 1 -postCutoff 0.95
gives a output like this:
1       14000202        G       A       GG      NN      NN      GA      NN      
1       14000873        G       A       GG      GG      GG      AA      GA      
1       14001018        T       C       NN      NN      NN      CC      NN      
1       14001867        A       G       NN      AA      AA      NN      NN      
1       14002342        C       T       CC      CC      CC      CC      CC      
1       14002422        A       T       AA      NN      NN      NN      NN      
1       14002474        T       C       TC      TT      TT      TT      TT      
1       14003581        C       T       CC      CC      NN      NN      CT      
1       14004623        T       C       TT      TT      TT      NN      TC      
1       14005069        A       G       AA      AA      AA      AA      AA