ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

Heterozygosity: Difference between revisions

From angsd
Jump to navigation Jump to search
Line 12: Line 12:
#OR
#OR
./angsd -i my.bam -anc ref.fa -dosaf 1 -fold 1
./angsd -i my.bam -anc ref.fa -dosaf 1 -fold 1
#followed by the actual estimation
./realSFS angsdput.saf.idx >est.ml
</pre>
The heterozygosity is then:
<pre>
#in R
a<-scan("est.ml")
a[2]/sum(a)
</pre>
</pre>


=Local estimate=
=Local estimate=

Revision as of 16:28, 10 January 2017

The heterozygosity is the proportion of heterozygous genotypes.

This can either be a global estimate or a local estimate.

For diploid single samples the hetereo zygosity is simply second value in the SFS/AFS. An important aspect with this approach is that we DO NOT require to fix the major and minor. By fixing the ancestral we loop over the 3 possible derived alleles, or we can use the reference as the ancestral and fold the spectrum.

Global estimate

This is simply the SFS Estimation for single samples. A short example is:

./angsd -i my.bam -anc ancestral.fa -dosaf 1 -gl 1
#OR
./angsd -i my.bam -anc ref.fa -dosaf 1 -fold 1
#followed by the actual estimation
./realSFS angsdput.saf.idx >est.ml

The heterozygosity is then:

#in R
a<-scan("est.ml")
a[2]/sum(a)

Local estimate