ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Heterozygosity: Difference between revisions
Jump to navigation
Jump to search
Line 12: | Line 12: | ||
#OR | #OR | ||
./angsd -i my.bam -anc ref.fa -dosaf 1 -fold 1 | ./angsd -i my.bam -anc ref.fa -dosaf 1 -fold 1 | ||
#followed by the actual estimation | |||
./realSFS angsdput.saf.idx >est.ml | |||
</pre> | |||
The heterozygosity is then: | |||
<pre> | |||
#in R | |||
a<-scan("est.ml") | |||
a[2]/sum(a) | |||
</pre> | </pre> | ||
=Local estimate= | =Local estimate= |
Revision as of 16:28, 10 January 2017
The heterozygosity is the proportion of heterozygous genotypes.
This can either be a global estimate or a local estimate.
For diploid single samples the hetereo zygosity is simply second value in the SFS/AFS. An important aspect with this approach is that we DO NOT require to fix the major and minor. By fixing the ancestral we loop over the 3 possible derived alleles, or we can use the reference as the ancestral and fold the spectrum.
Global estimate
This is simply the SFS Estimation for single samples. A short example is:
./angsd -i my.bam -anc ancestral.fa -dosaf 1 -gl 1 #OR ./angsd -i my.bam -anc ref.fa -dosaf 1 -fold 1 #followed by the actual estimation ./realSFS angsdput.saf.idx >est.ml
The heterozygosity is then:
#in R a<-scan("est.ml") a[2]/sum(a)