Error estimation method: Difference between revisions

Latest revision as of 17:53, 12 February 2013

The estimated rates can roughly be intrepreted as relative error rates. That is excess of errors in your sample compare to the error in the perfect indviduals. The idea is the your sample and the perfect individuals should have the same expected number of derived alleles and the extra derived alleles in you sample are due to the excess errorr. For each individual we sample a single base from the reads at each position. We use only positions were there are coverage for both the chimp, the sample and the perfect man. The overall error rate is obtained from

$O_{D}=E_{D}(1-\epsilon )+E_{A}\epsilon$

were

$\epsilon$ is the error rate
$O_{D}$ is the observed number of derived alleles in the sample
$E_{D}$ is the expected number of derived alleles which is obtained from the observed derived alleles from the perfect man
$E_{A}$ is the expected number of ancestral alleles which is obtained from the perfect man

For the type specific error rates are obtained from maximizing the likehood

$p(H=h|C=c)=p(H=h|C=c,noerror)\left(1-\sum _{h'\neq h}e_{h->h'}\right)+\left(\sum _{h'\neq h}p(H=h'|C=c,noerror)e_{h'->h}\right)$

where

$h$ is the allele of you sample
$c$ is the allele of the chimp
$e_{a->b}$ is the error rate for base a to base b
$p(H=h|C=c,noerror)$ are obtained from the perfect man assuming that the perfect man has no errors.

@@ Line 9: / Line 9: @@
 were
-;<math>\epsilon</math> is the error rate
+*<math>\epsilon</math> is the error rate
- - <math>O_D</math> is the observed number of derived alleles in the sample
+*<math>O_D</math> is the observed number of derived alleles in the sample
- - <math>E_D</math> is the expected number of derived alleles which is obtained from the observed derived alleles from the perfect man
+*<math>E_D</math> is the expected number of derived alleles which is obtained from the observed derived alleles from the perfect man
- - <math>E_A</math> is the expected number of ancestral alleles which is obtained from the perfect man
+*<math>E_A</math> is the expected number of ancestral alleles which is obtained from the perfect man
 For the type specific error rates are obtained from maximizing the likehood
@@ Line 21: / Line 21: @@
 where
- - <math>h</math> is the allele of you sample
+*<math>h</math> is the allele of you sample
- - <math>c</math> is the allele of the chimp
+*<math>c</math> is the allele of the chimp
- - <math>e_{a->b}</math> is the error rate for base a to base b
+*<math>e_{a->b}</math> is the error rate for base a to base b
- - <math>p(H=h|C=c,no error)</math> are obtained from the perfect man assuming that the perfect man has no errors.
+*<math>p(H=h|C=c,no error)</math> are obtained from the perfect man assuming that the perfect man has no errors.

Error estimation method: Difference between revisions

Latest revision as of 17:53, 12 February 2013

Navigation menu