ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Relatedness: Difference between revisions
No edit summary |
|||
Line 39: | Line 39: | ||
| <math>G=2 </math> || <math> C </math> || <math>F </math> || <math> I </math> | | <math>G=2 </math> || <math> C </math> || <math>F </math> || <math> I </math> | ||
|} | |} | ||
{| class="wikitable" style="text-align: center | |||
!| Relationship || Expected ratio || || | |||
|- | |||
| Parent-Offspring || <math>0 </math> || <math>1</math> || <math>0 </math> | |||
|- | |||
| Full siblings || <math>0.25 </math> || <math> 0.5 </math> || <math> 0.25 </math> | |||
|- | |||
| Half siblings || <math> 0.5 </math> || <math>0.5 </math> || <math> 0 </math> | |||
|- | |||
| First cousins || <math>0.75 </math> || <math>0.25 </math> || <math> 0 </math> | |||
|- | |||
| Unrelated || <math> \frac{E}{C+G}=2 </math> || <math>0 </math> || <math> 0 </math> | |||
|} | |||
<pre> | |||
#R code go get expected IBS pattern | |||
getEst<-function(k=c(1,0,0),f=0.5){ | |||
p<-f | |||
q<-1-f | |||
m0<-rbind( | |||
c(p^4,2*p^3*q,p^2*q^2), | |||
c(2*p^3*q,4*p^2*q^2,2*p*q^3), | |||
c(q^4,2*q^3*p,p^2*q^2) | |||
) | |||
m1<-rbind( | |||
c(p^3,p^2*q,0), | |||
c(p^2*q,p^2*q+q^2*p,p*q^2), | |||
c(0,q^2*p,q^3) | |||
) | |||
m2<-rbind( | |||
c(p^2,0,0), | |||
c(0,2*p*q,0), | |||
c(0,0,q^2) | |||
) | |||
return(k[1]*m0+k[2]*m1+k[3]*m2) | |||
} | |||
getEst(k=c(1,0,0),f=0.5) | |||
[,1] [,2] [,3] | |||
[1,] 0.0625 0.125 0.0625 | |||
[2,] 0.1250 0.250 0.1250 | |||
[3,] 0.0625 0.125 0.0625 | |||
</pre> |
Revision as of 14:22, 12 July 2016
NGSrelate - estimation of IBD probabilities
In order to estimate kinship coefficient then population allele frequencies are needed. These can be estimated from data if you can multiple individuals. For some individuals, for example most human populations, there are publicly available data. If you can obtain population allele frequencies or have a many samples from your population then we recommend that you use NGSrelate has works with ANGSD output. From the estimated IBD probabilities you can then infer the relationship. Below is a table of the expected IBD sharing probabilities assuming no inbreeding
Relationship | |||
---|---|---|---|
Parent-Offspring | |||
Full siblings | |||
Half siblings | |||
First cousins | |||
Unrelated |
NGSrelate has its very own website http://www.popgen.dk/software/index.php/NgsRelate
IBS/genotype distribution
If you do not have population allele frequencies the you cannot estimate kinship coefficients. However, you can still make some claims about the relationship of your samples based on IBS patterns. Below is an example of IBS patterns between two individuals where we ignore the allele types. G is the genotype that counts for example the number of derived or non-reference alleles. Basically it is the 2D SFS where the is just 1 individual in each of the two populations
ind2 | |||
---|---|---|---|
ind1 | |||
Relationship | Expected ratio | ||
---|---|---|---|
Parent-Offspring | |||
Full siblings | |||
Half siblings | |||
First cousins | |||
Unrelated |
#R code go get expected IBS pattern getEst<-function(k=c(1,0,0),f=0.5){ p<-f q<-1-f m0<-rbind( c(p^4,2*p^3*q,p^2*q^2), c(2*p^3*q,4*p^2*q^2,2*p*q^3), c(q^4,2*q^3*p,p^2*q^2) ) m1<-rbind( c(p^3,p^2*q,0), c(p^2*q,p^2*q+q^2*p,p*q^2), c(0,q^2*p,q^3) ) m2<-rbind( c(p^2,0,0), c(0,2*p*q,0), c(0,0,q^2) ) return(k[1]*m0+k[2]*m1+k[3]*m2) } getEst(k=c(1,0,0),f=0.5) [,1] [,2] [,3] [1,] 0.0625 0.125 0.0625 [2,] 0.1250 0.250 0.1250 [3,] 0.0625 0.125 0.0625