ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Fst PCA: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 15: | Line 15: | ||
=cite= | =cite= | ||
If you use these methods, you should cite the m. fumagalli paper http://www.ncbi.nlm.nih.gov/pubmed/23979584 | If you use these methods, you should cite the m. fumagalli paper http://www.ncbi.nlm.nih.gov/pubmed/23979584 | ||
=New Fancy Method= | |||
<pre> | |||
nRep <- 100 | |||
nPop1 <- 24 | |||
nPop2 <- 16 | |||
cmd <- paste("msms -ms",nPop1+nPop2,nRep,"-t 930 -r 400 -I 2",nPop1,nPop2,"0 -g 1 9.70406 -n 1 2 -n 2 1 -ma x 0.0 0.0 x -ej 0.07142857 2 1 >msoutput.txt ",sep=" ") | |||
system(cmd) | |||
##system("msms -ms 40 1 -t 930 -r 400 -I 2 20 20 0 -g 1 9.70406 -n 1 2 -n 2 1 -ma x 0.0 0.0 x -ej 0.07142857 2 1 >msoutput.txt ") | |||
source("../R/readms.output.R") | |||
to2dSFS <- function(p1.d,p2.d,nPop1,nPop2) | |||
sapply(0:nPop1,function(x) table(factor(p2.d[p1.d==x],levels=0:nPop2))) | |||
source("../R/readms.output.R") | |||
a<- read.ms.output(file="msoutput.txt") | |||
p1.d <- unlist((sapply(a$gam,function(x) colSums(x[1:nPop1,])))) | |||
p2.d <- unlist((sapply(a$gam,function(x) colSums(x[-c(1:nPop1),])))) | |||
par(mfrow=c(1,2)) | |||
barplot(table(p1.d)) | |||
barplot(table(p2.d)) | |||
sfs.2d <- t(sapply(0:nPop1,function(x) table(factor(p2.d[p1.d==x],levels=0:nPop2)))) | |||
</pre> |
Revision as of 20:54, 7 June 2015
Matteo Fumagalli has been working on methods to estimate Fst and doing PCA/Covariance based on ANGSD output files.
The main documentation for this is found here: https://github.com/mfumagalli/ngsTools
Fst
- Generate .saf files from each population using ANGSD SFS Estimation
- using a 2D-SFS as a prior, estimated using ngs2dSFS
- using marginal spectra as priors, estimated using realSFS
PCA
More information here: https://github.com/mfumagalli/ngsTools#ngscovar
cite
If you use these methods, you should cite the m. fumagalli paper http://www.ncbi.nlm.nih.gov/pubmed/23979584
New Fancy Method
nRep <- 100 nPop1 <- 24 nPop2 <- 16 cmd <- paste("msms -ms",nPop1+nPop2,nRep,"-t 930 -r 400 -I 2",nPop1,nPop2,"0 -g 1 9.70406 -n 1 2 -n 2 1 -ma x 0.0 0.0 x -ej 0.07142857 2 1 >msoutput.txt ",sep=" ") system(cmd) ##system("msms -ms 40 1 -t 930 -r 400 -I 2 20 20 0 -g 1 9.70406 -n 1 2 -n 2 1 -ma x 0.0 0.0 x -ej 0.07142857 2 1 >msoutput.txt ") source("../R/readms.output.R") to2dSFS <- function(p1.d,p2.d,nPop1,nPop2) sapply(0:nPop1,function(x) table(factor(p2.d[p1.d==x],levels=0:nPop2))) source("../R/readms.output.R") a<- read.ms.output(file="msoutput.txt") p1.d <- unlist((sapply(a$gam,function(x) colSums(x[1:nPop1,])))) p2.d <- unlist((sapply(a$gam,function(x) colSums(x[-c(1:nPop1),])))) par(mfrow=c(1,2)) barplot(table(p1.d)) barplot(table(p2.d)) sfs.2d <- t(sapply(0:nPop1,function(x) table(factor(p2.d[p1.d==x],levels=0:nPop2))))