ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Citing angsd: Difference between revisions
Jump to navigation
Jump to search
(18 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
There is an overall angsd paper with bibtex below. | |||
<pre> | |||
@article{korneliussen_angsd:_2014, | |||
title = {{ANGSD}: Analysis of Next Generation Sequencing Data}, | |||
volume = {15}, | |||
copyright = {http://creativecommons.org/licenses/by/2.0/}, | |||
issn = {1471-2105}, | |||
shorttitle = {{ANGSD}}, | |||
url = {http://www.biomedcentral.com/1471-2105/15/356/abstract}, | |||
doi = {10.1186/s12859-014-0356-4}, | |||
abstract = {High-throughput {DNA} sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously.}, | |||
language = {en}, | |||
number = {1}, | |||
urldate = {2014-11-26}, | |||
journal = {{BMC} Bioinformatics}, | |||
author = {Korneliussen, Thorfinn S. and Albrechtsen, Anders and Nielsen, Rasmus}, | |||
month = nov, | |||
year = {2014}, | |||
pages = {356}, | |||
} | |||
</pre> | |||
==Methods== | ==Methods== | ||
===Maf estimation from counts of alleles=== | |||
; -cutoff | |||
[[Li2010]] | |||
===Allele estimation=== | ===Allele estimation=== | ||
Line 10: | Line 35: | ||
===SNP calling=== | ===SNP calling=== | ||
SNP calling based on genotype likelihoods | SNP calling based on genotype likelihoods | ||
; - | ; -SNP_pval | ||
[[kim2011]] | [[kim2011]] | ||
===Genotype likelihoods=== | ===Genotype likelihoods=== | ||
; -GL 1 | ; -GL 1 | ||
same as in samtools [[ | same as in samtools [[Li2011]] | ||
; -GL 2 | ; -GL 2 | ||
same as in [[gatk]] | same as in [[gatk]] | ||
Line 29: | Line 54: | ||
; -doAsso 1 or 3 | ; -doAsso 1 or 3 | ||
using allele frequencies [[kim2011]] | using allele frequencies [[kim2011]] | ||
===SFS estimation=== | |||
Estimating the site frequency spectrum [[Nielsen2012]] | |||
===Neutrality tests (eg Tajima)=== | |||
[[Korneliussen2013]] | |||
===Admixture=== | |||
http://www.popgen.dk/software/index.php/NgsAdmix#Citation | |||
===Error rates method 1=== | |||
joint GL and error estimation [[kim2011]] | |||
===Error rates method 2=== | |||
based on a high quality genome [[orlando2013]] | |||
===Contamination === | |||
from X chromosome [[Rasmussen2011]] | |||
===Relatedness=== | |||
http://bioinformatics.oxfordjournals.org/content/early/2015/08/29/bioinformatics.btv509.full.pdf?keytype=ref&ijkey=ZQbzsWISGPWpPOg |
Latest revision as of 09:08, 31 August 2015
There is an overall angsd paper with bibtex below.
@article{korneliussen_angsd:_2014, title = {{ANGSD}: Analysis of Next Generation Sequencing Data}, volume = {15}, copyright = {http://creativecommons.org/licenses/by/2.0/}, issn = {1471-2105}, shorttitle = {{ANGSD}}, url = {http://www.biomedcentral.com/1471-2105/15/356/abstract}, doi = {10.1186/s12859-014-0356-4}, abstract = {High-throughput {DNA} sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously.}, language = {en}, number = {1}, urldate = {2014-11-26}, journal = {{BMC} Bioinformatics}, author = {Korneliussen, Thorfinn S. and Albrechtsen, Anders and Nielsen, Rasmus}, month = nov, year = {2014}, pages = {356}, }
Methods
Maf estimation from counts of alleles
- -cutoff
Allele estimation
allele estimation from genotype likelihoods
- - doMaf
SNP calling
SNP calling based on genotype likelihoods
- -SNP_pval
Genotype likelihoods
- -GL 1
same as in samtools Li2011
- -GL 2
same as in gatk
- -GL 3
same as in soapSNP
- -GL 4
same as in kim2011
Association
- -doAsso 2
using score statistic Skotte2012
- -doAsso 1 or 3
using allele frequencies kim2011
SFS estimation
Estimating the site frequency spectrum Nielsen2012
Neutrality tests (eg Tajima)
Admixture
http://www.popgen.dk/software/index.php/NgsAdmix#Citation
Error rates method 1
joint GL and error estimation kim2011
Error rates method 2
based on a high quality genome orlando2013
Contamination
from X chromosome Rasmussen2011