ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Abbababa: Difference between revisions
No edit summary |
No edit summary |
||
Line 4: | Line 4: | ||
<classdiagram type="dir:LR"> | <classdiagram type="dir:LR"> | ||
[ | [BAM files{bg:orange}]->[Sequence data|Random base] | ||
[sequence data]->[*.abbababa|ABBA and BABA couts file{bg:blue}] | [sequence data]->[*.abbababa|ABBA and BABA couts file{bg:blue}] | ||
</classdiagram> | </classdiagram> | ||
Line 29: | Line 29: | ||
=Options= | =Options= | ||
;- | ;-doAbbababa 1: sample a random base at each position. | ||
;- | ;-rmTrans | ||
Remove transitions (important for ancient DNA) | |||
;-blockSize [INT] | |||
Size of each block. Choose a number that is higher than the LD in the populations. For human 5Mb (5000000) is usually used. | |||
; -anc [fileName.fa] | |||
Include an outgroup in fasta format. | |||
; -doCounts 1 | |||
use -doCounts 1 in order to count the bases at each sites after filters. | |||
=Output= | =Output= | ||
Output | ;*.abbbababa | ||
Output: Each lines represents a block with a chromsome name (Column 1), a start position (Column 2), an end postion (Column 3). The new columns are the counts of ABBA and BABA sites. For each combination of 3 individuals (H1,H2,H3) two columns are printed. These number served as input to the R script called jackKnife.R | |||
==Example== | ==Example== | ||
Create a fasta file bases from a random samples of bases. | Create a fasta file bases from a random samples of bases. | ||
<pre> | <pre> | ||
./angsd - | head -n5 smallBam.filelist > smallerBam.filelist | ||
./angsd -out out -doAbbababa 1 -bam smallerBam.filelist -doCounts 1 -anc /space/genomes/refgenomes/ancestral/hg19/fasta/hg19ancNoChr.fa | |||
Rscript file=out.abbababa indNames=smallerBam.filelist | |||
</pre> | </pre> |
Revision as of 17:13, 2 December 2013
Available from version 0.559+.
performs the abbababa test also called the D-statistic. This tests for ancient admixture (or wrong tree topology)
<classdiagram type="dir:LR">
[BAM files{bg:orange}]->[Sequence data|Random base]
[sequence data]->[*.abbababa|ABBA and BABA couts file{bg:blue}] </classdiagram>
<classdiagram type="dir:LR">
[*.abbababa|ABBA and BABA couts file{bg:blue}]->jackKnife.R[D stat and Z scores{bg:blue}]
</classdiagram>
Brief Overview
> ./angsd -doAbbababa -------------- analysisAbbababa.cpp: -doAbbababa 0 1: use a random base -rmTrans 0 remove transitions -blockSize 5000000 number of based in a block
This function will counts the number of ABBA and BABA sites
Options
- -doAbbababa 1
- sample a random base at each position.
- -rmTrans
Remove transitions (important for ancient DNA)
- -blockSize [INT]
Size of each block. Choose a number that is higher than the LD in the populations. For human 5Mb (5000000) is usually used.
- -anc [fileName.fa]
Include an outgroup in fasta format.
- -doCounts 1
use -doCounts 1 in order to count the bases at each sites after filters.
Output
- .abbbababa
Output: Each lines represents a block with a chromsome name (Column 1), a start position (Column 2), an end postion (Column 3). The new columns are the counts of ABBA and BABA sites. For each combination of 3 individuals (H1,H2,H3) two columns are printed. These number served as input to the R script called jackKnife.R
Example
Create a fasta file bases from a random samples of bases.
head -n5 smallBam.filelist > smallerBam.filelist ./angsd -out out -doAbbababa 1 -bam smallerBam.filelist -doCounts 1 -anc /space/genomes/refgenomes/ancestral/hg19/fasta/hg19ancNoChr.fa Rscript file=out.abbababa indNames=smallerBam.filelist