ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

Abbababa

From angsd
Revision as of 16:12, 2 December 2013 by Albrecht (talk | contribs)
Jump to navigation Jump to search

Available from version 0.559+.

performs the abbababa test also called the D-statistic. This tests for ancient admixture (or wrong tree topology)

<classdiagram type="dir:LR">

[Single BAM file{bg:orange}]->[Sequence data|Random base (-doAbbababa 1)]

[sequence data]->[*.abbababa|ABBA and BABA couts file{bg:blue}] [*.abbababa|ABBA and BABA couts file{bg:blue}]->jackKnife.R[D stat and Z scores{bg:blue}] </classdiagram>

Brief Overview

> ./angsd -doAbbababa

--------------
analysisAbbababa.cpp:
	-doAbbababa	0
	1: use a random base
	-rmTrans		0	remove transitions
	-blockSize		5000000	number of based in a block

This function will counts the number of ABBA and BABA sites

Options

-doFasta 1
sample a random base at each position.
-minQ [INT]

minimum base quality score.

Output

Output is a fasta file, a normal looking fast file. Nothing special about this. For -doFasta 1, sometimes its big letters sometime small letters. This is due to the results being copied directly from the sequencing data. So small/big letters correspond to which strand for the original data. For the consensus fasta all letters are capital letters.

Example

Create a fasta file bases from a random samples of bases.

./angsd -i smallNA07056.mapped.ILLUMINA.bwa.CEU.low_coverage.20111114.bam -doFasta 1