ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

Haploid calling

From angsd
Revision as of 15:27, 3 January 2016 by Albrecht (talk | contribs) (Created page with "Simple haploid output based on sampling or consensus. __TOC__ <classdiagram type="dir:LR"> [BAM files{bg:orange}]->[Sequence data|Random base;Consensus base] [sequence da...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Simple haploid output based on sampling or consensus.


<classdiagram type="dir:LR">

[BAM files{bg:orange}]->[Sequence data|Random base;Consensus base]

[sequence data]->[*.haplo.gz|single base file{bg:blue}] </classdiagram>


Brief Overview

> ./angsd -doHaploCall
	-> angsd version: 0.910-45-g2b2b4f0-dirty (htslib: 1.2.1-192-ge7e2b3d) build(Jan  3 2016 14:45:41)
	-> Analysis helpbox/synopsis information:
	-> Command: 
./angsd -doHaploCall 	-> Sun Jan  3 15:18:15 2016
--------------
abcHaploCall.cpp:
	-doHaploCall	0
	(Sampling strategies)
	 0:	 no haploid calling 
	 1:	 (Sample single base)
	 2:	 (Concensus base)
	-doCounts	0	Must choose -doCount 1
Optional
	-minMinor	0	Minimum observed minor alleles
	-maxMis	-1	Maximum missing bases (per site)


This function outputs a base for each individual for each site

Options

-doHaploCall [int]

1; sample a random base 2; most frequent base. Random base for ties

-doCounts 1

use -doCounts 1 in order to count the bases at each sites after filters.

-minMinor [int]

Minimum observed minor alleles; only prints sites with more than minMinor sampled alleles (across individuals).

-maxMis [int]

maximum allowed missing alleles (accross individuals). -maxMis 0 means only sites without missing alleles are printed


Output

  • .haplo.gz

Output: Each line represents site. chromsome name (Column 1), position (Column 2), major allele (Column 3). One column for each individual with the sampled allele.

Example

Create a fasta file bases from a random samples of bases.

./angsd -bam bam.filelist -dohaplocall 1 -doCounts 1 -r 1: -minMinor 1

Output

1	14094607	C	C	N	N	C	C	C	T
1	14094618	C	C	N	N	C	G	C	N
1	14094619	G	C	N	N	G	N	G	G
1	14094628	C	G	N	N	N	C	N	G
1	14094784	G	G	G	T	T	N	G	G
1	14095072	A	A	N	A	A	A	A	C
1	14095751	C	C	C	C	C	N	T	C
1	14095773	G	G	G	G	G	G	N	T
1	14095992	C	C	A	N	N	C	A	C
1	14096030	C	C	C	N	A	N	C	N
1	14096362	G	T	G	G	G	G	G	G
1	14096635	A	T	A	A	N	A	N	N
1	14096717	C	C	N	C	C	C	C	N
1	14097480	A	A	G	A	A	A	A	A
1	14097899	T	T	T	G	T	T	G	T
1	14098042	G	T	N	G	T	T	G	T
1	14098127	A	A	N	C	A	N	A	A
1	14098140	G	G	N	G	G	G	N	G
1	14098148	C	A	N	C	C	C	N	C
1	14098346	T	T	T	T	G	T	G	G
1	14098792	T	T	N	T	A	N	T	N
1	14099223	G	G	G	T	G	G	G	G