Revision as of 19:39, 27 November 2013

Available from version 0.559+.

This option creates a fasta file from a sequencing data file (BAM file). The function uses genome information in the BAM header to determine the length and chromosome names. For the sites without data an "N" is written.

[One bam file{bg:orange}]->[sequencing data|random base (-doFasta 1);consensus base (-doFasta 2)]

[sequencing data]->doFasta[fasta file{bg:blue}]

</classdiagram>

Brief Overview

> ./angsd -doFasta
--------------
analysisFasta.cpp:
	-doFasta	0
	1: use a random base
	2: use the most common base (needs -doCounts 1)
	-minQ		13	(remove bases with qscore<minQ)

options

-doFasta 1: sample a random base at each position

-doFasta 2: use the most common base. In the case of ties a random base is chosen among the bases with the same maximum counts. The "-doCounts 1" options for allele counts is needed in order to determine the most common base

-minQ [INT]

minimum base quality score

Example

Create a fasta file bases from a random samples of bases

./angsd -i smallNA07056.mapped.ILLUMINA.bwa.CEU.low_coverage.20111114.bam -doFasta 1

Fasta: Difference between revisions

Revision as of 19:39, 27 November 2013

Brief Overview

options

Example

Navigation menu

Revision as of 19:39, 27 November 2013 (view source) Thorfinn (talk \| contribs) No edit summary ← Older edit		Revision as of 19:39, 27 November 2013 (view source) Thorfinn (talk \| contribs) No edit summary Newer edit →
Line 1:		Line 1:
	Available from version 0.559+		Available from version 0.559+.

	This option creates a fasta file from a sequencing data file (BAM file). The function uses genome information in the BAM header to determine the length and chromosome names. For the sites without data an "N" is written.		This option creates a fasta file from a sequencing data file (BAM file). The function uses genome information in the BAM header to determine the length and chromosome names. For the sites without data an "N" is written.