ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

Genotype Likelihoods: Difference between revisions

From angsd
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
=Analysis from sequencing data=
=Genotype likelihoods from alignments=
<classdiagram>
<classdiagram>
// [input|bam files;SOAP files{bg:orange}]->[sequence data]
// [input|bam files;SOAP files{bg:orange}]->[sequence data]
Line 5: Line 5:
  </classdiagram>
  </classdiagram>


=Genotype likelihoods from alignments=
 
; -GL [int]
; -GL [int]
If your input is sequencing file you can estimate genotype likelhoods from the mapped reads. Four different methods are available.  
If your input is sequencing file you can estimate genotype likelhoods from the mapped reads. Four different methods are available.  
Line 77: Line 77:
./angsd -bam bam.filelist -GL 4 -out outfile -error error.file  
./angsd -bam bam.filelist -GL 4 -out outfile -error error.file  
</pre>
</pre>
=output genotype likelihoods=

Revision as of 18:00, 19 September 2012

Genotype likelihoods from alignments

<classdiagram> // [input|bam files;SOAP files{bg:orange}]->[sequence data]

[sequence data]->[genotype likelihoods|samtools;GATK;soapSNP;kim et.al]
</classdiagram>


-GL [int]

If your input is sequencing file you can estimate genotype likelhoods from the mapped reads. Four different methods are available.

Samtools

-GL 1

This methods has a random component. In same tools there is a stocastic component so to get the exact same results as samtools use nThreads=1. However, the method is still the same with multiple threads but some sites will have small differences compared to the samtools output bacause of the stocastic component.

options

-minQ [int]

default 13. The minimum allowed base quality score.

-minMapQ [int]

default 0; The minimum allowed mapping quality score.

example

./angsd -bam bam.filelist -GL 1 -out outfile

GATK

-GL 2

options

-minQ [int]

default 13. The minimum allowed base quality score.

-minMapQ [int]

default 0; The minimum allowed mapping quality score.

example

./angsd -bam bam.filelist -GL 2 -out outfile

soapSNP

-GL 3 When estimating GL with soapSNP we need to generate a calibration matrix. This is done automaticly if these doesn't exist. These are located in angsd_tmpdir/basenameNUM.count,angsd_tmpdir/basenameNUM.qual

options

-minQ [int]

default 13. The minimum allowed base quality score.

-minMapQ [int]

default 0; The minimum allowed mapping quality score.

-maxq [int]

default 51; The maximum allowed base quality score.

-L [int]

default 150; The maximum read length (choosing one that is too large is not a problem)

example

./angsd -bam bam.filelist -GL 3 -out outfile -minQ 0 -ref hg19.fa 

This first loop doesn't estimate anything else than the calibration matrix. So now we can do the analysis we want

./angsd -bam bam.filelist -GL 3 -out outfile -minQ 0 -ref hg19.fa

Kim et al.

-GL 4 Citation Citation

options

-error [filename]

A file with the estimated type specific error rates (see Error_estimation).

example

./angsd -bam bam.filelist -GL 4 -out outfile -error error.file 


output genotype likelihoods