ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

RealSFS: Difference between revisions

From angsd
Jump to navigation Jump to search
No edit summary
Line 2: Line 2:
=Brief overview=
=Brief overview=
<pre>
<pre>
./emOptim2 afile.saf nChr [-start FNAME -P nThreads -tole tole -maxIter  -nSites ]
./emOptim2 afile.saf nChr [-start FNAME -P nThreads -tole tole -maxIter  -nSites -use-BFGS ]
nChr is the number of chromosomes. (twice the number of diploid invididuals)
nChr is the number of chromosomes. (twice the number of diploid invididuals)
</pre>
</pre>


Program defaults to use the EM algorithm for the optimisation, the BFGS method can used by supplying BFGS as the first argument after ./emOptim2
Program defaults to use the EM algorithm for the optimisation. See example below for using the bfgs optimisation.


<pre>
emOptim2 sfstest.saf 20 -P 4 >sfs.em
emOptim2 fstest.saf 20 -P 4 -use-BFGS 1 >sfs.bfgs
</pre>





Revision as of 09:53, 10 March 2014

This program will estimate the SFS based on a .saf file generated from the ./angsd [options] -doSaf .

Brief overview

./emOptim2 afile.saf nChr [-start FNAME -P nThreads -tole tole -maxIter  -nSites -use-BFGS ]
nChr is the number of chromosomes. (twice the number of diploid invididuals)

Program defaults to use the EM algorithm for the optimisation. See example below for using the bfgs optimisation.

emOptim2 sfstest.saf 20 -P 4 >sfs.em
emOptim2 fstest.saf 20 -P 4 -use-BFGS 1 >sfs.bfgs


The emOptim2 program will read in a block of the genome (from the .saf) file, and for this region it will estimate the SFS.

The size of the block can be choosen using -nSites argument, otherwise it will try to read in the entire saf file.

If you have .saf file larger than -nSites (you can check the number of sites in the .saf.pos file), then the program will loop over the genome out output the results. So each line in your Whit.saf.ml, is an SFS for a region.

Output

Main results are printed to the stdout. Values are in logspace.

NB

Use as many sites as possible, for more reliable estimates.