ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Thetas,Tajima,Neutrality tests: Difference between revisions
Line 60: | Line 60: | ||
<pre> | <pre> | ||
( | (59999,69999)(60000,70000)(60000,70000) chr1 65000 2349.039592 2008.865974 2791.401569 3817.828656 2913.347320 -0.545594 -0.626967 -0.486984 -0.617873 0.195337 10000 | ||
( | (69999,79999)(70000,80000)(70000,80000) chr1 75000 2349.113388 1993.792014 2764.051812 3979.987797 2986.889940 -0.569871 -0.617112 -0.456779 -0.678388 0.220762 10000 | ||
(79999,89999)(80000,90000)(80000,90000) chr1 85000 2349.154140 2035.577279 2649.132059 3902.254435 2968.915852 -0.502912 -0.491556 -0.330221 -0.637555 0.214522 10000 | |||
( | (89999,99999)(90000,100000)(90000,100000) chr1 95000 2349.462773 2048.143641 2533.193917 3881.554872 2964.849262 -0.483190 -0.388552 -0.202228 -0.626111 0.212980 10000 | ||
(99999,109999)(100000,110000)(100000,110000) chr1 105000 2349.306947 2103.402129 2608.611593 3738.658529 2921.030347 -0.394355 -0.404727 -0.285429 -0.558478 0.197881 10000 | |||
( | (109999,119999)(110000,120000)(110000,120000) chr1 115000 2348.965451 1867.325681 2725.815492 4491.310734 3179.318214 -0.772512 -0.687843 -0.414876 -0.896283 0.287438 10000 | ||
(119999,129999)(120000,130000)(120000,130000) chr1 125000 2349.437816 2077.636124 2623.517860 3755.631838 2916.633993 -0.435861 -0.437286 -0.301676 -0.573043 0.196304 10000 | |||
( | |||
( | |||
( | |||
( | |||
</pre> | </pre> | ||
Format is: | Format is: | ||
<code> | <code> | ||
(indexStart,indexStop)(posStart,posStop) chrname wincenter tajD fuliD | (indexStart,indexStop)(posStart,posStop) chrname wincenter tW tP tF tH tL tajD fulif fuliD fayH zengsE | ||
</code> | </code> | ||
=Citation= | =Citation= | ||
[[Korneliussen2013]] | [[Korneliussen2013]] |
Revision as of 14:34, 7 October 2013
This method will estimate different thetas (population scaled mutation rate) and can based on these thetas calculate Tajima's D and various other neutrality test statistics. Method is described in Korneliussen2013.
NB Information on this website is for version 0.551 or higher.
Example
Below is a chain of commands used for caculating statistics. Its a 3 step procedure
- Estimate an site frequency spectrum. Output is .em.ml file.
- Calculate per-site thetas. Output is a .thetas.gz file.
- Calculate neutrality tests statistics. Output is a .thetas.gz.pestPG file.
#(estimate an SFS) ../angsd0.551/angsd -bam pop1.list -out bingo -realSFS 1 ../angsd0.551/misc/emOptim2 bingo.sfs 40 -P 24 >bingo.em.ml #(calculate thetas) ../angsd0.551/angsd -bam pop1.list -out bongo -doThetas 1 -realSFS 1 -pest bingo.em.ml #(calculate Tajimas.) ../angsd0.551/misc/bgid make_bed taj/aout2.thetas.gz ../angsd0.551/misc/bgid do_stat taj/aout2.thetas.gz -nChr 40
Sliding Window example
We can easily do a sliding window analysis by adding -win/-step arguments to the last command
../angsd0.551/misc/bgid do_stat taj/aout2.thetas.gz -nChr 40 -win 50000 -step 10000
This will calculate the test statistic using a window size of 50kb and a step size of 10kb.
Example Output
.thetas.gz is
chr10 1 -7.041327 -7.362126 -6.399318 -8.788662 -7.840050 chr10 2 -8.337345 -8.921354 -7.358819 -11.052126 -9.502293 chr10 3 -9.283203 -9.945923 -8.207280 -12.281002 -10.546671 chr10 4 -1.105216 -0.617477 -35.602360 -0.730974 -0.672616 chr10 5 -7.705427 -8.374287 -6.622234 -10.726430 -8.976529 chr10 6 -11.683662 -12.369866 -10.578860 -14.766857 -12.975926 chr10 7 -11.688400 -12.374647 -10.583545 -14.771753 -12.980717 chr10 8 -12.989104 -13.675391 -11.884201 -16.072602 -14.281470 chr10 9 -13.682495 -14.368910 -12.577435 -16.766463 -14.975017 chr10 10 -1.105216 -0.655518 -37.649835 -0.982402 -0.805662 chr10 11 -14.433381 -15.119890 -13.328206 -17.517696 -15.726019 chr10 12 -14.410507 -15.097004 -13.305348 -17.494774 -15.703129 chr10 13 -14.963982 -15.650467 -13.858836 -18.048209 -16.256590 chr10 14 -15.657183 -16.343698 -14.552002 -18.741516 -16.949827 chr10 15 -15.762558 -16.449085 -14.657361 -18.846938 -17.055218 chr10 16 -1.105216 -0.766882 -12.373455 -1.539395 -1.080326 chr10 17 -15.876465 -16.563007 -14.771249 -18.960901 -17.169143 chr10 18 -17.256944 -17.943487 -16.151728 -20.341382 -18.549623
- 1. chromosome
- 2. position
- 3. ThetaWatterson
- 4. ThetaD (nucleotide diversity)
- 5. Theta? (singleton category)
- 6. ThetaL
- 7. ThetaH
.thetas.gz.pestPG
(59999,69999)(60000,70000)(60000,70000) chr1 65000 2349.039592 2008.865974 2791.401569 3817.828656 2913.347320 -0.545594 -0.626967 -0.486984 -0.617873 0.195337 10000 (69999,79999)(70000,80000)(70000,80000) chr1 75000 2349.113388 1993.792014 2764.051812 3979.987797 2986.889940 -0.569871 -0.617112 -0.456779 -0.678388 0.220762 10000 (79999,89999)(80000,90000)(80000,90000) chr1 85000 2349.154140 2035.577279 2649.132059 3902.254435 2968.915852 -0.502912 -0.491556 -0.330221 -0.637555 0.214522 10000 (89999,99999)(90000,100000)(90000,100000) chr1 95000 2349.462773 2048.143641 2533.193917 3881.554872 2964.849262 -0.483190 -0.388552 -0.202228 -0.626111 0.212980 10000 (99999,109999)(100000,110000)(100000,110000) chr1 105000 2349.306947 2103.402129 2608.611593 3738.658529 2921.030347 -0.394355 -0.404727 -0.285429 -0.558478 0.197881 10000 (109999,119999)(110000,120000)(110000,120000) chr1 115000 2348.965451 1867.325681 2725.815492 4491.310734 3179.318214 -0.772512 -0.687843 -0.414876 -0.896283 0.287438 10000 (119999,129999)(120000,130000)(120000,130000) chr1 125000 2349.437816 2077.636124 2623.517860 3755.631838 2916.633993 -0.435861 -0.437286 -0.301676 -0.573043 0.196304 10000
Format is:
(indexStart,indexStop)(posStart,posStop) chrname wincenter tW tP tF tH tL tajD fulif fuliD fayH zengsE