ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Genotype Distribution
Works from version 0.913 and above. The latest developmental version can be found here github
This method allow for estimation of the expected genotype count or fractions for one or two individuals based on genotype likelihoods. This can be very usefull for a number of population genetic statistics including Relatedness and Heterozygosity.
Examples of genotypes fraction for a single individual
all 10 possible genotypes
pAA | pAC | pAG | pAT | pCC | pCG | pCT | pGG | pGT | pTT |
---|---|---|---|---|---|---|---|---|---|
0.293 | 9.3e-05 | 0.000331 | 7.3e-05 | 0.2 | 7.7e-05 | 0.000411 | 0.204 | 7e-05 | 0.302 |
number of derived alleles (use SFS method )
pAA | pAD | pDD |
---|---|---|
0.9986 | 0.0003168 | 0.001127 |
or homozygoes vs. heterogoes
pHO | pHE |
---|---|
0.9987 | 0.0003168 |
For two individuals it could be the full 10x10 possible genotype combination
Example of 10x10 genotype probability
AA AC AG AT CC CG CT GG GT TT AA 0.0420 0.0130 0.0200 0.0170 0.0160 0.0170 0.0150 0.0240 0.0042 0.0500 AC 0.0030 0.0034 0.0071 0.0067 0.0074 0.0071 0.0065 0.0074 0.0032 0.0038 AG 0.0030 0.0033 0.0068 0.0064 0.0070 0.0068 0.0061 0.0070 0.0028 0.0034 AT 0.0071 0.0084 0.0110 0.0110 0.0110 0.0110 0.0100 0.0120 0.0072 0.0084 CC 0.0180 0.0045 0.0110 0.0100 0.0092 0.0100 0.0089 0.0140 0.0016 0.0240 CG 0.0015 0.0018 0.0061 0.0061 0.0067 0.0063 0.0060 0.0067 0.0019 0.0015 CT 0.0029 0.0032 0.0068 0.0064 0.0070 0.0067 0.0060 0.0069 0.0027 0.0033 GG 0.0180 0.0054 0.0110 0.0096 0.0088 0.0094 0.0085 0.0120 0.0012 0.0200 GT 0.0029 0.0033 0.0069 0.0066 0.0072 0.0070 0.0062 0.0071 0.0027 0.0031 TT 0.0400 0.0130 0.0200 0.0170 0.0150 0.0170 0.0150 0.0240 0.0038 0.0480
or the number of derived alleles (use 2D SFS method for this)
ind2 | |||
---|---|---|---|
ind1 | pAA | pAD | pDD |
pAA | 0.6561 | 0.1458 | 0.0081 |
pAD | 0.1458 | 0.0324 | 0.0018 |
pDD | 0.0081 | 0.0018 | 0.0001 |
or the heterozygoes and homozygoes
HO HO | HO HE | HE HO | HE HE | HO altHO |
---|---|---|---|---|
0.6562 | 0.1476 | 0.1476 | 0.0324 | 0.0162 |
Brief Overview
misc/ibs Needed arguments: -glf/-f input GLF filename: Optional arguments: -outFileName/-o output filename(prefix): -nInd/-n nubmer of individuals in GLF file: -ind1/i1 individuals 1: -ind2/i2 individuals 2: -allpairs/-a analyse all pairs: -maxSites/-m maximum sites to analyze: -model ibs model:0 all 10 genotypes, 1 HO/HE
Options
- -glf [fileName]
A binary GLF fileName that contains 10 genotype likelihoods per sites per individual as specified in ANGSD -doGLF 1.
- -outFileName [fileName]
prefix for the output file names. Default in the glf input filename
- -nInd
number of individuals in GLF file. This is needed if you have more than one individual in the GLF file.
- -ind1 [int]
If you dont want to analysis all individuals then you can specify a single individual to analyze. an integer 0-(nInd-1). The first individuals is individuals 0
- -ind2 [int]
if you only want to analyse a single pair of individuals then you can specify ind1 and ind2 an integer 0-(nInd-1). The first individuals is individuals 0
- -allpairs [int]
use -allPairs 1 to analyse all pairs of individuals
- -maxSites [int]
maximum sites to analyze. This is usefull if you don't have enough RAM for the whole genome
- -model [int]
model:0 all 10 genotypes, 1 HO/HE
Output
If you analyse each individuals seperately (-allpair 0 )
Example of output *.ibs
ind nSites Llike pAA pAC pAG pAT pCC pCG pCT pGG pGT pTT 0 2713190 -3743970.428653 0.299576 0.000081 0.000300 0.000057 0.206840 0.000069 0.000319 0.200658 0.000080 0.292020 1 2696847 -3745104.294527 0.293158 0.000093 0.000331 0.000073 0.199560 0.000077 0.000411 0.203912 0.000070 0.302315 2 2708392 -3744487.856004 0.292558 0.000074 0.000304 0.000054 0.210749 0.000072 0.000347 0.205672 0.000064 0.290105 3 2703572 -3747132.052095 0.320657 0.000073 0.000269 0.000070 0.197760 0.000057 0.000296 0.191058 0.000076 0.289685 4 2645152 -3745343.670384 0.304946 0.000063 0.000248 0.000041 0.196731 0.000058 0.000242 0.196616 0.000047 0.301008 5 2697327 -3757413.590019 0.318968 0.000098 0.000323 0.000074 0.177954 0.000076 0.000367 0.182372 0.000097 0.319671 6 2712223 -3745037.550278 0.309327 0.000040 0.000222 0.000048 0.200149 0.000056 0.000238 0.195224 0.000053 0.294644 7 2671708 -3751258.005057 0.323357 0.000066 0.000318 0.000066 0.185810 0.000079 0.000336 0.187431 0.000072 0.302463
Example of output *.ibspair
ind1 ind2 nSites Llike pAA_AA pAC_AA pAG_AA pAT_AA pCC_AA pCG_AA pCT_AA pGG_AA pGT_AA pTT_AA pAA_AC pAC_AC pAG_AC pAT_AC pCC_AC pCG_AC pCT_AC pGG_AC pGT_AC pTT_AC pAA_AG pAC_AG pAG_AG pAT_AG pCC_AG pCG_AG pCT_AG pGG_AG pGT_AG pTT_AG pAA_AT pAC_AT pAG_AT pAT_AT pCC_AT pCG_AT pCT_AT pGG_AT pGT_AT pTT_AT pAA_CC pAC_CC pAG_CC pAT_CC pCC_CC pCG_CC pCT_CC pGG_CC pGT_CC pTT_CC pAA_CG pAC_CG pAG_CG pAT_CG pCC_CG pCG_CG pCT_CG pGG_CG pGT_CG pTT_CG pAA_CT pAC_CT pAG_CT pAT_CT pCC_CT pCG_CT pCT_CT pGG_CT pGT_CT pTT_CT pAA_GG pAC_GG pAG_GG pAT_GG pCC_GG pCG_GG pCT_GG pGG_GG pGT_GG pTT_GG pAA_GT pAC_GT pAG_GT pAT_GT pCC_GT pCG_GT pCT_GT pGG_GT pGT_GT pTT_GT pAA_TT pAC_TT pAG_TT pAT_TT pCC_TT pCG_TT pCT_TT pGG_TT pGT_TT pTT_TT 0 1 2666273 -15101833.893403 0.044284 0.002556 0.002840 0.007315 0.017972 0.001189 0.002824 0.019690 0.003079 0.044594 0.011683 0.003200 0.003507 0.008936 0.003465 0.001633 0.003465 0.004289 0.003717 0.011984 0.016918 0.007269 0.007371 0.012387 0.009642 0.005872 0.007471 0.009886 0.007592 0.017011 0.017168 0.005425 0.005240 0.009444 0.009141 0.005412 0.005458 0.009405 0.005514 0.017127 0.017411 0.007036 0.006751 0.011750 0.009794 0.007228 0.006963 0.009600 0.007101 0.017301 0.016278 0.006881 0.006509 0.011501 0.009320 0.007174 0.006715 0.009397 0.006866 0.016429 0.018595 0.006579 0.006201 0.011048 0.009953 0.006773 0.006366 0.009590 0.006492 0.018422 0.019068 0.007246 0.006854 0.012420 0.010073 0.006641 0.007062 0.009515 0.007063 0.018884 0.002373 0.002853 0.002495 0.007785 0.001071 0.001662 0.002712 0.000920 0.002506 0.002561 0.053818 0.003643 0.003223 0.009266 0.023981 0.001400 0.003529 0.021892 0.003049 0.053435 0 2 2676526 -15816810.609543 0.041516 0.002951 0.002951 0.007122 0.018048 0.001474 0.002893 0.017909 0.002886 0.040033 0.013051 0.003357 0.003251 0.008372 0.004535 0.001808 0.003196 0.005441 0.003253 0.012761 0.020086 0.007098 0.006762 0.011401 0.011445 0.006130 0.006762 0.011151 0.006894 0.019695 0.017164 0.006746 0.006427 0.010837 0.010155 0.006090 0.006382 0.009635 0.006587 0.016748 0.015683 0.007385 0.007046 0.011432 0.009218 0.006675 0.007040 0.008797 0.007213 0.015285 0.017100 0.007088 0.006779 0.011182 0.009968 0.006274 0.006734 0.009407 0.006952 0.016626 0.014913 0.006453 0.006062 0.010040 0.008931 0.005985 0.006022 0.008548 0.006229 0.014648 0.024341 0.007421 0.006985 0.011662 0.013634 0.006674 0.006895 0.012352 0.007094 0.023660 0.004206 0.003167 0.002798 0.007235 0.001649 0.001867 0.002732 0.001216 0.002748 0.003767 0.049895 0.003815 0.003394 0.008391 0.023567 0.001494 0.003347 0.020295 0.003070 0.047911 0 3 2671296 -15660554.911695 0.037567 0.002986 0.003026 0.007970 0.012805 0.001148 0.002770 0.012966 0.002881 0.032981 0.022781 0.003080 0.003141 0.008308 0.008247 0.001530 0.002872 0.009005 0.002993 0.020261 0.021752 0.006992 0.006880 0.011773 0.010969 0.005788 0.006579 0.010640 0.006706 0.019734 0.018656 0.006450 0.006254 0.010667 0.009591 0.005672 0.006007 0.009326 0.006165 0.016988 0.017648 0.006789 0.006528 0.010714 0.009206 0.005808 0.006312 0.008878 0.006328 0.016119 0.017684 0.007068 0.006723 0.011350 0.009147 0.005854 0.006497 0.008710 0.006443 0.016055 0.018377 0.007038 0.006607 0.011320 0.009632 0.005736 0.006385 0.009067 0.006311 0.016671 0.021570 0.006779 0.006205 0.011137 0.010963 0.005538 0.006004 0.010191 0.005906 0.019442 0.017537 0.003402 0.002811 0.008049 0.007456 0.001552 0.002743 0.006172 0.002401 0.015417 0.045183 0.003889 0.003270 0.009095 0.017183 0.001220 0.003156 0.014249 0.002672 0.038899 0 4 2614962 -14819561.589144 0.043463 0.002407 0.002645 0.007394 0.016497 0.001107 0.002488 0.017405 0.002799 0.042466 0.015502 0.002873 0.003116 0.008777 0.003788 0.001351 0.002937 0.005238 0.003325 0.014871 0.018295 0.006551 0.006519 0.011845 0.009660 0.005401 0.006393 0.009730 0.006746 0.017307 0.017008 0.006034 0.005870 0.011057 0.009467 0.005539 0.005898 0.009462 0.006205 0.016197 0.018715 0.005621 0.005303 0.009626 0.010008 0.006151 0.005368 0.010072 0.005683 0.018173 0.023104 0.006546 0.006151 0.011009 0.012025 0.007019 0.006153 0.011398 0.006609 0.022200 0.016942 0.006526 0.006154 0.010938 0.009400 0.006624 0.006164 0.008933 0.006549 0.016012 0.020140 0.006080 0.005790 0.010608 0.010473 0.005667 0.005731 0.009818 0.006086 0.019118 0.001466 0.002945 0.002556 0.007448 0.000595 0.001660 0.002631 0.000536 0.002503 0.001708 0.058903 0.003621 0.003087 0.009001 0.024629 0.001300 0.003310 0.021219 0.002857 0.055710 0 5 2666711 -15227151.893570 0.036899 0.002553 0.002824 0.008822 0.010771 0.000729 0.002624 0.011793 0.003056 0.036172 0.022875 0.002763 0.002929 0.009225 0.007305 0.001029 0.002750 0.008607 0.003189 0.022559 0.021971 0.006600 0.006468 0.012324 0.010420 0.005076 0.006412 0.010733 0.006755 0.021399 0.018968 0.006145 0.005983 0.011149 0.009179 0.005045 0.005937 0.009421 0.006261 0.018479 0.017877 0.006457 0.006200 0.011085 0.008798 0.005187 0.006214 0.008926 0.006420 0.017399 0.017960 0.006619 0.006314 0.011399 0.008783 0.005235 0.006347 0.008791 0.006406 0.017389 0.018607 0.006561 0.006204 0.011295 0.009180 0.005133 0.006269 0.009132 0.006206 0.018034 0.021943 0.006268 0.005817 0.011020 0.010518 0.004960 0.005942 0.010337 0.005788 0.021241 0.017231 0.003055 0.002554 0.008172 0.006245 0.001119 0.002750 0.005446 0.002342 0.016849 0.045028 0.003483 0.002981 0.009449 0.015572 0.000810 0.003183 0.013559 0.002600 0.043110 0 6 2680960 -15533429.999386 0.044062 0.002891 0.002915 0.007490 0.016390 0.001064 0.002766 0.016855 0.002865 0.040301 0.013862 0.003288 0.003270 0.008757 0.003943 0.001378 0.003085 0.005044 0.003250 0.013000 0.021137 0.007026 0.006745 0.011713 0.011035 0.005662 0.006700 0.010895 0.006848 0.019762 0.018062 0.006638 0.006310 0.011037 0.009917 0.005632 0.006290 0.009454 0.006471 0.016941 0.016423 0.007178 0.006822 0.011603 0.008910 0.006190 0.006877 0.008591 0.006994 0.015524 0.018048 0.006978 0.006650 0.011437 0.009826 0.005892 0.006664 0.009334 0.006775 0.016896 0.015552 0.006321 0.005853 0.010214 0.008575 0.005558 0.006003 0.008188 0.005974 0.014624 0.025792 0.007423 0.006845 0.012090 0.013320 0.006362 0.006991 0.011994 0.006949 0.023881 0.004151 0.003212 0.002610 0.007736 0.001377 0.001592 0.002795 0.001003 0.002508 0.003837 0.053188 0.003932 0.003208 0.008975 0.022252 0.001276 0.003427 0.019009 0.002949 0.048090
Example
First generate genotype likelihood file for chromosome 1
./angsd -GL 1 -out genolike -doGlf 1 -bam bam.filelist -r 1:
Estimate all 10 genotype fractions for the second (same order as the bam.filelist) individual (-ind1 1)
misc/ibs -f genolike.glf.gz -nInd 10 -ind1 1
The output file is genolike.glf.gz.ibs
Estimate all 10 genotype fractions for each of the 10 individuals
misc/ibs -f genolike.glf.gz -nInd 10 -o all
The output file is all.ibs
Estimate the 10x10 genotype fraction matrix the first (-ind1 0) and the fourth (ind2 3) individual (same order as the bam.filelist)
misc/ibs -f genolike.glf.gz -nInd 10 -ind1 0 -ind 3
genolike.glf.gz.ibspair
Estimate the 10x10 genotype fraction matrix for all pairs (very slow)
misc/ibs -f genolike.glf.gz -nInd 10 -allpairs 1 -o all
The output file is all.ibspair