ANGSD: Analysis of next generation Sequencing Data
Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.
Contamination: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
Angsd can estimate contamination, but only for chromosomes that exists in one genecopy (eg chrX for males). This method requires a list of HapMap sites along with their frequency and we also recommend to discard regions with low mappability. | Angsd can estimate contamination, but only for chromosomes that exists in one genecopy (eg chrX for males). This method requires a list of HapMap sites along with their frequency and we also recommend to discard regions with low mappability. | ||
We have included a mappability and HapMap files for chrX these are found in the '''RES''' subfolder of the angsd source package. | |||
So if you are working with humans, and your sample is a male then you can estimate the contamination with the follow two commands. | |||
* First we generate a binary count file for chrX for a single BAM file (ANGSD cprogram) | |||
* Then we do a fisher test for finding a p-value, and jackknife to get an estimate of contamination (Rprogram) | |||
An example are found below: | An example are found below: | ||
<pre> | <pre> | ||
</pre> | </pre> |
Revision as of 10:56, 27 June 2014
Angsd can estimate contamination, but only for chromosomes that exists in one genecopy (eg chrX for males). This method requires a list of HapMap sites along with their frequency and we also recommend to discard regions with low mappability.
We have included a mappability and HapMap files for chrX these are found in the RES subfolder of the angsd source package. So if you are working with humans, and your sample is a male then you can estimate the contamination with the follow two commands.
- First we generate a binary count file for chrX for a single BAM file (ANGSD cprogram)
- Then we do a fisher test for finding a p-value, and jackknife to get an estimate of contamination (Rprogram)
An example are found below: