RefFinder
Small fast cprogram to extract bases from a fasta file. Download here [1]
Program can either work as a standalone program, or allow for easy retrieval of reference bases by using the API.
Install
wget http://popgen.dk/software/download/refFinder/refFinder.tar.gz cd refFinder/ make cd ..
Stand alone
Example
Generate samtools chr pos ref doing
samtools mpileup -b smallBam.filelist -f /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa |cut -f1-3 >small.sam
Use refFinder to find the bases for each position in small.sam
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full >tst cmp tst small.bam
possible options are
- inputIsZero
- full
These are flags, so examples are
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa |head a g c t a c t c g g
Or if we want the chr position also
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full |head 1 13999902 a 1 13999903 g 1 13999904 c 1 13999905 t 1 13999906 a 1 13999907 c 1 13999908 t 1 13999909 c 1 13999910 g 1 13999911 g
Or if the positions are zero index as opposed to one indexed:
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full inputIsZero |head 1 13999902 g 1 13999903 c 1 13999904 t 1 13999905 a 1 13999906 c 1 13999907 t 1 13999908 c 1 13999909 g 1 13999910 g 1 13999911 g
API
#include "refFinder.h" perFasta *pf = init("hg19.fa"); char refbase = getchar("chr20",130224101,pf) destroy(pf);
Remember to link with refFinder.o faidx.o razf.o and -lz
g++ sampleProg.cpp refFinder.o faidx.o razf.o -lz