FindRepeatsPacBio
Introduction
This tool searches for and annotates repeat regions inside a BAM file. It intersect the regions provided in the bed file with the BAM file and extracts them. On the extracted regions samtools mpileup will be run and all insertions, deletions or substitutions will be counted on a per read basis
Example
To get the help menu:
biopet tool FindRepeatsPacBio -h
Usage: FindRepeatsPacBio [options]
-l <value> | --log_level <value>
Log level
-h | --help
Print usage
-v | --version
Print version
-I <file> | --inputBam <file>
-b <file> | --inputBed <file>
output file, default to stdout
To run the tool:
biopet tool FindRepeatsPacBio --inputBam myInputbam.bam \
--inputBed myRepeatRegions.bed > mySummary.txt
Since the default output of the program is printed in stdout we can use > to write the output to a text file.
Output
The Output is a tab delimited text file which looks like this:
chr | startPos | stopPos | Repeat_seq | repeatLength | original_Repeat_readLength |
---|---|---|---|---|---|
chr4 | 3076603 | 3076667 | CAG | 3 | 65 |
chr4 | 3076665 | 3076667 | GCC | 3 | 3 |
chrX | 66765158 | 66765261 | GCA | 3 | 104 |
table continues below:
Calculated_repeat_readLength | minLength | maxLength | inserts |
---|---|---|---|
61,73,68 | 61 | 73 | GAC,G,T/A,C,G,G,A,G,A,G/C,C,C,A,C,A,G |
3,3,3 | 3 | 3 | // |
98 | 98 | 98 | A,G,G |
table continues below:
deletions | notSpan |
---|---|
1,1,2,1,1,1,2//2,1,1 | 0 |
// | 0 |
1,1,1,1,1,1,2,1 | 0 |