ECCB 2016 main conference Genome

PT09 – Improve homology search sensitivity of PacBio data by correcting frameshifts


Theater (plenary hall) September 5, 2016 2:00 pm - 2:20 pm

Bookmark and Share


Proceeding talk – Theme: Genome.

Abstract

Single-molecule, real-time sequencing developed by Pacific BioSciences produces longer reads than secondary generation sequencing technologies such as Illumina. However, PacBio data has a high sequencing error rate and most of the errors are insertion or deletion errors, causing frameshifts and leading to marginal alignment scores and short alignments in homology search. As SMRT has been adopted widely for various sequencing projects, there is an urgent need for dedicated homology search tools for PacBio data. In this work, we introduce Frame-Pro, a profile homology search tool for PacBio reads. Our tool corrects sequencing errors and also output the profile alignments of the corrected sequences against characterized protein families. We applied our tool to both simulated and real PacBio data. The results showed that our method enables more sensitive homology search and corrects more errors compared to a popular error correction tool that does not rely on hybrid sequencing.

Link to PDF file

Authors

Nan Du, Michigan State University, United States
Yanni Sun, Michigan State University, United States