Proceeding talk – Theme: Genome.
Abstract
Single-molecule, real-time sequencing developed by Pacific BioSciences produces longer reads than secondary generation sequencing technologies such as Illumina. However, PacBio data has a high sequencing error rate and most of the errors are insertion or deletion errors, causing frameshifts and leading to marginal alignment scores and short alignments in homology search. As SMRT has been adopted widely for various sequencing projects, there is an urgent need for dedicated homology search tools for PacBio data. In this work, we introduce Frame-Pro, a profile homology search tool for PacBio reads. Our tool corrects sequencing errors and also output the profile alignments of the corrected sequences against characterized protein families. We applied our tool to both simulated and real PacBio data. The results showed that our method enables more sensitive homology search and corrects more errors compared to a popular error correction tool that does not rely on hybrid sequencing.
Authors
Nan Du, Michigan State University, United States
Yanni Sun, Michigan State University, United States
