ECCB 2016 main conference Genome

PT02 – CoLoRMap: Correcting Long Reads by Mapping short reads


Theater (plenary hall) September 5, 2016 10:40 am - 11:00 am

Bookmark and Share


Proceeding talk – Theme: Genome.

Abstract

We describe CoLoRMap, a hybrid correction method for correcting noisy long reads, such as the ones produced by Pacific Biosciences (PacBio) sequencing technology, using high-quality Illumina short paired-end reads mapped onto the long reads. Our algorithm is based on two novel ideas: using a classical shortest path algorithm to find a sequence of overlapping short reads that minimizes the distance to a long read and extending corrected regions by local assembly of unmapped mates of mapped short reads. Our results show that we compensate a slight drop in correction accuracy by the ability to correct long reads that can be mapped to the reference better and used for downstream analysis tasks more reliably.

Link to PDF file

Authors

Ehsan Haghshenas, Simon Fraser University, Canada
Faraz Hach, Simon Fraser University, Canada
S. Cenk Sahinalp, Simon Fraser University, Canada
Cedric Chauve, Department of Mathematics, Simon Fraser University, Canada