Proceeding talk – Theme: Genome.
Abstract
We describe CoLoRMap, a hybrid correction method for correcting noisy long reads, such as the ones produced by Pacific Biosciences (PacBio) sequencing technology, using high-quality Illumina short paired-end reads mapped onto the long reads. Our algorithm is based on two novel ideas: using a classical shortest path algorithm to find a sequence of overlapping short reads that minimizes the distance to a long read and extending corrected regions by local assembly of unmapped mates of mapped short reads. Our results show that we compensate a slight drop in correction accuracy by the ability to correct long reads that can be mapped to the reference better and used for downstream analysis tasks more reliably.
Authors
Ehsan Haghshenas, Simon Fraser University, Canada
Faraz Hach, Simon Fraser University, Canada
S. Cenk Sahinalp, Simon Fraser University, Canada
Cedric Chauve, Department of Mathematics, Simon Fraser University, Canada
