Technical Program

Paper Detail

Paper Title Linear-Time Encoders for Codes Correcting a Single Edit for DNA-Based Data Storage
Paper IdentifierTU1.R5.4
Authors Yeow Meng Chee, National University of Singapore, Singapore; Han Mao Kiah, Tuan Thanh Nguyen, Nanyang Technological University, Singapore
Session DNA-based Storage
Location Saint Victor, Level 3
Session Time Tuesday, 09 July, 09:50 - 11:10
Presentation Time Tuesday, 09 July, 10:50 - 11:10
Manuscript  Click here to download the manuscript
Abstract An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. We investigate codes that combat either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two linear-time encoders. One corrects a single edit with $2\ceil{\log n} + 2$ redundant bits, while the other corrects a single indel with $\ceil{\log n} + 2$ redundant bits. The latter encoder reduces the redundancy of the best known encoder of Tenengolts (1984) by at least four bits. Over the DNA alphabet, exactly half of the symbols of a GC-balanced word are either C or G. Via a modification of Knuth’s balancing technique, we provide a linear-time map that translates binary messages into GC-balanced codewords and the resulting codebook is able to correct a single edit. The redundancy of our encoder is $3\ceil{\log n} + 2$ bits and this is the first known construction of a GC-balanced code that corrects a single edit.