Abstract
The Burrows-Wheeler Transformation computes a permutation of a string of letters over an alphabet, and is well-suited to compression-related applications due to its invertability and data clustering properties. For space eciency the input to the transform can be preprocessed into Lyndon factors. We consider scenarios with uncertainty regarding the data: a position in an indeterminate or degenerate string is a set of letters. We first define Indeterminate Lyndon Words and establish their associated unique string factorization; we then introduce the novel Degenerate Burrows-Wheeler Transformation which may apply the indeterminate Lyndon factorization. A core computation in Burrows-Wheeler type transforms is the linear sorting of all conjugates of the input string - we achieve this in the degenerate case by applying lex-extension ordering. Indeterminate Lyndon factorization, and the degenerate transform and its inverse, can all be computed in linear time and space with respect to total input size of degenerate strings.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd International Conference on Algorithms for Big Data |
Subtitle of host publication | Palermo, Italy, April 07-09, 2014. |
Editors | Costas S. Iliopoulos, Alessio Langiu |
Publisher | CEUR Workshop Proceedings |
Pages | 23-29 |
Number of pages | 7 |
Publication status | Published - 16 Apr 2014 |
Externally published | Yes |