Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform

Jacqueline Daykin, Richard Groult, Yannick Guesnet, Thierry Lecroq, Arnaud Lefebvre, Martine Léonard, Laurent Mouchard, Élise Prieur-Gaston, Bruce Watson

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)
129 Downloads (Pure)

Abstract

A degenerate or indeterminate string on an alphabet Σ is a sequence of non-empty subsets of Σ. Given a degenerate string t of length n and its Burrows–Wheeler transform we present a new method for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet Σ. Furthermore, it is a hybrid pattern matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search time complexity is O(qm2) for counting the number of occurrences andO(qm2+occ) for reporting the found occurrences where occ is the number of occurrences of the pattern in t. Experimental results show that our method performs well in practice
Original languageEnglish
Pages (from-to)82-87
Number of pages6
JournalInformation Processing Letters
Volume147
Early online date15 Mar 2019
DOIs
Publication statusPublished - Jul 2019

Keywords

  • Algorithms
  • Burrows–Wheeler transform
  • Degenerate
  • Pattern matching
  • String

Fingerprint

Dive into the research topics of 'Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform'. Together they form a unique fingerprint.

Cite this