Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations

Lily Major, Amanda Clare, Jacqueline Daykin, Benjamin Mora, Leo Peña Gamboa, Christine Zarges

Research output: Chapter in Book/Report/Conference proceedingConference Proceeding (Non-Journal item)

2 Citations (Scopus)
177 Downloads (Pure)

Abstract

String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval's well-known algorithm uniquely factors a string over an ordered alphabet into Lyndon words, i.e., patterned strings which are strictly smaller than all of their cyclic rotations. While Duval's algorithm produces a pre-determined factorization, modern applications motivate the demand for factorizations with specific properties, e.g., those that minimize the number of factors or consist of factors with similar lengths. In this paper, we consider the problem of finding an alphabet ordering that yields a Lyndon factorization with such properties. We introduce a flexible evolutionary framework and evaluate it on biological sequence data. For the minimization case, we also propose a new problem-specific heuristic, Flexi-Duval, and a problem-specific mutation operator for Lyndon factorization. Our results show that our framework is competitive with Flexi-Duval for minimization and yields high quality and robust solutions for balancing where no problem-specific algorithm is available.
Original languageEnglish
Title of host publicationParallel Problem Solving from Nature – PPSN XVI - 16th International Conference, PPSN 2020, Proceedings
Subtitle of host publication16th International Conference, Leiden, The Netherlands, September 5-9, 2020, Proceedings
EditorsThomas Bäck, Mike Preuss, André Deutz, Michael Emmerich, Hao Wang, Carola Doerr, Heike Trautmann
PublisherSpringer Nature
Pages390-403
Number of pages14
ISBN (Electronic)9783030581121
ISBN (Print)9783030581114
DOIs
Publication statusPublished - 31 Aug 2020
EventParallel Problem Solving Nature - Leiden, Netherlands
Duration: 05 Sept 202009 Sept 2020
Conference number: 16
https://ppsn2020.liacs.leidenuniv.nl

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12269 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceParallel Problem Solving Nature
Abbreviated titlePPSN
Country/TerritoryNetherlands
CityLeiden
Period05 Sept 202009 Sept 2020
Internet address

Keywords

  • Alphabet ordering
  • Biosequences
  • Duval’s algorithm
  • Lyndon words
  • Permutations
  • String factorization

Fingerprint

Dive into the research topics of 'Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations'. Together they form a unique fingerprint.

Cite this