Back to Search Start Over

A survey of string orderings and their application to the Burrows–Wheeler transform

Authors :
Jacqueline W. Daykin
Arnaud Lefebvre
Richard Groult
Thierry Lecroq
Yannick Guesnet
Élise Prieur-Gaston
M. Léonard
Modélisation, Information et Systèmes - UR UPJV 4290 (MIS)
Université de Picardie Jules Verne (UPJV)
Equipe Traitement de l'information en Biologie Santé (TIBS - LITIS)
Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes (LITIS)
Université Le Havre Normandie (ULH)
Normandie Université (NU)-Normandie Université (NU)-Université de Rouen Normandie (UNIROUEN)
Normandie Université (NU)-Institut national des sciences appliquées Rouen Normandie (INSA Rouen Normandie)
Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)-Université Le Havre Normandie (ULH)
Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)
Source :
Theoretical Computer Science, Theoretical Computer Science, Elsevier, 2018, 710, pp.52-65. ⟨10.1016/j.tcs.2017.02.021⟩
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

For over 20 years the data clustering properties and applications of the efficient Burrows–Wheeler transform have been researched. Lexicographic suffix-sorting is induced during the transformation, and more recently a new direction has considered alternative ordering strategies for suffix arrays and thus the transforms. In this survey we look at these distinctly ordered bijective and linear transforms. For arbitrary alphabets we discuss the V-BWT derived from V-order and the D-BWT based on lex-extension order. The binary case yields a pair of transforms, the binary Rouen B-BWT, defined using binary block order. Lyndon words are relevant to implementing the original transform; the new transforms are defined for analogous structures: V-words, indeterminate Lyndon words, and B-words, respectively. There is plenty of scope for further non-lexicographic transforms as indicated in the conclusion.

Details

Language :
English
ISSN :
18792294 and 03043975
Database :
OpenAIRE
Journal :
Theoretical Computer Science, Theoretical Computer Science, Elsevier, 2018, 710, pp.52-65. ⟨10.1016/j.tcs.2017.02.021⟩
Accession number :
edsair.doi.dedup.....3138e1e664470b5d31d26fec3fd68da6
Full Text :
https://doi.org/10.1016/j.tcs.2017.02.021⟩