Author: "Pissis S.P." / Journal: 2019 data compression conference (dcc) - Searchworks@Jio Institute Digital Library Search Results

1. Constructing Antidictionaries in Output-Sensitive Space

Author: Golnaz Badkobeh, Alice Héliou, Gabriele Fici, Solon P. Pissis, Lorraine A.K. Ayad, Department of Informatics [King's College London], King‘s College London, Goldsmiths, University of London (Goldsmiths College), University of London [London], Dipartimento di Matematica e Informatica [Palermo], Università degli studi di Palermo - University of Palermo, Centrum Wiskunde & Informatica (CWI), Equipe de recherche européenne en algorithmique et biologie formelle et expérimentale (ERABLE), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Department of Computing, Goldsmiths, University of London, Dipartimento di Matematica e Informatica, Università degli Studi di Palermo, Palermo, Italy, Storer, James A., Bilgin, Ali, Serra-Sagrista, Joan, Marcellin, Michael W., Ayad L.A.K., Badkobeh G., Fici G., Heliou A., Pissis S.P., and Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Subjects: FOS: Computer and information sciences, Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni, Output sensitive algorithms, String algorithms, Physics, Antidictionarie, Settore INF/01 - Informatica, Output sensitive algorithm, 0102 computer and information sciences, Absent words, Space (mathematics), 01 natural sciences, Antidictionaries, Combinatorics, 010201 computation theory & mathematics, TheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITY, Data compression, Computer Science - Data Structures and Algorithms, Data Structures and Algorithms (cs.DS), Computer Science::Symbolic Computation, [INFO]Computer Science [cs], Absent word, Alphabet, Word (group theory)
Abstract: A word $x$ that is absent from a word $y$ is called minimal if all its proper factors occur in $y$. Given a collection of $k$ words $y_1,y_2,\ldots,y_k$ over an alphabet $\Sigma$, we are asked to compute the set $\mathrm{M}^{\ell}_{y_{1}\#\ldots\#y_{k}}$ of minimal absent words of length at most $\ell$ of word $y=y_1\#y_2\#\ldots\#y_k$, $\#\notin\Sigma$. In data compression, this corresponds to computing the antidictionary of $k$ documents. In bioinformatics, it corresponds to computing words that are absent from a genome of $k$ chromosomes. This computation generally requires $\Omega(n)$ space for $n=|y|$ using any of the plenty available $\mathcal{O}(n)$-time algorithms. This is because an $\Omega(n)$-sized text index is constructed over $y$ which can be impractical for large $n$. We do the identical computation incrementally using output-sensitive space. This goal is reasonable when $||\mathrm{M}^{\ell}_{y_{1}\#\ldots\#y_{N}}||=o(n)$, for all $N\in[1,k]$. For instance, in the human genome, $n \approx 3\times 10^9$ but $||\mathrm{M}^{12}_{y_{1}\#\ldots\#y_{k}}|| \approx 10^6$. We consider a constant-sized alphabet for stating our results. We show that all $\mathrm{M}^{\ell}_{y_{1}},\ldots,\mathrm{M}^{\ell}_{y_{1}\#\ldots\#y_{k}}$ can be computed in $\mathcal{O}(kn+\sum^{k}_{N=1}||\mathrm{M}^{\ell}_{y_{1}\#\ldots\#y_{N}}||)$ total time using $\mathcal{O}(\mathrm{MaxIn}+\mathrm{MaxOut})$ space, where $\mathrm{MaxIn}$ is the length of the longest word in $\{y_1,\ldots,y_{k}\}$ and $\mathrm{MaxOut}=\max\{||\mathrm{M}^{\ell}_{y_{1}\#\ldots\#y_{N}}||:N\in[1,k]\}$. Proof-of-concept experimental results are also provided confirming our theoretical findings and justifying our contribution., Comment: Version accepted to DCC 2019
Published: 2019
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Pissis S.P."'

1. Constructing Antidictionaries in Output-Sensitive Space

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Database

1 results on '"Pissis S.P."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources