Descriptor: "bit parallelism" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"bit parallelism"' showing total 47 results

Start Over Descriptor "bit parallelism"

47 results on '"bit parallelism"'

1. Internal shortest absent word queries in constant time and linear space.

Author: Badkobeh, Golnaz, Charalampopoulos, Panagiotis, Kosolobov, Dmitry, and Pissis, Solon P.
Subjects: *VECTOR spaces, *DATA structures
Abstract: Given a string T of length n over an alphabet Σ ⊂ { 1 , 2 , ... , n O (1) } of size σ , we are to preprocess T so that given a range [ i , j ] , we can return a representation of a shortest string over Σ that is absent in the fragment T [ i ] ⋯ T [ j ] of T. We present an O (n) -space data structure that answers such queries in constant time and can be constructed in O (n log σ ⁡ n) time. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

2. Linear work generation of R-MAT graphs.

Author: Hübschle-Schneider, Lorenz and Sanders, Peter
Subjects: PARALLELISM (Linguistics), SUPERCOMPUTERS
Published: 2020
Full Text: View/download PDF

3. Regular Expression Matching

Author: Ilie, Lucian and Kao, Ming-Yang, editor
Published: 2016
Full Text: View/download PDF

4. High Performance Construction of RecSplit Based Minimal Perfect Hash Functions

Author: Dominik Bez and Florian Kurpicz and Hans-Peter Lehmann and Peter Sanders, Bez, Dominik, Kurpicz, Florian, Lehmann, Hans-Peter, Sanders, Peter, Dominik Bez and Florian Kurpicz and Hans-Peter Lehmann and Peter Sanders, Bez, Dominik, Kurpicz, Florian, Lehmann, Hans-Peter, and Sanders, Peter
Abstract: A minimal perfect hash function (MPHF) bijectively maps a set S of objects to the first |S| integers. It can be used as a building block in databases and data compression. RecSplit [Esposito et al., ALENEX'20] is currently the most space efficient practical minimal perfect hash function. It heavily relies on trying out hash functions in a brute force way. We introduce rotation fitting, a new technique that makes the search more efficient by drastically reducing the number of tried hash functions. Additionally, we greatly improve the construction time of RecSplit by harnessing parallelism on the level of bits, vectors, cores, and GPUs. In combination, the resulting improvements yield speedups up to 239 on an 8-core CPU and up to 5438 using a GPU. The original single-threaded RecSplit implementation needs 1.5 hours to construct an MPHF for 5 Million objects with 1.56 bits per object. On the GPU, we achieve the same space usage in just 5 seconds. Given that the speedups are larger than the increase in energy consumption, our implementation is more energy efficient than the original implementation.
Published: 2023
Full Text: View/download PDF

5. Bit-Parallel Multiple Pattern Matching

Author: Tran, Tuan Tu, Giraud, Mathieu, Varré, Jean-Stéphane, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Wyrzykowski, Roman, editor, Dongarra, Jack, editor, Karczewski, Konrad, editor, and Waśniewski, Jerzy, editor
Published: 2012
Full Text: View/download PDF

6. A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms

Author: Faro, Simone, Lecroq, Thierry, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, and Klasing, Ralf, editor
Published: 2012
Full Text: View/download PDF

7. Bit-(Parallelism)2: Getting to the Next Level of Parallelism

Author: Cantone, Domenico, Faro, Simone, Giaquinta, Emanuele, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Boldi, Paolo, editor, and Gargano, Luisa, editor
Published: 2010
Full Text: View/download PDF

8. An Efficient and Accurate Stochastic Number Generator Using Even-Distribution Coding.

Author: Zhakatayev, Aidyn, Kim, Kyounghoon, Choi, Kiyoung, and Lee, Jongeun
Subjects: *SYNTHETIC natural gas, *SHIFT registers, *LOGIC circuits, *BINARY sequences
Abstract: Stochastic computing (SC) is a promising approach for low-power and low-cost applications with the added benefit of high error tolerance. However, the high overhead of generating stochastic bitstreams can offset the advantages of SC especially when a large number of bitstreams are needed. In this paper, we propose a new stochastic number generator (SNG) that significantly reduces area and energy while improving accuracy. Experimental results show that the proposed SNG can reduce energy by more than 72% compared with the state-of-the-art designs. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

9. BitmapAligner: Bit-Parallelism String Matching with MapReduce and Hadoop

Author: Mary Aksa, Toqeer Mahmood, Junaid Rashid, Muhammad Wasif Nisar, Amir Hussain, and Hyuk-Yoon Kwon
Subjects: Biomaterials, Mechanics of Materials, Computer science, Modeling and Simulation, Bit parallelism, Parallel computing, String searching algorithm, Electrical and Electronic Engineering, Computer Science Applications
Published: 2021
Full Text: View/download PDF

10. Bit-Parallelism Score Computation with Multi Integer Weight

Author: Setyorini Setyorini, Dwi H. Widyantoro, Adi Pancoro, and Kuspriyanto Kuspriyanto
Subjects: Computer science, Computation, General Engineering, Bit parallelism, Arithmetic, Integer (computer science)
Published: 2019
Full Text: View/download PDF

11. Internal shortest absent word queries

Author: Badkobeh, Golnaz, Charalampopoulos, Panagiotis, Pissis, Solon P., Gawrychowski, Pawel, Starikovskaya, Tatiana, Gawrychowski, Pawel, Starikovskaya, Tatiana, Bioinformatics, AIMMS, and Bio Informatics (IBIVU)
Subjects: String algorithms, Shortest absent word, Internal queries, Bit parallelism, Theory of computation → Pattern matching
Abstract: Given a string T of length n over an alphabet Σ ⊂ {1,2,…,n^{𝒪(1)}} of size σ, we are to preprocess T so that given a range [i,j], we can return a representation of a shortest string over Σ that is absent in the fragment T[i]⋯ T[j] of T. For any positive integer k ∈ [1,log log_σ n], we present an 𝒪((n/k)⋅ log log_σ n)-size data structure, which can be constructed in 𝒪(nlog_σ n) time, and answers queries in time 𝒪(log log_σ k)., LIPIcs, Vol. 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021), pages 6:1-6:18
Published: 2021
Full Text: View/download PDF

12. Efficiently enumerating all maximal cliques with bit-parallelism

Author: Pablo San Segundo, Jorge Artieda, and Darren Strash
Subjects: FOS: Computer and information sciences, F.2.2, G.2.2, Current (mathematics), General Computer Science, Selection (relational algebra), 0102 computer and information sciences, 02 engineering and technology, Management Science and Operations Research, Clique (graph theory), 01 natural sciences, Clique problem, 010201 computation theory & mathematics, Simple (abstract algebra), Modeling and Simulation, Computer Science - Data Structures and Algorithms, 0202 electrical engineering, electronic engineering, information engineering, Enumeration, Benchmark (computing), Data Structures and Algorithms (cs.DS), 020201 artificial intelligence & image processing, Bit parallelism, Sociology, Algorithm
Abstract: The maximal clique enumeration (MCE) problem has numerous applications in biology, chemistry, sociology, and graph modeling. Though this problem is well studied, most current research focuses on finding solutions in large sparse graphs or very dense graphs, while sacrificing efficiency on the most difficult medium-density benchmark instances that are representative of data sets often encountered in practice. We show that techniques that have been successfully applied to the maximum clique problem give significant speed gains over the state-of-the-art MCE algorithms on these instances. Specifically, we show that a simple greedy pivot selection based on a fixed maximum-degree first ordering of vertices, when combined with bit-parallelism, performs consistently better than the theoretical worst-case optimal pivoting of the state-of-the-art algorithms of Tomita et al. [Theoretical Computer Science, 2006] and Naud\'e [Theoretical Computer Science, 2016]. Experiments show that our algorithm is faster than the worst-case optimal algorithm of Tomita et al. on 60 out of 74 standard structured and random benchmark instances: we solve 48 instances 1.2 to 2.2 times faster, and solve the remaining 12 instances 3.6 to 47.6 times faster. We also see consistent speed improvements over the algorithm of Naud\'e: solving 61 instances 1.2 to 2.4 times faster. To the best of our knowledge, we are the first to achieve such speed-ups compared to these state-of-the-art algorithms on these standard benchmarks., Comment: 15 pages, 1 figure
Published: 2018
Full Text: View/download PDF

13. Detecting k-(Sub-)Cadences and Equidistant Subsequence Occurrences

Author: Mitsuru Funakoshi and Yuto Nakashima and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda and Ayumi Shinohara, Funakoshi, Mitsuru, Nakashima, Yuto, Inenaga, Shunsuke, Bannai, Hideo, Takeda, Masayuki, Shinohara, Ayumi, Mitsuru Funakoshi and Yuto Nakashima and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda and Ayumi Shinohara, Funakoshi, Mitsuru, Nakashima, Yuto, Inenaga, Shunsuke, Bannai, Hideo, Takeda, Masayuki, and Shinohara, Ayumi
Abstract: The equidistant subsequence pattern matching problem is considered. Given a pattern string P and a text string T, we say that P is an equidistant subsequence of T if P is a subsequence of the text such that consecutive symbols of P in the occurrence are equally spaced. We can consider the problem of equidistant subsequences as generalizations of (sub-)cadences. We give bit-parallel algorithms that yield o(n²) time algorithms for finding k-(sub-)cadences and equidistant subsequences. Furthermore, O(nlog² n) and O(nlog n) time algorithms, respectively for equidistant and Abelian equidistant matching for the case |P| = 3, are shown. The algorithms make use of a technique that was recently introduced which can efficiently compute convolutions with linear constraints.
Published: 2020
Full Text: View/download PDF

14. On finding k-cliques in k-partite graphs.

Author: Mirghorbani, M. and Krokhmal, P.
Abstract: In this paper, a branch-and-bound algorithm for finding all cliques of size k in a k-partite graph is proposed that improves upon the method of Grunert et al. (in Comput Oper Res 29(1):13-31, ). The new algorithm uses bit-vectors, or bitsets, as the main data structure in bit-parallel operations. Bitsets enable a new form of data representation that improves branching and backtracking of the branch-and-bound procedure. Numerical studies on randomly generated instances of k-partite graphs demonstrate competitiveness of the developed method. [ABSTRACT FROM AUTHOR]
Published: 2013
Full Text: View/download PDF

15. The wide window string matching algorithm

Author: He, Longtao, Fang, Binxing, and Sui, Jie
Subjects: *ROBOTS, *ALGORITHMS, *GAME theory, *MATHEMATICAL models
Abstract: Abstract: Generally, current string matching algorithms make use of a window whose size is equal to pattern length. In this paper, we present a novel string matching algorithm named WW (for Wide Window) algorithm, which divides the text into overlapping windows of size . In the windows, the algorithm attempts m possible occurrence positions in parallel. It firstly searches pattern suffixes from middle to right with a forward suffix automaton, shifts the window directly when it fails, otherwise, scans the corresponding prefixes backward with a reverse prefix automaton. Theoretical analysis shows that WW has optimal time complexity of in the worst, best and for average case. Experimental comparison of WW with existing algorithms validates our theoretical claims for searching long patterns. It further reveals that WW is also efficient for searching short patterns. [Copyright &y& Elsevier]
Published: 2005
Full Text: View/download PDF

16. Pattern Matching.

Author: Navarro, Gonzalo
Subjects: *AUTOMATION, *INDUSTRIAL engineering, *PATTERNMAKING, *MODELS & modelmaking, *APPROXIMATION theory, *FUNCTIONAL analysis
Abstract: An important subtask of the pattern discovery process is pattern matching, where the pattern sought is already known and we want to determine how often and where it occurs in a sequence. In this paper we review the most practical techniques to find patterns of different kinds. We show how regular expressions can be searched for with general techniques, and how simpler patterns can be dealt with more simply and efficiently. We consider exact as well as approximate pattern matching. Also we cover both sequential searching, where the sequence cannot be preprocessed, and indexed searching, where we have a data structure built over the sequence to speed up the search. [ABSTRACT FROM AUTHOR]
Published: 2004
Full Text: View/download PDF

17. Detecting $k$-(Sub-)Cadences and Equidistant Subsequence Occurrences

Author: Funakoshi, Mitsuru, Nakashima, Yuto, Inenaga, Shunsuke, Bannai, Hideo, Takeda, Masayuki, and Shinohara, Ayumi
Subjects: FOS: Computer and information sciences, 050101 languages & linguistics, cadences, 05 social sciences, 02 engineering and technology, string algorithms, pattern matching, Computer Science - Data Structures and Algorithms, bit parallelism, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, 0501 psychology and cognitive sciences, Data Structures and Algorithms (cs.DS), subsequences
Abstract: The equidistant subsequence pattern matching problem is considered. Given a pattern string P and a text string T, we say that P is an equidistant subsequence of T if P is a subsequence of the text such that consecutive symbols of P in the occurrence are equally spaced. We can consider the problem of equidistant subsequences as generalizations of (sub-)cadences. We give bit-parallel algorithms that yield o(n²) time algorithms for finding k-(sub-)cadences and equidistant subsequences. Furthermore, O(nlog² n) and O(nlog n) time algorithms, respectively for equidistant and Abelian equidistant matching for the case |P| = 3, are shown. The algorithms make use of a technique that was recently introduced which can efficiently compute convolutions with linear constraints., LIPIcs, Vol. 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020), pages 12:1-12:11
Published: 2020
Full Text: View/download PDF

18. Linear Work Generation of R-MAT Graphs

Author: Lorenz Hübschle-Schneider and Peter Sanders
Subjects: FOS: Computer and information sciences, sampling, Discrete Mathematics (cs.DM), Sociology and Political Science, Social Psychology, Logarithm, Alias, parallel processing, Computer science, 0102 computer and information sciences, 02 engineering and technology, Topology, 01 natural sciences, large graphs, Matrix (mathematics), graph generator, Computer Science - Data Structures and Algorithms, 0202 electrical engineering, electronic engineering, information engineering, Overhead (computing), Data Structures and Algorithms (cs.DS), Social and Information Networks (cs.SI), Communication, DATA processing & computer science, Computer Science - Social and Information Networks, Data structure, Computer Science - Distributed, Parallel, and Cluster Computing, 010201 computation theory & mathematics, bit parallelism, 020201 artificial intelligence & image processing, Node (circuits), Distributed, Parallel, and Cluster Computing (cs.DC), ddc:004, Constant (mathematics), Order of magnitude, Computer Science - Discrete Mathematics
Abstract: R-MAT (for Recursive MATrix) is a simple, widely used model for generating graphs with a power law degree distribution, a small diameter, and communitys structure. It is particularly attractive for generating very large graphs because edges can be generated independently by an arbitrary number of processors. However, current R-MAT generators need time logarithmic in the number of nodes for generating an edge— constant time for generating one bit at a time for node IDs of the connected nodes. We achieve constant time per edge by precomputing pieces of node IDs of logarithmic length. Using an alias table data structure, these pieces can then be sampled in constant time. This simple technique leads to practical improvements by an order of magnitude. This further pushes the limits of attainable graph size and makes generation overhead negligible in most situations.
Published: 2019

19. Approximate search for big data with applications in information security: A survey

Author: Slobodan Petrovic
Subjects: Big Data, information security, Chemistry, business.industry, Computer science, Digital forensics, Big data, approximate search, bit-parallelism, Intrusion detection system, Information security, computer.software_genre, lcsh:HD72-88, lcsh:Economic growth, development, planning, Search algorithm, Malware, Bit parallelism, Data mining, constraints, business, Business management, computer
Abstract: This paper is a survey of approximate search techniques in very large data sets (so-called Big Data). After a short introduction, some techniques for speeding up approximate search in such data sets based on exploitation of inherent bit-parallelism in computers are described. It then reviews the applications in search related to information security problems (digital forensics, malware detection, intrusion detection) are reviewed. Finally, the need for constraints in approximate search regarding the number of so-called elementary edit operations and the run lengths of particular elementary edit operations is explained and the status of on-going research on efficient implementation of approximate search algorithms with various constraints is given.
Published: 2015
Full Text: View/download PDF

20. String matching with lookahead

Author: Hannu Peltola and Jorma Tarhio
Subjects: Theoretical computer science, LOOP (programming language), Generalization, Computer science, Character (computing), Applied Mathematics, Binary data, Discrete Mathematics and Combinatorics, Window (computing), Bit parallelism, String searching algorithm, Variation (game tree), Algorithm
Abstract: Forward-SBNDM is a recently introduced variation of the BNDM algorithm for exact string matching. Forward-SBNDM inspects a 2-gram in the text such that the first character is the last one of an alignment window of the pattern and the second one is then outside the window. We present a generalization of this idea by inspecting several lookahead characters beyond an alignment window and integrate it with SBNDMq, a q-gram variation of BNDM. As a result, we get several new variations of SBNDMq. In addition, we introduce a greedy skip loop for SBNDM2. We tune up our algorithms and the reference algorithms with 2-byte read. According to our experiments, the best of the new variations are faster than the winners of recent algorithm comparisons for English, DNA, and binary data.
Published: 2014
Full Text: View/download PDF

21. BIT-PARALLEL ALGORITHMS FOR THE MERGED LONGEST COMMON SUBSEQUENCE PROBLEM

Author: Sebastian Deorowicz and Agnieszka Danek
Subjects: Longest common subsequence problem, Theoretical computer science, Similarity (geometry), Longest alternating subsequence, Computer Science (miscellaneous), Hunt–McIlroy algorithm, Parallel algorithm, Edit distance, Bit parallelism, Longest increasing subsequence, Algorithm, Mathematics
Abstract: It is often a necessity to compare some sequences to find out how similar they are. There are many similarity measures that can be used, e.g., longest common subsequence, edit distance, sequence alignment. Recently a merged longest common subsequence (MergedLCS) problem was formulated with applications in bioinformatics. We propose the bit-parallel algorithms for the MergedLCS problem and evaluate them in practice showing that they are usually tens times faster than the already published methods.
Published: 2013
Full Text: View/download PDF

22. A Simple Streaming Bit-Parallel Algorithm for Swap Pattern Matching

Author: Ondřej Suchý, Tomáš Valla, and Václav Blažej
Subjects: Deterministic finite automaton, Swap (finance), Computer science, Efficient algorithm, Parallel algorithm, Bit parallelism, Pattern matching, Algorithm, Bitwise operation, Blossom algorithm
Abstract: The pattern matching problem with swaps is to find all occurrences of a pattern in a text while allowing the pattern to swap adjacent symbols. The goal is to design fast matching algorithm that takes advantage of the bit parallelism of bitwise machine instructions and has only streaming access to the input. We introduce a new approach to solve this problem based on the graph theoretic model and compare its performance to previously known algorithms. We also show that an approach using deterministic finite automata cannot achieve similarly efficient algorithms. Furthermore, we describe a fatal flaw in some of the previously published algorithms based on the same model. Finally, we provide experimental evaluation of our algorithm on real-world data.
Published: 2017
Full Text: View/download PDF

23. Improving practical exact string matching

Author: Branislav Ďurian, Jorma Tarhio, Hannu Peltola, and Jan Holub
Subjects: ta113, State variable, Theoretical computer science, Parallel algorithm, Information processing, Process (computing), String searching algorithm, Computer Science Applications, Theoretical Computer Science, Experimental comparison, ta5141, Signal Processing, String matching, Point (geometry), Bit parallelism, ta513, ta512, Bit-parallelism, Algorithm, ta515, Algorithms, Information Systems, Mathematics
Abstract: We present improved variations of the BNDM algorithm for exact string matching. At each alignment our bit-parallel algorithms process a q-gram before testing the state variable. In addition we apply reading a 2-gram in one instruction. Our point of view is practical efficiency of algorithms. Our experiments show that the new variations are faster than earlier algorithms in many cases.
Published: 2010
Full Text: View/download PDF

24. An efficient pattern matching scheme in LZW compressed sequences

Author: Nai-Lun Huang and Tsern-Huei Lee
Subjects: Computer Networks and Communications, Computer science, Data_CODINGANDINFORMATIONTHEORY, String searching algorithm, computer.file_format, Uncompressed video, Space requirements, Bitmap, Bit parallelism, Pattern matching, Algorithm, computer, Information Systems, Compressed pattern matching
Abstract: Summary Compressed pattern matching (CPM) is an emerging research field addressing the problem: given a compressed sequence and a pattern, process the sequence with minimal (or no) decompression to find the pattern occurrence(s) in the uncompressed sequence. It can be applied to detect malwares and confidential information leakage in compressed files directly. In this paper, we report our work of CPM in Lempel–Ziv–Welch (LZW) compressed sequences. We propose an efficient bitmap-based realization of the Amir–Benson–Farach algorithm. We also generalize the algorithm to find all pattern occurrences and report their absolute positions in the uncompressed sequence. Experiments are conducted to test the space requirements of our proposed generalization and two related CPM schemes which can also be realized with bitmaps. Results show that our proposed generalization requires the least amount of storage for moderate and long patterns. We also conduct experiments to compare the throughput performance of our proposed generalization with these two related CPM schemes and the decompress-then-search scheme. Results show that our proposed generalization outperforms the decompress-then-search scheme significantly. When scanning a file with pattern occurrences, our proposed generalization performs slightly better than the two related CPM schemes. The difference is significant when scanning a file with no pattern occurrence. Copyright # 2008 John Wiley & Sons, Ltd.
Published: 2008
Full Text: View/download PDF

25. Word Size Removal (WSR) method for Bit Parallel String Matching

Author: Kapil Kumar Soni and Vivek Kumar Sharma
Subjects: Bit (horse), Pattern size, Computer science, Bit parallelism, String searching algorithm, Algorithm, Word (computer architecture)
Abstract: Bit Parallel String Matching algorithms are the most efficient and latest class of String Matching Algorithms. These algorithms use the intrinsic property of computer known as Bit Parallelism for performing bit operations in parallel with-in single computer word. BNDM, TNDM, BNDMq are the popular single pattern bit parallel string matching algorithms whereas Shift-OR and Shift-OR with Q-grams algorithms are popular in multiple pattern string matching. All these algorithms are restricted by the limitation of pattern size. These all algorithms are not working on patterns whose sizes are longer than computer word. In this paper, we proposed the generic method named WSR (Word Size Removal) for bit parallel string matching. With the inclusion of WSR these bit parallel string matching algorithms can work on longer than computer word size patterns. This paper presented WSR method and all modified algorithms.
Published: 2015
Full Text: View/download PDF

26. Regular expression searching on compressed text

Author: Gonzalo Navarro
Subjects: Discrete mathematics, Ziv–Lempel, Regular languages, Automata, Automaton, Theoretical Computer Science, Regular language, Computational Theory and Mathematics, Search algorithm, Compressed pattern matching, Discrete Mathematics and Combinatorics, Bit parallelism, Regular expression, Bit-parallelism, Mathematics
Abstract: We present a solution to the problem of regular expression searching on compressed text. The format we choose is the Ziv-Lempel family, specifically the LZ78 and LZW variants. Given a text of length u compressed into length n, and a pattern of length m, we report all the R occurrences of the pattern in the text in O(2m + mn + Rm log m) worst case time. On average this drops to O(m2 + (n + Rm) log m) or O(m2 + n + Ru/n) for most regular expressions. This is the first nontrivial result for this problem. The experimental results show that our compressed search algorithm needs half the time necessary for decompression plus searching, which is currently the only alternative.
Published: 2003
Full Text: View/download PDF

27. The implementation of bit-parallelism for DNA sequence alignment

Author: Dwi H. Widyantoro, Kuspriyanto, Setyorini, and Adi Pancoro
Subjects: History, Computer science, Sequence alignment, Bit parallelism, Parallel computing, Computer Science Applications, Education
Published: 2017
Full Text: View/download PDF

28. Fast Longest Common Subsequence with General Integer Scoring Support on GPUs

Author: Martin Swany, Adnan Ozsoy, and Arun Chauhan
Subjects: Longest common subsequence problem, Discrete mathematics, CUDA, Class (computer programming), Theoretical computer science, Matching (graph theory), Computer science, Benchmark (computing), Bit parallelism, Longest increasing subsequence, Parallel computing, Massively parallel, Integer (computer science)
Abstract: Graphic Processing Units (GPUs) have been gaining popularity among high-performance users. Certain classes of algorithms benefit greatly from the massive parallelism of GPUs. One such class of algorithms is longest common subsequence (LCS). Combined with bit parallelism, recent studies have been able to achieve terascale performance for LCS on GPUs. However, the reported results for the one-to-many matching problem lack correlation with weighted scoring algorithms. In this paper, we describe a novel technique to improve the score significance of the length of LCS algorithm for multiple matching. We extend the bit-vector algorithms for LCS to include integer scoring and parallelize them for hybrid CPU-GPU platforms. We benchmark our algorithm against the well-known sequence alignment algorithm on GPUs, CUDASW++, for accuracy and report performance on three different systems.
Published: 2014
Full Text: View/download PDF

29. On Pattern Matching With Swaps

Author: Fouad B. Chedid
Subjects: Dynamic programming, Computational complexity theory, Computer science, Efficient algorithm, Approximation algorithm, Bit parallelism, Variation (game tree), Pattern matching, Disjoint sets, Algorithm
Abstract: Pattern Matching with Swaps (PMS for short) is a variation of the classical pattern matching problem where a match is allowed to include disjoint local swaps. In 2009, Cantone and Faro devised a new dynamic programming algorithm for PMS, named Cross-Sampling, that runs in O(nm) time and uses O(m) space. More important, Cross-Sampling admits a linear-time implementation based on bit parallelism when the pattern's size is comparable to the word size of the machine. In this paper, we present improved dynamic programming formulations of the approach of Cantone and Faro for PMS which result in simpler algorithms that are much easier to be comprehended and implemented.
Published: 2013
Full Text: View/download PDF

30. A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms

Author: Thierry Lecroq and Simone Faro
Subjects: Nondeterministic automata, Speedup, Theoretical computer science, Text processing, Efficient algorithm, Bit parallelism, String searching algorithm, Algorithm, Natural language, Mathematics
Abstract: In this paper we present a general approach to string matching based on multiple sliding text-windows, and show how it can be applied to some among the most efficient algorithms for the problem based on nondeterministic automata and comparison of characters. From our experimental results it turns out that the new multiple sliding windows approach leads to algorithms which obtain better results than the original ones when searching texts over relatively large alphabets. Best improvements are obtained especially for short patterns.
Published: 2012

31. Bit-Parallel Multiple Pattern Matching

Author: Tuan Tu Tran, Jean-Stéphane Varré, Mathieu Giraud, Laboratoire d'Informatique Fondamentale de Lille (LIFL), Université de Lille, Sciences et Technologies-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lille, Sciences Humaines et Sociales-Centre National de la Recherche Scientifique (CNRS), Bioinformatics and Sequence Analysis (BONSAI), Université de Lille, Sciences et Technologies-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lille, Sciences Humaines et Sociales-Centre National de la Recherche Scientifique (CNRS)-Université de Lille, Sciences et Technologies-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lille, Sciences Humaines et Sociales-Centre National de la Recherche Scientifique (CNRS)-Inria Lille - Nord Europe, and Institut National de Recherche en Informatique et en Automatique (Inria)
Subjects: 0303 health sciences, Multi-core processor, sequence comparison, OpenCL, Computer science, GPU, 02 engineering and technology, Parallel computing, Extension (predicate logic), [SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM], 03 medical and health sciences, Task (computing), pattern matching, neighborhood indexing, 020204 information systems, Text matching, bit parallelism, 0202 electrical engineering, electronic engineering, information engineering, Pattern matching, Multicore cpu, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], Heuristics, Massively parallel, 030304 developmental biology
Abstract: International audience; Text matching with errors is a regular task in computational biology. We present an extension of the bit-parallel Wu-Manber algorithm to combine several searches for a pattern into a collection of fixed-length words. We further present an OpenCL parallelization of a redundant index on massively parallel multicore processors, within a framework of searching for similarities with seed-based heuristics. We successfully implemented and ran our algorithms on GPU and multicore CPU. Some speedups obtained are more than 60x.
Published: 2011

32. Optimal Packed String Matching

Author: Ben-Kiki, Oren, Bille, Philip, Breslauer, Dany, Gasieniec, Leszek, Grossi, Roberto, Weimann, Oren, Chakraborty, Supratik, and Kumar, Amit
Subjects: Space efficiency, Real time, 000 Computer science, knowledge, general works, 010201 computation theory & mathematics, Computer Science, 0202 electrical engineering, electronic engineering, information engineering, Bit parallelism, 020201 artificial intelligence & image processing, String matching, 0102 computer and information sciences, 02 engineering and technology, 01 natural sciences
Abstract: In the packed string matching problem, each machine word accomodates alpha characters, thus an n-character text occupies n/alpha memory words. We extend the Crochemore-Perrin constant-space O(n)-time string matching algorithm to run in optimal O(n/alpha) time and even in real-time, achieving a factor alpha speedup over traditional algorithms that examine each character individually. Our solution can be efficiently implemented, unlike prior theoretical packed string matching work. We adapt the standard RAM model and only use its AC0 instructions (i.e. no multiplication) plus two specialized AC0 packed string instructions. The main string-matching instruction is available in commodity processors (i.e. Intel's SSE4.2 and AVX Advanced String Operations); the other maximal-suffix instruction is only required during pattern preprocessing. In the absence of these two specialized instructions, we propose theoretically-efficient emulation using integer multiplication (not AC0) and table lookup.
Published: 2011
Full Text: View/download PDF

33. Filtering Based Multiple String Matching Algorithm Combining q-Grams and BNDM

Author: Guiran Chang, Xingwei Wang, and Changsheng Miao
Subjects: Statistical classification, Computer science, Preprocessor, Approximation algorithm, Algorithm design, Bit parallelism, String searching algorithm, Pattern matching, Algorithm
Abstract: We present a new algorithm for exact multiple string matching. Our algorithm is based on filtration combining BNDM and q-grams. We have tested it with experiments and compared it with other algorithms, e.g. DFA, AC_BM and MWM. The preprocessing phase of our algorithms is fast, the memory usage is fairly small, and our algorithm is considerably faster for huge sets of several thousand patterns. The benefits are due to the bit parallelism of BNDM and the improved filtering efficiency by q-grams.
Published: 2010
Full Text: View/download PDF

34. Bit-Parallelism$^2$: Getting to the Next Level of Parallelism

Author: Domenico Cantone, Emanuele Giaquinta, and Simone Faro
Subjects: Data parallelism, Computer science, string matching, design and analysis of algorithms, Task parallelism, String searching algorithm, Parallel computing, instruction-level parallelism, Scalable parallelism, Memory-level parallelism, bit parallelism, text processing, Implicit parallelism, Instruction-level parallelism
Abstract: We investigate the problem of getting to a higher instruction-level parallelism in string matching algorithms. In particular, starting from an algorithm based on bit-parallelism, we propose two flexible approaches for boosting it with a higher level of parallelism. These approaches are general enough to be applied to other bit-parallel algorithms. It turns out that higher levels of parallelism lead to more efficient solutions in practical cases, as demonstrated by an extensive experimentation.
Published: 2010

35. Fast text searching

Author: Udi Manber and Sun Wu
Subjects: Data processing, General Computer Science, Bitap algorithm, Search algorithm, Computer science, Bit parallelism, Pattern matching, Approximate string matching, Algorithm, Text searching, Compressed pattern matching
Published: 1992
Full Text: View/download PDF

36. BPBM: An Algorithm for String Matching with Wildcards and Length Constraints

Author: Yingling Liu, Jun Gao, Xuegang Hu, Gongqing Wu, Xiao-Li Hong, and Xindong Wu
Subjects: Sequence, Matching (graph theory), business.industry, Commentz-Walter algorithm, Scale (descriptive set theory), Pattern recognition, Wildcard character, computer.file_format, String searching algorithm, Bit parallelism, Pattern matching, Artificial intelligence, business, Algorithm, computer, Mathematics
Abstract: Pattern matching with wildcards and length constraints under the one-off condition is a challenging topic. We propose an algorithm BPBM, based on bit parallelism and the Boyer-Moore algorithm, that outputs an occurrence of a given pattern P as soon as the pattern appears in the given sequence. The experimental results show that our BPBM algorithm has an improved time performance of over 50% with the same matching results when compared with SAIL, a state-of-the-art algorithm of this matching problem. The superiority is even more remarkable when the scale of the pattern increases.
Published: 2009
Full Text: View/download PDF

37. Multipattern string matching with q -grams

Author: Leena Salmela, Jorma Tarhio, and Jari Kytöjoki
Subjects: Trie, Commentz-Walter algorithm, Bit parallelism, String searching algorithm, Intrusion detection system, Approximate string matching, Algorithm, Theoretical Computer Science, Mathematics
Abstract: We present three algorithms for exact string matching of multiple patterns. Our algorithms are filtering methods, which apply q -grams and bit parallelism. We ran extensive experiments with them and compared them with various versions of earlier algorithms, e.g., different trie implementations of the Aho--Corasick algorithm. All of our algorithms appeared to be substantially faster than earlier solutions for sets of 1,000--10,000 patterns and the good performance of two of them continues to 100,000 patterns. The gain is because of the improved filtering efficiency caused by q -grams.
Published: 2007
Full Text: View/download PDF

38. Optimal Packed String Matching

Author: Chakraborty, Supratik, Kumar, Amit, Ben-Kiki, Oren, Bille, Philip, Breslauer, Dany, Gasieniec, Leszek, Grossi, Roberto, Weimann, Oren, Chakraborty, Supratik, Kumar, Amit, Ben-Kiki, Oren, Bille, Philip, Breslauer, Dany, Gasieniec, Leszek, Grossi, Roberto, and Weimann, Oren
Abstract: In the packed string matching problem, each machine word accommodates – characters, thus an n-character text occupies n/– memory words. We extend the Crochemore-Perrin constantspace O(n)-time string matching algorithm to run in optimal O(n/–) time and even in real-time, achieving a factor – speedup over traditional algorithms that examine each character individually. Our solution can be efficiently implemented, unlike prior theoretical packed string matching work. We adapt the standard RAM model and only use its AC0 instructions (i.e., no multiplication) plus two specialized AC0 packed string instructions. The main string-matching instruction is available in commodity processors (i.e., Intel’s SSE4.2 and AVX Advanced String Operations); the other maximal-suffix instruction is only required during pattern preprocessing. In the absence of these two specialized instructions, we propose theoretically-efficient emulation using integer multiplication (not AC0) and table lookup.
Published: 2011

39. Optimal Packed String Matching

Author: Oren Ben-Kiki and Philip Bille and Dany Breslauer and Leszek Gasieniec and Roberto Grossi and Oren Weimann, Ben-Kiki, Oren, Bille, Philip, Breslauer, Dany, Gasieniec, Leszek, Grossi, Roberto, Weimann, Oren, Oren Ben-Kiki and Philip Bille and Dany Breslauer and Leszek Gasieniec and Roberto Grossi and Oren Weimann, Ben-Kiki, Oren, Bille, Philip, Breslauer, Dany, Gasieniec, Leszek, Grossi, Roberto, and Weimann, Oren
Abstract: In the packed string matching problem, each machine word accomodates alpha characters, thus an n-character text occupies n/alpha memory words. We extend the Crochemore-Perrin constant-space O(n)-time string matching algorithm to run in optimal O(n/alpha) time and even in real-time, achieving a factor alpha speedup over traditional algorithms that examine each character individually. Our solution can be efficiently implemented, unlike prior theoretical packed string matching work. We adapt the standard RAM model and only use its AC0 instructions (i.e. no multiplication) plus two specialized AC0 packed string instructions. The main string-matching instruction is available in commodity processors (i.e. Intel's SSE4.2 and AVX Advanced String Operations); the other maximal-suffix instruction is only required during pattern preprocessing. In the absence of these two specialized instructions, we propose theoretically-efficient emulation using integer multiplication (not AC0) and table lookup.
Published: 2011
Full Text: View/download PDF

40. Multi-Dimensional Packet Classification Using Tuple Space Based on Bit-Parallelism

Author: Wang
Subjects: Computer science, Multi dimensional, Tuple space, Bit parallelism, Parallel computing, Packet classification
Published: 2006
Full Text: View/download PDF

41. Efficient Bit-Parallel Algorithms for (δ,α)-Matching

Author: Szymon Grabowski and Kimmo Fredriksson
Subjects: Combinatorics, Matching (graph theory), Parallel algorithm, Bit parallelism, String searching algorithm, Pattern matching, Approximate string matching, Algorithm, Word (group theory), Substring, Mathematics
Abstract: We consider the following string matching problem. Pattern p0p1p2 ... pm−−1 (δ,α)-matches the text substring $t_{i_0} t_{i_1} t_{i_2} \ldots t_{i_{m-1}}$, if $|p_j-t_{i_j}|\leq\delta$ for j ∈ {0,..., m–1}, where 0 < ij+1 – ij ≤ α + 1. The task is then to find all text positions im−−1 that (δ,α)-match the pattern. For a text of length n, the best previously known algorithms for this string matching problem run in time O(nm) and in time O(n⌈mα/w⌉), where the former is based on dynamic programming, and the latter on bit-parallelism with w bits in computer word (32 or 64 typically). We improve these to take O(nδ+⌈n/w⌉m) and O(n ⌈m log(α)/w⌉), respectively, worst case time using bit-parallelism. On average the algorithms run in O(⌈n/w⌉⌈αδ/σ⌉+n)and O(n) time. Our experimental results show that the algorithms work extremely well in practice. Our algorithms handle general gaps as well, having important applications in computational biology.
Published: 2006
Full Text: View/download PDF

42. Increased bit-parallelism for approximate and multiple string matching

Author: Kimmo Fredriksson, Heikki Hyyrö, and Gonzalo Navarro
Subjects: Combinatorics, Boosting (machine learning), Speedup, Computation, Commentz-Walter algorithm, Bit parallelism, Edit distance, String searching algorithm, Approximate string matching, Theoretical Computer Science, Mathematics
Abstract: Bit-parallelism permits executing several operations simultaneously over a set of bits or numbers stored in a single computer word. This technique permits searching for the approximate occurrences of a pattern of length m in a text of length n in time O (⌈ m / w ⌉ n ), where w is the number of bits in the computer word. Although this is asymptotically the optimal bit-parallel speedup over the basic O ( mn ) time algorithm, it wastes bit-parallelism's power in the common case where m is much smaller than w , since w − m bits in the computer words are unused. In this paper, we explore different ways to increase the bit-parallelism when the search pattern is short. First, we show how multiple patterns can be packed into a single computer word so as to search for all them simultaneously. Instead of spending O ( rn ) time to search for r patterns of length m ≤ w /2, we need O (⌈ rm / w ⌉ n ) time. Second, we show how the mechanism permits boosting the search for a single pattern of length m ≤ w /2, which can be searched for in O (⌈ n /⌊ w / m ⌋⌉) bit-parallel steps instead of O ( n ). Third, we show how to extend these algorithms so that the time bounds essentially depend on k instead of m , where k is the maximum number of differences permitted. Finally, we show how the ideas can be applied to other problems such as multiple exact string matching and one-against-all computation of edit distance and longest common subsequences. Our experimental results show that the new algorithms work well in practice, obtaining significant speedups over the best existing alternatives, especially on short patterns and moderate number of differences allowed. This work fills an important gap in the field, where little work has focused on very short patterns.
Published: 2005
Full Text: View/download PDF

43. Increased Bit-Parallelism for Approximate String Matching

Author: Gonzalo Navarro, Kimmo Fredriksson, and Heikki Hyyrö
Subjects: Set (abstract data type), Speedup, Bit parallelism, Approximate string matching, Algorithm, Word (computer architecture), Power (physics), Mathematics
Abstract: Bit-parallelism permits executing several operations simultaneously over a set of bits or numbers stored in a single computer word. This technique permits searching for the approximate occurrences of a pattern of length m in a text of length n in time O(⌈m/w⌉n), where w is the number of bits in the computer word. Although this is asymptotically the optimal speedup over the basic O(mn) time algorithm, it wastes bit-parallelism’s power in the common case where m is much smaller than w, since w-m bits in the computer words get unused.
Published: 2004
Full Text: View/download PDF

44. Transposition invariant pattern matching for multi-track strings

Author: Kjell Lemström, Jorma Tarhio, and Department of Computer Science
Subjects: String algorithms, combinatorial pattern matching, music retrieval, education, bit parallelism, 113 Computer and information sciences
Published: 2003

45. Bit-parallel approach to approximate string matching in compressed texts

Author: Takuya Kida, Masayuki Takeda, Setsuo Arikawa, Tetsuya Matsumoto, and Ayumi Shinohara
Subjects: Combinatorics, Computer science, Compression (functional analysis), Bit parallelism, String searching algorithm, Approximate string matching, Time complexity, Text string, Algorithm, Word (group theory)
Abstract: Addresses the problem of approximate string matching on compressed text. We consider this problem for a text string described in terms of a collage system, which is a formal system proposed by T. Kida et al. (1999) that captures various dictionary-based compression methods. We present an algorithm that exploits bit-parallelism, assuming that our problem fits in a single machine word, e.g. (m-k)(k+1)/spl les/L, where m is the pattern length, k is the number of allowed errors and L is the length, in bits, of the machine word. For a class of simple collage systems, the algorithm runs in O(k/sup 2/(/spl par//spl Dscr//spl par/+|/spl Sscr/|)+km) time using O(k/sup 2//spl par//spl Dscr//spl par/) space, where /spl par//spl Dscr//spl par/ is the size of dictionary /spl Dscr/ and |/spl Sscr/| is the number of tokens in the sequence /spl Sscr/. The LZ78 (Lempel-Ziv, 1978) and the LZW (Lempel-Ziv-Welch, 1984) compression methods are covered by this class. Since we can regard n=/spl par//spl Dscr//spl par/+|/spl Sscr/| as the compressed length, the time and space complexities are O(k/sup 2/n+km) and O(k/sup 2/n), respectively. For general k and m, they become O(k/sup 3/mn/L+km) and O(k/sup 3/mn/L). Thus, our algorithm is competitive to the algorithm proposed by J. Ka/spl uml/rkka/spl uml/inen, et al. (2000), which runs in O(km) time using O(kmn) space, when k=O(/spl radic/L).
Published: 2000

46. Fast and flexible string matching by combining bit-parallelism and suffix automata

Author: Gonzalo Navarro and Mathieu Raffinot
Subjects: Nondeterministic algorithm, Theoretical computer science, Computer science, Suffix automaton, Bit parallelism, String searching algorithm, Pattern matching, Alphabet, Suffix, Algorithm, Theoretical Computer Science, Automaton
Abstract: The most important features of a string matching algorithm are its efficiency and its flexibility. Efficiency has traditionally received more attention, while flexibility in the search pattern is becoming a more and more important issue. Most classical string matching algorithms are aimed at quickly finding an exact pattern in a text, being Knuth-Morris-Pratt (KMP) and the Boyer-Moore (BM) family the most famous ones. A recent development uses deterministic "suffix automata" to design new optimal string matching algorithms, e.g. BDM and TurboBDM. Flexibility has been addressed quite separately by the use of "bit-parallelism", which simulates automata in their nondeterministic form by using bits and exploiting the intrinsic parallelism inside the computer word, e.g. the Shift-Or algorithm. Those algorithms are extended to handle classes of characters and errors in the pattern and/or in the text, their drawback being their inability to skip text characters. In this paper we merge bit-parallelism and suffix automata, so that a nondeterministic suffix automaton is simulated using bit-parallelism. The resulting algorithm, called BNDM, obtains the best from both worlds. It is much simpler to implement than BDM and nearly as simple as Shift-Or. It inherits from Shift-Or the ability to handle flexible patterns and from BDM the ability to skip characters. BNDM is 30%-40% faster than BDM and up to 7 times faster than Shift-Or. When compared to the fastest existing algorithms on exact patterns (which belong to the BM family), BNDM is from 20% slower to 3 times faster, depending on the alphabet size. With respect to flexible pattern searching, BNDM is by far the fastest technique to deal with classes of characters and is competitive to search allowing errors. In particular, BNDM seems very adequate for computational biology applications, since it is the fastest algorithm to search on DNA sequences and flexible searching is an important problem in that area. As a theoretical development related to flexible pattern matching, we introduce a new automaton to recognize suffixes of patterns with classes of characters. To the best of our knowledge, this automaton has not been studied before.
Published: 2000
Full Text: View/download PDF

47. A compact representation of nondeterministic (suffix) automata for the bit-parallel approach

Author: Emanuele Giaquinta, Simone Faro, and Domenico Cantone
Subjects: Theoretical computer science, TheoryofComputation_COMPUTATIONBYABSTRACTDEVICES, Computer science, Generalized suffix tree, Timed automaton, Bit parallelism, Compact representation, Nondeterministic automata, Shift-and, Suffix automata, Theoretical Computer Science, Nondeterministic finite automaton with ε-moves, Deterministic automaton, Nondeterministic finite automaton, Generalized nondeterministic finite automaton, Mathematics, String (computer science), Automaton, Computer Science Applications, Nondeterministic algorithm, Deterministic finite automaton, Computational Theory and Mathematics, Suffix automaton, Suffix, Algorithm, Computer Science::Formal Languages and Automata Theory, Information Systems
Abstract: We present a novel technique, suitable for bit-parallelism, for representing both the nondeterministic automaton and the nondeterministic suffix automaton of a given string in a more compact way. Our approach is based on a particular factorization of strings which on the average allows to pack in a machine word of w bits automata state configurations for strings of length greater than w. We adapted the Shift-And and BNDM algorithms using our encoding and compared them with the original algorithms. Experimental results show that the new variants are generally faster for long patterns.
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

47 results on '"bit parallelism"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources