Author: "Sitchinava, Nodari" / Database: OAIster - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Sitchinava, Nodari"' showing total 23 results

Start Over Author "Sitchinava, Nodari" Database OAIster

23 results on '"Sitchinava, Nodari"'

1. Locality-of-Reference Optimality of Cache-Oblivious Algorithms

Author: Afshani, Peyman, Iacono, John, Varunkumar Jayapaul, Vo, Karsin, Benjamin, Sitchinava, Nodari, Afshani, Peyman, Iacono, John, Varunkumar Jayapaul, Vo, Karsin, Benjamin, and Sitchinava, Nodari
Abstract: info:eu-repo/semantics/published
Published: 2022

2. LCP-Aware Parallel String Sorting

Author: Ellert, Jonas, Fischer, Johannes, Sitchinava, Nodari, Ellert, Jonas, Fischer, Johannes, and Sitchinava, Nodari
Abstract: When lexicographically sorting strings, it is not always necessary to inspect all symbols. For example, the lexicographical rank of "europar" amongst the strings "eureka", "eurasia", and "excells" only depends on its so called relevant prefix "euro". The distinguishing prefix size $D$ of a set of strings is the number of symbols that actually need to be inspected to establish the lexicographical ordering of all strings. Efficient string sorters should be $D$-aware, i.e. their complexity should depend on $D$ rather than on the total number $N$ of all symbols in all strings. While there are many $D$-aware sorters in the sequential setting, there appear to be no such results in the PRAM model. We propose a framework yielding a $D$-aware modification of any existing PRAM string sorter. The derived algorithms are work-optimal with respect to their original counterpart: If the original algorithm requires $O(w(N))$ work, the derived one requires $O(w(D))$ work. The execution time increases only by a small factor that is logarithmic in the length of the longest relevant prefix. Our framework universally works for deterministic and randomized algorithms in all variations of the PRAM model, such that future improvements in ($D$-unaware) parallel string sorting will directly result in improvements in $D$-aware parallel string sorting., Comment: Accepted at Euro-Par 2020 and to be published by Springer as part of the conference proceedings
Published: 2020

3. Fragile Complexity of Comparison-Based Algorithms

Author: Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, Sitchinava, Nodari, Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, and Sitchinava, Nodari
Abstract: We initiate a study of algorithms with a focus on the computational complexity of individual elements, and introduce the fragile complexity of comparison-based algorithms as the maximal number of comparisons any individual element takes part in. We give a number of upper and lower bounds on the fragile complexity for fundamental problems, including Minimum, Selection, Sorting and Heap Construction. The results include both deterministic and randomized upper and lower bounds, and demonstrate a separation between the two settings for a number of problems. The depth of a comparator network is a straight-forward upper bound on the worst case fragile complexity of the corresponding fragile algorithm. We prove that fragile complexity is a different and strictly easier property than the depth of comparator networks, in the sense that for some problems a fragile complexity equal to the best network depth can be achieved with less total work and that with randomization, even a lower fragile complexity is possible.
Published: 2019
Full Text: View/download PDF

4. Fragile complexity of comparison-based algorithms

Author: Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, Sitchinava, Nodari, Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, and Sitchinava, Nodari
Abstract: We initiate a study of algorithms with a focus on the computational complexity of individual elements, and introduce the fragile complexity of comparison-based algorithms as the maximal number of comparisons any individual element takes part in. We give a number of upper and lower bounds on the fragile complexity for fundamental problems, including Minimum, Selection, Sorting and Heap Construction. The results include both deterministic and randomized upper and lower bounds, and demonstrate a separation between the two settings for a number of problems. The depth of a comparator network is a straight-forward upper bound on the worst case fragile complexity of the corresponding fragile algorithm. We prove that fragile complexity is a different and strictly easier property than the depth of comparator networks, in the sense that for some problems a fragile complexity equal to the best network depth can be achieved with less total work and that with randomization, even a lower fragile complexity is possible.
Published: 2019

5. Fragile complexity of comparison-based algorithms

Author: Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, Sitchinava, Nodari, Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, and Sitchinava, Nodari
Abstract: We initiate a study of algorithms with a focus on the computational complexity of individual elements, and introduce the fragile complexity of comparison-based algorithms as the maximal number of comparisons any individual element takes part in. We give a number of upper and lower bounds on the fragile complexity for fundamental problems, including Minimum, Selection, Sorting and Heap Construction. The results include both deterministic and randomized upper and lower bounds, and demonstrate a separation between the two settings for a number of problems. The depth of a comparator network is a straight-forward upper bound on the worst case fragile complexity of the corresponding fragile algorithm. We prove that fragile complexity is a different and strictly easier property than the depth of comparator networks, in the sense that for some problems a fragile complexity equal to the best network depth can be achieved with less total work and that with randomization, even a lower fragile complexity is possible.
Published: 2019

6. Fragile Complexity of Comparison-Based Algorithms

Author: Peyman Afshani and Rolf Fagerberg and David Hammer and Riko Jacob and Irina Kostitsyna and Ulrich Meyer and Manuel Penschuck and Nodari Sitchinava, Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, Sitchinava, Nodari, Peyman Afshani and Rolf Fagerberg and David Hammer and Riko Jacob and Irina Kostitsyna and Ulrich Meyer and Manuel Penschuck and Nodari Sitchinava, Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, and Sitchinava, Nodari
Abstract: We initiate a study of algorithms with a focus on the computational complexity of individual elements, and introduce the fragile complexity of comparison-based algorithms as the maximal number of comparisons any individual element takes part in. We give a number of upper and lower bounds on the fragile complexity for fundamental problems, including Minimum, Selection, Sorting and Heap Construction. The results include both deterministic and randomized upper and lower bounds, and demonstrate a separation between the two settings for a number of problems. The depth of a comparator network is a straight-forward upper bound on the worst case fragile complexity of the corresponding fragile algorithm. We prove that fragile complexity is a different and strictly easier property than the depth of comparator networks, in the sense that for some problems a fragile complexity equal to the best network depth can be achieved with less total work and that with randomization, even a lower fragile complexity is possible.
Published: 2019
Full Text: View/download PDF

7. A parallel priority queue with fast updates for GPU architectures

Author: Berney, Kyle, Iacono, John, Karsin, Ben, Sitchinava, Nodari, Berney, Kyle, Iacono, John, Karsin, Ben, and Sitchinava, Nodari
Abstract: The single-source shortest path (SSSP) problem is a well-studied problem that is used in many applications. In the parallel setting, a work-efficient algorithm that additionally attains $o(n)$ parallel depth has been elusive. Alternatively, various approaches have been developed that take advantage of specific properties of a particular class of graphs. On a graphics processing unit (GPU), the current state-of-the-art SSSP algorithms are implementations of the Delta-stepping algorithm, which does not perform well for graphs with large diameters. The main contribution of this work is to provide an algorithm designed for GPUs that runs efficiently for such graphs. We present the parallel bucket heap, a parallel cache-efficient data structure adapted for modern GPU architectures that supports standard priority queue operations, as well as bulk update. We analyze the structure in several well-known computational models and show that it provides both optimal parallelism and is cache-efficient. We implement the parallel bucket heap and use it in a parallel variant of Dijkstra's algorithm to solve the SSSP problem. Experimental results indicate that, for sufficiently large, dense graphs with high diameter, we outperform the current state-of-the-art SSSP implementations on an NVIDIA RTX 2080 Ti and Quadro M4000 by up to a factor of 2.8 and 5.4, respectively.
Published: 2019

8. A parallel priority queue with fast updates for GPU architectures

Author: Iacono, John, Karsin, Benjamin, Sitchinava, Nodari, Iacono, John, Karsin, Benjamin, and Sitchinava, Nodari
Abstract: The high computational throughput of modern graphics processing units (GPUs) make them the de-facto architecture for high-performance computing applications. However, to achieve peak performance, GPUs require highly parallel workloads, as well as memory access patterns that exhibit good locality of reference. As a result, many state-of-the-art algorithms and data structures designed for GPUs sacrifice work-optimality to achieve the necessary parallelism. Furthermore, some abstract data types are avoided completely due to there being no corresponding data structure that performs well on the GPU. One such abstract data type is the priority queue. Many well-known algorithms rely on priority queue operations as a building block. While various priority queue structures have been developed that are parallel, cache-aware, or cache-oblivious, none has been shown to be efficient on GPUs. In this paper, we present the parBucketHeap, a parallel, cache-efficient data structure designed for modern GPU architectures that supports standard priority queue operations, as well as bulk update. We analyze the structure in several well-known computational models and show that it provides both optimal parallelism and is cache-efficient. We implement the parBucketHeap and, using it, we solve the single-source shortest path (SSSP) problem. Experimental results indicate that, for sufficiently large, dense graphs with high diameter, we out-perform current state-of-the-art SSSP algorithms on the GPU by up to a factor of 5. Unlike existing GPU SSSP algorithms, our approach is work-optimal and places significantly less load on the GPU, reducing power consumption., info:eu-repo/semantics/published
Published: 2019

9. Locality

Author: Afshani, Peyman, Iacono, John, Jayapaul, Varunkumar, Karsin, Ben, Sitchinava, Nodari, Afshani, Peyman, Iacono, John, Jayapaul, Varunkumar, Karsin, Ben, and Sitchinava, Nodari
Abstract: The program performance on modern hardware is characterized by \emph{locality of reference}, that is, it is faster to access data that is close in address space to data that has been accessed recently than data in a random location. This is due to many architectural features including caches, prefetching, virtual address translation and the physical properties of a hard disk drive; attempting to model all the components that constitute the performance of a modern machine is impossible, especially for general algorithm design purposes. What if one could prove an algorithm is asymptotically optimal on all systems that reward locality of reference, no matter how it manifests itself within reasonable limits? We show that this is possible, and that excluding some pathological cases, cache-oblivious algorithms that are asymptotically optimal in the ideal-cache model are asymptotically optimal in any reasonable setting that rewards locality of reference. This is surprising as the cache-oblivious framework envisions a particular architectural model involving blocked memory transfer into a multi-level hierarchy of caches of varying sizes, and was not designed to directly model locality-of-reference correlated performance.
Published: 2019

10. Fragile Complexity of Comparison-Based Algorithms

Author: Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, Sitchinava, Nodari, Afshani, Peyman, Fagerberg, Rolf, Hammer, David, Jacob, Riko, Kostitsyna, Irina, Meyer, Ulrich, Penschuck, Manuel, and Sitchinava, Nodari
Abstract: We initiate a study of algorithms with a focus on the computational complexity of individual elements, and introduce the fragile complexity of comparison-based algorithms as the maximal number of comparisons any individual element takes part in. We give a number of upper and lower bounds on the fragile complexity for fundamental problems, including Minimum, Selection, Sorting and Heap Construction. The results include both deterministic and randomized upper and lower bounds, and demonstrate a separation between the two settings for a number of problems. The depth of a comparator network is a straight-forward upper bound on the worst case fragile complexity of the corresponding fragile algorithm. We prove that fragile complexity is a different and strictly easier property than the depth of comparator networks, in the sense that for some problems a fragile complexity equal to the best network depth can be achieved with less total work and that with randomization, even a lower fragile complexity is possible.
Published: 2019

11. An efficient algorithm for the 1D total visibility-index problem and its parallelization

Author: Afshani, Peyman, de Berg, Mark, Casanova, Henri, Karsin, Ben, Lambrechts, Colin, Sitchinava, Nodari, Tsirogiannis, Constantinos, Afshani, Peyman, de Berg, Mark, Casanova, Henri, Karsin, Ben, Lambrechts, Colin, Sitchinava, Nodari, and Tsirogiannis, Constantinos
Abstract: Let T be a terrain and P be a set of points on its surface. An important problem in Geographic Information Science (GIS) is computing the visibility index of a point p on P, that is, the number of points in P that are visible from p. The total visibility-index problem asks for the visibility index of every point in P. We present the first subquadratic-time algorithm to solve the one-dimensional total-visibility-index problem. Our algorithm uses a geometric dualization technique to reduce the problem to a set of instances of the red-blue line segment intersection counting problem, allowing us to find the total visibility-index in O(n log2 n) time. We implement a naive O(n2) approach and four variations of our algorithm: one that uses an existing red-blue line segment intersection counting algorithm and three new approaches that leverage features specific to our problem. Two of our implementations allow for parallel execution, requiringO(log2 n) time and O(n log2 n) work in the CREW PRAM model. We present experimental results for both serial and parallel implementations on synthetic and real-world datasets using two hardware platforms. Results show that all variants of our algorithm outperform the naive approach by several orders of magnitude. Furthermore, we show that our special-case red-blue line segment intersection counting implementations out-perform the existing general-case solution by up to a factor 10. Our fastest parallel implementation is able to process a terrain of more than 100 million vertices in under 3 minutes, achieving up to 85% parallel efficiency using 16 cores.
Published: 2018

12. An efficient algorithm for the 1D total visibility-index problem and its parallelization

Author: Afshani, Peyman, de Berg, Mark, Casanova, Henri, Karsin, Ben, Lambrechts, Colin, Sitchinava, Nodari, Tsirogiannis, Constantinos, Afshani, Peyman, de Berg, Mark, Casanova, Henri, Karsin, Ben, Lambrechts, Colin, Sitchinava, Nodari, and Tsirogiannis, Constantinos
Abstract: Let T be a terrain and P be a set of points on its surface. An important problem in Geographic Information Science (GIS) is computing the visibility index of a point p on P, that is, the number of points in P that are visible from p. The total visibility-index problem asks for the visibility index of every point in P. We present the first subquadratic-time algorithm to solve the one-dimensional total-visibility-index problem. Our algorithm uses a geometric dualization technique to reduce the problem to a set of instances of the red-blue line segment intersection counting problem, allowing us to find the total visibility-index in O(n log2 n) time. We implement a naive O(n2) approach and four variations of our algorithm: one that uses an existing red-blue line segment intersection counting algorithm and three new approaches that leverage features specific to our problem. Two of our implementations allow for parallel execution, requiringO(log2 n) time and O(n log2 n) work in the CREW PRAM model. We present experimental results for both serial and parallel implementations on synthetic and real-world datasets using two hardware platforms. Results show that all variants of our algorithm outperform the naive approach by several orders of magnitude. Furthermore, we show that our special-case red-blue line segment intersection counting implementations out-perform the existing general-case solution by up to a factor 10. Our fastest parallel implementation is able to process a terrain of more than 100 million vertices in under 3 minutes, achieving up to 85% parallel efficiency using 16 cores.
Published: 2018

13. Analysis-driven Engineering of Comparison-based Sorting Algorithms on GPUs

Author: Karsin, Benjamin, Weichert, Volker, Casanova, Henri, Iacono, John, Sitchinava, Nodari, Karsin, Benjamin, Weichert, Volker, Casanova, Henri, Iacono, John, and Sitchinava, Nodari
Abstract: We study the relationship between memory accesses, bank conflicts, thread multiplicity (also known as over-subscription) and instruction-level parallelism in comparison-based sorting algorithms for Graphics Processing Units (GPUs). We experimentally validate a proposed formula that relates these parameters with asymptotic analysis of the number of memory accesses by an algorithm. Using this formula we analyze and compare several GPU sorting algorithms, identifying key performance bottlenecks in each one of them. Based on this analysis we propose a GPU-efficient multiway merge-sort algorithm, GPU-MMS, which minimizes or eliminates these bottlenecks and balances various limiting factors for specific hardware.We realize an implementation of GPU-MMS and compare it to sorting algorithm implementations in state-of-the-art GPU libraries on three GPU architectures. Despite these library implementations being highly optimized, we find that GPU-MMS outperforms them by an average of 21% for random integer inputs and 14% for random key-value pairs., Proceedings of the 32nd International Conference on Supercomputing, 2018, Beijing, China, June 12-15, 2018, info:eu-repo/semantics/published
Published: 2018

14. An Efficient Multiway Mergesort for GPU Architectures

Author: Casanova, Henri, Iacono, John, Karsin, Benjamin, Sitchinava, Nodari, Weichert, Volker, Casanova, Henri, Iacono, John, Karsin, Benjamin, Sitchinava, Nodari, and Weichert, Volker
Abstract: Sorting is a primitive operation that is a building block for countless algorithms. As such, it is important to design sorting algorithms that approach peak performance on a range of hardware architectures. Graphics Processing Units (GPUs) are particularly attractive architectures as they provides massive parallelism and computing power. However, the intricacies of their compute and memory hierarchies make designing GPU-efficient algorithms challenging. In this work we present GPU Multiway Mergesort (MMS), a new GPU-efficient multiway mergesort algorithm. MMS employs a new partitioning technique that exposes the parallelism needed by modern GPU architectures. To the best of our knowledge, MMS is the first sorting algorithm for the GPU that is asymptotically optimal in terms of global memory accesses and that is completely free of shared memory bank conflicts.We realize an initial implementation of MMS, evaluate its performance on three modern GPU architectures, and compare it to competitive implementations available in state-of-the-art GPU libraries. Despite these implementations being highly optimized, MMS compares favorably, achieving performance improvements for most random inputs. Furthermore, unlike MMS, state-of-the-art algorithms are susceptible to bank conflicts. We find that for certain inputs that cause these algorithms to incur large numbers of bank conflicts, MMS can achieve up to a 37.6% speedup over its fastest competitor. Overall, even though its current implementation is not fully optimized, due to its efficient use of the memory hierarchy, MMS outperforms the fastest comparison-based sorting implementations available to date., info:eu-repo/semantics/published
Published: 2017

15. Reconstructing Generalized Staircase Polygons with Uniform Step Length

Author: Sitchinava, Nodari, Strash, Darren, Sitchinava, Nodari, and Strash, Darren
Abstract: Visibility graph reconstruction, which asks us to construct a polygon that has a given visibility graph, is a fundamental problem with unknown complexity (although visibility graph recognition is known to be in PSPACE). We show that two classes of uniform step length polygons can be reconstructed efficiently by finding and removing rectangles formed between consecutive convex boundary vertices called tabs. In particular, we give an $O(n^2m)$-time reconstruction algorithm for orthogonally convex polygons, where $n$ and $m$ are the number of vertices and edges in the visibility graph, respectively. We further show that reconstructing a monotone chain of staircases (a histogram) is fixed-parameter tractable, when parameterized on the number of tabs, and polynomially solvable in time $O(n^2m)$ under reasonable alignment restrictions., Comment: Appears in the Proceedings of the 25th International Symposium on Graph Drawing and Network Visualization (GD 2017)
Published: 2017

16. An Efficient Multiway Mergesort for GPU Architectures

Author: Casanova, Henri, Iacono, John, Karsin, Ben, Sitchinava, Nodari, Weichert, Volker, Casanova, Henri, Iacono, John, Karsin, Ben, Sitchinava, Nodari, and Weichert, Volker
Abstract: Sorting is a primitive operation that is a building block for countless algorithms. As such, it is important to design sorting algorithms that approach peak performance on a range of hardware architectures. Graphics Processing Units (GPUs) are particularly attractive architectures as they provides massive parallelism and computing power. However, the intricacies of their compute and memory hierarchies make designing GPU-efficient algorithms challenging. In this work we present GPU Multiway Mergesort (MMS), a new GPU-efficient multiway mergesort algorithm. MMS employs a new partitioning technique that exposes the parallelism needed by modern GPU architectures. To the best of our knowledge, MMS is the first sorting algorithm for the GPU that is asymptotically optimal in terms of global memory accesses and that is completely free of shared memory bank conflicts. We realize an initial implementation of MMS, evaluate its performance on three modern GPU architectures, and compare it to competitive implementations available in state-of-the-art GPU libraries. Despite these implementations being highly optimized, MMS compares favorably, achieving performance improvements for most random inputs. Furthermore, unlike MMS, state-of-the-art algorithms are susceptible to bank conflicts. We find that for certain inputs that cause these algorithms to incur large numbers of bank conflicts, MMS can achieve up to a 37.6% speedup over its fastest competitor. Overall, even though its current implementation is not fully optimized, due to its efficient use of the memory hierarchy, MMS outperforms the fastest comparison-based sorting implementations available to date.
Published: 2017

17. Sorting and Permuting without Bank Conflicts on GPUs

Author: Afshani, Peyman, Sitchinava, Nodari, Afshani, Peyman, and Sitchinava, Nodari
Abstract: In this paper, we look at the complexity of designing algorithms without any bank conflicts in the shared memory of Graphical Processing Units (GPUs). Given input of size $n$, $w$ processors and $w$ memory banks, we study three fundamental problems: sorting, permuting and $w$-way partitioning (defined as sorting an input containing exactly $n/w$ copies of every integer in $[w]$). We solve sorting in optimal $O(\frac{n}{w} \log n)$ time. When $n \ge w^2$, we solve the partitioning problem optimally in $O(n/w)$ time. We also present a general solution for the partitioning problem which takes $O(\frac{n}{w} \log^3_{n/w} w)$ time. Finally, we solve the permutation problem using a randomized algorithm in $O(\frac{n}{w} \log\log\log_{n/w} n)$ time. Our results show evidence that when working with banked memory architectures, there is a separation between these problems and the permutation and partitioning problems are not as easy as simple parallel scanning., Comment: 12 pages, 2 figures, 23rd European Symposium on Algorithms (ESA)
Published: 2015

18. On the Complexity of List Ranking in the Parallel External Memory Model

Author: Jacob, Riko, Lieber, Tobias, Sitchinava, Nodari, Jacob, Riko, Lieber, Tobias, and Sitchinava, Nodari
Abstract: We study the problem of list ranking in the parallel external memory (PEM) model. We observe an interesting dual nature for the hardness of the problem due to limited information exchange among the processors about the structure of the list, on the one hand, and its close relationship to the problem of permuting data, which is known to be hard for the external memory models, on the other hand. By carefully defining the power of the computational model, we prove a permuting lower bound in the PEM model. Furthermore, we present a stronger \Omega(log^2 N) lower bound for a special variant of the problem and for a specific range of the model parameters, which takes us a step closer toward proving a non-trivial lower bound for the list ranking problem in the bulk-synchronous parallel (BSP) and MapReduce models. Finally, we also present an algorithm that is tight for a larger range of parameters of the model than in prior work.
Published: 2014
Full Text: View/download PDF

19. Bank Conflict Free Comparison-based Sorting On GPUs

Author: Sitchinava, Nodari, Weichert, Volker, Sitchinava, Nodari, and Weichert, Volker
Abstract: In this paper we present a framework for designing algorithms in shared memory of GPUs without incurring memory bank conflicts. Using our framework we develop the first comparison-based shared memory sorting algorithm that incurs no bank conflicts. It can be used as a subroutine for GPU sorting algorithms to replace current use of sorting networks in shared memory. Using our bank conflict free shared memory sorting subroutine as a black box, we design BCFMergesort, an algorithm for merging sorted streams of data that are larger than shared memory. Our algorithm performs all accesses to global memory in coalesced manner and incurs no bank conflicts during the merge.
Published: 2013

20. Empirical Evaluation of the Parallel Distribution Sweeping Framework on Multicore Architectures

Author: Ajwani, Deepak, Sitchinava, Nodari, Ajwani, Deepak, and Sitchinava, Nodari
Abstract: In this paper, we perform an empirical evaluation of the Parallel External Memory (PEM) model in the context of geometric problems. In particular, we implement the parallel distribution sweeping framework of Ajwani, Sitchinava and Zeh to solve batched 1-dimensional stabbing max problem. While modern processors consist of sophisticated memory systems (multiple levels of caches, set associativity, TLB, prefetching), we empirically show that algorithms designed in simple models, that focus on minimizing the I/O transfers between shared memory and single level cache, can lead to efficient software on current multicore architectures. Our implementation exhibits significantly fewer accesses to slow DRAM and, therefore, outperforms traditional approaches based on plane sweep and two-way divide and conquer., Comment: Longer version of ESA'13 paper
Published: 2013

21. Efficient Parallel and External Matching

Author: Birn, Marcel, Osipov, Vitaly, Sanders, Peter, Schulz, Christian, Sitchinava, Nodari, Birn, Marcel, Osipov, Vitaly, Sanders, Peter, Schulz, Christian, and Sitchinava, Nodari
Abstract: We show that a simple algorithm for computing a matching on a graph runs in a logarithmic number of phases incurring work linear in the input size. The algorithm can be adapted to provide efficient algorithms in several models of computation, such as PRAM, External Memory, MapReduce and distributed memory models. Our CREW PRAM algorithm is the first O(log^2 n) time, linear work algorithm. Our experimental results indicate the algorithm's high speed and efficiency combined with good solution quality.
Published: 2013

22. Sorting, Searching, and Simulation in the MapReduce Framework

Author: Goodrich, Michael T., Sitchinava, Nodari, Zhang, Qin, Goodrich, Michael T., Sitchinava, Nodari, and Zhang, Qin
Abstract: In this paper, we study the MapReduce framework from an algorithmic standpoint and demonstrate the usefulness of our approach by designing and analyzing efficient MapReduce algorithms for fundamental sorting, searching, and simulation problems. This study is motivated by a goal of ultimately putting the MapReduce framework on an equal theoretical footing with the well-known PRAM and BSP parallel models, which would benefit both the theory and practice of MapReduce algorithms. We describe efficient MapReduce algorithms for sorting, multi-searching, and simulations of parallel algorithms specified in the BSP and CRCW PRAM models. We also provide some applications of these results to problems in parallel computational geometry for the MapReduce framework, which result in efficient MapReduce algorithms for sorting, 2- and 3-dimensional convex hulls, and fixed-dimensional linear programming. For the case when mappers and reducers have a memory/message-I/O size of $M=\Theta(N^\epsilon)$, for a small constant $\epsilon>0$, all of our MapReduce algorithms for these applications run in a constant number of rounds., Comment: 16 pages
Published: 2011

23. Guard Placement For Wireless Localization

Author: Eppstein, David, Goodrich, Michael T., Sitchinava, Nodari, Eppstein, David, Goodrich, Michael T., and Sitchinava, Nodari
Abstract: Motivated by secure wireless networking, we consider the problem of placing fixed localizers that enable mobile communication devices to prove they belong to a secure region that is defined by the interior of a polygon. Each localizer views an infinite wedge of the plane, and a device can prove membership in the secure region if it is inside the wedges for a set of localizers whose common intersection contains no points outside the polygon. This model leads to a broad class of new art gallery type problems, for which we provide upper and lower bounds.
Published: 2006

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

23 results on '"Sitchinava, Nodari"'

1. Locality-of-Reference Optimality of Cache-Oblivious Algorithms

2. LCP-Aware Parallel String Sorting

3. Fragile Complexity of Comparison-Based Algorithms

4. Fragile complexity of comparison-based algorithms

5. Fragile complexity of comparison-based algorithms

6. Fragile Complexity of Comparison-Based Algorithms

7. A parallel priority queue with fast updates for GPU architectures

8. A parallel priority queue with fast updates for GPU architectures

9. Locality

10. Fragile Complexity of Comparison-Based Algorithms

11. An efficient algorithm for the 1D total visibility-index problem and its parallelization

12. An efficient algorithm for the 1D total visibility-index problem and its parallelization

13. Analysis-driven Engineering of Comparison-based Sorting Algorithms on GPUs

14. An Efficient Multiway Mergesort for GPU Architectures

15. Reconstructing Generalized Staircase Polygons with Uniform Step Length

16. An Efficient Multiway Mergesort for GPU Architectures

17. Sorting and Permuting without Bank Conflicts on GPUs

18. On the Complexity of List Ranking in the Parallel External Memory Model

19. Bank Conflict Free Comparison-based Sorting On GPUs

20. Empirical Evaluation of the Parallel Distribution Sweeping Framework on Multicore Architectures

21. Efficient Parallel and External Matching

22. Sorting, Searching, and Simulation in the MapReduce Framework

23. Guard Placement For Wireless Localization

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Database

Publisher

23 results on '"Sitchinava, Nodari"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources