1. Culling for Extreme-Scale Segmentation Volumes: A Hybrid Deterministic and Probabilistic Approach
- Author
-
Markus Hadwiger, Ali K. Al-Awami, Haneen Mohammed, Marco Agus, Johanna Beyer, and Hanspeter Pfister
- Subjects
business.industry ,Computer science ,Probabilistic logic ,020207 software engineering ,Pattern recognition ,02 engineering and technology ,Image segmentation ,Data structure ,Computer Graphics and Computer-Aided Design ,Hierarchical database model ,Rendering (computer graphics) ,Data visualization ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Segmentation ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software - Abstract
With the rapid increase in raw volume data sizes, such as terabyte-sized microscopy volumes, the corresponding segmentation label volumes have become extremely large as well. We focus on integer label data, whose efficient representation in memory, as well as fast random data access, pose an even greater challenge than the raw image data. Often, it is crucial to be able to rapidly identify which segments are located where, whether for empty space skipping for fast rendering, or for spatial proximity queries. We refer to this process as culling. In order to enable efficient culling of millions of labeled segments, we present a novel hybrid approach that combines deterministic and probabilistic representations of label data in a data-adaptive hierarchical data structure that we call the label list tree. In each node, we adaptively encode label data using either a probabilistic constant-time access representation for fast conservative culling, or a deterministic logarithmic-time access representation for exact queries. We choose the best data structures for representing the labels of each spatial region while building the label list tree. At run time, we further employ a novel query-adaptive culling strategy. While filtering a query down the tree, we prune it successively, and in each node adaptively select the representation that is best suited for evaluating the pruned query, depending on its size. We show an analysis of the efficiency of our approach with several large data sets from connectomics, including a brain scan with more than 13 million labeled segments, and compare our method to conventional culling approaches. Our approach achieves significant reductions in storage size as well as faster query times.
- Published
- 2019
- Full Text
- View/download PDF