1. Cornac: Tackling Huge Graph Visualization with Big Data Infrastructure
- Author
-
Alexandre Perrot, David Auber, Laboratoire Bordelais de Recherche en Informatique (LaBRI), and Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)
- Subjects
Information Systems and Management ,Computer science ,Distributed computing ,Big data ,IEEEtran ,02 engineering and technology ,Data visualization ,Graph drawing ,0202 electrical engineering, electronic engineering, information engineering ,Cluster analysis ,Interactive visualization ,business.industry ,paper ,template ,020207 software engineering ,Computer Society ,Visualization ,IEEE ,[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] ,Scalability ,Canopy clustering algorithm ,journal ,020201 artificial intelligence & image processing ,business ,L A T E X ,Information Systems - Abstract
International audience; The size of available graphs has drastically increased in recent years. The real-time visualization of graphs with millions of edges is a challenge but is necessary to grasp information hidden in huge datasets. This article presents an end-to-end technique to visualize huge graphs using an established Big Data ecosystem and a lightweight client running in a Web browser. For that purpose, levels of abstraction and graph tiles are generated by a batch layer and the interactive visualization is provided using a serving layer and client-side real-time computation of edge bundling and graph splatting. A major challenge is to create techniques that work without moving data to an ad hoc system and that take advantage of the horizontal scalability of these infrastructures. We introduce two novel scalable algorithms that enable to generate a canopy clustering and to aggregate graph edges. These two algorithms are both used to produce levels of abstraction and graph tiles. We prove that our technique guarantee a quality of visualization by controlling both the necessary bandwidth required for data transfer and the quality of the produced visualization. Furthermore, we demonstrate the usability of our technique by providing a complete prototype. We present benchmarks on graphs with millions of elements and we compare our results to those obtained by state of the art techniques. Our results show that new Big Data technologies can be incorporated into visualization pipeline to push out the size limits of graphs one can visually analyze.
- Published
- 2018