1. MuLOT: Multi-level Optimization of the Canonical Polyadic Tensor Decomposition at Large-Scale
- Author
-
Nadine Cullot, Eric Leclercq, Annabelle Gillet, Laboratoire d'Informatique de Bourgogne [Dijon] (LIB), and Université de Bourgogne (UB)
- Subjects
0303 health sciences ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,Scale (ratio) ,Scala ,Computer science ,Computation ,010103 numerical & computational mathematics ,Data structure ,01 natural sciences ,03 medical and health sciences ,Spark (mathematics) ,Decomposition (computer science) ,Tensor ,0101 mathematics ,computer ,Algorithm ,ComputingMilieux_MISCELLANEOUS ,030304 developmental biology ,Sparse matrix ,computer.programming_language - Abstract
Tensors are used in a wide range of analytics tools and as intermediary data structures in machine learning pipelines. Implementations of tensor decompositions at large-scale often select only a specific type of optimization, and neglect the possibility of combining different types of optimizations. Therefore, they do not include all the improvements available, and are less effective than what they could be. We propose an algorithm that uses both dense and sparse data structures and that leverages coarse and fine grained optimizations in addition to incremental computations in order to achieve large scale CP (CANDECOMP/PARAFAC) tensor decomposition. We also provide an implementation in Scala using Spark, MuLOT, that outperforms the baseline of large-scale CP decomposition libraries by several orders of magnitude, and run experiments to show its large-scale capability. We also study a typical use case of CP decomposition on social network data. more...
- Published
- 2021
- Full Text
- View/download PDF