1. Extracting OLAP Cubes From Document-Oriented NoSQL Database Based on Parallel Similarity Algorithms
- Author
-
Kambiz Majidzadeh, Farnaz Davardoost, and Amin Babazadeh Sangar
- Subjects
Database ,Computer science ,business.industry ,Relational database ,Nearest neighbor search ,Data management ,Online analytical processing ,Big data ,Hash function ,02 engineering and technology ,NoSQL ,computer.software_genre ,Hardware and Architecture ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Shingling ,Electrical and Electronic Engineering ,business ,computer - Abstract
Today, the relational database is not suitable for data management due to the large variety and volume of data which are mostly untrusted. Therefore, NoSQL has attracted the attention of companies. Despite it being a proper choice for managing a variety of large volume data, there is a big challenge and difficulty in performing online analytical processing (OLAP) on NoSQL since it is schema-less. This article aims to introduce a model to overcome null value in converting document-oriented NoSQL databases into relational databases using parallel similarity techniques. The proposed model includes four phases, shingling, chunck, minhashing, and locality-sensitive hashing MapReduce (LSHMR). Each phase performs a proper process on input NoSQL databases. The main idea of LSHMR is based on the nature of both locality-sensitive hashing (LSH) and MapReduce (MR). In this article, the LSH similarity search technique is used on the MR framework to extract OLAP cubes. LSH is used to decrease the number of comparisons. Furthermore, MR enables efficient distributed and parallel computing. The proposed model is an efficient and suitable approach for extracting OLAP cubes from an NoSQL database.
- Published
- 2020
- Full Text
- View/download PDF