Back to Search
Start Over
DPiSAX: Massively Distributed Partitioned iSAX
- Source :
- International Conference on Data Mining, ICDM: International Conference on Data Mining, ICDM: International Conference on Data Mining, Nov 2017, New Orleans, United States. pp.1135-1140, ⟨10.1109/ICDM.2017.151⟩, 2017 IEEE International Conference on Data Mining (ICDM), ICDM 2017: IEEE International Conference on Data Mining, ICDM 2017: IEEE International Conference on Data Mining, Nov 2017, New Orleans, United States. pp.1-6, 2017, 〈http://www.ucs.louisiana.edu/~sxk6389/index.html〉, ICDM
- Publication Year :
- 2017
- Publisher :
- HAL CCSD, 2017.
-
Abstract
- International audience; Indexing is crucial for many data mining tasks that rely on efficient and effective similarity query processing. Consequently, indexing large volumes of time series, along with high performance similarity query processing, have became topics of high interest. For many applications across diverse domains though, the amount of data to be processed might be intractable for a single machine, making existing centralized indexing solutions inefficient. We propose a parallel indexing solution that gracefully scales to billions of time series, and a parallel query processing strategy that, given a batch of queries, efficiently exploits the index. Our experiments, on both synthetic and real world data, illustrate that our index creation algorithm works on 1 billion time series in less than 2 hours , while the state of the art centralized algorithms need more than 5 days. Also, our distributed querying algorithm is able to efficiently process millions of queries over collections of billions of time series, thanks to an effective load balancing mechanism.
- Subjects :
- High interest
[ INFO.INFO-NA ] Computer Science [cs]/Numerical Analysis [cs.NA]
Computer science
Search engine indexing
Parallel algorithm
02 engineering and technology
Load balancing (computing)
[INFO.INFO-NA]Computer Science [cs]/Numerical Analysis [cs.NA]
computer.software_genre
Temporal database
Database index
[ INFO.INFO-DC ] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Data mining
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
computer
Subjects
Details
- Language :
- English
- ISBN :
- 978-1-5386-3835-4
- ISBNs :
- 9781538638354
- Database :
- OpenAIRE
- Journal :
- International Conference on Data Mining, ICDM: International Conference on Data Mining, ICDM: International Conference on Data Mining, Nov 2017, New Orleans, United States. pp.1135-1140, ⟨10.1109/ICDM.2017.151⟩, 2017 IEEE International Conference on Data Mining (ICDM), ICDM 2017: IEEE International Conference on Data Mining, ICDM 2017: IEEE International Conference on Data Mining, Nov 2017, New Orleans, United States. pp.1-6, 2017, 〈http://www.ucs.louisiana.edu/~sxk6389/index.html〉, ICDM
- Accession number :
- edsair.doi.dedup.....b9e6c8a503bdafb28c0fcfd868a25334