Start Over

Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data.

Authors :: Ghosh, Moumita
Roy, Anirban
Sil, Pritam
Mondal, Kartick Chandra
Source :: Innovations in Systems & Software Engineering; Sep2023, Vol. 19 Issue 3, p283-301, 19p
Publication Year :: 2023
Abstract: The efficient discovery of frequent itemsets from a transaction database is the fundamental step for association rule mining in data analytics. Interesting associations among the items present in a transaction database contribute to knowledge enrichment. Thus, decision-making and pattern generation from the massive amounts of data become effortless. But one of the major problems associated with the algorithms of frequent itemset mining is excessive memory requirements, which cause them to be inappropriate for larger datasets with itemsets having high cardinality. A few novel data structures for mining frequent itemsets have been introduced in recent years. For example, N-List, NodeSet, DiffNodeSet, proximity list, etc. have been proposed that show a coherent mining approach for improving the execution time while still leaving the scope for further improvements in memory requirements. In this paper, we propose a novel algorithm using cellular learning automata (CLA) and multiple FP tree structures for frequent itemset mining that is efficient in both time and memory requirements. Extensive experimentation has been performed by comparing the performance of the proposed method with the leading algorithms and using publicly available real and synthetic datasets designed specifically for pattern mining algorithms. It can be concluded that the proposed method is memory-efficient and shows comparable execution time with varying dataset dimensions and dataset density, assuring its robustness. In addition to the proposal of the new methodology for frequent itemset mining, its potential domain-specific usage in species biodiversity data analysis has also been discussed. The fact that which groups of species are closely related can be derived from huge occurrence records of species datasets. This could help in understanding species co-occurrence in multiple sites, which in turn assists in solving ecology-related issues for afforesting and reforesting. It could be a step forward toward the advantageous use of computer science in the biodiversity domain. [ABSTRACT FROM AUTHOR]

Details

Language :: English
ISSN :: 16145046
Volume :: 19
Issue :: 3
Database :: Complementary Index
Journal :: Innovations in Systems & Software Engineering
Publication Type :: Academic Journal
Accession number :: 170040563
Full Text :: https://doi.org/10.1007/s11334-022-00500-3

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data.

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data.

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources