Binary Count Tree: An Efficient and Compact Structure for Mining Rare and Frequent Itemsets

Shwetha Rai, M. Geetha*, Preetham Kumar, B. Giridhar

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

The discovery of rare and frequent itemsets is done efficiently if the datasets to be processed are stored within the main memory. In recent years, various data structures have been developed to represent a large dataset in a compact form, which otherwise cannot be stored as a whole within the main memory. Binary Count Tree (BIN-Tree), a tree data structure is proposed in this paper, represents the entire dataset in a compact and complete form without any information loss. Each transaction is encoded and stored as a node in the tree, in contrast to the existing algorithms that store each item as a node. The efficiency of BIN-Tree for datasets of varying size and dimensions was evaluated against Single Scan Pattern Tree (SSP-Tree) and Weighted Count Tree (WC-Tree). The results obtained revealed BIN-Tree to be 95% and 75% more space-efficient than SSP-Tree and WC-Tree, respectively. The BIN-Tree construction and discovery of itemsets from a large dataset were found to be 93% and 22% more time-efficient than SSP-Tree and WC-Tree, respectively. BIN-Tree is equally efficient to discover rare and frequent itemsets from a small dataset in the main memory.

Original languageEnglish
Pages (from-to)185-194
Number of pages10
JournalEngineered Science
Volume17
DOIs
Publication statusPublished - 2022

All Science Journal Classification (ASJC) codes

  • General Engineering
  • Physical and Theoretical Chemistry
  • Chemistry (miscellaneous)
  • General Materials Science
  • Energy Engineering and Power Technology
  • Artificial Intelligence
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Binary Count Tree: An Efficient and Compact Structure for Mining Rare and Frequent Itemsets'. Together they form a unique fingerprint.

Cite this