TY - JOUR
T1 - Partial Weighted Count Tree for Discovery of Rare and Frequent Itemsets
AU - Rai, Shwetha
AU - Geetha, M.
AU - Kumar, Preetham
AU - Giridhar, B.
N1 - Funding Information:
The authors would like to acknowledge Mr. Nakul Shetty and Dr. Salmataj S. A. for their assistance while writing this research paper. The authors would like to thank the Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal for providing the lab facilities to conduct the experiments.
Publisher Copyright:
© Engineered Science Publisher LLC 2022.
PY - 2022/12/1
Y1 - 2022/12/1
N2 - Time and space utilization for discovering interesting patterns from a database plays an important role in analyzing information for major sectors like education, medicine, and e-business. Association rule mining (ARM) technique is used to discover associations among the patterns from large volumes of data. In most ARM algorithms, rare and frequent itemsets discovery is optimized by mining pruned databases stored in the main memory. However, in this case, any change in requirements would necessitate re-scanning of the database. Weighted count tree (WC-Tree), and Single scan pattern tree (SSP-Tree) store the database in the main memory without pruning. WC-Tree stores the entire transaction as a node in the tree. However, if the weight is large, the actual information may be lost due to the precision error. In the current work, an efficient data structure, Partial weighted count tree (PWC-Tree), is proposed to store the database as a complete and compact structure in the main memory without losing the information. The work revealed that PWC-Tree construction is in O(n2) for n transactions in the database. The experimental results show that, for a large dataset, the PWC-Tree is time as well as space-efficient when compared with WC-Tree and SSP-Tree.
AB - Time and space utilization for discovering interesting patterns from a database plays an important role in analyzing information for major sectors like education, medicine, and e-business. Association rule mining (ARM) technique is used to discover associations among the patterns from large volumes of data. In most ARM algorithms, rare and frequent itemsets discovery is optimized by mining pruned databases stored in the main memory. However, in this case, any change in requirements would necessitate re-scanning of the database. Weighted count tree (WC-Tree), and Single scan pattern tree (SSP-Tree) store the database in the main memory without pruning. WC-Tree stores the entire transaction as a node in the tree. However, if the weight is large, the actual information may be lost due to the precision error. In the current work, an efficient data structure, Partial weighted count tree (PWC-Tree), is proposed to store the database as a complete and compact structure in the main memory without losing the information. The work revealed that PWC-Tree construction is in O(n2) for n transactions in the database. The experimental results show that, for a large dataset, the PWC-Tree is time as well as space-efficient when compared with WC-Tree and SSP-Tree.
UR - http://www.scopus.com/inward/record.url?scp=85137202420&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137202420&partnerID=8YFLogxK
U2 - 10.30919/es8d731
DO - 10.30919/es8d731
M3 - Article
AN - SCOPUS:85137202420
SN - 2576-988X
VL - 20
JO - Engineered Science
JF - Engineered Science
ER -