Abstract
Data Mining is a field of computer science that is concerned with extracting useful information from varied sources. In an era where information has become the inherent necessity of human beings, its increased relevance and usefulness has taken focus as need of the hour. The most important part of this association rule mining is the mining of item sets that are frequent. Market basket analysis is done by companies in order to retrieve itemsets that are frequent and often used together by customers. Apriori algorithm is a widely used technique in order to find those combinations of itemsets. However, when any of these frequent itemsets increases in length, the algorithm needs to pass through many iterations and, as a result, the performance drastically decreases. In this paper, we propose a modification to the apriori algorithm by using a hash function which divides the frequent item sets into buckets. Further, we propose a novel technique to be used in conjunction with the apriori algorithm by eliminating infrequent itemsets from the candidate set. In this top down approach, it finds the frequent itemsets without going through several iterations, thus saving time and space. By discovering a large maximal frequent itemset very early in the algorithm, all its subsets are also frequent hence we no longer need to scan them. Clearly, the proposed technique has an advantage over the existing apriori algorithm when the most frequent itemset's length is long.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 747-752 |
Number of pages | 6 |
ISBN (Electronic) | 9781509023981 |
DOIs | |
Publication status | Published - 25-04-2017 |
Externally published | Yes |
Event | 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016 - Bengaluru, Karnataka, India Duration: 21-07-2016 → 23-07-2016 |
Conference
Conference | 2nd International Conference on Applied and Theoretical Computing and Communication Technology, iCATccT 2016 |
---|---|
Country/Territory | India |
City | Bengaluru, Karnataka |
Period | 21-07-16 → 23-07-16 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Computer Science Applications
- Signal Processing
- Software