TY - GEN
T1 - Prefix-Suffix trees
T2 - 2nd International Conference on Pattern Recognition and Machine Intelligence, PReMI 2007
AU - Pai, Radhika M.
AU - Ananthanarayana, V. S.
PY - 2007
Y1 - 2007
N2 - An important goal in data mining is to generate an abstraction of the data. Such an abstraction helps in reducing the time and space requirements of the overall decision making process. It is also important that the abstraction be generated from the data in small number of scans. In this paper we propose a novel scheme called Prefix-Suffix trees for compact storage of patterns in data mining, which forms an abstraction of the patterns, and which is generated from the data in a single scan. This abstraction takes less amount of space and hence forms a compact storage of patterns. Further, we propose a clustering algorithm based on this storage and prove experimentally that this type of storage reduces the space and time. This has been established by considering large data sets of handwritten numerals namely the OCR data, the MNIST data and the USPS data. The proposed algorithm is compared with other similar algorithms and the efficacy of our scheme is thus established.
AB - An important goal in data mining is to generate an abstraction of the data. Such an abstraction helps in reducing the time and space requirements of the overall decision making process. It is also important that the abstraction be generated from the data in small number of scans. In this paper we propose a novel scheme called Prefix-Suffix trees for compact storage of patterns in data mining, which forms an abstraction of the patterns, and which is generated from the data in a single scan. This abstraction takes less amount of space and hence forms a compact storage of patterns. Further, we propose a clustering algorithm based on this storage and prove experimentally that this type of storage reduces the space and time. This has been established by considering large data sets of handwritten numerals namely the OCR data, the MNIST data and the USPS data. The proposed algorithm is compared with other similar algorithms and the efficacy of our scheme is thus established.
UR - http://www.scopus.com/inward/record.url?scp=38149103161&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38149103161&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:38149103161
SN - 3540770453
SN - 9783540770459
VL - 4815 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 316
EP - 323
BT - Pattern Recognition and Machine Intelligence - Second International Conference, PReMI 2007, Proceedings
Y2 - 18 December 2007 through 22 December 2007
ER -