TY - GEN
T1 - Categorical Boosting Machine for Tamil Character Recognition Using Shape Based Features
AU - Vishnu, Vishnu Mukundan
AU - Nevatia, Isha
AU - Mishra, Tusar Kanti
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Tamil is one of the world's earliest surviving languages, from which modern Indian scripts draw inspiration. However, there has been limited development of Optical Character Recognition techniques for Tamil characters. This paper describes a model for Tamil vowels-based Optical Character Recognition. The strategy seeks to utilise a CHFR feature extraction technique to extract three distinct feature vectors from the contours of each individual character's centroid. These feature vectors are condensed into a single vector before being sent to the CatBoost classifier for classification. Using the Optuna framework, hyperparameter optimisation and comparative analysis of existing models are performed. The proposed scheme is applied to a dataset of 6,972 handwritten samples evenly divided into 12 classes representing the vowels of the Tamil alphabet. The primary evaluation metric is accuracy, and 84.36% testing accuracy is observed.
AB - Tamil is one of the world's earliest surviving languages, from which modern Indian scripts draw inspiration. However, there has been limited development of Optical Character Recognition techniques for Tamil characters. This paper describes a model for Tamil vowels-based Optical Character Recognition. The strategy seeks to utilise a CHFR feature extraction technique to extract three distinct feature vectors from the contours of each individual character's centroid. These feature vectors are condensed into a single vector before being sent to the CatBoost classifier for classification. Using the Optuna framework, hyperparameter optimisation and comparative analysis of existing models are performed. The proposed scheme is applied to a dataset of 6,972 handwritten samples evenly divided into 12 classes representing the vowels of the Tamil alphabet. The primary evaluation metric is accuracy, and 84.36% testing accuracy is observed.
UR - https://www.scopus.com/pages/publications/85187266635
UR - https://www.scopus.com/pages/publications/85187266635#tab=citedBy
U2 - 10.1109/ICCUBEA58933.2023.10392044
DO - 10.1109/ICCUBEA58933.2023.10392044
M3 - Conference contribution
AN - SCOPUS:85187266635
T3 - 2023 7th International Conference On Computing, Communication, Control And Automation, ICCUBEA 2023
BT - 2023 7th International Conference On Computing, Communication, Control And Automation, ICCUBEA 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 7th International Conference On Computing, Communication, Control And Automation, ICCUBEA 2023
Y2 - 18 August 2023 through 19 August 2023
ER -