Abstract
This paper proposes a system of part of speech tagging for the South Indian language Kannada using supervised machine learning. POS tagging is an important step in Natural Language Processing and has varied applications such as word sense disambiguation, natural language understanding etc. Based on extensive research into methods used for POS tagging, Conditional Random fields have been chosen as our algorithm. CRFs are used for sequence modeling in POS tagging, named entity recognition and as an alternative to Hidden Markov Models. Three very large corpora are used and their results are compared. The feature sets for all three corpora are also varied. The best method for the task is determined using these results.
Original language | English |
---|---|
Pages (from-to) | 2418-2421 |
Number of pages | 4 |
Journal | International Journal of Engineering and Technology(UAE) |
Volume | 7 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2018 |
All Science Journal Classification (ASJC) codes
- Biotechnology
- Computer Science (miscellaneous)
- Environmental Engineering
- General Chemical Engineering
- General Engineering
- Hardware and Architecture