TY - JOUR
T1 - Semi-supervised deep learning based named entity recognition model to parse education section of resumes
AU - Gaur, Bodhvi
AU - Saluja, Gurpreet Singh
AU - Sivakumar, Hamsa Bharathi
AU - Singh, Sanjay
N1 - Publisher Copyright:
© 2020, The Author(s).
PY - 2021/6
Y1 - 2021/6
N2 - A job seeker’s resume contains several sections, including educational qualifications. Educational qualifications capture the knowledge and skills relevant to the job. Machine processing of the education sections of resumes has been a difficult task. In this paper, we attempt to identify educational institutions’ names and degrees from a resume’s education section. Usually, a significant amount of annotated data is required for neural network-based named entity recognition techniques. A semi-supervised approach is used to overcome the lack of large annotated data. We trained a deep neural network model on an initial (seed) set of resume education sections. This model is used to predict entities of unlabeled education sections and is rectified using a correction module. The education sections containing the rectified entities are augmented to the seed set. The updated seed set is used for retraining, leading to better accuracy than the previously trained model. This way, it can provide a high overall accuracy without the need of large annotated data. Our model has achieved an accuracy of 92.06% on the named entity recognition task.
AB - A job seeker’s resume contains several sections, including educational qualifications. Educational qualifications capture the knowledge and skills relevant to the job. Machine processing of the education sections of resumes has been a difficult task. In this paper, we attempt to identify educational institutions’ names and degrees from a resume’s education section. Usually, a significant amount of annotated data is required for neural network-based named entity recognition techniques. A semi-supervised approach is used to overcome the lack of large annotated data. We trained a deep neural network model on an initial (seed) set of resume education sections. This model is used to predict entities of unlabeled education sections and is rectified using a correction module. The education sections containing the rectified entities are augmented to the seed set. The updated seed set is used for retraining, leading to better accuracy than the previously trained model. This way, it can provide a high overall accuracy without the need of large annotated data. Our model has achieved an accuracy of 92.06% on the named entity recognition task.
UR - http://www.scopus.com/inward/record.url?scp=85091156602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091156602&partnerID=8YFLogxK
U2 - 10.1007/s00521-020-05351-2
DO - 10.1007/s00521-020-05351-2
M3 - Article
AN - SCOPUS:85091156602
SN - 0941-0643
VL - 33
SP - 5705
EP - 5718
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 11
ER -