Skip to main navigation Skip to search Skip to main content

BERT based Transformers lead the way in Extraction of Health Information from Social Media

  • Sidharth Ramesh
  • , Abhiraj Tiwari
  • , Parthivi Choubey
  • , Saisha Kashyap
  • , Sahil Khose
  • , Kumud Lakara
  • , Nishesh Singh
  • , Ujjwal Verma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes our submissions for the Social Media Mining for Health (SMM4H) 2021 shared tasks. We participated in 2 tasks: (1) Classification, extraction and normalization of adverse drug effect (ADE) mentions in English tweets (Task-1) and (2) Classification of COVID-19 tweets containing symptoms (Task-6). Our approach for the first task uses the language representation model RoBERTa with a binary classification head. For the second task, we use BERTweet, based on RoBERTa. Fine-tuning is performed on the pre-trained models for both tasks. The models are placed on top of a custom domain-specific pre-processing pipeline. Our system ranked first among all the submissions for subtask-1(a) with an F1-score of 61%. For subtask-1(b), our system obtained an F1-score of 50% with improvements up to +8% F1 over the score averaged across all submissions. The BERTweet model achieved an F1 score of 94% on SMM4H 2021 Task-6.

Original languageEnglish
Title of host publicationSocial Media Mining for Health, SMM4H 2021 - Proceedings of the 6th Workshop and Shared Tasks
EditorsArjun Magge, Ari Z. Klein, Antonio Miranda-Escalada, Mohammed Ali Al-garadi, Ilseyar Alimova, Zulfat Miftahutdinov, Eulalia Farre-Maduell, Salvador Lima Lopez, Ivan Flores, Karen O'Connor, Davy Weissenbacher, Elena Tutubalina, Abeed Sarker, Juan M Banda, Martin Krallinger, Graciela Gonzalez-Hernandez
PublisherAssociation for Computational Linguistics (ACL)
Pages33-38
Number of pages6
ISBN (Electronic)9781954085312
Publication statusPublished - 2021
Event6th Workshop and Shared Tasks on Social Media Mining for Health, SMM4H 2021 - Mexico City, Mexico
Duration: 10-06-2021 → …

Publication series

NameSocial Media Mining for Health, SMM4H 2021 - Proceedings of the 6th Workshop and Shared Tasks

Conference

Conference6th Workshop and Shared Tasks on Social Media Mining for Health, SMM4H 2021
Country/TerritoryMexico
CityMexico City
Period10-06-21 → …

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'BERT based Transformers lead the way in Extraction of Health Information from Social Media'. Together they form a unique fingerprint.

Cite this