Statistical Detection of Data Drift in Real-time Social Network Conversations

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The increasing reliance on conversational datasets for natural language processing (NLP) applications necessitates a comprehensive understanding of potential data drift phenomena. This paper investigates the phenomenon of data drift within conversational datasets over time, aiming to develop effective methods for detection and mitigation. Our approach involves the analysis of temporal changes in the distribution of conversation data, focusing on linguistic patterns, user preferences, and contextual nuances. A novel framework leveraging advanced statistical methods and machine learning techniques to quantify and detect data drift within the dataset is proposed here. The methodology is designed to adapt to the evolving nature of language use, capturing subtle shifts in conversational dynamics that may impact model performance. Furthermore, experimental results on a diverse set of conversational datasets, demonstrating the efficacy of our approach in identifying and characterizing data drift is presented here. The findings highlight the importance of continuous monitoring and adaptation to evolving linguistic patterns, ensuring the robustness and generalization capability of NLP models over time. This research contributes to the broader understanding of data drift in conversational datasets and provides a foundation for the development of adaptive NLP models capable of maintaining high performance in dynamic linguistic environments. The proposed framework not only enhances the reliability of existing models but also lays the groundwork for future research in addressing the evolving challenges posed by data drift in natural language conversations.

Original languageEnglish
Title of host publication2024 15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350370249
DOIs
Publication statusPublished - 2024
Event15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024 - Kamand, India
Duration: 24-06-202428-06-2024

Publication series

Name2024 15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024

Conference

Conference15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024
Country/TerritoryIndia
CityKamand
Period24-06-2428-06-24

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Decision Sciences (miscellaneous)
  • Information Systems and Management
  • Health Informatics
  • Communication

Fingerprint

Dive into the research topics of 'Statistical Detection of Data Drift in Real-time Social Network Conversations'. Together they form a unique fingerprint.

Cite this