Streamlined Data Pipeline for Real-Time Threat Detection and Model Inference

  • Rajkanwar Singh*
  • , V. Aravindan
  • , Sanket Mishra
  • , Sunil Kumar Singh
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Real-time threat detection in streaming data is crucial yet challenging due to varying data volumes and speeds. This paper presents an architecture designed to manage large-scale, high-speed data streams using deep learning and machine learning models. The system utilizes Apache Kafka for high-throughput data transfer and a publish-subscribe model to facilitate continuous threat detection. Various machine learning techniques, including XGBoost, Random Forest, and LightGBM, are evaluated to identify the best model for classification. The ExtraTrees model achieves exceptional performance with accuracy, precision, recall, and F1 score all reaching 99% using the SensorNetGuard dataset within this architecture. The PyFlink framework, with its parallel processing capabilities, supports real-time training and adaptation of these models. The system calculates prediction metrics every 2, 000 data points, ensuring efficient and accurate real-time threat detection.

Original languageEnglish
Title of host publication2025 17th International Conference on COMmunication Systems and NETworkS, COMSNETS 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1148-1153
Number of pages6
Edition2025
ISBN (Electronic)9798331531195
DOIs
Publication statusPublished - 2025
Event17th International Conference on COMmunication Systems and NETworkS, COMSNETS 2025 - Bengaluru, India
Duration: 06-01-202510-01-2025

Conference

Conference17th International Conference on COMmunication Systems and NETworkS, COMSNETS 2025
Country/TerritoryIndia
CityBengaluru
Period06-01-2510-01-25

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Streamlined Data Pipeline for Real-Time Threat Detection and Model Inference'. Together they form a unique fingerprint.

Cite this