Abstract
Real-time threat detection in streaming data is crucial yet challenging due to varying data volumes and speeds. This paper presents an architecture designed to manage large-scale, high-speed data streams using deep learning and machine learning models. The system utilizes Apache Kafka for high-throughput data transfer and a publish-subscribe model to facilitate continuous threat detection. Various machine learning techniques, including XGBoost, Random Forest, and LightGBM, are evaluated to identify the best model for classification. The ExtraTrees model achieves exceptional performance with accuracy, precision, recall, and F1 score all reaching 99% using the SensorNetGuard dataset within this architecture. The PyFlink framework, with its parallel processing capabilities, supports real-time training and adaptation of these models. The system calculates prediction metrics every 2, 000 data points, ensuring efficient and accurate real-time threat detection.
| Original language | English |
|---|---|
| Title of host publication | 2025 17th International Conference on COMmunication Systems and NETworkS, COMSNETS 2025 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1148-1153 |
| Number of pages | 6 |
| Edition | 2025 |
| ISBN (Electronic) | 9798331531195 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 17th International Conference on COMmunication Systems and NETworkS, COMSNETS 2025 - Bengaluru, India Duration: 06-01-2025 → 10-01-2025 |
Conference
| Conference | 17th International Conference on COMmunication Systems and NETworkS, COMSNETS 2025 |
|---|---|
| Country/Territory | India |
| City | Bengaluru |
| Period | 06-01-25 → 10-01-25 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Networks and Communications
- Hardware and Architecture
- Information Systems
- Information Systems and Management
- Safety, Risk, Reliability and Quality