Abstract
Sheer growth in the rate of malware propagation across a wide range of file formats presents a significant challenge for existing detection systems. Such approaches often rely on either format-specific rules or homogeneous data representations. These methods fail to cope with obfuscation, encryption, or structural changes and are thus of limited use in large forensic and security processes that interact with varied and changing sources of data. To address these issues, this paper proposes a cross-modal deep-learning framework, named X-MET (Cross-Modal Entropy-based Technique), to identify the presence of static malware in a format-independent manner. X-MET combines image-based features with multi-order Rényi entropy values to classify malware consistently across diverse file formats. From an initial corpus of 408,000 samples, 100,253 representative samples across 11 file formats were obtained using stratified sampling and selective augmentation. The framework employs a dual-stream convolutional neural network that processed six channel inputs constructed from grayscale images, Shannon entropy, multi-order Rényi entropy and local entropy variance. These features are then adaptively combined by a cross-modal attention mechanism. The performance of the experimental assessment on the unseen test data achieved an F1 score of 0.9617 and an area under the receiver operating characteristic (AUROC) curve of 0.9847. Robust generalization is ensured by extensive validation with repeated stratified cross-validation, indicating a mean F1 score of 0.9593 with the best overall recall of 98.22% across all benchmarked models. Being reliable and free of format-dependent preprocessing, X-MET is scalable as well as practical in terms of large-scale malware detection in diverse environments.
| Original language | English |
|---|---|
| Pages (from-to) | 27972-27992 |
| Number of pages | 21 |
| Journal | IEEE Access |
| Volume | 14 |
| DOIs | |
| Publication status | Accepted/In press - 2026 |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering
Fingerprint
Dive into the research topics of 'X-MET: Multimodal Entropy–Visual Deep Learning Technique for Strengthening Cyber Governance and Resilient Digital Infrastructure'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver