Abstract
With the exponential growth of digital data across several domains, multimedia retrieval has emerged as a very critical research area. While much development has been witnessed in feature extraction, indexing methods, and deep learning-based retrieval models, many challenges persistently hamper the creation of efficient and scalable multimedia retrieval systems. This review systematically analyzes recent research in multimedia retrieval, highlighting key methodologies, findings, and limitations across various studies. The major gaps identified include the semantic disparity between low-level features and high-level semantics, as well as challenges of large datasets, privacy and security concerns, and the explainability of deep learning models. Besides, challenges on noisy and imbalanced data, multimodal data sources, and a lack of standardized benchmarking frameworks further limit the performance of existing systems. This paper comprehensively presents these gaps and proposes future research directions to bridge them, thereby directing the development of more robust, scalable, and user-centered multimedia retrieval systems. To overcome persistent challenges in versatility, semantic alignment, and trustworthiness, we introduce a unified framework that seamlessly integrates multimodal semantic fusion, explainable AI, privacy-preserving learning, and continual adaptation for robust and future-ready multimedia retrieval.
| Original language | English |
|---|---|
| Pages (from-to) | 143688-143712 |
| Number of pages | 25 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| Publication status | Published - 2025 |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering
Fingerprint
Dive into the research topics of 'A Comprehensive Review of Recent Advances in Multimodal Multimedia Indexing and Retrieval'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver