TY - GEN
T1 - A Deep Learning Approach Using Vision Transformer (ViT) for Alzheimer Detection in MRI Images
AU - Chandan, Dhriti
AU - Rashmi, R.
AU - Pathan, Sumaiya
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Accurate detection of Alzheimer's Disease (AD) through MRIs is integral for early diagnosis and intervention. This paper offers a fresh perspective of Alzheimer's Detection using Vision Transformers (ViTs) for brain MRI images. The study uses an Alzheimer MRI Disease Classification dataset, which categorizes MRI images into four different stages that is Mild Demented, Moderate Demented, Non-Demented and Very Mild Demented. We fine-tune the Vision Transformer model Google's vit-base-patch16-224-in21k, to improve the classification accuracy. Compared to Convolutional Neural Networks (CNNs), the computational efficiency and classification accuracy is enhanced by utilizing ViT's capability to directly handle image patches. The MRI images are pre-processed into RGB formats and are converted into their tensor formats for input into the model. The end result reveals that the Vision Transformer model gets a classification accuracy of 95.55. These results can serve as a benchmark in upcoming research in AD detection and in the demonstration of the effectiveness of the Vision Transformer (ViT) in the medical field. This study emphasizes the capability of ViTs to optimize the accuracy of the detection of AD and highlights the importance of further research to optimize this model.
AB - Accurate detection of Alzheimer's Disease (AD) through MRIs is integral for early diagnosis and intervention. This paper offers a fresh perspective of Alzheimer's Detection using Vision Transformers (ViTs) for brain MRI images. The study uses an Alzheimer MRI Disease Classification dataset, which categorizes MRI images into four different stages that is Mild Demented, Moderate Demented, Non-Demented and Very Mild Demented. We fine-tune the Vision Transformer model Google's vit-base-patch16-224-in21k, to improve the classification accuracy. Compared to Convolutional Neural Networks (CNNs), the computational efficiency and classification accuracy is enhanced by utilizing ViT's capability to directly handle image patches. The MRI images are pre-processed into RGB formats and are converted into their tensor formats for input into the model. The end result reveals that the Vision Transformer model gets a classification accuracy of 95.55. These results can serve as a benchmark in upcoming research in AD detection and in the demonstration of the effectiveness of the Vision Transformer (ViT) in the medical field. This study emphasizes the capability of ViTs to optimize the accuracy of the detection of AD and highlights the importance of further research to optimize this model.
UR - https://www.scopus.com/pages/publications/105010205916
UR - https://www.scopus.com/pages/publications/105010205916#tab=citedBy
U2 - 10.1109/INCIP64058.2025.11020410
DO - 10.1109/INCIP64058.2025.11020410
M3 - Conference contribution
AN - SCOPUS:105010205916
T3 - Proceedings - International Conference on Next Generation Communication and Information Processing, INCIP 2025
SP - 931
EP - 936
BT - Proceedings - International Conference on Next Generation Communication and Information Processing, INCIP 2025
A2 - Bukya, Mahipal
A2 - Kumar, Pramod
A2 - Rawat, Sanyog
A2 - Jangid, Mahesh
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 International Conference on Next Generation Communication and Information Processing, INCIP 2025
Y2 - 23 January 2025 through 24 January 2025
ER -