EN
TR
COMPARATIVE ANALYSIS OF VISION TRANSFORMERS AND CONVOLUTIONAL NEURAL NETWORKS IN DIABETIC RETINOPATHY DIAGNOSIS
Abstract
Diabetic retinopathy can lead to significant visual complications and significantly affects individuals' quality of life. This study focuses on comparing the performance of Vision Transformer (ViT) models and Convolutional Neural Networks (CNN) methods in diabetic retinopathy diagnosis and aims to evaluate their potential as an alternative to traditional diagnostic methods. In this study, the performance of four different ViT model architectures and four different convolutional neural network (CNN) models in training and testing phases were comparatively analyzed. ViT models achieved accuracy rates of 97.83%, 98.41%, 95.2%, and 98.26% for "tiny," "base," "small," and "large," respectively. Additionally, models trained with VGG13, ResNet18, ResNet50, and SqueezeNet architectures from CNN techniques achieved accuracy rates of 96.1%, 97.83%, 90.9%, and 93.93%, respectively. ViT architectures achieved higher accuracy rates than CNN architectures. When the results were evaluated, it was concluded that ViT methods were more successful in the diagnosis of diabetic retinopathy.
Keywords
Kaynakça
- Alhawas, N., & Tüfekçi, Z. (2022). The Identification of Red-Meat Types using The Fine-Tuned Vision Transformer and MobileNet Models. European Journal of Science and Technology. https://doi.org/10.31590/ejosat.1112892
- Beyer, L., Zhai, X., & Kolesnikov, A. I. (2022). Better plain ViT baselines for ImageNet-1k. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2205.01580
- Chen, J., He, Y., Frey, E. C., Li, Y., & Du, Y. (2021). VIT-V-Net: Vision Transformer for unsupervised Volumetric Medical Image Registration. arXiv.org. https://arxiv.org/abs/2104.06468
- Chintamreddy, D., & Seshasayee, U. R. (2024, June). Detection of Diabetic Retinopathy (DR) Severity from Fundus Photographs using Conv-ViT. In 2024 International Conference on Advancements in Power, Communication and Intelligent Systems (APCI) (pp. 1-6). IEEE.
- Darabi, P. K. (n.d.). Competitions Contributor. Kaggle. https://www.kaggle.com/pkdarabi/competitions. Accessed [24.07.2024].
- Dosovitskiy, A., et al. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv.org. https://arxiv.org/abs/2010.11929
- Fang, L., & Qiao, H. (2022). Diabetic retinopathy classification using a novel DAG network based on multi-feature of fundus images. Biomedical Signal Processing and Control, 77, 103810. https://doi.org/10.1016/j.bspc.2022.103810
- Huang, Y.-H., et al. (2023). Model long-range dependencies for multi-modality and multi-view retinopathy diagnosis through transformers. Knowledge-Based Systems, 271, 110544. https://doi.org/10.1016/j.knosys.2023.110544
Ayrıntılar
Birincil Dil
İngilizce
Konular
Yazılım Mühendisliği (Diğer)
Bölüm
Araştırma Makalesi
Yayımlanma Tarihi
3 Haziran 2025
Gönderilme Tarihi
24 Temmuz 2024
Kabul Tarihi
23 Ocak 2025
Yayımlandığı Sayı
Yıl 2025 Cilt: 28 Sayı: 2
APA
Yüzgeç Özdemir, E., Koç, C., & Özyurt, F. (2025). COMPARATIVE ANALYSIS OF VISION TRANSFORMERS AND CONVOLUTIONAL NEURAL NETWORKS IN DIABETIC RETINOPATHY DIAGNOSIS. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 28(2), 592-600. https://doi.org/10.17780/ksujes.1521858