Research Article
BibTex RIS Cite

GÖRÜNTÜ TRANSFORMATÖRLERİ VE EVRİŞİMLİ SİNİR AĞLARININ DİYABETİK RETİNOPATİ TEŞHİSİNDE KARŞILAŞTIRMALI ANALİZİ

Year 2025, Volume: 28 Issue: 2, 592 - 600, 03.06.2025
https://doi.org/10.17780/ksujes.1521858

Abstract

Diyabetik retinopati, önemli görsel komplikasyonlara yol açabilen ve bireylerin yaşam kalitesini önemli ölçüde etkileyen bir hastalıktır. Bu çalışma, diyabetik retinopatinin erken evrelerde teşhis edilmesinin önemini vurgulamakta, mevcut teşhis yöntemlerinin sınırlılıklarına dikkat çekmekte ve geleneksel yöntemlere alternatif olarak Görüntü Dönüştürücüsü (ViT) modellerinin potansiyelini ele almaktadır. Bu çalışmada, dört farklı ViT model mimarisinin yanı sıra döt farklı evrişimli sinir ağı (CNN) modellerinin eğitim ve test aşamalarındaki performansları karşılaştırmalı olarak analiz edilmiştir. ViT modelleri 'tiny', 'base', 'small' ve 'large' sırasıyla %97,83, %98,41, %95,2 ve %98,26 doğruluk oranlarına ulaşmıştır. Ayrıca CNN tekniklerinden VGG13, ResNet18, ResNet50 ve SqueezeNet mimarileri ile eğitilen modeller sırasıyla %96,1, %97,83, %90,9 ve %93,93 doğruluk oranlarına ulaşmıştır. Çalışma sonucunda ViT mimarileri CNN mimarilerine göre daha yüksek doğruluk oranlarına ulaşmıştır. Sonuçlar değerlendirildiğinde ViT yöntemlerinin diyabetik retinopati teşhisinde daha başarılı olduğu sonucuna varılmıştır.

References

  • Alhawas, N., & Tüfekçi, Z. (2022). The Identification of Red-Meat Types using The Fine-Tuned Vision Transformer and MobileNet Models. European Journal of Science and Technology. https://doi.org/10.31590/ejosat.1112892
  • Beyer, L., Zhai, X., & Kolesnikov, A. I. (2022). Better plain ViT baselines for ImageNet-1k. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2205.01580
  • Chen, J., He, Y., Frey, E. C., Li, Y., & Du, Y. (2021). VIT-V-Net: Vision Transformer for unsupervised Volumetric Medical Image Registration. arXiv.org. https://arxiv.org/abs/2104.06468
  • Chintamreddy, D., & Seshasayee, U. R. (2024, June). Detection of Diabetic Retinopathy (DR) Severity from Fundus Photographs using Conv-ViT. In 2024 International Conference on Advancements in Power, Communication and Intelligent Systems (APCI) (pp. 1-6). IEEE.
  • Darabi, P. K. (n.d.). Competitions Contributor. Kaggle. https://www.kaggle.com/pkdarabi/competitions. Accessed [24.07.2024].
  • Dosovitskiy, A., et al. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv.org. https://arxiv.org/abs/2010.11929
  • Fang, L., & Qiao, H. (2022). Diabetic retinopathy classification using a novel DAG network based on multi-feature of fundus images. Biomedical Signal Processing and Control, 77, 103810. https://doi.org/10.1016/j.bspc.2022.103810
  • Huang, Y.-H., et al. (2023). Model long-range dependencies for multi-modality and multi-view retinopathy diagnosis through transformers. Knowledge-Based Systems, 271, 110544. https://doi.org/10.1016/j.knosys.2023.110544
  • Karthika, S., & Durgadevi, M. (2024). Improved ResNet_101 assisted attentional global transformer network for automated detection and classification of diabetic retinopathy disease. Biomedical Signal Processing and Control, 88, 105674. https://doi.org/10.1016/j.bspc.2023.105674
  • Lian, J., & Li, T. (2024). Lesion identification in fundus images via convolutional neural network-vision transformer. Biomedical Signal Processing and Control, 88, 105607. https://doi.org/10.1016/j.bspc.2023.105607
  • Manzari, O. N., Ahmadabadi, H., Kashiani, H., Shokouhi, S. B., & Ayatollahi, A. (2023). MedViT: A robust vision transformer for generalized medical image classification. Computers in Biology and Medicine, 157, 106791. https://doi.org/10.1016/j.compbiomed.2023.106791
  • Özçelik, Y. B., & Altan, A. (2021). Diyabetik Retinopati Teşhisi için Fundus Görüntülerinin Derin Öğrenme Tabanlı Sınıflandırılması. European Journal of Science and Technology. December 2021. https://doi.org/10.31590/ejosat.1011806
  • Özçelik, Y. B., & Altan, A. (2023). Overcoming nonlinear dynamics in diabetic retinopathy classification: a robust AI-based model with chaotic swarm intelligence optimization and recurrent long short-term memory. Fractal and Fractional, 7(8), 598.
  • Patil, M. S., Chickerur, S., Abhimalya, C., Naik, A., Kumari, N., & Maurya, S. K. (2023). Effective deep learning data augmentation techniques for diabetic retinopathy classification. Procedia Computer Science, 218, 1156-1165. https://doi.org/10.1016/j.procs.2023.01.094
  • Powers, D. M. (2020). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061.
  • Rahmanlar, H., Atılgan, C. Ü., Çıtırık, M., Yaradilmiş, İ. M., & Gürsöz, H. (2019). Türkiye’de diyabetik retinopati tanısında endikasyon dışı ilaç kullanımı. Sakarya Medical Journal, 9(3), 499-505. https://doi.org/10.31832/smj.543998
  • Sunkari, S., et al. (2024). A refined ResNet18 architecture with Swish activation function for Diabetic Retinopathy classification. Biomedical Signal Processing and Control, 88, 105630. https://doi.org/10.1016/j.bspc.2023.105630
  • Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021). Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877.
  • Uğurlu, N., Taşlıpınar, A. G., Yülek, F., Özdemir, D., Ersoy, R., & Çakır, B. (2018). Evaluation of Retinal Microvascular Structures in Type 1 Diabetic Patients without Diabetic Retinopathy. Ankara Medical Journal. December 2018. https://doi.org/10.17098/amj.501136
  • Wang, Z., Dong, N., & Voiculescu, I. (2022). Computationally-Efficient Vision transformer for medical image semantic segmentation via dual Pseudo-Label supervision. 2022 IEEE International Conference on Image Processing (ICIP). https://doi.org/10.1109/icip46576.2022.9897482
  • Wu, J., Hu, R., Xiao, Z., Chen, J., & Liu, J. (2021). Vision Transformer‐based recognition of diabetic retinopathy grade. Medical Physics, 48(12), 7850-7863.
  • Wu, K., et al. (2023). TinyCLIP: CLIP distillation via affinity mimicking and weight inheritance. arXiv.org. https://arxiv.org/abs/2309.12314
  • Zhou, D., et al. (2021). DeepVIT: Towards Deeper Vision Transformer. arXiv.org. https://arxiv.org/abs/2103.11886

COMPARATIVE ANALYSIS OF VISION TRANSFORMERS AND CONVOLUTIONAL NEURAL NETWORKS IN DIABETIC RETINOPATHY DIAGNOSIS

Year 2025, Volume: 28 Issue: 2, 592 - 600, 03.06.2025
https://doi.org/10.17780/ksujes.1521858

Abstract

Diabetic retinopathy can lead to significant visual complications and significantly affects individuals' quality of life. This study focuses on comparing the performance of Vision Transformer (ViT) models and Convolutional Neural Networks (CNN) methods in diabetic retinopathy diagnosis and aims to evaluate their potential as an alternative to traditional diagnostic methods. In this study, the performance of four different ViT model architectures and four different convolutional neural network (CNN) models in training and testing phases were comparatively analyzed. ViT models achieved accuracy rates of 97.83%, 98.41%, 95.2%, and 98.26% for "tiny," "base," "small," and "large," respectively. Additionally, models trained with VGG13, ResNet18, ResNet50, and SqueezeNet architectures from CNN techniques achieved accuracy rates of 96.1%, 97.83%, 90.9%, and 93.93%, respectively. ViT architectures achieved higher accuracy rates than CNN architectures. When the results were evaluated, it was concluded that ViT methods were more successful in the diagnosis of diabetic retinopathy.

References

  • Alhawas, N., & Tüfekçi, Z. (2022). The Identification of Red-Meat Types using The Fine-Tuned Vision Transformer and MobileNet Models. European Journal of Science and Technology. https://doi.org/10.31590/ejosat.1112892
  • Beyer, L., Zhai, X., & Kolesnikov, A. I. (2022). Better plain ViT baselines for ImageNet-1k. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2205.01580
  • Chen, J., He, Y., Frey, E. C., Li, Y., & Du, Y. (2021). VIT-V-Net: Vision Transformer for unsupervised Volumetric Medical Image Registration. arXiv.org. https://arxiv.org/abs/2104.06468
  • Chintamreddy, D., & Seshasayee, U. R. (2024, June). Detection of Diabetic Retinopathy (DR) Severity from Fundus Photographs using Conv-ViT. In 2024 International Conference on Advancements in Power, Communication and Intelligent Systems (APCI) (pp. 1-6). IEEE.
  • Darabi, P. K. (n.d.). Competitions Contributor. Kaggle. https://www.kaggle.com/pkdarabi/competitions. Accessed [24.07.2024].
  • Dosovitskiy, A., et al. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv.org. https://arxiv.org/abs/2010.11929
  • Fang, L., & Qiao, H. (2022). Diabetic retinopathy classification using a novel DAG network based on multi-feature of fundus images. Biomedical Signal Processing and Control, 77, 103810. https://doi.org/10.1016/j.bspc.2022.103810
  • Huang, Y.-H., et al. (2023). Model long-range dependencies for multi-modality and multi-view retinopathy diagnosis through transformers. Knowledge-Based Systems, 271, 110544. https://doi.org/10.1016/j.knosys.2023.110544
  • Karthika, S., & Durgadevi, M. (2024). Improved ResNet_101 assisted attentional global transformer network for automated detection and classification of diabetic retinopathy disease. Biomedical Signal Processing and Control, 88, 105674. https://doi.org/10.1016/j.bspc.2023.105674
  • Lian, J., & Li, T. (2024). Lesion identification in fundus images via convolutional neural network-vision transformer. Biomedical Signal Processing and Control, 88, 105607. https://doi.org/10.1016/j.bspc.2023.105607
  • Manzari, O. N., Ahmadabadi, H., Kashiani, H., Shokouhi, S. B., & Ayatollahi, A. (2023). MedViT: A robust vision transformer for generalized medical image classification. Computers in Biology and Medicine, 157, 106791. https://doi.org/10.1016/j.compbiomed.2023.106791
  • Özçelik, Y. B., & Altan, A. (2021). Diyabetik Retinopati Teşhisi için Fundus Görüntülerinin Derin Öğrenme Tabanlı Sınıflandırılması. European Journal of Science and Technology. December 2021. https://doi.org/10.31590/ejosat.1011806
  • Özçelik, Y. B., & Altan, A. (2023). Overcoming nonlinear dynamics in diabetic retinopathy classification: a robust AI-based model with chaotic swarm intelligence optimization and recurrent long short-term memory. Fractal and Fractional, 7(8), 598.
  • Patil, M. S., Chickerur, S., Abhimalya, C., Naik, A., Kumari, N., & Maurya, S. K. (2023). Effective deep learning data augmentation techniques for diabetic retinopathy classification. Procedia Computer Science, 218, 1156-1165. https://doi.org/10.1016/j.procs.2023.01.094
  • Powers, D. M. (2020). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061.
  • Rahmanlar, H., Atılgan, C. Ü., Çıtırık, M., Yaradilmiş, İ. M., & Gürsöz, H. (2019). Türkiye’de diyabetik retinopati tanısında endikasyon dışı ilaç kullanımı. Sakarya Medical Journal, 9(3), 499-505. https://doi.org/10.31832/smj.543998
  • Sunkari, S., et al. (2024). A refined ResNet18 architecture with Swish activation function for Diabetic Retinopathy classification. Biomedical Signal Processing and Control, 88, 105630. https://doi.org/10.1016/j.bspc.2023.105630
  • Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021). Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877.
  • Uğurlu, N., Taşlıpınar, A. G., Yülek, F., Özdemir, D., Ersoy, R., & Çakır, B. (2018). Evaluation of Retinal Microvascular Structures in Type 1 Diabetic Patients without Diabetic Retinopathy. Ankara Medical Journal. December 2018. https://doi.org/10.17098/amj.501136
  • Wang, Z., Dong, N., & Voiculescu, I. (2022). Computationally-Efficient Vision transformer for medical image semantic segmentation via dual Pseudo-Label supervision. 2022 IEEE International Conference on Image Processing (ICIP). https://doi.org/10.1109/icip46576.2022.9897482
  • Wu, J., Hu, R., Xiao, Z., Chen, J., & Liu, J. (2021). Vision Transformer‐based recognition of diabetic retinopathy grade. Medical Physics, 48(12), 7850-7863.
  • Wu, K., et al. (2023). TinyCLIP: CLIP distillation via affinity mimicking and weight inheritance. arXiv.org. https://arxiv.org/abs/2309.12314
  • Zhou, D., et al. (2021). DeepVIT: Towards Deeper Vision Transformer. arXiv.org. https://arxiv.org/abs/2103.11886
There are 23 citations in total.

Details

Primary Language English
Subjects Software Engineering (Other)
Journal Section Computer Engineering
Authors

Esra Yüzgeç Özdemir 0000-0003-2914-2603

Canan Koç 0000-0002-2651-9471

Fatih Özyurt 0000-0002-8154-6691

Publication Date June 3, 2025
Submission Date July 24, 2024
Acceptance Date January 23, 2025
Published in Issue Year 2025Volume: 28 Issue: 2

Cite

APA Yüzgeç Özdemir, E., Koç, C., & Özyurt, F. (2025). COMPARATIVE ANALYSIS OF VISION TRANSFORMERS AND CONVOLUTIONAL NEURAL NETWORKS IN DIABETIC RETINOPATHY DIAGNOSIS. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 28(2), 592-600. https://doi.org/10.17780/ksujes.1521858