HAYCAM VS EIGENCAM FOR WEAKLY-SUPERVISED OBJECT DETECTION ACROSS VARYING SCALES
Yıl 2024,
Cilt: 27 Sayı: 3, 1078 - 1088, 03.09.2024
Ahmet Ornek
,
Murat Ceylan
Öz
When a classification process is performed using Class Activation Maps, which is one of the Explainable Artificial Intelligence approaches, the areas influencing the classification on the input image can be revealed. In other words, it is demonstrated which part of the image the classifier model looks at to make a decision. In this study, a 200-class classification model was trained using the open-source dataset CUB 200 2011, and the classification results were visualized using the EigenCAM and HayCAM methods. When comparing object detection performances based on the areas influencing classification, the EigenCAM method reaches an IoU (Intersection over Union) value of 30.88%, while the HayCAM method reaches a value of 41.95%. The obtained results indicate that outputs derived using Principal Component Analysis (HayCAM) are better than those obtained using Singular Value Decomposition (EigenCAM).
Etik Beyan
The paper reflects the authors' own research and analysis in a truthful and complete manner.
Destekleyen Kurum
Huawei Türkiye R&D Center
Teşekkür
Huawei Türkiye R&D Center
Kaynakça
- Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., & Barbado, A. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012
- Chattopadhay, A., Sarkar, A., Howlader, P., & Balasubramanian, V. N. (2018). Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 839–847). https://doi.org/10.1109/WACV.2018.00097
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90
- Kornblith, S., Shlens, J., & Le, Q. V. (2019). Do better imagenet models transfer better? In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 2661–2671). https://doi.org/10.1109/CVPR.2019.00277
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25. https://doi.org/10.1145/3065386
- Muhammad, M. B., & Yeasin, M. (2020). Eigen-cam: Class activation map using principal components. In 2020 international joint conference on neural networks (ijcnn) (pp. 1–7). https://doi.org/10.1109/IJCNN48605.2020.9206626
- Ornek. (2023). Developing a new explainable artificial intelligence method (doctoral dissertation). Konya Technical University. (No DOI available for the dissertation)
- Ornek, A., & Ceylan, M. (2022). Haycam: A novel visual explanation for deep convolutional neural networks. Traitement Du Signal, 39 (5), 1711–1719. https://doi.org/10.18280/ts.390529
- Ornek, A. H., & Ceylan, M. (2022). A novel approach for visualization of class activation maps with reduced dimensions. In 2022 innovations in intelligent systems and applications conference (asyu) (pp. 1–5). https://doi.org/10.1109/ASYU56188.2022.9925400
- Ornek, A. H., & Ceylan, M. (2023). Codcam: A new ensemble visual explanation for classification of medical thermal images. Quantitative InfraRed Thermography Journal, 1–25. https://doi.org/10.1080/17686733.2023.2167459
- Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the ieee international conference on computer vision (pp. 618–626). https://doi.org/10.1109/ICCV.2017.74
- Shao, F., Chen, L., Shao, J., Ji, W., Xiao, S., Ye, L., Xiao, J. (2022). Deep learning for weakly-supervised object detection and localization: A survey. Neurocomputing. https://doi.org/10.48550/arXiv.2105.12694
- Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
- Stewart, G. W. (1993). On the early history of the singular value decomposition. SIAM review, 35 (4), 551–566. https://doi.org/10.1137/1035134
- van der Velden, B. H., Kuijf, H. J., Gilhuijs, K. G., & Viergever, M. A. (2022). Explainable artificial intelligence (xai) in deep learning-based medical image analysis. Medical Image Analysis, 79 , 102470. doi: https://doi.org/10.1016/j.media.2022.102470
- Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). Caltech (Tech. Rep. No. CNS-TR-2011-001). California Institute of Technology. (Technical reports typically do not have DOIs)
- Wu, S. X., Wai, H.-T., Li, L., & Scaglione, A. (2018). A review of distributed algorithms for principal component analysis. Proceedings of the IEEE, 106 (8), 1321–1340. https://doi.org/10.1109/JPROC.2018.2846568
- Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 2921–2929). https://doi.org/10.1109/CVPR.2016.319
Zhou, D., Fang, J., Song, X., Guan, C., Yin, J., Dai, Y., & Yang, R. (2019). Iou loss for 2d/3d object detection. In 2019 international conference on 3d vision (3dv) (pp. 85–94). https://doi.org/10.1109/3DV.2019.00019
FARKLI ÖLÇEKLERDE ZAYIF DENETİMLİ NESNE TESPİTİ İÇİN HAYCAM VE EIGEN KARŞILAŞTIRILMASI
Yıl 2024,
Cilt: 27 Sayı: 3, 1078 - 1088, 03.09.2024
Ahmet Ornek
,
Murat Ceylan
Öz
Açıklanabilir Yapay Zeka yaklaşımlarından biri olan Sınıf Aktivasyon haritaları ile bir sınıflama işlemi gerçekleştirildiği zaman, giriş görüntüsü üzerindeki sınıflamaya etki eden alanlar ortaya çıkarılabilmektedir. Yani bir sınıflayıcı modelin görüntünün hangi kısmına bakarak karar verdiği gösterilmektedir. Bu çalışmada açık kaynak bir veri seti olan CUB 200 2011 kullanılarak 200 sınıflı bir sınıflama modeli eğitilmiş ve sınıflama sonuçları EigenCAM ve HayCAM yöntemleri kullanılarak görselleştirilmiştir. Sınıflamaya etki eden alanlar kullanılarak gerçekleştirilen nesne tanıma performansları karşılaştırıldığında EigenCAM yöntemi %30.88 IoU değerine ulaşırken HayCAM yöntemi %41.95 değerine ulaşmaktadır. Elde edilen çıktılar Temel Bileşenler Analizi kullanılarak elde edilen sonuçların (HayCAM), Tekil Değer Ayrışımı kullanılarak elde edilen sonuçlardan (EigenCAM) daha iyi olduğunu göstermektedir.
Etik Beyan
Makale, yazarların kendi araştırmalarını ve analizlerini güvenilir ve eksiksiz bir şekilde yansıtmaktadır.
Destekleyen Kurum
Huawei Türkiye Ar-Ge Merkezi
Teşekkür
Huawei Türkiye Ar-Ge Merkezi
Kaynakça
- Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., & Barbado, A. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012
- Chattopadhay, A., Sarkar, A., Howlader, P., & Balasubramanian, V. N. (2018). Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 839–847). https://doi.org/10.1109/WACV.2018.00097
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90
- Kornblith, S., Shlens, J., & Le, Q. V. (2019). Do better imagenet models transfer better? In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 2661–2671). https://doi.org/10.1109/CVPR.2019.00277
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25. https://doi.org/10.1145/3065386
- Muhammad, M. B., & Yeasin, M. (2020). Eigen-cam: Class activation map using principal components. In 2020 international joint conference on neural networks (ijcnn) (pp. 1–7). https://doi.org/10.1109/IJCNN48605.2020.9206626
- Ornek. (2023). Developing a new explainable artificial intelligence method (doctoral dissertation). Konya Technical University. (No DOI available for the dissertation)
- Ornek, A., & Ceylan, M. (2022). Haycam: A novel visual explanation for deep convolutional neural networks. Traitement Du Signal, 39 (5), 1711–1719. https://doi.org/10.18280/ts.390529
- Ornek, A. H., & Ceylan, M. (2022). A novel approach for visualization of class activation maps with reduced dimensions. In 2022 innovations in intelligent systems and applications conference (asyu) (pp. 1–5). https://doi.org/10.1109/ASYU56188.2022.9925400
- Ornek, A. H., & Ceylan, M. (2023). Codcam: A new ensemble visual explanation for classification of medical thermal images. Quantitative InfraRed Thermography Journal, 1–25. https://doi.org/10.1080/17686733.2023.2167459
- Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the ieee international conference on computer vision (pp. 618–626). https://doi.org/10.1109/ICCV.2017.74
- Shao, F., Chen, L., Shao, J., Ji, W., Xiao, S., Ye, L., Xiao, J. (2022). Deep learning for weakly-supervised object detection and localization: A survey. Neurocomputing. https://doi.org/10.48550/arXiv.2105.12694
- Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
- Stewart, G. W. (1993). On the early history of the singular value decomposition. SIAM review, 35 (4), 551–566. https://doi.org/10.1137/1035134
- van der Velden, B. H., Kuijf, H. J., Gilhuijs, K. G., & Viergever, M. A. (2022). Explainable artificial intelligence (xai) in deep learning-based medical image analysis. Medical Image Analysis, 79 , 102470. doi: https://doi.org/10.1016/j.media.2022.102470
- Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). Caltech (Tech. Rep. No. CNS-TR-2011-001). California Institute of Technology. (Technical reports typically do not have DOIs)
- Wu, S. X., Wai, H.-T., Li, L., & Scaglione, A. (2018). A review of distributed algorithms for principal component analysis. Proceedings of the IEEE, 106 (8), 1321–1340. https://doi.org/10.1109/JPROC.2018.2846568
- Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 2921–2929). https://doi.org/10.1109/CVPR.2016.319
Zhou, D., Fang, J., Song, X., Guan, C., Yin, J., Dai, Y., & Yang, R. (2019). Iou loss for 2d/3d object detection. In 2019 international conference on 3d vision (3dv) (pp. 85–94). https://doi.org/10.1109/3DV.2019.00019