TR
EN
ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION
Öz
Phishing attacks are a major cybersecurity threat that aim to steal sensitive information by redirecting users to fraudulent websites. Traditional blacklist- and rule-based methods often remain insufficient, especially against zero-day attacks. This study presents a systematic methodological benchmark framework for phishing URL detection, integrating hybrid feature selection, Bayesian hyperparameter optimization, model comparison, ensemble learning, explainability analysis, and additional validation strategies. Experiments were conducted on a dataset of 88,647 URLs with 111 features, evaluating 7 hybrid feature selection methods, 3 Bayesian optimization techniques, and 252 optimized model configurations, complemented by deep learning baselines. The findings show that hybrid feature selection and optimization provide a clear contribution, particularly for tree-based models. The LightGBM model optimized on the L1+Boruta feature set with Scikit-Optimize produced the strongest single-model performance (Test Accuracy: 97.28%, F1: 95.99%, AUC: 99.54%), while the best ensemble (CAT+XGB+LGBM, Soft Voting) reached 97.34% accuracy, 96.17% F1, and 99.60% AUC — a comparable level of performance. Deep learning models yielded lower performance, while additional validation experiments demonstrated that the ensemble structure provided a consistent improvement over the baseline across different data splits.
Anahtar Kelimeler
Destekleyen Kurum
Bu araştırma, kamu, ticari veya kar amacı gütmeyen sektörlerdeki herhangi bir finansman kuruluşundan özel bir hibe almamıştır.
Etik Beyan
Bu çalışma, kamuya açık bir veri seti kullanması ve insan, hayvan veya kişisel
veri toplama içermemesi nedeniyle etik kurul onayı gerektirmemektedir.
Teşekkür
Yazarlar, bu araştırmayı mümkün kılan phishing URL veri setini kamuya
açık olarak sunan Vrbančič ve arkadaşlarına teşekkür eder.
Kaynakça
- Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. https://doi.org/10.1145/3292500.3330701
- Aydemir, M. (2024). Siberuzamda suç tipolojileri ve siber iletişim tabanlı çözümleme modelinin analizi. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 27(4), 1375–1400. https://doi.org/10.17780/ksujes.1477116
- Batur Dinler, Ö., & Batur Şahin, C. (2021). Prediction of phishing web sites with deep learning using WEKA environment. European Journal of Science and Technology, 24, 35–41. https://doi.org/10.31590/ejosat.901465
- Batur Dinler, Ö., Batur Şahin, C., & Abualigah, L. (2021). Comparison of performance of phishing web sites with different DeepLearning4J models. European Journal of Science and Technology, 28, 425–431. https://doi.org/10.31590/ejosat.1004778
- Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281-305.
- Bergstra, J., Yamins, D., & Cox, D. D. (2013). Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on Machine Learning (ICML), PMLR 28(1), 115-123.
- Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD, 785-794. https://doi.org/10.1145/2939672.2939785
Ayrıntılar
Birincil Dil
İngilizce
Konular
Sistem ve Ağ Güvenliği
Bölüm
Araştırma Makalesi
Yayımlanma Tarihi
3 Haziran 2026
Gönderilme Tarihi
28 Kasım 2025
Kabul Tarihi
24 Nisan 2026
Yayımlandığı Sayı
Yıl 2026 Cilt: 29 Sayı: 2
APA
Berkil, H., & Batur Dinler, Ö. (2026). ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 29(2), 678-698. https://izlik.org/JA44UW96BD
AMA
1.Berkil H, Batur Dinler Ö. ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi. 2026;29(2):678-698. https://izlik.org/JA44UW96BD
Chicago
Berkil, Hacer, ve Özlem Batur Dinler. 2026. “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 29 (2): 678-98. https://izlik.org/JA44UW96BD.
EndNote
Berkil H, Batur Dinler Ö (01 Haziran 2026) ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 29 2 678–698.
IEEE
[1]H. Berkil ve Ö. Batur Dinler, “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”, Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, c. 29, sy 2, ss. 678–698, Haz. 2026, [çevrimiçi]. Erişim adresi: https://izlik.org/JA44UW96BD
ISNAD
Berkil, Hacer - Batur Dinler, Özlem. “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 29/2 (01 Haziran 2026): 678-698. https://izlik.org/JA44UW96BD.
JAMA
1.Berkil H, Batur Dinler Ö. ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi. 2026;29:678–698.
MLA
Berkil, Hacer, ve Özlem Batur Dinler. “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, c. 29, sy 2, Haziran 2026, ss. 678-9, https://izlik.org/JA44UW96BD.
Vancouver
1.Hacer Berkil, Özlem Batur Dinler. ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi [Internet]. 01 Haziran 2026;29(2):678-9. Erişim adresi: https://izlik.org/JA44UW96BD