Research Article

ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION

Volume: 29 Number: 2 June 3, 2026
TR EN

ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION

Abstract

Phishing attacks are a major cybersecurity threat that aim to steal sensitive information by redirecting users to fraudulent websites. Traditional blacklist- and rule-based methods often remain insufficient, especially against zero-day attacks. This study presents a systematic methodological benchmark framework for phishing URL detection, integrating hybrid feature selection, Bayesian hyperparameter optimization, model comparison, ensemble learning, explainability analysis, and additional validation strategies. Experiments were conducted on a dataset of 88,647 URLs with 111 features, evaluating 7 hybrid feature selection methods, 3 Bayesian optimization techniques, and 252 optimized model configurations, complemented by deep learning baselines. The findings show that hybrid feature selection and optimization provide a clear contribution, particularly for tree-based models. The LightGBM model optimized on the L1+Boruta feature set with Scikit-Optimize produced the strongest single-model performance (Test Accuracy: 97.28%, F1: 95.99%, AUC: 99.54%), while the best ensemble (CAT+XGB+LGBM, Soft Voting) reached 97.34% accuracy, 96.17% F1, and 99.60% AUC — a comparable level of performance. Deep learning models yielded lower performance, while additional validation experiments demonstrated that the ensemble structure provided a consistent improvement over the baseline across different data splits.

Keywords

Supporting Institution

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Ethical Statement

This study does not require ethical approval as it uses a publicly available dataset and does not involve human subjects, animals, or personal data collection.

Thanks

The authors would like to thank Vrbančič et al. for making the phishing URL dataset publicly available, which made this research possible.

References

  1. Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. https://doi.org/10.1145/3292500.3330701
  2. Aydemir, M. (2024). Siberuzamda suç tipolojileri ve siber iletişim tabanlı çözümleme modelinin analizi. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 27(4), 1375–1400. https://doi.org/10.17780/ksujes.1477116
  3. Batur Dinler, Ö., & Batur Şahin, C. (2021). Prediction of phishing web sites with deep learning using WEKA environment. European Journal of Science and Technology, 24, 35–41. https://doi.org/10.31590/ejosat.901465
  4. Batur Dinler, Ö., Batur Şahin, C., & Abualigah, L. (2021). Comparison of performance of phishing web sites with different DeepLearning4J models. European Journal of Science and Technology, 28, 425–431. https://doi.org/10.31590/ejosat.1004778
  5. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281-305.
  6. Bergstra, J., Yamins, D., & Cox, D. D. (2013). Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on Machine Learning (ICML), PMLR 28(1), 115-123.
  7. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
  8. Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD, 785-794. https://doi.org/10.1145/2939672.2939785

Details

Primary Language

English

Subjects

System and Network Security

Journal Section

Research Article

Publication Date

June 3, 2026

Submission Date

November 28, 2025

Acceptance Date

April 24, 2026

Published in Issue

Year 2026 Volume: 29 Number: 2

APA
Berkil, H., & Batur Dinler, Ö. (2026). ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 29(2), 678-698. https://izlik.org/JA44UW96BD
AMA
1.Berkil H, Batur Dinler Ö. ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. KSU J. Eng. Sci. 2026;29(2):678-698. https://izlik.org/JA44UW96BD
Chicago
Berkil, Hacer, and Özlem Batur Dinler. 2026. “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 29 (2): 678-98. https://izlik.org/JA44UW96BD.
EndNote
Berkil H, Batur Dinler Ö (June 1, 2026) ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 29 2 678–698.
IEEE
[1]H. Berkil and Ö. Batur Dinler, “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”, KSU J. Eng. Sci., vol. 29, no. 2, pp. 678–698, June 2026, [Online]. Available: https://izlik.org/JA44UW96BD
ISNAD
Berkil, Hacer - Batur Dinler, Özlem. “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 29/2 (June 1, 2026): 678-698. https://izlik.org/JA44UW96BD.
JAMA
1.Berkil H, Batur Dinler Ö. ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. KSU J. Eng. Sci. 2026;29:678–698.
MLA
Berkil, Hacer, and Özlem Batur Dinler. “ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, vol. 29, no. 2, June 2026, pp. 678-9, https://izlik.org/JA44UW96BD.
Vancouver
1.Hacer Berkil, Özlem Batur Dinler. ADVANCED HYBRID FEATURE SELECTION AND BAYESIAN OPTIMIZATION FOR PHISHING URL DETECTION. KSU J. Eng. Sci. [Internet]. 2026 Jun. 1;29(2):678-9. Available from: https://izlik.org/JA44UW96BD

INDEXING & ABSTRACTING & ARCHIVING

download?token=eyJhdXRoX3JvbGVzIjpbXSwiZW5kcG9pbnQiOiJqb3VybmFsIiwib3JpZ2luYWxuYW1lIjoiaW1hZ2UucG5nIiwicGF0aCI6IjAzNTkvYmZjYS81YjQyLzY5ZjFkM2E4NWY2YWY3Ljg1NjQ2NDgxLnBuZyIsImV4cCI6MTc3NzQ1OTY0MCwibm9uY2UiOiI1NTUzYmJiN2U5NGNkMjdkYWNhMTRlMDZiYjc1OTY4NCJ9.nCVoSJClEIC9bWK5gGCmjHyTNRz2N0DhYKVJzJZR9Bs

download?token=eyJhdXRoX3JvbGVzIjpbXSwiZW5kcG9pbnQiOiJqb3VybmFsIiwib3JpZ2luYWxuYW1lIjoiaW1hZ2UucG5nIiwicGF0aCI6Ijg5YmUvODZlOC8wYzY0LzY5ZjFkNWE4MWJmYzY0LjM0OTM2NzM1LnBuZyIsImV4cCI6MTc3NzQ2MDE1Miwibm9uY2UiOiI3OWE1Mzk0OWRhMTk0Mjg0OGYzZTUxOWQyNTU5MjdjMSJ9.XxqhJ36woCZcO1DV_I9Mogpgg86-bwM454jQiOcqpS0 

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).