Stacking Large Language Models is All You Need: A Case Study on Phishing Url Detection
Published Online: Jul 11, 2025
Page range: 337 - 356
Received: Mar 16, 2025
Accepted: Jun 12, 2025
DOI: https://doi.org/10.2478/jaiscr-2025-0017
Keywords
© 2025 Hawraa Nasser et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Prompt-engineered Large Language Models (LLMs) have gained widespread adoption across various applications due to their ability to perform complex tasks without requiring additional training. Despite their impressive performance, there is considerable scope for improvement, particularly in addressing the limitations of individual models. One promising avenue is the use of ensemble learning strategies, which combine the strengths of multiple models to enhance overall performance. In this study, we investigate the effectiveness of stacking ensemble techniques for chat-based LLMs in text classification tasks, with a focus on phishing URL detection. Notably, we introduce and evaluate three stacking methods: (1)