Li-Hua Li 1, Agus Cahyo Nugroho 1, 2*, Yung-Cheng Chuang 1, Radius Tanone 1

1Department of Information Management, Chaoyang University of Technology, Taichung, Taiwan

2Department of Information System, Unika Soegijapranata, Semarang, Indonesia


 

Download Citation: |
{xpdfattach}


ABSTRACT


Classifying short message service (SMS) spam is critical for identifying unauthorized and potentially harmful messages, especially given the increasing number of crimes associated with such communications. This study compares the effectiveness of Large Language Models (LLMs) with traditional machine-learning techniques in spam SMS classification. The results demonstrate that LLMs outperform commonly used traditional methods, including Support Vector Machine (SVM), Decision Tree (DT), and Naïve Bayes (NB), setting this research apart from prior work. To ensure robust evaluation, this study utilizes a comprehensive dataset comprising diverse SMS spam samples alongside preprocessing techniques such as tokenization, case transformation, and stopword filtering (in English). Three LLM models—Phi-3.5 Classifier, H2O-Danube, and DistilBERT—were fine-tuned to optimize performance. Experimental results revealed that the Phi-3.5 Classifier and H2O-Danube achieved identical performance metrics of accuracy, precision, recall, and F1-scores with 99%. The DistilBERT model also performed exceptionally well, achieving 99% across these metrics. These results significantly surpass those obtained from traditional machine learning models, highlighting the superior accuracy of LLMs in spam classification. The findings have profound implications for integrating LLM Models to enhance the performance of sentiment analysis, improve spam detection systems, compare and establish performance benchmarks by leveraging LLMs for sentiment analysis in SMS spam detection, which can enhance SMS communication security, and increasing the overall efficiency of spam mitigation strategies.


Keywords: Large language models, Sentiment analysis, Spam classification, SMS, Traditional machine learning.


Share this article with your colleagues

 


REFERENCES


  1.  Abdin, M., Aneja, J., Awadalla, H., Awadallah, A., Awan, A.A., Bach, N., Bahree, A., Bakhtiari, A., Bao, J., Behl, H., Benhaim, A. 2024. Phi-3 technical report: A highly capable language model locally on your phone. arXiv preprint arXiv:2404.14219.

  2. Acharya, K., Velasquez, A., Song, H.H. 2024. A survey on symbolic knowledge distillation of large language models. IEEE Transactions on Artificial Intelligence. 5928–5948.

  3. Agboola, O. 2022. Spam detection using machine learning and deep learning. Louisiana State University Agricultural and Mechanical College.

  4. Alhenawi, E.A., Khurma, R.A., Castillo, P.A., Arenas, M.G., Al-Hinawi, A.M. 2023. Effects of term weighting approach with and without stop words removing on Arabic text classification. 9th International Conference on Optimization and Applications (ICOA), 1–6.

  5. Bello, I., Zoph, B., Vaswani, A., Shlens, J., Le, Q.V. 2019. Attention augmented convolutional networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3286–3295.

  6. Birjali, M., Kasri, M., Beni-Hssane, A. 2021. A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134.

  7. Chen, W., Yang, Z. 2023. Landslide susceptibility modeling using bivariate statistical-based logistic regression, naïve Bayes, and alternating decision tree models. Bulletin of Engineering Geology and the Environment, 190.

  8. Cormack, G.V., Gómez Hidalgo, J.M., Sánz, E.P. 2007. Spam filtering for short messages. 16th ACM Conference on Information and Knowledge Management, 313–320.

  9. Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. 2019. Bert: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, 1, 4171–4186.

  10. Dewi, C., Indriawan, F.A. and Christanto, H.J. 2023. Spam classification problems using SVM and grid search. International Journal of Applied Science and Engineering, 20.

  11. Hassanin, M., Anwar, S., Radwan, I., Khan, F.S., Mian, A. 2024. Visual attention methods in deep learning: An in-depth survey. Information Fusion, 108, 102471.

  12. Howard, J., Ruder, S. 2018. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 1, 328–339.

  13. Hu, J., Yang, Y., An, Y., Yao, L. 2023. Dual-spatial normalized transformer for image captioning. Engineering Applications of Artificial Intelligence, 123, 106384.

  14. Huang, A.H., Wang, H., Yang, Y. 2023. FinBERT: A large language model for extracting information from financial text. Contemporary Accounting Research, 40, 806–841.

  15. Iyer, V. 2024. A comparative analysis of sentiment classification models for improved performance optimization. Authorea Preprints. [Online]. Available: https://nhsjs.com/wp-content/uploads/2024/05/A-Comparative-Analysis-of-Sentiment-Classification-Models-for-Improved-Performance-Optimization.pdf.

  16. Kaur, G., Sharma, A. 2022. Comparison of different machine learning algorithms for sentiment analysis. International Conference on Sustainable Computing and Data Communication Systems, 141–147.

  17. Kurani, A., Doshi, P., Vakharia, A., Shah, M. 2023. A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting. Annals of Data Science, 10, 183–208.

  18. Lin, J., Dai, X., Xi, Y., Liu, W., Chen, B., Zhang, H., Liu, Y., Wu, C., Li, X., Zhu, C., Guo, H. 2025. How can recommender systems benefit from large language models: A survey, ACM Transactions on Information Systems, 43, 1–47.

  19. Lu, Y., Ye, T., Zheng, J. 2022. Decision tree algorithm in machine learning. In 2022 IEEE International Conference on Advances in Electrical Engineering and Computer Applications, 1014–1017.

  20. Miah, M.S.U., Kabir, M.M., Sarwar, T.B., Safran, M., Alfarhood, S., Mridha, M.F. 2024. A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM. Scientific Reports, 14, 9603.

  21. Mienye, I.D., Jere, N. 2024. A survey of decision trees: concepts, algorithms, and applications. IEEE Access, 12, 86716–86727.

  22. Mishra, S., Aggarwal, M., Yadav, S., Sharma, Y. 2023. Comparison of machine learning techniques for sentiment analysis. International Conference on Advances in Computing, Communication, Embedded and Secure Systems,184–191.

  23. Ng, S.Y., Lim, K.M., Lee, C.P., Lim, J.Y. 2023. Sentiment analysis using DistilBERT. 11th Conference on Systems, Process and Control, 84–89.

  24. Pajila, P.B., Sheena, B.G., Gayathri, A., Aswini, J., Nalini, M. 2023. A comprehensive survey on naive bayes algorithm: Advantages, limitations and applications. International Conference on Smart Electronics and Communication, 1228–1234.

  25. Poomka, P., Pongsena, W., Kerdprasop, N., Kerdprasop, K. 2019. SMS spam detection based on LSTM and gated recurrent unit. International Journal of Future Computer and Communication, 8, 11–15.

  26. Prema, V., Elavazhahan, V. 2023. Sculpting DistilBERT: enhancing efficiency in resource-constrained scenarios. International Conference on System Modeling and Advancement in Research Trends, 251–256.

  27. Rojas-Galeano, S. 2024. Zero-shot spam email classification using pre-trained large language models. In Workshop on Engineering Applications, 3–18.

  28. Sahoo, C., Wankhade, M., Singh, B.K. 2023. Sentiment analysis using deep learning techniques: A comprehensive review. International Journal of Multimedia Information Retrieval, 12, 41.

  29. Salman, M., Ikram, M., Kaafar, M.A. 2024. Investigating evasive techniques in SMS spam filtering: A Comparative Analysis of Machine Learning Models, 12, 24306–24324.

  30. Sehirli, E., Arslan, K. 2022. An application for the classification of egg quality and haugh unit based on characteristic egg features using machine learning models. Expert Systems with Applications, 205, 117692.

  31. Shahriar, S. 2025. Linguistic deception detection–models, domains, behaviors, stylistic patterns to large language models (LLMs) (Doctoral dissertation, University of Houston).

  32. Shu, K., Mahudeswaran, D., Wang, S., Liu, H. 2020. Hierarchical propagation networks for fake news detection: Investigation and exploitation. In Proceedings of the international AAAI conference on web and social media, 626–637.

  33. Sjarif, N.N.A., Azmi, N.F.M., Chuprat, S., Sarkan, H.M., Yahya, Y., Sam, S.M. 2019. SMS spam message detection using term frequency-inverse document frequency and random forest algorithm. Procedia Computer Science, 161, 509–515

  34. Sokolová, Z., Harahus, M., Juhár, J., Pleva, M., Staš, J. Hládek, D. 2024. Comparison of machine learning approaches for sentiment analysis in Slovak. Electronics, 13, 703.

  35. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., Liu, Y. 2024. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568, 127063.

  36. Sultana, T., Sapnaz, K.A., Sana, F., Najath, M.J. 2020. Email based Spam Detection. International Journal of Engineering Research and Technology, 9, 135–139.

  37. Tagg, C. 2009. A corpus linguistics study of SMS text messaging (Doctoral dissertation, University of Birmingham).

  38. Theng, D., Bhoyar, K.K. 2024. Feature selection techniques for machine learning: a survey of more than two decades of research. Knowledge and Information Systems, 66, 1575–1637.

  39. Wang, Q. 2022. Support vector machine algorithm in machine learning. International conference on artificial intelligence and computer applications, 750–756.

  40. Wang, Z., Chu, Z., Doan, T.V., Ni, S., Yang, M., Zhang, W. 2025. History, development, and principles of large language models: An introductory survey. AI Ethics, 1955–1971.

  41. Zhang, W., Li, X., Deng, Y., Bing, L., Lam, W. 2022. A survey on aspect-based sentiment analysis: Tasks, methods, and challenges. IEEE Transactions on Knowledge and Data Engineering, 35, 11019–11038.

  42. Zhang, Y., Dong, H. 2023. Criminal law regulation of cyber fraud crimes—from the perspective of citizens’ personal information protection in the era of edge computing. Journal of Cloud Computing, 12, 64.


ARTICLE INFORMATION


Received: 2025-02-19
Revised: 2026-01-02
Accepted: 2026-02-10
Available Online: 2026-03-25


Cite this article:

Li, L.H., Nugroho, A.C., Chuang, Y.C., Tanone, R., 2026. Future SMS spam filtering: comparative fine-tuning of machine learning and LLMs. International Journal of Applied Science and Engineering, 23, 2025057. https://doi.org/10.6703/IJASE.202606_23(2).003

  Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.