International Journal of Applied Science and Engineering
Published by Chaoyang University of Technology

Mahesh Kumar Thota1*, Francis H Shajin2, P. Rajesh3

1 Research Scholar, Department of Computer Science Engineering, KL University, Guntur, India
2 Department of Electronics and Communication Engineering, Anna University, Chennai, India
3 Department of Electrical and Electronics Engineering, Anna University, Chennai, India

Download Citation: |
Download PDF


Recent advancements in technology have emerged the requirements of hardware and software applications. Along with this technical growth, software industries also have faced drastic growth in the demand of software for several applications. For any software industry, developing good quality software and maintaining its eminence for user end is considered as most important task for software industrial growth. In order to achieve this, software engineering plays an important role for software industries. Software applications are developed with the help of computer programming where codes are written for desired task. Generally, these codes contain some faulty instances which may lead to the buggy software development cause due to software defects. In the field of software engineering, software defect prediction is considered as most important task which can be used for maintaining the quality of software. Defect prediction results provide the list of defect-prone source code artefacts so that quality assurance team scan effectively allocate limited resources for validating software products by putting more effort on the defect-prone source code. As the size of software projects becomes larger, defect prediction techniques will play an important role to support developers as well as to speed up time to market with more reliable software products. One of the most exhaustive and pricey part of embedded software development is consider as the process of finding and fixing the defects. Due to complex infrastructure, magnitude, cost and time limitations, monitoring and fulfilling the quality is a big challenge, especially in automotive embedded systems. However, meeting the superior product quality and reliability is mandatory. Hence, higher importance is given to V&V (Verification & Validation). Software testing is an integral part of software V&V, which is focused on promising accurate functionality and long-term reliability of software systems. Simultaneously, software testing requires much effort, cost, infrastructure and expertise as the development. The costs and efforts elevate in safety critical software systems. Therefore, it is essential to have a good testing strategy for any industry with high software development costs. In this work, we are planning to develop an efficient approach for software defect prediction by using soft computing based machine learning techniques which helps to predict optimize the features and efficiently learn the features.

Keywords: Defect prediction, Soft computing, Verification, Validation.

Share this article with your colleagues



  1. Abaei, G., Selamat, A. 2014. A survey on software fault detection based on different prediction approaches. Vietnam Journal of Computer Science, 1, 79–95.

  2. Bosu, A., Carver, J.C., Hafiz, M., Hilley, P., Janni, D. 2014, November. Identifying the characteristics of vulnerable code changes: An empirical study. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 257–268. ACM.

  3. Brereton, P., Kitchenham, B.A., Budgen, D., Turner, M., Khalil, M. 2007. Lessons from applying the systematic literature review process within the software engineering domain. Journal of systems and software, 80, 571–583.

  4. Briand, L.C., Wüst, J., Daly, J.W., Porter, D.V. 2000. Exploring the relationships between design measures and software quality in object-oriented systems. Journal of systems and software, 51, 245–273.

  5. Catal, C., Diri, B. 2007, February. Software defect prediction using artificial immune recognition system. In Proceedings of the 25th conference on IASTED International Multi-Conference: Software Engineering 285–290. ACTA Press.

  6. Catal, C., Diri, B. 2009. A systematic review of software fault prediction studies. Expert systems with applications, 36, 7346–7354.

  7. Catal, C., Diri, B. 2009. Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Information Sciences, 179, 1040–1058.

  8. Catal, C., Diri, B. 2009. Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Information Sciences, 179, 1040–1058.

  9. Catal, C. 2014. A comparison of semi-supervised classification approaches for software defect prediction. Journal of Intelligent Systems, 23, 75–82.

  10. D'Ambros, M., Lanza, M., Robbes, R. 2010, May. An extensive comparison of bug prediction approaches. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), 31–41. IEEE.

  11. Ebert, C., Jones, C. 2009. Embedded software: Facts, figures, and future. Computer, 42, 42–52.

  12. Fenton, N.E., Neil, M. 1999. A critique of software defect prediction models. IEEE Transactions on software engineering, 25, 675–689.
  13. Ghaffarian, S.M., Shahriari, H.R. 2017. Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey. ACM Computing Surveys (CSUR), 50, 56.

  14. Ghotra, B., McIntosh, S., Hassan, A.E. 2015, May. Revisiting the impact of classification techniques on the performance of defect prediction models. In Proceedings of the 37th International Conference on Software Engineering-Volume 1, 789–800. IEEE Press.

  15. Gondra, I. 2008. Applying machine learning to software fault-proneness prediction. Journal of Systems and Software, 8, 186–195.

  16. Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S. 2011. A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38, 1276–1304.

  17. He, P., Li, B., Liu, X., Chen, J., Ma, Y. 2015. An empirical study on software defect prediction with a simplified metric set. Information and Software Technology, 59, 170–190.

  18. He, P., Li, B., Liu, X., Chen, J., Ma, Y. 2015. An empirical study on software defect prediction with a simplified metric set. Information and Software Technology, 59, 170–190.

  19. He, P., Li, B., Ma, Y., He, L. 2013. Using software dependency to bug prediction. Mathematical Problems in Engineering.

  20. He, Z., Shu, F., Yang, Y., Li, M., Wang, Q. 2012. An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering, 19, 167–199.

  21. Herbold, S. 2013, October. Training data selection for cross-project defect prediction. In Proceedings of the 9th International Conference on Predictive Models in Software Engineering 6. ACM.

  22. Hewett, R. 2011. Mining software defect data to support software testing management. Applied Intelligence, 34, 245–257.

  23. Hu, Q.P., Xie, M., Ng, S.H., Levitin, G. 2007. Robust recurrent neural network modeling for software fault detection and correction prediction. Reliability Engineering & System Safety, 92, 332–340.

  24. Jiang, Y., Cukic, B., Ma, Y. 2008. Techniques for evaluating fault prediction models. Empirical Software Engineering, 13, 561–595.

  25. Jing, X.Y., Ying, S., Zhang, Z.W., Wu, S.S., Liu, J. 2014, May. Dictionary learning based software defect prediction. In Proceedings of the 36th International Conference on Software Engineering, 414–423. ACM.

  26. Kamei, Y., Matsumoto, S., Monden, A., Matsumoto, K.I., Adams, B., Hassan, A.E. 2010, September. Revisiting common bug prediction findings using effort-aware models. In 2010 IEEE International Conference on Software Maintenance, 1–10. IEEE.

  27. Khoshgoftaar, T.M., Rebours, P. 2007. Improving software quality prediction by noise filtering techniques. Journal of Computer Science and Technology, 22, 387–396.

  28. Khoshgoftaar, T.M., Ganesan, K., Allen, E.B., Ross, F.D., Munikoti, R., Goel, N., Nandi, A. 1997, November. Predicting fault-prone modules with case-based reasoning. In Proceedings the eighth international symposium on software reliability engineering, 27–35. IEEE.

  29. Khoshgoftaar, T.M., Gao, K., Seliya, N. 2010, October. Attribute selection and imbalanced data: Problems in software defect prediction. In 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, 1, 137–144. IEEE.

  30. Kim, S., Zhang, H., Wu, R., Gong, L. 2011, May. Dealing with noise in defect prediction. In 2011 33rd International Conference on Software Engineering (ICSE). 481–490. IEEE.

  31. Kim, S., Zimmermann, T., Whitehead Jr, E.J., Zeller, A. 2007, May. Predicting faults from cached history. In Proceedings of the 29th international conference on Software Engineering, 489–498. IEEE Computer Society.

  32. Köksal, G., Batmaz, İ., Testik, M.C. 2011. A review of data mining applications for quality improvement in manufacturing industry. Expert systems with Applications, 38, 13448–13467.

  33. Koru, A.G., Liu, H. 2005. Building effective defect-prediction models in practice. IEEE software, 22, 23–29.

  34. Lemos, O.A.L., Ferrari, F.C., Silveira, F.F., Garcia, A. 2015. Experience report: Can software testing education lead to more reliable code?. In 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE), 359–369.

  35. Lessmann, S., Baesens, B., Mues, C., Pietsch, S. 2008. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34, 485–496.

  36. Lewis, N.D. 1999. Assessing the evidence from the use of SPC in monitoring, predicting & improving software quality. Computers & Industrial Engineering, 37, 157–160.

  37. Li, K., Chen, C., Liu, W., Fang, X., Lu, Q. 2014. Software defect prediction using fuzzy integral fusion based on GA-FM. Wuhan University Journal of Natural Sciences, 19, 405–408.

  38. Li, M., Zhang, H., Wu, R., Zhou, Z.H. 2012. Sample-based software defect prediction with active and semi-supervised learning. Automated Software Engineering, 19, 201–230.

  39. Li, Z., Jing, X.Y., Zhu, X., Zhang, H., Xu, B., Ying, S. 2017. On the multiple sources and privacy preservation issues for heterogeneous defect prediction. IEEE Transactions on Software Engineering.

  40. Lu, J., Behbood, V., Hao, P., Zuo, H., Xue, S., Zhang, G. 2015. Transfer learning using computational intelligence: a survey. Knowledge-Based Systems, 80, 14–23.

  41. Ma, Y., Luo, G., Zeng, X., Chen, A. 2012. Transfer learning for cross-company software defect prediction. Information and Software Technology, 54, 248–256.

  42. Meneely, A., Williams, L., Snipes, W., Osborne, J. 2008, November. Predicting failures with developer networks and social network analysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering 13–23. ACM.

  43. Menzies, T., DiStefano, J., Orrego, A., Chapman, R. 2004. Assessing predictors of software defects. In Proc. Workshop Predictive Software Models.

  44. Mısırlı, A.T., Çağlayan, B., Miranskyy, A.V., Bener, A., Ruffolo, N. 2011, May. Different strokes for different folks: A case study on software metrics for different defect categories. In Proceedings of the 2nd International Workshop on Emerging Trends in Software Metrics, 45–51. ACM.

  45. Morrison, P., Herzig, K., Murphy, B., Williams, L. 2015, April. Challenges with applying vulnerability prediction models. In Proceedings of the 2015 Symposium and Bootcamp on the Science of Security, 4. ACM.

  46. Moshtari, S., Sami, A. 2016, April. Evaluating and comparing complexity, coupling and a new proposed set of coupling metrics in cross-project vulnerability prediction. In Proceedings of the 31st Annual ACM Symposium on Applied Computing, 1415–1421. ACM.

  47. Nam, J., Pan, S.J., Kim, S. 2013, May. Transfer defect learning. In 2013 35th International Conference on Software Engineering (ICSE), 382–391. IEEE.

  48. Okutan, A., Yıldız, O.T. 2014. Software defect prediction using Bayesian networks. Empirical Software Engineering, 19, 154–181.

  49. Peters, F., Menzies, T., Marcus, A. 2013, May. Better cross company defect prediction. In Proceedings of the 10th Working Conference on Mining Software Repositories, 409–418. IEEE Press.

  50. Premraj, R., Herzig, K. 2011, September. Network versus code metrics to predict defects: A replication study. In 2011 International Symposium on Empirical Software Engineering and Measurement, 215–224. IEEE.

  51. Radjenović, D., Heričko, M., Torkar, R., Živkovič, A. 2013. Software fault prediction metrics: A systematic literature review. Information and software technology, 55, 1397–1418.

  52. Rahman, F., Posnett, D., Devanbu, P. 2012, November. Recalling the imprecision of cross-project defect prediction. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, 61. ACM.

  53. Rajbahadur, G.K., Wang, S., Kamei, Y., Hassan, A.E. 2017, May. The impact of using regression models to build defect classifiers. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) 135–145. IEEE.

  54. Rana, R., Staron, M., Mellegård, N., Berger, C., Hansson, J., Nilsson, M., Törner, F. 2013, June. Evaluation of standard reliability growth models in the context of automotive software systems. In International Conference on Product Focused Software Process Improvement, 324–329. Springer, Berlin, Heidelberg.

  55. Roy, P., Mahapatra, G.S., Rani, P., Pandey, S.K., Dey, K.N. 2014. Robust feedforward and recurrent neural network based dynamic weighted combination models for software reliability prediction. Applied Soft Computing, 22, 629–637.

  56. Ryu, D., Choi, O., Baik, J. 2016. Value-cognitive boosting with a support vector machine for cross-project defect prediction. Empirical Software Engineering, 21, 43–71.

  57. Selby, R.W., Porter, A.A. 1988. Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Transactions on Software Engineering, 14, 1743–1757.

  58. Shin, Y., Williams, L. 2008, October. An empirical model to predict security vulnerabilities using code complexity metrics. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, 315–317. ACM.

  59. Shin, Y., Williams, L. 2011, May. An initial study on the use of execution complexity metrics as indicators of software vulnerabilities. In Proceedings of the 7th International Workshop on Software Engineering for Secure Systems, 1–7. ACM.

  60. Shin, Y., Williams, L. 2013. Can traditional fault prediction models be used for vulnerability prediction?. Empirical Software Engineering, 18, 25–59.

  61. Shin, Y., Meneely, A., Williams, L., Osborne, J.A. 2010. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Transactions on Software Engineering, 37, 772–787.

  62. Song, Q., Shepperd, M., Cartwright, M., Mair, C. 2006. Software defect association mining and defect correction effort prediction. IEEE Transactions on Software Engineering, 32, 69–82.

  63. Staron, M., Meding, W. 2008. Predicting weekly defect inflow in large software projects based on project planning and test status. Information and Software Technology, 50, 782–796.

  64. Walden, J., Doyle, M. 2012. SAVI: Static-analysis vulnerability indicator. IEEE Security & Privacy, 10, 32–39.

  65. Walden, J., Stuckman, J., Scandariato, R. 2014, November. Predicting vulnerable components: Software metrics vs text mining. In 2014 IEEE 25th international symposium on software reliability engineering 23–33. IEEE.

  66. Wang, H., Khoshgoftaar, T.M., Liang, Q. 2013. A study of software metric selection techniques: Stability analysis and defect prediction model performance. International journal on artificial intelligence tools, 22, 1360010.

  67. Wang, T., Li, W.H. 2010, December. Naive bayes software defect prediction model. In 2010 International Conference on Computational Intelligence and Software Engineering, 1–4. Ieee.

  68. Watanabe, S., Kaiya, H., Kaijiri, K. 2008, May. Adapting a fault prediction model to allow inter languagereuse. In Proceedings of the 4th international workshop on Predictor models in software engineering, 19–24. ACM.

  69. Wu, F., Jing, X.Y., Dong, X., Cao, J., Xu, M., Zhang, H., Ying, S., Xu, B. 2017, May. Cross-project and within-project semi-supervised software defect prediction problems study using a unified solution. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), 195–197. IEEE.

  70. Xia, X., Lo, D., Pan, S.J., Nagappan, N., Wang, X. 2016. Hydra: Massively compositional model for cross-project defect prediction. IEEE Transactions on software Engineering, 42, 977–998.

  71. Xie, X., Ho, J.W., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y., 2011. Testing and validating machine learning classifiers by metamorphic testing. Journal of Systems and Software, 84, 544–558.

  72. Yadav, H.B., Yadav, D.K. 2015. A fuzzy logic based approach for phase-wise software defects prediction using software metrics. Information and Software Technology, 63, 44–57.

  73. Yang, X., Lo, D., Xia, X., Zhang, Y., Sun, J. 2015, August. Deep learning for just-in-time defect prediction. In 2015 IEEE International Conference on Software Quality, Reliability and Security, 17–26. IEEE.

  74. Younis, A., Malaiya, Y., Anderson, C., Ray, I. 2016, March. To fear or not to fear that is the question: Code characteristics of a vulnerable functionwith an existing exploit. In Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, 97–104. ACM.

  75. Zhang, F., Zheng, Q., Zou, Y., Hassan, A.E. 2016, May. Cross-project defect prediction using a connectivity-based unsupervised classifier. In Proceedings of the 38th International Conference on Software Engineering, 309–320. ACM.

  76. Zhang, Z.W., Jing, X.Y., Wang, T.J. 2017. Label propagation based semi-supervised learning for software defect prediction. Automated Software Engineering, 24, 47–69.

  77. Zimmerman, T., Nagappan, N., Herzig, K., Premraj, R., Williams, L. 2011, March. An empirical study on the relation between dependency neighborhoods and failures. In 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation 347–356. IEEE.

  78. Zimmermann, T., Nagappan, N., Gall, H., Giger, E., Murphy, B. 2009, August. Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, 91–100. ACM.


Received: 2019-09-06
Revised: 2019-12-12
Accepted: 2020-07-03
Available Online: 2020-12-01

Cite this article:

Thota, M.K., Shajin, F.H., Rajesh, P. 2020. Survey on software defect prediction techniques. International Journal of Applied Science and Engineering, 17, 331–344.

  Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.