Mining Bug Report Repositories to Identify Significant Information for Software Bug Fixing

Bancha Luaphol, Jantima Polpinij, Manasawee Kaneampornpan

Abstract

Most studies relating to bug reports aims to automatically identify necessary information from bug reports for software bug fixing. Unfortunately, the study of bug reports focuses only on one issue, but more complete and comprehensive software bug fixing would be facilitated by assessing multiple issues concurrently. This becomes a challenge in this study, where it aims to present a method of identifying bug reports at severe level from a bug report repository, together with assembling their related bug reports to visualize the overall picture of a software problem domain. The proposed method is called “mining bug report repositories”. Two techniques of text mining are applied as the main mechanisms in this method. First, classification is applied for identifying severe bug reports, called “bug severity classification”, while “threshold-based similarity analysis” is then applied to assemble bug reports that are related to a bug report at severe level. Our datasets are from three opensource namely SeaMonkey, Firefox, and Core:Layout downloaded from the Bugzilla. Finally, the best models from the proposed method are selected and compared with two baseline methods. For identifying severe bug reports using classification technique, the results show that our method improved accuracy, F1, and AUC scores over the baseline by 11.39, 11.63, and 19% respectively. Meanwhile, for assembling related bug reports using threshold-based similarity technique, the results show that our method improved precision, and likelihood scores over the other baseline by 15.76, and 9.14% respectively. This demonstrate that our proposed method may help increasing chance to fix bugs completely.

Keywords

References

[1] P. Runeson, M. Alexandersson, and O. Nyholm, “Detection of duplicate defect reports using natural language processing,” in Proceedings of the 29th International Conference on Software Engineering, 2007, pp. 499–510.

[2] R. J. Sandusky, L. Gasser, and G. Ripoche, “Bug report networks: Varieties, strategies, and impacts in af/oss development community,” in Proceedings of 1st Int’l Workshop on Mining Software Repositories, 2004, pp. 80–84.

[3] J. Zhang, X. Wang, D. Hao, B. Xie, L. Zhang, and H. Mei, “A survey on bug-report analysis,” Science China Information Sciences, vol. 58, no. 2, pp. 1–24, 2015.

[4] N. Bettenburg, S. Just, A. Schröter, C. Weiss, R. Premraj, and T. Zimmermann, “What makes a good bug report?,” in Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2008, pp. 308–318.

[5] S. Davies and M. Roper, “What's in a bug report?,” in Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2014, p. 26.

[6] G. Antoniol, K. Ayari, M. Di Penta, F. Khomh, and Y.-G. Guéhéneuc, “Is it a bug or an enhancement?: A text-based approach to classify change requests,” in CASCON, 2008, vol. 8, pp. 304–318.

[7] K. Herzig, S. Just, and A. Zeller, “It's not a bug, it's a feature: How misclassification impacts bug prediction,” in Proceedings of the 2013 International Conference on Software Engineering, 2013, pp. 392–401.

[8] N. Limsettho, H. Hata, A. Monden, and K. Matsumoto, “Automatic unsupervised bug report categorization,” in 2014 6th International Workshop on Empirical Software Engineering in Practice, 2014, pp. 7–12.

[9] H. Qin and X. Sun, “Classifying bug reports into bugs and non-bugs using LSTM,” in Proceedings of the Tenth Asia-Pacific Symposium on Internetware, 2018, p. 20.

[10] P. Terdchanakul, H. Hata, P. Phannachitta, and K. Matsumoto, “Bug or not? bug report classification using n-gram idf,” in 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2017, pp. 534–538.

[11] K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text classification algorithms: A survey,” Information, vol. 10, no. 4, p. 150, 2019.

[12] A. Lamkanfi, S. Demeyer, E. Giger, and B. Goethals, “Predicting the severity of a reported bug,” in 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), 2010, pp. 1–10.

[13] A. Lamkanfi, S. Demeyer, Q. D. Soetens, and T. Verdonck, “Comparing mining algorithms for predicting the severity of a reported bug,” in 15th European Conference on Software Maintenance and Reengineering, 2011, pp. 249–258.

[14] T. Menzies and A. Marcus, “Automated severity assessment of software defect reports,” in 2008 IEEE International Conference on Software Maintenance, 2008, pp. 346–355.

[15] W. Y. Ramay, Q. Umer, X. C. Yin, C. Zhu, and I. Illahi, “Deep neural network-based severity prediction of bug reports,” IEEE Access, vol. 7, pp. 46846–46857, 2019.

[16] O. Chaparro, J. M. Florez, U. Singh, and A. Marcus, “Reformulating queries for duplicate bug report detection,” in 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2019, pp. 218–229.

[17] N. Jalbert and W. Weimer, “Automated duplicate detection for bug tracking systems,” in 2008 IEEE International Conference on Dependable Systems and Networks with FTCS and DCC (DSN), 2008, pp. 52–61.
[18] C.-Y. Lee, D.-D. Hu, Z.-Y. Feng, and C.-Z. Yang, “Mining temporal information to improve duplication detection on bug reports,” in 2015 IIAI 4th International Congress on Advanced Applied Informatics, 2015, pp. 551–555.

[19] Q. Xie, Z. Wen, J. Zhu, C. Gao, and Z. Zheng, “Detecting duplicate bug reports with convolutional neural networks,” in 2018 25th Asia-Pacific Software Engineering Conference (APSEC), 2018, pp. 416–425.

[20] J. Kanwal and O. Maqbool, “Bug prioritization to facilitate bug report triage,” Journal of Computer Science and Technology, vol. 27, no. 2, pp. 397– 412, 2012.

[21] J. Uddin, R. Ghazali, M. M. Deris, R. Naseem, and H. Shah, “A survey on bug prioritization,” Artificial Intelligence Review, vol. 47, no. 2, pp. 145–180, 2017.

[22] Q. Umer, H. Liu, and Y. Sultan, “Emotion based automated priority prediction for bug reports,” IEEE Access, vol. 6, pp. 35743–35752, 2018.

[23] Q. Umer, H. Liu, and I. Illahi, “CNN-based automatic prioritization of bug reports,” IEEE Transactions on Reliability, 2019.

[24] P. Bhattacharya and I. Neamtiu, “Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging,” in 2010 IEEE International Conference on Software Maintenance, 2010, pp. 1–10.

[25] J. Lee, D. Kim, and W. Jung, “Cost-aware clustering of bug reports by using a genetic algorithm,” Journal of Information Science and Engineering, vol. 35, no. 1, pp. 175–200, 2019.
[26] R. Almhana, W. Mkaouer, M. Kessentini, and A. Ouni, “Recommending relevant classes for bug reports using multi-objective search,” in Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016, pp. 286–295.

[27] X. Ye, R. Bunescu, and C. Liu, “Mapping bug reports to relevant files: A ranking model, a fine-grained benchmark, and feature evaluation,” IEEE Transactions on Software Engineering, vol. 42, no. 4, pp. 379–402, 2015.

[28] J. Zhou, H. Zhang, and D. Lo, “Where should the bugs be fixed?-more accurate information retrieval-based bug localization based on bug reports,” in Proceedings of the 34th International Conference on Software Engineering, 2012, pp. 14–24.

[29] J. Śliwerski, T. Zimmermann, and A. Zeller, “When do changes induce fixes?,” ACM Sigsoft Software Engineering Notes, vol. 30, no. 4, pp. 1–5, 2005.

[30] Y. Sun, Q. Wang, and Y. Yang, “Frlink: Improving the recovery of missing issue-commit links by revisiting file relevance,” Information and Software Technology, vol. 84, pp. 33–47, 2017.
[31] S. Akbarinasaji, B. Caglayan, and A. Bener, “Predicting bug-fixing time: A replication study using an open source software project,” Journal of Systems and Software, vol. 136, pp. 173–186, 2018.
[32] P. Bhattacharya and I. Neamtiu, “Bug-fix time prediction models: Can we do better?,” in Proceedings of the 8th Working Conference on Mining Software Repositories, 2011, pp. 207–210.

[33] H. Rocha, G. De Oliveira, H. Marques-Neto, and M. T. Valente, “NextBug: A Bugzilla extension for recommending similar bugs,” Journal of Software Engineering Research and Development, vol. 3, no. 1, p. 3, 2015.

[34] N. Pandey, A. Hudait, D. K. Sanyal, and A. Sen, “Automated classification of issue reports from a software issue tracker,” in Progress in Intelligent Computing Techniques: Theory, Practice, and Applications, 2018, pp. 423–430.

[35] Y. Zhou, Y. Tong, R. Gu, and H. Gall, “Combining text mining and data mining for bug report classification,” Journal of Software: Evolution and Process, vol. 28, no. 3, pp. 150–176, 2016.

[36] B. Luaphol, B. Srikudkao, T. Kachai, N. Srikanjanapert, J. Polpinij, and P. Bheganan, “Feature comparison for automatic bug report classification,” in International Conference on Computing and Information Technology, 2019, pp. 69–78.

[37] K. Chen, Z. Zhang, J. Long, and H. Zhang, “Turning from TF-IDF to TF-IGM for term weighting in text classification,” Expert Systems with Applications, vol. 66, pp. 245–260, 2016.

[38] B. Luaphol, J. Polpinij, and M. Kaneampornpun, “Automatic bug report severity prediction by binary text classification techniques,” in The 25th International Symposium on Artificial Life and Robotics 2020, 2020, pp. 206–211.

[39] K. P. Murphy, Machine Learning: A Probabilistic Perspective. Massachusetts: MIT press, 2012.

[40] K. Soman, R. Loganathan, and V. Ajay, Machine Learning with SVM and Other Kernel Methods. Delhi, India: PHI Learning Pvt. Ltd., 2009.

[41] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernelbased Learning Methods. Cambridge, UK: Cambridge University Press, 2000.

[42] Y. Tian, N. Ali, D. Lo, and A. E. Hassan, “On the unreliability of bug severity data,” Empirical Software Engineering, vol. 21, no. 6, pp. 2298– 2323, 2016.

[43] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.

[44] H. Rocha, G. Oliveira, H. Maques-Neto, and M. Valente, “Nextbug: A tool for recommending similar bugs in open-source systems,” in V Brazilian Conference on Software: Theory and Practice–Tools Track (CBSoft Tools), 2014, vol. 2, pp. 53–60.

[45] C.-Z. Yang, H.-H. Du, S.-S. Wu, and X. Chen, “Duplication detection for software bug reports based on bm25 term weighting,” in 2012 Conference on Technologies and Applications of Artificial Intelligence, 2012, pp. 33–38.

[46] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. New York: ACM Press, 1999.

[47] G. Forman, “An extensive empirical study of feature selection metrics for text classification,” Journal of Machine Learning Research, vol. 3, no. 3, pp. 1289–1305, 2003.

[48] N. Japkowicz and M. Shah, Evaluating Learning Algorithms: A Classification Perspective. Cambridge, UK: Cambridge University Press, 2011.

[49] T. Zimmermann, A. Zeller, P. Weissgerber, and S. Diehl, “Mining version histories to guide software changes,” IEEE Transactions on Software Engineering, vol. 31, no. 6, pp. 429–445, 2005.
[50] S. Robertson and H. Zaragoza, The Probabilistic Relevance Framework: BM25 and Beyond. Massachusetts: Now Publishers Inc, 2009.

Full Text: PDF

DOI: 10.14416/j.asep.2021.03.005

Refbacks

There are currently no refbacks.

Username
Password
Remember me

Applied Science and Engineering Progress