Classification of Extracted Features from Software Requirements Specification Documents using Support Vector Machine Learning Technique

Title

Classification of Extracted Features from Software Requirements Specification Documents using Support Vector Machine Learning Technique

Authors

1. Sadiq Waziri, Abubakar Tafawa Balewa University, Student, Nigeria

Abstract

This paper presents the results of an experiment on the classification of extracted features from Software Requirements Specification (SRS) documents using various Machine Learning techniques. The primary focus was on the linear Support Vector Machine (SVM) technique, with comparative analysis involving three additional techniques, namely Decision Tree (DT), Naïve Bayes (NB), and K-Nearest Neighbors (KNN). During the experimentation, features, which are fundamental building blocks of Software Product Lines [1], were classified into optional and mandatory. This differentiation facilitates both variability and similarity within a product family [1]. While previous research has explored similar classifications using diverse techniques, this study specifically identifies the most effective method for binary classification of features for feature modeling. The experiment was conducted on nine selected documents from the PURE dataset. The performance of each model was evaluated rigorously based on accuracy, precision, recall (sensitivity), and F1-score. The findings provide valuable insights into the optimal classification technique, enhancing the development and management of software product lines.

Keywords

Requirements Feature Feature Extraction Feature Classification Feature Modeling Support Vector Machine

PDF

This browser does not support PDFs. Please download the PDF to view it: View the PDF.

Conclusion

This study highlighted the potential of automating Software Product Line Engineering (SPLE) processes through feature extraction and classification. Automating these tasks can streamline SPLE, saving time and reducing errors. A key aspect of SPLE involves classifying features as mandatory (present in all products) or optional (included only in specific variants). The study demonstrated that SVM effectively performed binary classification. The advantage of a simple classification model, like the one used in the study, lies in its efficiency and ease of interpretation. However, for very complex product lines with numerous features and intricate relationships, more advanced models are necessary.

Reference

1. [1] Pohl, K., Bockle, G., & van der Linden, F. (2005). Software Product Line Engineering: Foundations, Principles, and Techniques. Springer, Heidelberg, Berlin, Germany. [2] Quba, G. Y., Al Qaisi, H., Althunibat, A.and AlZu’bi, S. (2021). Software Requirements Classification using Machine Learning Algorithms. 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 2021, pp. 685-690, doi: 10.1109/ICIT52682.2021.9491688. [3] Jakkula, V. ().Tutorial on Support Vector Machine (SVM). School of EECS, Washington State University, Pullman 99164. [Unpublished] [4] Müller, A. C. and Guido, S. (2002). Introduction to Machine Learning with Python: A Guide for Data Scientists. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. [5] Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. [6] Shawe-Taylor, J., & Cristianini, N. (2000). An Introduction to Support Vector Machines : and other Kernel-Based Learning Methods. Cambridge: Cambridge University Press. [7] Mavroforakis, M & Theodoridis, S (2020). A Geometric Approach to Support Vector Machine(SVM) Classification. EEE Transactions on Neural Networks, Vol. 17, No. 3, May 2006. DOI: 10.1109/TNN.2006.873281 [8] Awad, M., Khanna, R. (2015). Support Vector Machines for Classification. In: Efficient Learning Machines. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4302-5990-9_3 [9] Alpaydin, E. (2014). Introduction to Machine Learning. The MIT Press Cambridge, Massachusetts, London, England. [10] Canedo, E. D. & Mendes, B. C. (2020). Software Requirements Classification using Machine Learning Algorithms. Entropy. DOI: 10.3390/e22091057 [11] Cunha, V.C., Magoni, D., Inácio, P.R.M., Freire, M.M. (2022). Impact of Self C Parameter on SVM-based Classification of Encrypted Multimedia Peer-to-Peer Traffic. In: Barolli, L., Hussain, F., Enokido, T. (eds) Advanced Information Networking and Applications. AINA 2022. Lecture Notes in Networks and Systems, vol 449. Springer, Cham. https://doi.org/10.1007/978-3-030-99584-3_16.

Author Contribution

The experiment, implementation, and manuscript writing were carried out by S. M. Waziri. F. U. Zambuk and B. I. Ya’u supervised the project and contributed from the beginning to the end of the manuscript writing. Dr. M. A. Lawal is the departmental project coordinator who searched for and recommended the best organizations for publishing papers in journals. He also guided in formatting.

Funding

The Department of Computer Science, Abubakar Tafawa Balewa University, Nigeria approved this study. It also supported research; however, the funding was by the main author.

Software Information

Conflict of Interest

There was no conflict of interest in financial, commercial, legal, or professional relationships with organizations or individuals.

Acknowledge

I acknowledge the Department of Computer Science, Faculty of Science, Abubakar Tafawa Balewa University, Bauchi, Nigeria, for support and guidance.

Data availability

The dataset generated in this study is available in the PURE (Public Requirements) data pool at http://nlreqdataset.isti.cnr.it/.