Classification of Extracted Features from Software Requirements Specification Documents using Support Vector Machine Learning Technique
1. Sadiq Waziri, Abubakar Tafawa Balewa University, Student, Nigeria
This paper presents the
results of an experiment on the classification of extracted features from
Software Requirements Specification (SRS) documents using various Machine
Learning techniques. The primary focus was on the linear Support Vector Machine
(SVM) technique, with comparative analysis involving three additional
techniques, namely Decision Tree (DT), Naïve Bayes (NB), and K-Nearest
Neighbors (KNN). During the experimentation, features, which are fundamental
building blocks of Software Product Lines [1], were classified into
optional and mandatory. This differentiation facilitates both variability and
similarity within a product family [1]. While previous research has
explored similar classifications using diverse techniques, this study
specifically identifies the most effective method for binary classification of
features for feature modeling. The experiment was conducted on nine selected
documents from the PURE dataset. The performance of each model was evaluated rigorously
based on accuracy, precision, recall (sensitivity), and F1-score. The findings
provide valuable insights into the optimal classification technique, enhancing
the development and management of software product lines.
Requirements Feature Feature Extraction Feature Classification Feature Modeling Support Vector Machine
This study highlighted the potential of automating Software Product Line
Engineering (SPLE) processes through feature extraction and classification.
Automating these tasks can streamline SPLE, saving time and reducing errors. A
key aspect of SPLE involves classifying features as mandatory (present in all
products) or optional (included only in specific variants). The study
demonstrated that SVM effectively performed binary classification. The advantage of a simple
classification model, like the one used in the study, lies in its efficiency
and ease of interpretation. However, for very complex product lines with
numerous features and intricate relationships, more advanced models are necessary.
1. [1] Pohl, K., Bockle, G., & van der Linden, F. (2005). Software Product Line Engineering: Foundations, Principles, and Techniques. Springer, Heidelberg, Berlin, Germany. [2] Quba, G. Y., Al Qaisi, H., Althunibat, A.and AlZu’bi, S. (2021). Software Requirements Classification using Machine Learning Algorithms. 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 2021, pp. 685-690, doi: 10.1109/ICIT52682.2021.9491688. [3] Jakkula, V. ().Tutorial on Support Vector Machine (SVM). School of EECS, Washington State University, Pullman 99164. [Unpublished] [4] Müller, A. C. and Guido, S. (2002). Introduction to Machine Learning with Python: A Guide for Data Scientists. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. [5] Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. [6] Shawe-Taylor, J., & Cristianini, N. (2000). An Introduction to Support Vector Machines : and other Kernel-Based Learning Methods. Cambridge: Cambridge University Press. [7] Mavroforakis, M & Theodoridis, S (2020). A Geometric Approach to Support Vector Machine(SVM) Classification. EEE Transactions on Neural Networks, Vol. 17, No. 3, May 2006. DOI: 10.1109/TNN.2006.873281 [8] Awad, M., Khanna, R. (2015). Support Vector Machines for Classification. In: Efficient Learning Machines. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4302-5990-9_3 [9] Alpaydin, E. (2014). Introduction to Machine Learning. The MIT Press Cambridge, Massachusetts, London, England. [10] Canedo, E. D. & Mendes, B. C. (2020). Software Requirements Classification using Machine Learning Algorithms. Entropy. DOI: 10.3390/e22091057 [11] Cunha, V.C., Magoni, D., Inácio, P.R.M., Freire, M.M. (2022). Impact of Self C Parameter on SVM-based Classification of Encrypted Multimedia Peer-to-Peer Traffic. In: Barolli, L., Hussain, F., Enokido, T. (eds) Advanced Information Networking and Applications. AINA 2022. Lecture Notes in Networks and Systems, vol 449. Springer, Cham. https://doi.org/10.1007/978-3-030-99584-3_16.
The experiment, implementation, and manuscript writing were carried out by S. M. Waziri. F. U. Zambuk and B. I. Ya’u supervised the project and contributed from the beginning to the end of the manuscript writing. Dr. M. A. Lawal is the departmental project coordinator who searched for and recommended the best organizations for publishing papers in journals. He also guided in formatting.
The Department of Computer Science, Abubakar Tafawa Balewa University, Nigeria approved this study. It also supported research; however, the funding was by the main author.
There was no conflict of interest in financial, commercial, legal, or professional relationships with organizations or individuals.
I acknowledge the Department of Computer Science, Faculty of Science, Abubakar Tafawa Balewa University, Bauchi, Nigeria, for support and guidance.
The dataset generated in this study is available in
the PURE (Public Requirements) data pool at http://nlreqdataset.isti.cnr.it/.