A Review on Classification of Extracted Features from Software Requirements Specification Documents using Support Vector Machine Learning Technique

Title

A Review on Classification of Extracted Features from Software Requirements Specification Documents using Support Vector Machine Learning Technique

Authors

1. Sadiq Waziri, Abubakar Tafawa Balewa University Bauchi, Student, Nigeria

Abstract

Manual classification of extracted features from large datasets can be tedious and time-consuming. This paper reviews the methods for classifying extracted features from SRS documents using Machine Learning (ML), with focus on linear Support Vector Machine (SVM) technique. We also explore other classification techniques, such as decision trees (DT), naïve Bayes (NB), and k-nearest neighbors (KNN)—for classifying the extracted features into mandatory and optional. Previous studies have compared different classification techniques for feature modeling. The primary goal of this review is to identify the best method for binary classification of features for software product lines engineering (SPLE). The proposed system will be tested on nine SRS documents that were chosen from the Public Requirements dataset with accuracy, precision, recall, and F1 scores used for evaluation.

Keywords

Requirements Feature Feature Extraction Feature Classification Feature Modeling Support Vector Machine

PDF

This browser does not support PDFs. Please download the PDF to view it: View the PDF.

Conclusion

After implementing the proposed system, we found that SVM outperformed DT, NB, and KNN in terms of the average results shown in Table 1. This highlighted the potential of SVM as the most promising technique for feature classification in SPLE.

Table 1. Results of Performance Evaluation [22]

Model

Av. Accuracy

Av. Precision

Av. Recall

Av. F1-Score

SVM

0.86

0.89

0.83

0.86

DT

0.82

0.83

0.80

0.82

NB

0.80

0.82

0.78

0.79

KNN

0.81

0.82

0.79

0.81

“Av.” means “Average”.

Future research could focus on:

  1. Developing more efficient SVM algorithms for large-scale datasets.
  2. Investigating the impact of different feature extraction techniques on classification performance.
  3. Exploring alternative machine learning techniques that may be better suited for specific features or datasets.

Future research can contribute positively to the development of software product lines if these areas are addressed.

Reference

1. [1] Pohl, K., Bockle, G., & van der Linden, F. (2005). Software Product Line Engineering: Foundations, Principles, and Techniques. Springer, Heidelberg, Berlin, Germany. [2] Oncea, B. (2023). Automatic Classification using Supervised Machine Learning in Price Statistics. MDPI, Basel, Switzerland. Retrieved July 20, 2023 from https://www.mdpi.com/journal/mathematics [3] Chowdhury, S. and Shoen, M. (2020). Research Paper Classification using Machine Learning Techniques. 10.1109/IETC47856.2020.9249211 [4] Macedo, D. (2020). Improving Image Classification Accuracy Using Hybrid Systems of SVM and CNN. [5] Lumbanraja, F., Fitri, E. Junaidi, A. & Prabowo, R. (2022). Abstract Classification using SVM Algorithm (Case Study: Abstract in Computer Science Journal) [6] Sukhpreet, S. & Malik, K. (2022). Feature Selection and Classification Improvement of kinnow fruits using SVM Classifier [7] Osisanwo, F. et al (2017). Supervised Machine Learning Algorithms: Classification and Comparison. International Journal of Computer Trends and Technology (IJCTT) – Volume 48 Number 3 June 2017. [8] Shawe-Taylor, J., & Cristianini, N. (2000). Support Vector Machines (Vol. 2). Cambridge: Cambridge University Press. [9]. Wrinkler, J. & Vogelsang, A. (2017). Automatic Classification of Requirements Based on Convolutional Neural Networks. In Requirements Engineering Conference Workshops (REW), IEEE International. New York: IEEE. DOI: https://doi.org/10.1109/REW.2016.021. [10] Pandey, S., Taralekar, A., Yadav, R., Deshmukh, S. & Suryavanshi, S. (2020). Email Spam Detection and Classification using SVM. Paper in Journal of Computer Science and Information Technologies, Vol. II (1) [11] Hassan, S. et al. (2022). Analytics of ML-Based Algorithms for Text Classification. Published by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. [12] Aurnob, F. et al. (2022). Sentiment Analysis on Corona Virus Tweets. Ahsanullah University of Science and Technology, Dhaka, Bangladesh. [13] Quba, G. Y., Al Qaisi, H., Althunibat, A.and AlZu’bi, S. (2021). Software Requirements Classification using Machine Learning Algorithms. 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 2021, pp. 685-690, doi: 10.1109/ICIT52682.2021.9491688. [14] Pang, B., Lee, L. and Vaithyanathan, S. (2002). Thumbs Up: Sentiment Classification using ML Techniques. Appears in Proc. 2002 Conf. on Empirical Methods in Natural Language Processing (EMNLP). [15] Mavroforakis, M & Theodoridis, S (2006). A Geometric Approach to Support Vector Machine(SVM) Classification. EEE Transactions on Neural Networks, Vol. 17, No. 3, May 2006. DOI: 10.1109/TNN.2006.873281 [16] Müller, A. C. and Guido, S. (2002). Introduction to Machine Learning with Python: A Guide for Data Scientists. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. [17] Simpilearn (2023). https://www.simplilearn.com/image-processing-article/ [18] https://www.techtarget.com/whatis/definition/support-vector-machine-SVM [19] https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/#Pros_and_Cons_of_SVM [20] https://www.geeksforgeeks.org/support-vector-machine-in-machine-learning/ [21] Awad, M., Khanna, R. (2015). Support Vector Machines for Classification. In: Efficient Learning Machines. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4302-5990-9_3 [22] Waziri, S. et al (2024). Classification of Extracted Features from Software Requirements Specification Documents using Support Vector Machine Learning Technique. https://15.207.161.74/journal/SRJSET/currentissue/classification-of-extracted-features-from-software-requirements-specification-documents-using-support-vector-machine-learning-technique

Author Contribution

Sadiq Mohammed Waziri conceptualized and led the research; Fatima Umar Zambuk supervised the write-ups; Badamasi Imam Ya’u is the second supervisor and has contributed immensely to the literature review.

Funding

There was no funding received for the research work.

Software Information

Conflict of Interest

The authors declare that there was no conflict of interest.

Acknowledge

The authors acknowledge the Computer Science Department of Abubakar Tafawa Balewa University Bauchi for providing the necessary support required for the research.

Data availability

The data supporting the findings of this study is provided only upon reasonable request from the corresponding author.