• International Journal of Technology (IJTech)
  • Vol 14, No 4 (2023)

ReqGo: A Semi-Automated Requirements Management Tool

ReqGo: A Semi-Automated Requirements Management Tool

Title: ReqGo: A Semi-Automated Requirements Management Tool
Shi-Jing Koh, Fang-Fang Chua

Corresponding email:


Cite this article as:
Koh, S.-J., Chua, F.-F, 2023. ReqGo: A Semi-Automated Requirements Management Tool. International Journal of Technology. Volume 14(4), pp. 713-723

212
Downloads
Shi-Jing Koh Faculty of Computing and Informatics, Multimedia University,63100 Cyberjaya, Selangor, Malaysia
Fang-Fang Chua Faculty of Computing and Informatics, Multimedia University,63100 Cyberjaya, Selangor, Malaysia
Email to Corresponding Author

Abstract
ReqGo: A Semi-Automated Requirements Management Tool

This study deals with issues of changes in requirements management by dealing with requirements ambiguity and prioritization. A hypothesis about the possibility of integrating machine learning techniques and requirements management processes has been proven. It highlights the efforts in automating requirements ambiguity identification, requirements classification, and prioritization considering multi-criteria in decision-making through the utilization of Natural Language Processing (NLP) techniques and Universal Sentence Encoder. Naive Bayes (NB) classifier has been applied with its remarkable performance on binarily classifying requirements. Although existing methods proved to improve one or two of the process significantly, it rarely integrates the whole requirements management activity. The proposed tool helps the development team to manage the requirements systematically. The prioritization algorithm is proved to work as expected by considering multiple constraints before calculating the priority value. Meanwhile, it identifies the ambiguity that exists in the requirement automatically. The ambiguity classifier successfully identifies 87.5% of requirements accurately. Possible future work could be done in improving the prioritization module by allowing automated estimation of priority value upon requirements change. Future work may extend the automation coverage by providing test case generation.

NLP; Requirements; Requirements ambiguity; Requirements prioritization

Introduction

    Requirements Management has been the backbone of most software development to achieve the goal of every project. It handles rapidly changing requirements with proper planning, analysis, documenting, prioritizing, and integration of requirements to provide up-to-date requirements for a project.
    It is crucial to get the "right" requirements from the clients and put the requirements in the "right" place (Hafeez, Rasheed, and Khan, 2017).  Nevertheless, constantly changing requirements may end up in a large-scale system and require much effort in managing the details to ensure the information is always up to date. A key part of requirements management is managing the changes. The manual process of labeling requirements could be time-consuming and error-prone (Iqbal, Elahidoost, and Lucio, 2018). A failure to identify any issues in requirements in the early stage could result in project delay, which brings out the issues of loss of revenue, tarnished reputations, and loss of trust (Riazi and Nawi, 2018), other than adversely affect the expectation of the clients and the final product and finally, lead to project failure.
    This paper aims to propose a semi-automated requirements management tool, ReqGo and analyses how it benefits the existing requirements management process. More precisely, we pursue the following research questions: 1) How to identify ambiguity in requirements automatically? 2) How to improve requirements prioritization tasks through semi-automation? 3) How to integrate automated tasks into the requirements management process? The proposed tool makes use of Natural Language Processing (NLP) to classify requirements, detect requirement ambiguity, and automatically prioritize them using multi-criteria decision-making to facilitate effective resource utilization, besides providing the ability for users to manage their requirements and relevant artifacts. Among various NLP algorithms available to analyze and process the data, Naive Bayes (NB) has been chosen due to its feature that supports binary classification, which is ideal for classifying the requirement into two categories, i.e., Functional and Non-Function Requirements.
    According to previous studies, the correct requirements classification of requirements and clear definition of requirements have been the main focuses of researchers to allow filtering and prioritizing of requirements. There are numerous algorithms, including Term Frequency - Inverse Document Frequency (TF-IDF) (Wein and Briggs, 2021; Dias Canedo & Cordeiro Mendes, 2020) and machine learning techniques like Support Vector Machine (SVM) (Shariff, 2021; Kurtanovic and Maalej, 2017), Naïve Bayes (NB) (Shariff, 2021), Logistic Regression (LR) (Dias Canedo and Cordeiro Mendes, 2020) and Natural Language Processing (NLP) (Wein and Briggs, 2021; Asadabadi et al., 2020; Aysolmaz et al., 2018; Kurtanovic & Maalej, 2017; Emebo, Olawande, and Charles, 2016), have been implemented in various requirements management tasks to analyze and classify requirements by going through requirements normalization, feature extraction, feature selection, and finally classification.
    Existing requirements management software such as IBM DOORS Next (IBM, n.d.) and MaramaAIC (Kamalrudin, Hosking, and Grun, 2017) provide the ability to manage requirements for complex software and systems requirements environment with NLP techniques supported to improve the abilities in detecting requirements quality issues and traceability through a certain degree of automation. However, it is notable that they are still lacking the integration of automatic requirements prioritization in the tool. Inspired by the existing tools, ReqGo emphasized capturing requirements issues in the early stage while proposing a semi-automated requirements prioritization module to reduce the human effort in ranking the requirements.
    The remaining part of the paper is structured as the followings: Section 2 describes the fundamental theory and the working procedure of ReqGo, including its overall architecture, the workflow of requirements ambiguity identification and requirements prioritization with their corresponding algorithms, as well as print screens of the tool, followed by Section 3 which presents the results and summarize the major findings of our study. Finally, Section 4 concludes the paper.

Experimental Methods

2.1. ReqGo Architecture and Workflow

    In general, ReqGo contains five modules which are User Account Management, Requirement Record Management, Requirement Prioritization, Requirement Verification & Validation, and Requirement Traceability. Figure 1 represents a high-level flow of how the user could utilize the tool. By using ReqGo, users need to insert the requirements collected in natural language. The extracted requirements will then be ready for documentation and analysis. The requirements are processed and tagged for different categories and possible ambiguity. The requirements will then be prioritized accordingly, and finally, verification and validation.

Figure 1 High-Level Flow of ReqGo

2.2. Data Collection

     Before proceeding with any classification process, we need to collect the dataset for training and evaluation purposes. The dataset collected with being divided into two portions. The first part will be used to train the classifier, while the second part will serve as the validation set. To perform this requirements classification phase, the datasets are gathered from three sources (Lima et al., 2019; Ferrari et al., 2017; Cleland-Huang et al., 2007). Figure 2 presents a summary of requirements and their composition in the dataset. The dataset is divided into twelve categories, including Functional (F) and Non-Function requirements like Security (SE), Usability (US), Operational (O), Performance (PE), Look and Feel (LF), Availability (A), Maintainability (MN), Scalability (SC), Fault Tolerance (FT), Legal (L), and Portability (PO).

Figure 2 Distribution of Requirement Compositions on Datasets

2.3. Requirements Classification and Ambiguity

     ReqGo is deploying Machine Learning techniques for classifying Functional Requirements and Non-Functional Requirements and identifying ambiguous terms in requirements to discuss the machine learning methods utilized for classifying the requirements. The implementation of Machine Learning can be divided into three processes, namely Text Normalization, Text Vectorization, and Text Classification. Before a requirement can be further processed according to the need, text normalization is essential to convert the requirement into a standard form. In our case, we make use of Tokenization, Part-Of-Speech (POS) tagging, and Lemmatization to break down the words in terms of sentence structure and vocabulary. Then, we utilize Google's Universal Sentence Encoder method to embed the text into meaningful vector representations to evaluate words in requirements (Figure 3). In ReqGo, we utilize its ability to measure the degree of two pieces of text that carry similar meanings. It is useful in sentence classification tasks by analyzing the semantics similarity through the vectors generated through cosine similarity calculation based on the given equation:

Figure 3 Illustration of the process of Universal Sentence Encoder

In ReqGo, we employ the NB classifier, a supervised classification algorithm, to group the requirements automatically. NB supports binary classification to divide the requirements into functional and non-functional requirement categories; the requirements are further identified as ambiguous or non-ambiguous to discover any possible issues that exist in the requirements. The classifier is trained using datasets collected before it can be used to provide a more accurate result. Figure 4 presents an algorithm for initializing requirements classification, including model training and model testing. The function requires labeled dataset input to produce the outcome of a well-trained requirement classifier. There are several values to be set at the beginning of the process, such as the total amount of data in the dataset used for this training, the amount of data for testing the classifier, and the amount of data for training the classifier. After the classifier has been trained and tested, it is stored in a JSON file for reuse purposes.
       In order to analyze the ambiguity that exists in requirements, an ambiguity checker is implemented. The process is started with a step in line 1 of Algorithm 2 (as shown in Figure 5), where the threshold value is set. Threshold value places an important role as it determines whether a requirement is identified as ambiguous once it exceeds the value. Using TensorFlow.js, we embedded the requirements and ambiguous terms through Universal Sentence Encoded to support our process of examining the similarity of the words with high dimension vectors generated. Finally, the outcome is stored in the database. With ambiguous requirements detected, the users will be notified to take action to prevent any issues.

Figure 4 Algorithm of Requirement Classifier Training and Testing

Figure 5 Algorithm of Requirement Ambiguity Checker

2.4. Requirements Prioritization Logic

      ReqGo provides semi-automated requirements prioritizing methods based on a requirement priority value (RPV) formulation function (Hujainah et al., 2021), including the decision-making method, priority clustering, and insertion sort. To simplify the process, ReqGo involves the sorting algorithm after the calculation of RPV for each requirement without clustering algorithms. Figure 6 presents the flow of how the two categories of prioritization tasks interact to produce the desired outcome, i.e., reduce the manual effort in finalizing the priority value and the order of requirements collected when it comes to implementation. Figure 7 illustrates the flow of the requirements prioritization method algorithm by gathering all the necessary data. This includes stakeholder weights, priority values assigned by each stakeholder, requirement dependencies and their occurrence as parent requirements, and the efforts required to turn the requirement into reality.