With the gradual replacement of traditional desktop software by mobile applications as the mainstream software tools in people’s daily work, study and life, the development of mobile applications introduces complex new features that make quality assurance more challenging compared to desktop software [1]-[8]. However, the growth in testing needs is in contrast to the limited availability of testing tools and testers. Currently, both industry and academia are increasingly emphasizing the adoption and exploration of automated testing techniques to address the testing needs of mobile applications [9]-[12].
Model-based automated testing is a widely studied testing method that involves constructing test models by mining the state and behavior of the application, and subsequently utilizing these models to generate test cases [13]. Model-based automated testing typically involves completing static modeling of mobile applications based on GUI and dynamic modeling based on behavior jumping, and describing mobile application behavior using inter-state relationship models such as Finite State Machine (FSM). For instance, GUI Ripper [14] facilitates automated exploration modeling of applications, while tools like UI Automator [15]-[16] are utilized to obtain the GUI tree of an application and target the selection of events to complete exploration modeling of mobile applications.
Research on improvements related to model based automated testing: on the one hand, to achieve automatic evolution of the model, e.g., Gu et al [17] can find differences and quickly achieve model evolution after application version update. DeltaDroid [18] builds a defect model that can generate new cases under different conditions based on existing test cases combined with actual GUI and system actions to detect dynamic installation defects in Android applications. On the other hand, the modeling is guided by enhanced application knowledge, e.g., MEGDroid [19] uses model abstraction and model-to-model migration methods to achieve accurate generation of application events; Pan et al [20]-[23] use manual construction of richer test models to guide automated testing.
To overcome the limitations of mobile application automation test modeling mentioned above, this chapter proposes a semantic model-based automation test approach. This approach fuses the mobile application domain model to achieve a semantic matching association between mobile application GUI state and domain knowledge. This facilitates the automated generation of a mobile application GUI semantic test model, which is subsequently used to verify the testing effectiveness of the modeling method.
When it comes to mobile application testing, a typical model-based approach involves modeling the relationship between mobile application GUI states and GUI events to establish a series of jumps triggered by events between different GUI states in the application. One example of such an approach is the finite state machine (FSM) model [24]. However, current model-based approaches for mobile application testing are limited to direct records of GUI states and GUI events. These models can only describe the jump-trigger relationship between different GUI states of the application, without understanding the functional meaning of the application. To overcome the aforementioned problems, this paper proposes a method that integrates semantic ontology models with traditional GUI testing models. This method involves building a semantic testing model for mobile applications by extracting semantic information from the GUI of mobile applications and attaching semantic concepts to GUI states and GUI events.
A typical semantic definition of OWL, describing ontology with a formal definition of six tuples:
Ac denotes the set of attributes of each concept, the set of concept attributes Ac(ci), each concept ci in the set of concepts C is used to represent a set of objects of the same kind and can be described by the same set of attributes. R denotes the set of relations between concepts, relation ri (cp,cq) that is, each relation ri in relation R represents a binary relation between concepts cp and cq, and an instance of this relation is a pair of concept objects (cp,cq). AR denotes the set of attributes of each relation, and the set of relation attributes AR(ri) is used to represent the attributes of relation ri. H represents a concept hierarchy, where it is a hierarchy of sets of concepts denoted as C. H also includes a set of parent-child relations that exist between the concepts in C. The set of axioms is represented as X, where each axiom within X serves as a constraint on the attribute values of a concept and its relation, or as a constraint on the relations between conceptual objects. The mobile application domain metamodel is defined as a 4-tuple
The set of mobile application ontology concepts is denoted by C and is composed of three subsets: entity, action, and task. Each subset contains a concept identifier, concept type, and semantic name.
I represents a collection of instances of mobile application ontology concepts. These instances are concrete textual representations of mobile application component values that are recognized knowledge in the domain. For instance, examples of instances for the entity “place name” could include “Xi’an”, “Beijing”, and “Shanghai”. These instances can be used as the origin or destination in an air service application.
R denotes the set of semantic relations of the mobile application, which contains the inter-concept relation Rcc and the concept-instance relation Rci.
X is the constraint definition of the mobile application domain model, which encompasses inter-concept relationship constraints, constraints on concept attributes, instance data constraints, and more. For instance, one constraint could be that a task must consist of at least one entity and one action, and that the instance of the entity “place name” can only be described by text.
The domain model action flow graph is defined as a 5-tuple,
Action state set A: finite set of mobile application interaction actions.
Action flow control set T: control set of mobile application action sequences.
set of action flow relations F: set of relations from one action state to another action state.
Initial states: as, af∈ A. End state: af, af∈ A.
A component in a GUI that cannot be split is an atomic component. The basic components of a GUI in general are atomic components, such as text buttons (Button), text (Text View), images (Image View), etc. AC denotes the set of atomic components in a GUI, ∀ act denotes the component type of the atomic component ac. acv indicates the value of the atomic component ac, which is an optional attribute. aca indicates the possible actions of the atomic component ac, e.g. a button is usually bound to a click action. acs denotes the semantics of the atomic component ac. The semantic entity of acv is obtained by mapping acv to the domain model acs: A component composed of atomic components in a GUI is a semantic composite component, e.g., ListView, ToolBar, etc. CC denotes a collection of semantic composite components in a GUI, ccac denotes the composition of the semantic composite component cc, i.e., which atomic components the semantic composite component cc is composed of, ccac is the set of atomic components,
The component type of a semantic composite component is denoted by cct, and this attribute is optional. It’s possible that there may not be a corresponding GUI component type for the semantic composite component.
The semantics of the semantic composite component cc is represented by λccs, which is determined by the relationship in the domain model where the semantics of the constituent atomic components reside. This can be denoted ∧acs→ccs. It should be noted that despite being a composite component, the semantics of cc still corresponds to the entities present in the domain model.
The extended semantic FSM model, FSM-ES, is an extension of the typical FSM model that uses the 5-tuple setting. However, it introduces a semantic extension to the expression of the state-hopping relationship: The infinite non-empty state set of the GUI is denoted as S, which encompasses all possible states of the application being tested. For ∀s∈S, δ is the state transfer function that maps S × Σ to the transition function of S δ:S × ∑→S. ∀s ∈ S, ∀e ∈ ∑. The notation δ (s, e) refers to the set of states that can be accessed by transitioning from the GUI state s through event e.
We propose a semantic model-based mobile application testing method, which consists of two parts: visual semantic model-driven GUI modeling and task subgraph-based test case generation.
In the realm of semantic model-driven automated test modeling, the following critical modules are present:
The process of semantic model-driven GUI modeling is illustrated in Algorithm 1, which takes the mobile application domain model (ADM), the generic model (GDM), and the application under test (AUT) as inputs and produces the FSM-ES model of the application under test as outputs. The following pseudo code outlines this process:
1. 2. Get the current GUI gui_current of the application under test 3. Perform visual recognition of GUI component elements on gui_current to get GUI component information gui_info 4. Match the gui_info of the current GUI with the ADM for semantic similarity to get the vector gui_vc 5. gui_action inferAction(gui_vector) 6. Get the response interface vector gui_vr 7. 8. mark gui_a in gui_action as executed 9. Generate a path to “gui_vc, gui_a, gui_vr” f 10. 11. Path f is added to fsm_es 12. 13. 14. Logging exceptions 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. Reasoning generates gui actions in the domain corresponding to the gui state and saves them to gui_action 26. 26. Generate executable actions for the interface based on GDM probabilities and save them to gui_action 28. 29. 30. 31.
The primary objective of the FSM-ES model is to explore the attainable states of the application and associate each state with the task ontology outlined by the domain model. The structure of the FSM-ES model is depicted in TABLE I. FSM-ES semantic structure of music playback application Cathay Pacific
Area | Mssion | State Collection | Action Semantic Set |
---|---|---|---|
Music playback | Register | {S0,S1,S2,S3,S4,S12 } | {select login, agree to the agreement, enter the username, enter the password, and click login} |
Basin information | {S5,S6,S7,S8,S9} | {Recently played, locally downloaded personal cloud drive, friend list, favorite playlist} | |
Discovering Music | {S10,S11,S12} | {Daily recommendation, click like, click play} | |
Search Services | {S13,S14,S15,S16} | {Click to type, click to search, clear search, listen to music and recognize music} | |
Persoral Settings | {S17,S18,S19,S20,S21,S22,S23} | {Message center, personal privacy, personalized services, advanced settings, about, log out, switch accounts} |
D
Error type | Defect type found | Semantic Modeling in Robot Vision Environment | Semantic modeling in a simulated environment | Humanoid | Stoat |
---|---|---|---|---|---|
Application crash defects | ActivityNotFoundException | 1 | 1 | 3 | 2 |
lllegalArgumentException | 3 | 2 | 1 | 1 | |
lllegalState Exception | 3 | 2 | 2 | 2 | |
NullPointer Exception | 4 | 3 | 2 | 0 | |
OutOfMemoryError | 2 | 2 | 1 | 2 | |
amount | 13 | 10 | 9 | 7 |
The FSM-ES model is utilized to map to the action flow diagram of the application domain model, which generates task subgraphs for the functional tasks of the application to produce test cases.
If every action concept in the task concept, as defined in the FSM-ES model, corresponds to an action state found in an AFG in the domain, then the task concept is deemed to comply with a specific action flow graph (AFG). The following pseudo code outlines this process:
Domain Model Action Flow Graph AFG 1. 2. 3. task_path generatePath(task, AFG) 4. 5. with TFG then 5. Generate TFG by storing task_ 6. path in sequence 6. 7. return exception and interrupt location 8. 9. 10. 11. 12. 13. 14. 15. Generate a feasible action flow based on AFG inference action 16. task_path sets the continuity flag 17. 18. task_path sets the end flag 19. 20. 21.
As observed, the domain knowledge incorporated in the action flow diagram can be leveraged to supplement the unexplored inter-state action behavior, thereby generating test judgment criteria for mobile applications.
Task states explored by FSM-ES
Task Subgraph of Domain Action Flowchart
Upon examining the actual GUI of NetEase cloud music’s search song task in (f), it becomes apparent that while the FSM-ES has explored the GUI states, it has failed to account for the iterative behaviors of keying and deleting in S13 state and S15 state. This is because the FSM-ES prioritizes state exploration over other factors. However, by extending the corresponding action flow diagram definition in the domain ontology library based on domain knowledge, a pathway can be generated, and this domain knowledge can be effectively applied to GUI modeling.
The process of generating task subgraphs from FSM-ES is essentially the instantiation of the mobile application domain model on the application under test. This transforms the abstract concept relationships of the mobile application domain model into a concrete sequence of application action behaviors, making the mobile application testable for execution.
Test case generation comprises two main components: test sequence generation and test data generation. Test sequences are generated by defining semantics-oriented test coverage criteria to guide the traversal of task subgraphs.
Coverage decision rule 1: Existence decision c(A, B), i.e., if the element in A exists in B, the coverage decision is satisfied.
Coverage decision rule 2: Sequential decision s(A, B), i.e., if the elements in A are sequential and conform to the sequential arrangement in B, the coverage decision is satisfied.
Coverage rule 3: Key point decision d (A, B), i.e., if there is an element in A that matches the key element in B, then the coverage decision is satisfied.
Semantic concept entity coverage criteria The set of test cases ensures coverage of both the conceptual entities present in the application domain model being tested and all GUI states formed by these entities.
Where:
c(FA, OA) — the coverage of the conceptual entity FA involved in the FSM-ES model of the application under test with the entity OA included in the domain model. s(TG, FG) — the coverage of all GUI states involved in the test case set with the GUI states contained in the FSM-ES model of the application under test. Login Action Test Case Semantic concept action coverage criteria Test cases satisfy the coverage of actions involved in the action flow diagram of the application domain model being tested. Action coverage is evaluated independently for each action flow diagram. The coverage of semantic concept actions is calculated using the equation (3).
M—M is the number of task subgraphs included in the FSM-ES model of the application under test, and the action sequence coverage within a subtask is calculated for each task subgraph. Semantic concept task coverage criteria Test cases fulfill the coverage requirements of the subgraph of the task being tested, effectively covering all relationships between GUI states. The evaluation of coverage for semantic concept tasks is calculated using the equation presented in (4).
X — the number of application subtasks covered by the test case set XD — the number of tasks defined in the domain model
Figure 3.
The coverage of semantic concept tasks, which focuses on the application’s functional completeness, is calculated by assessing the coverage of tasks in the domain model using the test case set. The number of subtasks covered by the test case set is represented by X, and subtask coverage is assessed using the d-judgment rule. Specifically, if a test case can cover any pathway from the initial state to the end state of a task subgraph, it is deemed as covered and assigned a value of 1. Conversely, if there is no pathway from the initial state to the end state of the task subgraph, it is regarded as not covered and assigned a value of 0.
Where, α, β, γ correspond to the adjustable parameter weights of the three coverage criteria.
a. To generate test cases based on task subgraphs, it is necessary to cover as many feasible paths as possible to achieve high test case coverage and optimize testing effectiveness.
1. 2. node← getStartNode(tfg) 3. Create a new sequence TCi with node number i at the beginning 4. 5. 6. p← generatePath(node, tfg) //generate test behavior path 7. Add the node and events from p to TCi 8. Set the corresponding path in tfg to traversed state 9. 10. p ← generatePath(node, tfg) //generate test behavior paths 11. 12. node ← p.nodef 13. 14. i++ 15. 16. 17. 18. 19. 20. Generate this unique path with the node p 21. 22. Generate a path with the node that has the highest degree p 23. else if node has a path that has not been traversed then 24. Generate p from this untraversed path 25. 26. 27. Generate a path p between the node and its traversed neighbors 28. 29. Generate a path from the node to a random neighbor node p 30. 31. 32. Generate a path p between the node and any of the next nodes 33. 34. 35.
For a given task subgraph TSG:
If there exists a unique pathway of the current node with only one neighboring node, the pathway between the current node and that neighboring node is generated as the next test behavior path. If the current node has multiple neighboring nodes with different out degrees, the pathway of the current node and the neighboring node with the highest out degree will be generated as the next test behavior path. If there are multiple neighboring nodes at the current node and there is an untraversed path between the current node and one of the neighboring nodes, the untraversed path will be generated as the next test behavior path. If the current node has multiple neighboring nodes and there are multiple untraversed paths between it and the neighboring nodes, the path of the current node with the traversed neighboring nodes will be generated as the next test behavior path. If there are no traversed neighboring nodes, the path of a random neighboring node will be generated as the next test behavior path.
When selecting applications for testing, those with similar functions are grouped together as domain applications, such as airline service applications, file management applications, news applications, and so on. To establish the domain models, two teams, each consisting of five lab personnel, were invited to create two domain models based on the definition of domain models. These models were cross-checked for accuracy and consistency.
Test effectiveness is evaluated at three levels (1) the success rate of test action execution, which determines whether the actions in the test script can be executed without error; (2) the success rate of test script execution, which determines whether the test script can be executed in its entirety without encountering any errors; and (3) the success rate of defect discovery, which evaluates whether known defects in the application set can be identified.
Upon completion of the exploration of the application under test and subsequent building of the semantic test model, the coverage of the semantic test model with respect to the domain model is determined by analyzing the successful matching of the application state and the domain model as recorded by the exploration algorithm. The coverage results of the FSM-ES models created for each domain tested application, along with the corresponding domain model, are presented in TABLE II. The table indicates that the average coverage of entity concept is 89%, while the average coverage of action concept is also 89%. The average coverage of task concept is 81%. It is true that the non-intrusive environment may have some impact on the modeling process, particularly with regard to factors such as GUI recognition accuracy.
APP | Semantic Modeling in Robot Vision Environment | Semantic modeling in a simulated environment | ||||
---|---|---|---|---|---|---|
Entity Coverage | Action Coverage | Task Coverage | Entity Coverage | Action Coverage | Task Coverage | |
Apple Music | 84% | 82% | 87% | 84% | 82% | 89% |
QQ Music | 83% | 89% | 85% | 86% | 89% | 83% |
Music | 88% | 87% | 92% | 89% | 90% | 85% |
TunePro Music | 93% | 92% | 93% | 93% | 92% | 93% |
Shazam | 92% | 89% | 91% | 91% | 91% | 92% |
Spotify | 90% | 90% | 93% | 92% | 89% | 91% |
ES File Explorer | 89% | 87% | 87% | 89% | 86% | 89% |
average | 89% | 89% | 81% | 88% | 90% | 89% |
Application crash defects:
ActivityNotFoundException, Activity not found exception.
IllegalArgumentException, illegal parameter exception.
IllegalStateException, illegal state exception.
NullPointer Exception, null pointer exception.
The button is missing, it should have interacted with a component in the GUI, but the component is not found.
Information is missing; part of the information that should exist is missing.
GUI anomaly, where the GUI changes after interaction, but the GUI interface is displayed differently than it should be.
GUI display anomalies, where GUI display anomalies such as buttons unresponsive or GUI information abnormalities and missing information appear most frequently in the experiment. In addition, only a few exceptions were found for the commercial application, which may be related to the fact that it has been adequately tested, while the open-source application has more defects. No functional anomalies were found for the commercial application in the experiments, only application crashes.
The experiments conducted in this study validate the efficacy of the proposed semantic model-driven automated testing approach. By utilizing the domain semantic model as the core, this approach enhances the reusability of the testing model and introduces a new perspective on mobile application testing. Furthermore, it facilitates a completely non-invasive testing approach, particularly in the robot vision environment. To address the shortcomings of current mobile application functional testing, a semantic model-driven automated testing approach is proposed. The paper investigates an extended semantic model-driven automated testing method, based on a domain model of mobile applications. It first explores the states of the tested application with the goal of achieving maximum reachable states, thereby establishing an extended semantic FSM-ES model. Subseque-ntly, based on the domain model’s action flowchart, the FSM-ES model is extended and mapped to a task subgraph with feasible paths as the goal, aiming to cover application functionality. This modeling of the tested application is accomplished from two perspectives: the GUI state reachability relationships (FSM-ES) and feasible paths between GUI states (task subgraph).Following this, by defining semantic coverage-oriented testing criteria, the goal is to achieve the broadest path coverage within the task subgraph. This process generates test cases targeting application functionality. Through testing verification in various application domains such as aviation services, among 13 discovered defect categories totaling 34 defects, the test cases generated by the semantic testing model achieved defect detection rates of 70.6% in the robot’s visual environment and 82.4% in a simulated environment. Moreover, the semantic model-generated test cases were able to simultaneously detect application crashes and functional anomalies, supporting complex automated testing of functionalities with strict requirements for behavior sequences and test inputs.
To address the shortcomings of current mobile application functional testing, a semantic model-driven automated testing approach is proposed. The paper investigates an extended semantic model-driven automated testing method, based on a domain model of mobile applications. It first explores the states of the tested application with the goal of achieving maximum reachable states, thereby establishing an extended semantic FSM-ES model. Subsequently, based on the domain model’s action flowchart, the FSM-ES model is extended and mapped to a task subgraph with feasible paths as the goal, aiming to cover application functionality. This modeling of the tested application is accomplished from two perspectives: the GUI state reachability relationships (FSM-ES) and feasible paths between GUI states (task subgraph). Following this, by defining semantic coverage-oriented testing criteria, the goal is to achieve the broadest path coverage within the task subgraph. This process generates test cases targeting application functionality. Through testing verification in various application domains such as aviation services, among 13 discovered defect categories totaling 34 defects, the test cases generated by the semantic testing model achieved defect detection rates of 70.6% in the robot’s visual environment and 82.4% in a simulated environment. Moreover, the semantic model-generated test cases were able to simultaneously detect application crashes and functional anomalies, supporting complex automated testing of functionalities with strict requirements for behavior sequences and test inputs.