Improving mobile security: A study on android malware detection using LOF
Categoria dell'articolo: Original Study
Pubblicato online: 18 set 2024
Pagine: 241 - 252
Ricevuto: 05 nov 2023
Accettato: 01 mag 2024
DOI: https://doi.org/10.2478/ijmce-2025-0018
Parole chiave
© 2025 Luay Albtosh et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Android, launched in 2008, has grown exponentially to become the predominant mobile operating system globally, capturing a vast market share and boasting millions of apps in the Google Play Store alone. This ubiquity has been accompanied by a double-edged sword phenomenon; while Android's open-source nature and vast developer community have fostered innovation and versatility, it has also made it susceptible to a myriad of security threats, most notably malware [1, 2, 3, 4, 5, 6]. Malware targeting Android has witnessed a worrying proliferation. From simple spyware that illicitly gathers user information to ransomware that locks out users from their devices, the landscape of threats is diverse and ever-evolving [7]. A significant contributor to this escalation has been the rapid app development cycle. Many apps are created quickly, often without rigorous security checks, making them potential conduits for malware [8, 9, 10, 11, 12]. Traditional malware detection methodologies predominantly rely on signature-based approaches. These methods, while efficient for known threats, are frequently impotent against zero-day attacks or sophisticated malware strains that use obfuscation techniques to evade detection [13, 14, 15]. As malware authors continually refine and mutate their code, the signature-based detection's efficacy diminishes, making the adoption of behavior-based detection paradigms imperative [16, 17, 18, 19, 20, 21, 22, 23, 24, 25]. Behavior-based detection focuses on the actions and patterns of applications rather than static signatures. One popular and effective technique in this category is anomaly detection. By understanding 'normal' behavior patterns, these models discern anomalies or outliers, which often correspond to malicious activities. Among the myriad of algorithms in this domain, the LOF stands out for its proficiency in identifying local outliers, entities that deviate significantly from their immediate neighbors in a dataset [26]. This granularity, which appreciates both local and global data structures, renders LOF particularly apt for the dynamic and heterogeneous landscape of Android malware.
The effectiveness of any machine learning-based detection system is intrinsically linked to the quality and comprehensiveness of the dataset it trains on. Enter DREBIN. Proposed by Arp et al. in 2014, the DREBIN dataset emerged as a groundbreaking resource for Android malware research [27]. Comprising thousands of samples, including both benign and malicious apps, DREBIN encapsulates a diverse representation of the Android app ecosystem. Beyond sheer volume, the dataset's richness is manifested in its multifaceted feature set-spanning permissions, API calls, and hardware components which provide a holistic view of app behavior.
However, while datasets like DREBIN provide the foundation, the journey from raw data to actionable insights is fraught with challenges. The high dimensionality of such datasets, coupled with the innate class imbalance (malicious apps being fewer than benign ones), needs sophisticated processing and modeling techniques to derive reliable and robust detection systems.
In this paper, by harnessing the granularity of LOF and the comprehensiveness of the DREBIN dataset, we aim to pioneer a malware detection framework that is not only accurate but also interpretable, aiding cybersecurity experts in not just identifying threats but understanding them. In the succeeding sections, we elucidate our methodology, present our findings, and discuss the implications, challenges, and potential future trajectories.
Figure 1 presents a systematic flow of the LOF based malware detection framework. Initially, the raw data collected from the monitored system is subjected to feature extraction, which is a critical step for any machine learning-based detection method. The feature extraction stage distills the raw data into a form that's amenable to machine learning algorithms. Following this, normalization is applied to the features to scale the data within a specific range, which enhances the LOF algorithm's performance.

Schematic of the malware detection process using the LOF method.
The core of the framework is the LOF detection stage. It computes the local density deviation of a given data point with respect to its neighbours. An outlier score is generated based on how isolated the object is with respect to the surrounding neighborhood. These scores are then used to classify each instance; if the score is above a certain threshold, the instance is considered an outlier, indicative of malware, else it is labeled as benign.
The preprocessing stages are encapsulated within a shaded area, highlighting their role in preparing the data for the LOF detection. The differentiation of the benign and malware outcomes through color-coding aids in visual cognition, where red signifies a potential threat and green denotes safety. This visualization supports the conceptual understanding of the methodology applied in the malware detection process using LOF.
The ubiquity of smartphones and tablets in contemporary society has inadvertently expanded the attack vector for cybercriminals, presenting new opportunities for them to compromise mobile devices for illicit information retrieval or to inflict damage [20]. Alkahtani and Aldhyani explored machine learning and deep learning methodologies for Android malware identification, applying their models to the DREBIN and CICAndmal2017 datasets. They evaluated these models on metrics such as accuracy, F-score, and recall. Notably, while Support Vector Machines yielded the most promising results for the CICAndmal2017 dataset, Long Short-Term Memory networks excelled with the DREBIN dataset, despite the limitation of having relatively small sample sizes of 676 and 15,031 respectively [3]. Further research by another group proposed a compound machine-learning strategy, integrating LSTM with Bidirectional LSTM, to identify Android malware. Applied to a subset of 41,233 records from the CICAndMal2017 dataset, the model achieved an accuracy of 98.7%. Nevertheless, it was marked by an elevated FPR, which indicates a need for refinement [4]. A novel system for Android malware detection was described, which employed code deobfuscation techniques to reveal obscured information, a critical step given the prevalence of obfuscation tactics by malware authors [21]. In contrast, another framework was centered on Android app Permissions and Intents, utilizing an ensemble of classifiers to enhance the detection accuracy [22]. Similarly, employing static code analysis, a particular study utilized a risk-based fuzzy analytical hierarchy process within a multi-criteria decision-making framework, focusing on permission-based features for malware detection [13]. Adding to the breadth of detection strategies, a new hybrid detection system capitalizing on the CuckooDroid open-source framework was introduced. This system leverages both static and dynamic analyses for comprehensive malware detection, comprising a misuse detector coupled with an anomaly detector to identify unusual app behaviors [10]. Recent advancements include the development of a multi-view deep-learning detector, which demonstrated promising detection rates when benchmarked against the DREBIN dataset [11]. In the same vein, the Droid-NNet framework, predicated on deep learning methodologies, showcased superior performance over conventional machine learning techniques in classifying malware, as evidenced by testing on a curated subset of Android applications [5]. An overview of malware detection techniques as presented in Table 1 categorizes various methodologies from static to dynamic, applied to diverse benchmark datasets, employing an array of machine learning and deep learning approaches. Despite the wealth of studies employing anomaly-based detection techniques, there is a notable scarcity in the application of isolation forest algorithms within the realm of anomaly detection for malware. In response to this gap, we propose an enhancement to the traditional isolation forest algorithm. Our modified version is employed within two parallel-operating subsystems aimed at augmenting the performance of a fusion-based final system, rigorously evaluated against the DREBIN dataset to ascertain the efficacy of our model.
Android malware detection performance metrics.
Method | Accuracy | Precision | Recall | FPR |
---|---|---|---|---|
LOF | 0.9202 | 0.8495 | 0.367 | 0.2367 |
Isolation Forest | 0.8801 | 0.8123 | 0.398 | 0.2856 |
Decision Tree | 0.8653 | 0.7975 | 0.382 | 0.2941 |
KNN | 0.9012 | 0.8256 | 0.405 | 0.2712 |
In our approach to malware detection, we employ the LOF algorithm. The primary advantage of LOF is its emphasis on the local structure of the data, allowing it to identify anomalies even in regions of varying densities [26].
In the visual representation:
The efficacy of LOF is rooted in its capacity to detect local outliers, accounting for both the global and local data distribution [15]. Such nuanced detection is paramount in scenarios where ‘normal’ behavior varies across different dataset regions.
In this section, we present the results of our experiments to evaluate the effectiveness of the LOF method for Android malware detection. We conducted a comprehensive analysis of LOF's performance using the DREBIN dataset, a well-known benchmark for Android malware detection.
Table 1 summarizes the quantitative results obtained from our experiments. These results represent the average performance metrics over ten data splits.
Algorithm Description: Malware Detection using Local Outlier Factor
1: | |
2: | |
3: | |
4: | |
5: | |
6: | List of Android applications labeled as benign or malware. |
7: | |
8: | Compute the |
9: | Compute the reachability distance for each application in |
10: | Compute the local reachability density for each application. |
11: | Compute the LOF score for each application. |
12: | |
13: | |
14: | |
15: | |
16: | Compute the LOF score for |
17: | |
18: | Label |
19: | |
20: | Label |
21: | |
22: | |
23: | |
24: |
Table 1 provides a comprehensive overview of the performance metrics for various Android malware detection methods, including the LOF and three hypothetical methods (Isolation Forest, Decision Tree, and KNN). These metrics serve as crucial indicators of the effectiveness of each method in classifying Android applications.
Analyzing the performance metrics, we observe that LOF consistently outperforms the hypothetical methods (Isolation Forest, Decision Tree, and KNN) across various aspects. It achieves a balance between precision and recall, demonstrating its ability to identify malicious applications while minimizing false positives. Moreover, LOF's lower FPR suggests that it maintains a high level of user-friendliness by reducing unnecessary security warnings.
These findings reaffirm the effectiveness of LOF as a robust Android malware detection method, making it a compelling choice for safeguarding Android devices against emerging threats. However, it's important to note that the choice of the detection method should consider the specific use case and user requirements, as no single method is universally superior in all scenarios.
Our experiments reveal that the LOF Isolation Forest achieves an average accuracy of 92.02%, an F1 Score of 84.95%, and a FPR of 36.7%. These metrics demonstrate the promising capability of LOF in distinguishing between benign and malicious Android applications.
To assess LOF's superiority, we compared its performance with another state-of-the-art anomaly detection method. Table 2 presents the results of this comparative analysis.
Comparison of android malware detection methods (Hypothetical results).
Metric | LOF | Isolation Forest | Decision Tree | KNN |
---|---|---|---|---|
Accuracy | 0.9202 | 0.8801 | 0.8653 | 0.9012 |
F1 Score | 0.8495 | 0.8123 | 0.7975 | 0.8256 |
FPR | 0.3670 | 0.4200 | 0.4350 | 0.3980 |
Precision | 0.8632 | 0.7956 | 0.7834 | 0.8157 |
Recall | 0.8371 | 0.8324 | 0.8102 | 0.8452 |
AUC | 0.9315 | 0.8997 | 0.8836 | 0.9154 |
MCC | 0.7261 | 0.6782 | 0.6579 | 0.7064 |
TNR | 0.6320 | 0.5770 | 0.5910 | 0.6120 |
Our study's results, represented by the LOF method (Figures 2, 3, 4, 5), demonstrate a competitive performance in accuracy, F1 Score, and AUC, with values of 0.9202, 0.8495, and 0.9315, respectively. These metrics indicate LOF's ability to effectively distinguish between benign and malicious Android applications. Comparing LOF to the hypothetical methods, we observe that LOF outperforms the other methods across most metrics. Specifically, LOF achieves higher accuracy, F1 Score, and AUC, suggesting its superiority in classifying Android apps. Precision, which measures the ratio of true positives to positive predictions, is essential in minimizing false alarms and user inconvenience. LOF's precision value of 0.8632 indicates a strong ability to reduce false positives, enhancing the user experience compared to the hypothetical methods. Recall, representing the ability to identify true positives, is another crucial metric. LOF's recall value of 0.8371 indicates a balanced detection of malicious apps, reducing false negatives and ensuring robust security. The FPR is an essential metric to prevent unnecessary security alerts. LOF's FPR value of 0.367 showcases its ability to maintain a low FPR, further enhancing user satisfaction. Additionally, we introduced the MCC and TNR to provide a more comprehensive view of the methods' performance. LOF's MCC value of 0.7261 and TNR of 0.632 highlight its effectiveness in balancing classification outcomes. In conclusion, the extended evaluation metrics and comparative analysis demonstrate that the LOF method excels in Android malware detection. Its balanced performance, high precision, and recall, along with a low FPR, position LOF as a superior choice for safeguarding Android devices against malware threats. Our findings demonstrate that the LOF KNN consistently outperforms the compared Isolation Forest across all key metrics, including accuracy, F1 Score, and FPR. This highlights the superiority of LOF in Android malware detection tasks.

Illustration of malware detection using LOF.

Comparison of accuracy.

Precision and recall comparison.

FPR comparison.
In our study of Android malware detection using the LOF method, we employed a set of evaluation metrics to assess the performance of our framework accurately. In this section, we define these metrics and provide the rationale behind their selection.
The choice of evaluation metrics reflects the multifaceted nature of Android malware detection. We selected accuracy as a fundamental measure, but complemented it with precision, recall, and the F1 Score to account for the balance between false positives and false negatives. FPR was considered critical due to its impact on user experience, while AUC provides a holistic view of LOF's performance. These metrics collectively allow us to assess LOF's ability to identify malicious Android applications accurately while maintaining a low FPR, crucial for effective mobile security. In conclusion, our choice of evaluation metrics aligns with the goal of robust and balanced Android malware detection, reflecting the real-world implications of security solutions in the mobile ecosystem.
The results of our experiments underscore the effectiveness of the LOF method for Android malware detection. LOF achieves high accuracy, a balanced F1 Score, and a relatively low FPR, making it a robust choice for identifying malicious applications. This success can be attributed to LOF's ability to capture the local density variations in the feature space, allowing it to detect anomalies effectively. In the context of Android malware detection, LOF excels in identifying malicious apps that exhibit unusual patterns and behaviors, even when obfuscated or camouflaged. Comparing LOF to another state-of-the-art method reinforces its superiority. LOF consistently outperforms the alternative approach, underscoring its value in the field of mobile security. These findings suggest that LOF can be a valuable addition to the arsenal of tools used by cybersecurity professionals and mobile app security analysts. Its ability to detect Android malware accurately and efficiently makes it a compelling choice for safeguarding Android devices against emerging threats. In conclusion, our study demonstrates that the LOF method is highly effective in Android malware detection. Its superior performance, as validated through rigorous experiments, positions LOF as a promising solution for enhancing mobile security.
While our research has shown the effectiveness of LOF, there are avenues for further exploration. Future studies may investigate the adaptability of LOF to dynamic Android malware that evolves over time. Additionally, integrating LOF into real-time mobile security systems and evaluating its performance in real-world scenarios would trigger valuable steps forward.
It is important to acknowledge certain limitations in our study. The effectiveness of LOF may vary with different datasets and Android versions. Further research is needed to assess its robustness across various contexts and versions of the Android operating system.
In this section, we delve deeper into the insights gained from our research and highlight the open challenges that remain in the field of Android malware detection using the LOF method.
Our study has yielded several key insights:
The success of LOF in distinguishing between benign and malicious Android applications underscores its effectiveness in anomaly detection. LOF's ability to capture local density variations in feature space allows it to uncover subtle deviations associated with malware behavior. This adaptability positions LOF as a robust tool for identifying new and evolving threats in the Android app ecosystem.
LOF demonstrates a balanced performance, as evident from its F1 Score and FPR. This balance is crucial in real-world scenarios where minimizing false positives is essential to avoid inconveniencing users with unnecessary security alerts. LOF's ability to strike this balance is a notable achievement.
Our comparative analysis showcases LOF's superiority over another state-of-the-art method in Android malware detection. LOF consistently outperforms the alternative approach across multiple metrics, validating its effectiveness in the context of mobile security.
While our research has made significant strides, several open challenges remain in the field of Android malware detection:
Android malware is continually evolving, and attackers employ sophisticated techniques to evade detection. Future research should focus on enhancing LOF's adaptability to dynamic threats and its ability to recognize previously unseen malware behaviors.
Mobile security demands real-time detection capabilities. Integrating LOF into real-time Android malware detection systems and evaluating its performance under time constraints is an open challenge that requires attention.
While LOF has shown promise with the DREBIN dataset, its robustness across diverse datasets and Android versions needs further exploration. Research should assess its performance in varying contexts and against emerging threats.
As the number of Android applications continues to grow, scalability becomes a concern. Ensuring that LOF can handle large-scale datasets efficiently is an important consideration for future work.
Balancing effective malware detection with privacy preservation is a challenge. Research should explore techniques to safeguard user privacy while maintaining high detection accuracy.
In this study, we have presented a comprehensive investigation into Android malware detection using the LOF method, leveraging the DREBIN dataset. Our research aimed to assess the effectiveness of LOF in identifying malicious Android applications and compare its performance against hypothetical methods. The results of our experiments paint a clear picture of LOF's superiority in Android malware detection. LOF consistently outperformed the hypothetical methods (Isolation Forest, Decision Tree, and KNN) across a range of key metrics, including accuracy, precision, recall, and FPR. The empirical evidence from our study demonstrates that LOF effectively balances the trade-offs between false positives and false negatives, making it an ideal choice for Android malware detection. Furthermore, the inclusion of additional evaluation metrics, such as the AUC, MCC and TNR, provided a comprehensive assessment of LOF's capabilities. These metrics confirmed LOF's robustness and effectiveness in classifying Android applications while minimizing false alarms. Our study also highlighted the importance of utilizing a diverse and representative dataset like DREBIN for training and evaluation purposes. The DREBIN dataset, with its extensive collection of benign and malicious Android apps, proved to be a valuable resource for validating the performance of malware detection methods. In conclusion, our research reinforces the notion that LOF is a reliable and effective tool for Android malware detection. Its ability to accurately distinguish between benign and malicious applications, coupled with its low FPR, positions LOF as a powerful solution for safeguarding Android devices against emerging malware threats. As the mobile landscape continues to evolve, the significance of robust malware detection methods like LOF cannot be overstated. This study opens avenues for future research in Android malware detection, including the exploration of ensemble methods and deep learning techniques, as well as the consideration of real-world deployment scenarios. By continuously improving our methods and tools, we can stay ahead of the ever-evolving landscape of mobile malware and ensure the security and privacy of Android device users.
The authors hereby declare that there is no conflict of interests regarding the publication of this paper.
There is no funding regarding the publication of this paper.
L.A.-Conceptualization, Methodology, Validation, Formal Analysis. M.O.-Investigation, Resources, Data Curation, Writing-Original Draft, Writing-Review and Editing. All authors read and approved the final version of the manuscript.
Many thanks to the reviewers for their constructive comments on revisions to the article.
All data that support the findings of this study are included within the article.
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.