SecuGuard: Leveraging pattern-exploiting training in language models for advanced software vulnerability detection

Identifying vulnerabilities within source code remains paramount in assuring software quality and security. This study introduces a refined semi-supervised learning methodology that capitalizes on pattern-exploiting training coupled with cloze-style interrogation techniques. The research strategy employed involves the training of a linguistic model on the SARD and Devign datasets, which are replete with vulnerable code fragments. The training procedure entails obscuring specific segments of the code and subsequently prompting the model to ascertain the obfuscated tokens. Empirical analyses underscore the efficacy of our method in pinpointing vulnerabilities in source code, benefiting substantially from patterns discerned within the code fragments. This investigation underscores the potential of integrating pattern-exploiting training and cloze-based queries to enhance the precision of vulnerability detection within source code.

eISSN:: 2956-7068
Lingua:: Inglese

Frequenza di pubblicazione:: 2 volte all'anno
Argomenti della rivista:: Computer Sciences, other, Engineering, Introductions and Overviews, Mathematics, General Mathematics, Physics

Feed RSS della rivista

SecuGuard: Leveraging pattern-exploiting training in language models for advanced software vulnerability detection

Article Category: Original Study

Pubblicato online: 02 giu 2024

Pagine: -

Ricevuto: 28 ott 2023

Accettato: 16 gen 2024

DOI: https://doi.org/10.2478/ijmce-2025-0005

Parole chiaveLanguage models, software vulnerabilities, vulnerability detection, cloze-style questions, pattern-exploiting training, RoBERTa

© 2025 Mahmoud Basharat et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Parole chiave
Language models, software vulnerabilities, vulnerability detection, cloze-style questions, pattern-exploiting training, RoBERTa