Otwarty dostęp

From text to threats: A language model approach to software vulnerability detection


Zacytuj

Fig. 1

An Overview of our defense framework.
An Overview of our defense framework.

Fig. 2

Comparison of F1 scores across different models and datasets.
Comparison of F1 scores across different models and datasets.

Fig. 3

Model size comparison across the three models.
Model size comparison across the three models.

Comparison of models’ performance on various datasets.

Model Score SARD SeVC Devign D2A

VulBERTa 88.7 84.2 80.5 81.8 79.9
SySeVR 81.5 82.6 78.3 80.2 72.7
DistilVulBERT 94.0 91.4 82.2 87.5 85.9

Fine-tuning time comparison.

Model Dataset Fine-tuning time (hours)

VulBERTa SARD 1.2
SySeVR SeVC 1.1
DistilVulBERT SARD 0.8
DistilVulBERT SeVC 0.9

Model overhead analysis.

Model Parameters (millions) Training time (hours)

VulBERTa 110 8.2
SySeVR 90 6.5
DistilVulBERT 66 5.0

Hyperparameters of the models.

Hyperparameter GPT-2 CodeBERT LSTM

Learning rate 0.001 0.0005 0.01
Batch size 32 64 128
Epochs 5 10 3
Optimizer Adam AdamW RMSprop
Dropout rate 0.1 0.05 0.2
Hidden units 768 312 256
Attention heads 12 8
Layers 12 12 1

j.ijmce-2024-0003.tab.005

Require: Set of labeled training data D = {(xi,yi)}
Require: Set of K teacher models T = Tk
Require: Student model S
Ensure: Trained student model
1: Initialize student model parameters θS randomly.
2: for each teacher model TkT do
3: Compute predictions pk (x) for each xiD.
4: Initialize student model weights to match Tk.
5: Train student model on D using: KDLoss
(θS,θT(k);D)=1ni=1nDKL(pk(xi)qs(xi;θS,θT(k))) (\theta_S,\theta^{(k)}_{T};D)=\frac{1}{n}\sum^\nolimits{n}_{i=1}D_{KL}(p_k(x_i)\parallel q_s (x_i;\theta_S,\theta^{(k)}_T))
where DKL denotes Kullback-Leibler divergence and qs(xi;θS,θT(k)) q_s (x_i;\theta_S,\theta^{(k)}_T) is the softmax output of student model.
6: end for
7: return Trained student model S
eISSN:
2956-7068
Język:
Angielski
Częstotliwość wydawania:
2 razy w roku
Dziedziny czasopisma:
Computer Sciences, other, Engineering, Introductions and Overviews, Mathematics, General Mathematics, Physics