A Baseline for Violence Behavior Detection in Complex Surveillance Scenarios
, , oraz
31 gru 2024
O artykule
Data publikacji: 31 gru 2024
Zakres stron: 48 - 58
DOI: https://doi.org/10.2478/ijanmc-2024-0036
Słowa kluczowe
© 2024 Yingying Long et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Figure 1.

Figure 2.

Figure 3.

Figure 4.

Detection results of 3D-CBAM attention model embedded at different locations
Network | Embedding position | UCF101-24 | JHMDB | VioData |
---|---|---|---|---|
- | 84.4% | 80.4% | 86.5% | |
3D Inc_1 | 86.1% | 83.7% | 89.0% | |
3D Inc_2 | 86.7% | 83.3% | 88.3% | |
3D Inc_3 | 85.9% | 84.2% | 89.6% | |
3D Inc_1+3D Inc_2 | 88.2% | 87.5% | 90.7% | |
I3D | 3D Inc_1+3D Inc_3 | 89.8% | 88.6% | 91.8% |
3D Inc_2+3D Inc_3 | 88.0% | 88.0% | 91.4% | |
3D Inc_1+3D Inc_2+3D Inc_3 | 90.0% | 88.7% | 92.0% |
Parameter settings in network training
Parameter | Setting |
---|---|
Initial Learning Rate | 0.001 |
Epoch | 230 |
ReSize | (416,416) |
ReSize | (416,416) |
Weight Decay | 0.0005 |
Optimizer | Adam |
Results of violence detection accuracy of different models
Method | UCF101-24 | JHMDB | VioData |
---|---|---|---|
MPS | 82.4% | - | 85.3 |
P3D-CTN | - | 84.0% | 84.9% |
STEP | 83.1% | - | 86.4% |
YOWO | 82.5% | 85.7% | 88.0% |
ours | 89.8% | 88.6% | 91.8% |
Detection results with embedded ASPP model and introduction of spatio-temporal depth separable convolution
Network | UCF101-24 | JHMDB | VioData |
---|---|---|---|
Baseline | 78.5% | 75.3% | 78.9% |
CSPDarkNet-Tiny+ASPP | 80.7% | 76.6% | 82.0% |
CSPDarkNet-Tiny+ASPP++I3D(Impr oved 3D Inc) | 84.8% | 80.4% | 86.5% |