A Baseline for Violence Behavior Detection in Complex Surveillance Scenarios

Violence detection can improve the ability to deal with emergencies, but there is still no data set specifically for violence detection. In this work, we propose VioData, a datasets specialized for detection in complex surveillance scenarios, and to more accurately assess the efficacy of these datasets, we propose a violence detection model based on target detection and 3D convolution. The model consists of two key modules: spatio-temporal feature extraction module and spatiotemporal feature fusion module. Among them, the spatio-temporal feature extraction module consists of a spatial feature module that extracts key frames using ordinary convolutional networks and a temporal feature extraction module that establishes temporal features using 3D convolution. The spatio-temporal feature fusion module Channel Fusion and Attention Mechanism (CFAM) fuses the temporal and spatial features. The experimental results indicate that the precision of the suggested detection model on UCF101-24, JHMDB behavioral detection datasets, and our proposed violence detection datasets, VioData, is improved compared to other violence detection models, which not only verifies the validity of the datasets, but also provides a baseline for the subsequent research and improvement in this area.

Idioma:: Inglés

Calendario de la edición:: 4 veces al año
Temas de la revista:: Informática, Informática, otros

RSS Feed de revista

A Baseline for Violence Behavior Detection in Complex Surveillance Scenarios

Yingying Long

Zongxin Wang

Hanzhu Wei

Xiaojun Bai

Publicado en línea: 31 dic 2024

Páginas: 48 - 58

DOI: https://doi.org/10.2478/ijanmc-2024-0036

Palabras claveViolent Behavior Detection, Datasets, Spatio-temporal Feature, Target Detection, Feature Fusion

© 2024 Yingying Long et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Palabras clave
Violent Behavior Detection, Datasets, Spatio-temporal Feature, Target Detection, Feature Fusion