《火灾科学（中英文）》编辑部

基于时空网络的视频火灾识别方法研究

Research on Video Fire Recognition Method Based on Spatial and Temporal Networks

投稿时间：2025-08-19 修订日期：2025-10-14

DOI：

基金项目:国家自然科学基金（51977074）

作者	单位	邮编
陈华俊^*	无锡科技职业学院	214028
周汝	南京工业大学
陈柏安	无锡产业发展集团有限公司

中文关键词: 视频火灾识别时空网络卷积神经网络双向长短期记忆网络特征差分结构

英文关键词:video fire recognition spatiotemporal networks convolutional neural network bidirectional long short-term memory network feature difference

摘要点击次数: 148

全文下载次数: 0

中文摘要:

为提升视频火灾的识别检测效果,将深度学习算法应用于火灾的动静态特征挖掘中,构建了一种兼顾时域和空间域信息的火灾识别时空网络。网络以火灾视频中连续多帧图像作为输入,利用深度可分离卷积结合注意力机制构建轻量级卷积网络结构,并配合特征分组策略实现对每帧火灾图像空间域静态特征的提取。基于多帧静态特征,引入双向长短期记忆网络,从时域角度充分挖掘火灾视频帧之间的动态特征,并利用自适应融合方式综合多帧火灾预测结果,保障网络识别精度。同时,考虑到实际场景的高度重复性,设计相似特征差分过滤结构,利用汉明距离计算相邻输入帧之间的特征重复率来降低冗余计算,提升视频火灾识别效率。通过在多个标准公开数据集上的实验结果可见,所提时空网络可以充分提取视频火灾动静态特征,与其他方法相比,该方法有效的平衡了识别精度与效率,具有更高的鲁棒性及泛化性,可以更好的适应不同的火灾场景。

英文摘要:

In order to improve the recognition and detection performance of video fires, deep learning algorithms were applied to the dynamic and static feature extraction of fires, and a fire recognition spatiotemporal network that takes into account both time-domain and spatial domain features was constructed. The network takes consecutive multiple frames of fire video as input, utilizes deep separable convolution combined with attention mechanism to construct a lightweight convolutional network structure, and cooperates with feature grouping strategy to extract static features in the spatial domain of each frame of fire image. Based on multi-frame static features, a bidirectional Long short-term memory network is introduced to fully mine the dynamic features between fire video frames from the time domain perspective, and the adaptive fusion method is used to synthesize the multi frame fire prediction results to ensure the network recognition accuracy. At the same time, considering the high repeatability of the actual scene, a similar feature differential filtering structure is designed, and the Hamming distance is used to calculate the feature repetition rate between adjacent input frames to reduce redundant computing and improve the efficiency of video fire recognition. The experimental results on multiple standard public datasets show that the proposed spatiotemporal network can fully extract the dynamic and static features of video fires. Compared with other methods, this method effectively balances recognition accuracy and efficiency, has higher robustness and generalization, and can better adapt to different fire scenarios.

关闭