TIG: A Multitask Temporal Interval Guided Framework for Key Frame Detection

摘要

Detecting key frames in videos has garnered substantial attention in recent years, it is a point-level task and has deep research value and application prospect in daily life. For instances, video surveillance system, video cover generation and highlight moment flashback all demands the technique of key frame detection. However, the task is beset by challenges such as the sparsity of key frame instances, imbalances between target frames and background frames, and the absence of post-processing method. In response to these problems, we introduce a novel and effective Temporal Interval Guided (TIG) framework to precisely localize specific frames. The framework is incorporated with a proposed Point-Level-Soft non-maximum suppression (PLS-NMS) post-processing algorithm which is suitable for point-level task, facilitated by the well-designed confidence score decay function. Furthermore, we propose a TIG-loss, exhibiting sensitivity to temporal interval from target frame, to optimize the two-stage framework. The proposed method can be broadly applied to key frame detection in video understanding, including action start detection and static video summarization. Extensive experimentation validates the efficacy of our approach on action start detection benchmark datasets:THUMOS’14 and Activitynet v1.3, and we have reached state-of-the-art performance. Competitive results are also demonstrated on SumMe and TVSum datasets for deep learning based static video summarization.

出版物
In IEICE TRANSACTIONS on Information
Shijie Wang(王师捷)
Shijie Wang(王师捷)
硕士(2022-)

简略介绍

Xuejiao Hu(胡雪娇)
博士(2019-2024)

简略介绍

Sheng Liu(刘晟)
Sheng Liu(刘晟)
硕博连读(2021-)

简略介绍

Ming Li(李明)
硕博连读(2017-2024)

简略介绍

Yang Li(李杨)
Yang Li(李杨)
副教授

简略介绍

Sidan Du(都思丹)
Sidan Du(都思丹)
教授

简略介绍