Tube-CNN: Modeling temporal evolution of appearance for object detection in video

Tuan-Hung Vu; Anton Osokin; Ivan Laptev

Pré-Publication, Document De Travail Année : 2019

Tube-CNN: Modeling temporal evolution of appearance for object detection in video

(1) , (2) , (3)

1
2
3

Tuan-Hung Vu

Fonction : Auteur
PersonId : 959043

Département d'informatique - ENS Paris

Anton Osokin

Fonction : Auteur

Statistical Machine Learning and Parsimony

Ivan Laptev

Fonction : Auteur

Models of visual object recognition and scene understanding

Résumé

Object detection in video is crucial for many applications. Compared to images, video provides additional cues which can help to disambiguate the detection problem. Our goal in this paper is to learn discriminative models for the temporal evolution of object appearance and to use such models for object detection. To model temporal evolution, we introduce space-time tubes corresponding to temporal sequences of bounding boxes. We propose two CNN architectures for generating and classifying tubes, respectively. Our tube proposal network (TPN) first generates a large number of spatio-temporal tube proposals maximizing object recall. The Tube-CNN then implements a tube-level object detector in the video. Our method improves state of the art on two large-scale datasets for object detection in video: HollywoodHeads and ImageNet VID. Tube models show particular advantages in difficult dynamic scenes.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Tuan-Hung Vu : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01980339

Soumis le : lundi 14 janvier 2019-13:01:24

Dernière modification le : vendredi 19 avril 2024-16:18:56

Dates et versions

hal-01980339 , version 1 (14-01-2019)

Identifiants

HAL Id : hal-01980339 , version 1
ARXIV : 1812.02619

Citer

Tuan-Hung Vu, Anton Osokin, Ivan Laptev. Tube-CNN: Modeling temporal evolution of appearance for object detection in video. 2019. ⟨hal-01980339⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL

165 Consultations

0 Téléchargements

Tube-CNN: Modeling temporal evolution of appearance for object detection in video

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager