Where to Focus on for Human Action Recognition? - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Where to Focus on for Human Action Recognition?

Résumé

In this paper, we present a new attention model for the recognition of human action from RGB-D videos. We propose an attention mechanism based on 3D articulated pose. The objective is to focus on the most relevant body parts involved in the action. For action classification, we propose a classification network compounded of spatio-temporal sub-networks modeling the appearance of human body parts and RNN attention subnetwork implementing our attention mechanism. Furthermore, we train our proposed network end-to-end using a regularized cross-entropy loss, leading to a joint training of the RNN delivering attention globally to the whole set of spatio-temporal features, extracted from 3D ConvNets. Our method outperforms the State-of-the-art methods on the largest human activity recognition dataset available to-date (NTU RGB+D Dataset) which is also multi-views and on a human action recognition dataset with object interaction (Northwestern-UCLA Multiview Action 3D Dataset).
Fichier principal
Vignette du fichier
421.pdf (767.4 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01927432 , version 1 (19-11-2018)

Identifiants

  • HAL Id : hal-01927432 , version 1

Citer

Srijan Das, Arpit Chaudhary, Francois Bremond, Monique Thonnat. Where to Focus on for Human Action Recognition?. WACV 2019 - IEEE Winter Conference on Applications of Computer Vision, Jan 2019, Waikoloa Village, Hawaii, United States. pp.1-10. ⟨hal-01927432⟩
295 Consultations
1927 Téléchargements

Partager

Gmail Facebook X LinkedIn More