Multi-stream Multi-attention Deep Neural Network for Context-Aware Human Action Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Technological innovations in deep learning models have enabled reasonably close solutions to a wide variety of computer vision tasks such as object detection, face recognition, and many more. On the other hand, Human Action Recognition (HAR) is still far from human-level ability due to several challenges such as diversity in performing actions. Due to data availability in multiple modalities, HAR using video data recorded by RGB-D cameras is frequently used in current research. This paper proposes an approach for recognizing human actions using depth and skeleton data captured using the Kinect depth sensor. Attention modules have been introduced in recent years to assist in focusing on the most important features in computer vision tasks. This paper proposes a multi-stream deep learning model with multiple attention blocks for HAR. At first, the depth and skeletal modalities' action data are represented using two distinct action descriptors. Each generates an image from the action data gathered from numerous frames. The proposed deep learning model is trained using these descriptors. Additionally, we propose a set of score fusion techniques for accurate HAR using all the features and trained CNN + LSTM streams. The proposed method is evaluated on two benchmark datasets using well known cross-subject evaluation protocol. The proposed technique achieved 89.83% and 90.7% accuracy on the MSRAction3D and UTDMHAD datasets, respectively. The experimental results establish the validity and effectiveness of the proposed model.

Original languageEnglish
Title of host publication2022 IEEE Region 10 Symposium, TENSYMP 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665466585
DOIs
Publication statusPublished - 2022
Event2022 IEEE Region 10 Symposium, TENSYMP 2022 - Mumbai, India
Duration: 01-07-202203-07-2022

Publication series

Name2022 IEEE Region 10 Symposium, TENSYMP 2022

Conference

Conference2022 IEEE Region 10 Symposium, TENSYMP 2022
Country/TerritoryIndia
CityMumbai
Period01-07-2203-07-22

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Information Systems and Management
  • Health Informatics
  • Computer Science Applications
  • Artificial Intelligence
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Multi-stream Multi-attention Deep Neural Network for Context-Aware Human Action Recognition'. Together they form a unique fingerprint.

Cite this