Evaluating the Efficacy of Different Neural Network Deep Reinforcement Algorithms in Complex Search-and-Retrieve Virtual Simulations

  • Ishita Vohra
  • , Shashank Uttrani
  • , Akash K. Rao*
  • , Varun Dutt
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

In recent years, Deep Reinforcement Learning (DRL) has been extensively used to solve problems in various domains like traffic control, healthcare, and simulation-based training. Proximal Policy Optimization (PPO) and Soft-Actor Critic (SAC) methods are DRL’s latest state of art on-policy and off-policy algorithms. Though previous studies have shown that SAC generally performs better than PPO, hyperparameter tuning can significantly impact the performance of these algorithms. Also, a systematic evaluation of the efficacy of these algorithms after hyperparameter tuning in dynamic and complex environments is missing and much needed in literature. This research aims to evaluate the effect of the number of layers and nodes in SAC and PPO algorithms in a search-and-retrieve task developed in the Unity 3D game engine. In the task, a bot had to navigate through the physical mesh and collect ‘target’ objects while avoiding ‘distractor’ objects. We compared the SAC and PPO models on four different test conditions that differed in the ratios of targets and distractors. Results revealed that PPO performed better than SAC for all test conditions when the number of layers and units present in the architecture was the lowest. When the number of targets was more than the distractors (9:1), PPO outperformed SAC, especially when the number of units and layers were large. Furthermore, increasing the layers and units per layer was responsible for increasing PPO and SAC performance. Results also implied that similar hyperparameter settings might be used while comparing models developed using DRL algorithms. We discuss the implications of these results and explore the possible applications of using modern, state-of-the-art DRL algorithms to learn the semantics and idiosyncrasies associated with complex and dynamic environments.

Original languageEnglish
Title of host publicationAdvanced Computing - 11th International Conference, IACC 2021
EditorsDeepak Garg, Sarangapani Jagannathan, Ankur Gupta, Lalit Garg, Suneet Gupta
PublisherSpringer Science and Business Media Deutschland GmbH
Pages348-361
Number of pages14
ISBN (Print)9783030955014
DOIs
Publication statusPublished - 2022
Event11th International Advanced Computing Conference, IACC 2021 - Msida, Malta
Duration: 18-12-202119-12-2021

Publication series

NameCommunications in Computer and Information Science
Volume1528 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference11th International Advanced Computing Conference, IACC 2021
Country/TerritoryMalta
CityMsida
Period18-12-2119-12-21

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Evaluating the Efficacy of Different Neural Network Deep Reinforcement Algorithms in Complex Search-and-Retrieve Virtual Simulations'. Together they form a unique fingerprint.

Cite this