Skip to main navigation Skip to search Skip to main content

Evaluating the Efficacy of Different Neural Network Deep Reinforcement Algorithms in Complex Search-and-Retrieve Virtual Simulations

  • Ishita Vohra
  • , Shashank Uttrani
  • , Akash K. Rao*
  • , Varun Dutt
  • *Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    In recent years, Deep Reinforcement Learning (DRL) has been extensively used to solve problems in various domains like traffic control, healthcare, and simulation-based training. Proximal Policy Optimization (PPO) and Soft-Actor Critic (SAC) methods are DRL’s latest state of art on-policy and off-policy algorithms. Though previous studies have shown that SAC generally performs better than PPO, hyperparameter tuning can significantly impact the performance of these algorithms. Also, a systematic evaluation of the efficacy of these algorithms after hyperparameter tuning in dynamic and complex environments is missing and much needed in literature. This research aims to evaluate the effect of the number of layers and nodes in SAC and PPO algorithms in a search-and-retrieve task developed in the Unity 3D game engine. In the task, a bot had to navigate through the physical mesh and collect ‘target’ objects while avoiding ‘distractor’ objects. We compared the SAC and PPO models on four different test conditions that differed in the ratios of targets and distractors. Results revealed that PPO performed better than SAC for all test conditions when the number of layers and units present in the architecture was the lowest. When the number of targets was more than the distractors (9:1), PPO outperformed SAC, especially when the number of units and layers were large. Furthermore, increasing the layers and units per layer was responsible for increasing PPO and SAC performance. Results also implied that similar hyperparameter settings might be used while comparing models developed using DRL algorithms. We discuss the implications of these results and explore the possible applications of using modern, state-of-the-art DRL algorithms to learn the semantics and idiosyncrasies associated with complex and dynamic environments.

    Original languageEnglish
    Title of host publicationAdvanced Computing - 11th International Conference, IACC 2021
    EditorsDeepak Garg, Sarangapani Jagannathan, Ankur Gupta, Lalit Garg, Suneet Gupta
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages348-361
    Number of pages14
    ISBN (Print)9783030955014
    DOIs
    Publication statusPublished - 2022
    Event11th International Advanced Computing Conference, IACC 2021 - Msida, Malta
    Duration: 18-12-202119-12-2021

    Publication series

    NameCommunications in Computer and Information Science
    Volume1528 CCIS
    ISSN (Print)1865-0929
    ISSN (Electronic)1865-0937

    Conference

    Conference11th International Advanced Computing Conference, IACC 2021
    Country/TerritoryMalta
    CityMsida
    Period18-12-2119-12-21

    All Science Journal Classification (ASJC) codes

    • General Computer Science
    • General Mathematics

    Fingerprint

    Dive into the research topics of 'Evaluating the Efficacy of Different Neural Network Deep Reinforcement Algorithms in Complex Search-and-Retrieve Virtual Simulations'. Together they form a unique fingerprint.

    Cite this