Skip to main navigation Skip to search Skip to main content

Deep Reinforcement Learning for Energy Management in Hybrid Electric Vehicles with Softmax Double-Actor Regularized Critics

  • Jewaliddin Shaik
  • , Sri Phani Krishna Karri*
  • , Anugula Rajamallaiah
  • , Kishore Bingi
  • , Ramani Kannan
  • , Vikas Singh Panwar*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Enhancing fuel efficiency in hybrid electric vehicles (HEVs) requires energy management strategies (EMSs) that can operate effectively under nonlinear powertrain dynamics and uncertain, time-varying driving conditions. This paper proposes a deep reinforcement learning (DRL)- based EMS using the double actors regularized critics softmax deep deterministic policy gradient (DARC SD3) algorithm, which integrates Boltzmann-softmax value estimation, a dual-actor architecture, and critic regularization to improve learning stability and value-estimation accuracy. Simulation results show that the proposed DARC SD3 achieves faster convergence, improved state-of-charge (SOC) regulation, and reduced value estimation bias compared with DDPG, TD3, and baseline SD3. Under the FTP-75 driving cycle, the proposed EMS attains 94.6% of the dynamic programming (DP) benchmark fuel economy, while reducing engine transients and smoothing battery power flow. Further evaluation on an unseen composite driving cycle confirms that the trained policy maintains consistent fuel economy and SOC control, demonstrating strong generalization capability across diverse driving conditions.

Original languageEnglish
Pages (from-to)723-736
Number of pages14
JournalIEEE Open Journal of Vehicular Technology
Volume7
DOIs
Publication statusAccepted/In press - 2026

All Science Journal Classification (ASJC) codes

  • Automotive Engineering

Fingerprint

Dive into the research topics of 'Deep Reinforcement Learning for Energy Management in Hybrid Electric Vehicles with Softmax Double-Actor Regularized Critics'. Together they form a unique fingerprint.

Cite this