Review of Deep Reinforcement Learning (DRL) Algorithms for MPPT and Partial Shading in Solar PV applications
Abstract
The present study reviews Deep Reinforcement Learning (DRL) algorithms as applied to Photovoltaic (PV) systems. A literature survey was conducted on various DRL techniques for Maximum Power Point Tracking (MPPT) and Partial Shading Conditions. The survey shows Deep Deterministic Policy Gradient (DDPG) to be the most implemented technique because of its fast convergence speed. Deep Q-Network (DQN) was considered to achieve faster response than DDPG. Twin Delayed Deep Deterministic Policy Gradient (TD3) was considered preferable, while Soft Actor-Critic (SAC), approach better eliminates power oscillations, under partially shaded conditions. The implementation of DRL-based MPPT for critical and effective learning requires defining the state variable, action variable and reward function of the PV module. It is therefore important to observe the voltage, current, irradiance, and temperature data that can allow for easy adaptation to changing environmental conditions. DRL requires higher computational effort compared to conventional methods due to its training phase. However, the trained models can operate with relatively low computational effort, thus making it a promising approach for real-time applications. The literature survey also showed that the exploration–exploitation trade-off is a fundamental challenge in DRL-based MPPT control. Therefore, effective management of this trade-off, as well as bridging the gap between simulation and real-world hardware implementation, will enable DRL to become a practical solution for MPPT in PV systems.
Keywords
Full Text:
PDFTime cited: 0
DOI: http://dx.doi.org/10.55579/jaec.2026102.521
Refbacks
- There are currently no refbacks.
Copyright (c) 2026 Journal of Advanced Engineering and Computation

This work is licensed under a Creative Commons Attribution 4.0 International License.









