Maximum Power Point Tracking of Photovoltaic System Based on Reinforcement Learning.

Kuan-Yu Chou,Shu-Ting Yang,Yon-Ping Chen

doi:10.3390/s19225054

Kuan-Yu Chou, Shu-Ting Yang + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/s19225054

Copy DOI

Export

Save

Cite

Journal: Sensors	Publication Date: Nov 19, 2019
Citations: 32	License type: CC BY 4.0

Affiliation: National Yang Ming Chiao Tung University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

The maximum power point tracking (MPPT) technique is often used in photovoltaic (PV) systems to extract the maximum power in various environmental conditions. The perturbation and observation (P&O) method is one of the most well-known MPPT methods; however, it may face problems of large oscillations around maximum power point (MPP) or low-tracking efficiency. In this paper, two reinforcement learning-based maximum power point tracking (RL MPPT) methods are proposed by the use of the Q-learning algorithm. One constructs the Q-table and the other adopts the Q-network. These two proposed methods do not require the information of an actual PV module in advance and can track the MPP through offline training in two phases, the learning phase and the tracking phase. From the experimental results, both the reinforcement learning-based Q-table maximum power point tracking (RL-QT MPPT) and the reinforcement learning-based Q-network maximum power point tracking (RL-QN MPPT) methods have smaller ripples and faster tracking speeds when compared with the P&O method. In addition, for these two proposed methods, the RL-QT MPPT method performs with smaller oscillation and the RL-QN MPPT method achieves higher average power.

Highlights

Sustainable energy such as solar energy is often seen as one of the solutions to reduce pollution caused by thermal power generation
The state representation is needed to be discretized for the tabular method, which may cause the loss of maximum power point tracking (MPPT) control accuracy
In the reinforcement learning (RL)-QN MPPT method, the Q-table is approximated by a neural network, so that the discretization of the states are not needed

Summary

Introduction

Sustainable energy such as solar energy is often seen as one of the solutions to reduce pollution caused by thermal power generation. The adaptive P&O method basically modifies the step size based on the amount of the power difference between two perturbations. To achieve the best performance, the ratio between the step size and power difference needs to be tuned according to the actual model. Q-learning [19] is a model free temporal difference (TD) method to perform RL. A Q-table is constructed through bootstrapping to store the optimal action value of any state-action pair. For the state s0, an optimal action a _ is expected to be selected within action set A(s’) so that the Q value at s0 can be maximized, i.e., max Q( s0 , a _). According to the interaction experiences, the Q values can be updated by (12) and stored in a tabular form, which is called a Q-table

Methods

Results

Conclusion