The Agile Earth Observation Satellites (AEOS) possess active imaging capabilities, which enable variable image durations during observations. Due to the problem’s high complexity, traditional heuristic algorithms are unable to provide optimal solutions within a feasible timeframe. In this paper, we propose an improved deep reinforcement learning approach (IDRL), including IDRL based on optimal observation quality (IDRL-MQ) and IDRL based on the longest observation duration (IDRL-MD), to solve the multi-objective scheduling problem for AEOS with variable image duration (MO-SPVID). Two different strategies IDRL-MQ and IDRL-MD, were designed to determine the start time and duration of observation. The experimental results demonstrate that IDRL-MD outperforms both IDRL-MQ and the superior heuristic algorithm ALNS-NSGAII in terms of solution quality and solution diversity. The demonstrated effectiveness of the appropriate heuristic strategies provides evidence for their rationality. Furthermore, the results obtained on instances of varying scales indicate that IDRL exhibits a high level of generality and robustness.
Read full abstract