Portfolio optimization is an active management strategy that aims to maximize returns and control risk within reasonable limits. The Proximal Policy Optimization (PPO), a robust on-policy actor-critic deep reinforcement learning (DRL) model, is gaining popularity in portfolio optimization because it can help reduce emotional biases and take systematic investment actions. However, some research has found that the PPO model cannot achieve such remarkable performance in portfolio optimization as in games or robot control. In this paper, a novel GraphSAGE and DRL coupled model (GRL) is proposed to improve the architecture of the PPO agent by introducing a GraphSAGE-based feature extractor to capture the complex non-Euclidean relationships among market indexes, industry indexes and stocks. In addition, the explainable model SHAP is used to select a few but important features for GRL learning, and a method for generating a static financial graph is defined. This improves the robustness and training efficiency of the GRL model. We provide a holistic performance evaluation for GRL on three datasets using five metrics, i.e., Return On Investment (ROI), Sharpe Ratio, Sortino Ratio, Maximum Drawdown, and Calmar Ratio. The results show that the GRL model outperforms the Equal Weight strategy and the S&P 500 index. In addition, the results of the comparative analysis show that the Share-Extractor GRL and the Separate-Extractor GRL significantly outperform the PPO baseline without a feature extractor. This implies that integrating a GraphSAGE-based feature extractor into the PPO agent can improve its performance and robustness in portfolio optimization tasks.