The dynamic non-destructive grasp of thin-skinned fruits using flexible robotic hands, which requires obtaining three-dimension(3D) spatial structure information along with adaptive planning and motion control toward the target object, is a challenging topic in agricultural intelligence. To tackle the issue of 3D detection, we utilize RGB images and LiDAR point clouds for feature extraction and construct a multi-modal depth fusion convolution neural network (MDF-CNN) to obtain the classification information and perform image segmentation. Incorporating the advantages of a variable palm structure, we establish an evaluation mechanism of the optimal grasping stability (EM-OGS) using a hybrid method of the best configuration and force closure to build a new comprehensive performance optimal configuration planning (CPO-CP) method that is based on the multiple grasping performance indexes. We also create three cross-related nonlinear prediction models P-MGF, P-ODAP, and P-OBA along with a forward-looking non-destructive grasp control algorithm (FL-NGCA) using minimum grasping force to grasp thin-skinned fruits. The control algorithm carries out online, self-directed learning in the actual grasping process of the flexible hand and constantly optimizes the accuracy of the prediction model. Experimental results show that our proposed approach greatly improves the flexible hand comprehensive performance of grasping, outperforming state-of-the-art methods for non-destructive grasping of delicate fruits in most areas.