ABSTRACTThis study monitored 37 longliners fishing in waters near the Marshall Islands from 2020 to 2022 by Liancheng Overseas Fishery (Shenzhen) Co., Ltd.'s operation management system. This study developed nine predictive models on the relationship between catch per unit effort (CPUE) data for yellowfin tuna (Thunnus albacares) and the environmental data. The environmental data integrate 48 variables, including eddy kinetic energy, chlorophyll a concentration, sea surface height, and additional measures of vertical oceanic conditions, alongside spatiotemporal parameters (year, month, day, longitude, and latitude). This study employed four spatial resolutions (0.25° × 0.25°, 0.5° × 0.5°, 1° × 1°, and 2° × 2°) to develop nine predictive models: KNN, RF, GBDT, CART, LightGBM, XGBoost, CatBoost, AdaBoost, and Stacking (RF, KNN, GBDT, and LR). These models, with a daily time resolution, were trained using 75% of the data and tested with the remaining 25%. The optimal spatial resolution and model were determined through a comprehensive comparison of model evaluation metrics across these spatial resolutions. The SMOTETomek algorithm was then applied to resample 75% of the data at the optimal spatial resolution, forming a new training dataset. This dataset was used to refine the model, subsequently tested with the remaining 25% of the data. Results indicated that (1) the optimal spatial resolution is 0.25° × 0.25° and the optimal model is RF; (2) the SMOTETomek algorithm enhances the model's predictive performance; and (3) the developed SMK‐RF model, exhibiting Acc and AUC values of 76.73% and 82.47%, respectively, accurately predicts the central fishing grounds for yellowfin tuna, consisting closely with actual fishing activity.
Read full abstract