This study aimed to evaluate classification algorithms to predict largemouth bass (Micropterus salmoides) occurrence in South Korea. Fish monitoring and environmental data (temperature, precipitation, flow rate, water quality, elevation, and slope) were collected from 581 locations throughout four major river basins for 5 years (2011–2015). Initially, 13 classification models built in the caret package were evaluated for predicting largemouth bass occurrence. Based on the accuracy (>0.8) and kappa (>0.5) criteria, the top three classification algorithms (i.e., random forest (rf), C5.0, and conditional inference random forest) were selected to develop ensemble models. However, combining the best individual models did not work better than the best individual model (rf) at predicting the frequency of largemouth bass occurrence. Additionally, annual mean temperature (12.1 °C) and fall mean temperature (13.6 °C) were the most important environmental variables to discriminate the presence and absence of largemouth bass. The evaluation process proposed in this study will be useful to select a prediction model for the prediction of freshwater fish occurrence but will require further study to ensure ecological reliability.
Read full abstract