Identifying farmland use has long been an important topic in large-scale agricultural production management. This study used multi-temporal visible RGB images taken from agricultural areas in Taiwan by UAV to build a model for classifying field types. We combined color and texture features to extract more information from RGB images. The vectorized gray-level co-occurrence matrix (GLCMv), instead of the common Haralick feature, was used as texture to improve the classification accuracy. To understand whether changes in the appearance of crops at different times affect image features and classification, this study designed a labeling method that combines image acquisition times and land use type to observe it. The Extreme Gradient Boosting (XGBoost) algorithm was chosen to build the classifier, and two classical algorithms, the Support Vector Machine and Classification and Regression Tree algorithms, were used for comparison. In the testing results, the highest overall accuracy reached 82%, and the best balance accuracy across categories reached 97%. In our comparison, the color feature provides the most information about the classification model and builds the most accurate classifier. If the color feature were used with the GLCMv, the accuracy would improve by about 3%. In contrast, the Haralick feature does not improve the accuracy, indicating that the GLCM itself contains more information that can be used to improve the prediction. It also shows that with combined image acquisition times in the label, the within-group sum of squares can be reduced by 2–31%, and the accuracy can be increased by 1–2% for some categories, showing that the change of crops over time was also an important factor of image features.