A non-destructive method for determining the color value of pelletized red peppers is crucial for pepper processing factories. This study aimed to investigate the potentiality of visible and fluorescence images for the determination of color value of pelletized red pepper. The imaging problem, caused by the cylindrical shape and irregular cross-sectional features of the pelletized red peppers, was reduced through the extraction of an approximate plane region. To integrate the information in the visible and fluorescence images, a baseline convolutional neural network (CNN) architecture was designed and low level, middle level, and high level fusion models (denoted as LL-CNN, ML-CNN, and HL-CNN, respectively) were developed upon the baseline CNN. The effects of input image size and color space were examined. According to the training result, CNN fusion models were developed using visible image in L*a*b* color space and fluorescence image in RGB color space using 56×56 input image size. Among the three types of CNN fusion models, the HL-CNN obtained the best performance, resulting in Rv 2 of 0.828 and RMSEV of 0.351. This study suggests that the fusion of visible and fluorescence image through CNN is a practical approach to save testing time and replace traditional destructive method. The low cost and compact structure of the imaging systems can maintain the commercial appeal of pepper industry.