Machine learning is a promising technique to develop models, which extract relevant information from image data. This study applies convolutional neural networks trained end-to-end to predict the mechanical properties of apples (var. Granny Smith) from micro-CT image data collected during in vitro gastric digestion. Models were trained to directly output compression curves, allowing for representation of complex curve shapes, which changed throughout the digestion process. Models evaluated using 3-fold cross-validation demonstrated high predictive performance, with RMSE of 4.36 N and R2 of 0.939 compared to measured data. This performance was decreased to an RMSE of 14.3 N and R2 of 0.296 when applied to an out-of-distribution dataset. Saliency mapping used to interpret model output demonstrated a mechanistic link between typical biophysical tissue changes and model attention. Overall, the end-to-end deep learning approach represents a promising method for rapid, nondestructive evaluation of mechanical properties during food processing and digestion.