Computed tomography (CT) can capture volumes large enough to measure a statistically meaningful number of micron-sized particles with a sufficiently good resolution to allow for the analysis of individual particles. However, the development of methods to efficiently investigate such image data and interpretably model the observed particle features is still an active field of research. When image data of particles exhibiting a wide range of shapes and sizes is considered, traditional image segmentation methods, such as the classic watershed algorithm, struggle to recognize particles with satisfying accuracy. Thus, more advanced methods of machine learning must be utilized for image segmentation to improve the validity of subsequent analyses. Moreover, CT data does not include information about the mineralogical composition of particles and, therefore, additional SEM-EDS image data has to be acquired. In this paper, micro-CT image data of a particle system mostly consisting of zinnwaldite–quartz composites is considered. First, an image segmentation method is applied which uses deep convolutional neural networks, in particular an adaptation of the U-net architecture. This has the advantage of requiring less hand-labeling than other machine learning methods, while also being more flexible with the possibility of transfer learning. Then, fully parameterized models based on vine copulas are designed to determine multivariate probability distributions of descriptor vectors for the size, shape, texture and composition of particles—allowing for the estimation and interpretable characterization of interdependencies between particle descriptors. For model fitting, the segmented three-dimensional CT data and co-registered two-dimensional SEM-EDS data are used. The models are applied to predict the mineralogical composition of particles, solely on the basis of particle descriptors observed in CT data.