Recommender systems act as filtering algorithms to provide users with items that might meet their interests according to the expressed preferences and items' characteristics. As of today, the collaborative filtering paradigm, along with deep learning techniques to learn high-quality users' and items' representations, constitute the de facto standard for personalized recommendation, showing remarkable recommendation accuracy performance. Nevertheless, recommendation remains a highly-challenging task. Among the most debated open issues in the community, this thesis considers two algorithmic and conceptual ones, namely: (i) the inexplicable nature of users' preferences, especially when they come in the form of implicit feedback; (ii) the effective exploitation of the collaborative information in the designing and training of recommendation models. In domains such as fashion, food, and media content recommendation, the shallow item's profile can be enhanced through the multimodal characteristics describing items [Malitesta et al., 2023]. Driven by these assumptions, in the first part of this thesis, we apply multimodal deep learning strategies for multimedia recommendation; the scope is to study and design recommendation algorithms based upon the principles of multimodality to possibly match each item's characteristic to the implicit preference expressed by the user [Deldjoo et al., 2022], thus addressing the (i) issue. Recent collaborative filtering approaches profile users and items through embedding vectors in the latent space. However, such models disregard structural properties naturally encoded into the user-item interaction data. Indeed, recommendation datasets are easily describable under the topology of a bipartite and undirected graph, with users and items being the graph nodes connected at multiple distance hops. In this respect, the application of graph neural networks , recent deep learning techniques specifically tailored to learn from non-euclidean data, can provide a refined representation of users and items to mine near- and long-distance relationships in the user-item graphs [Anelli et al., 2023b]. Indeed, this is one possible solution to exploit the collaborative information, which is effectively propagated within the user-item graph, addressing the (ii) issue. Conclusively, this thesis aims to match the two families of recommendation strategies by leveraging graph neural networks and multimodal information data [Anelli et al., 2022]. In doing so, other numerous micro-aspects within the two macro-areas (introduced above) are examined. Indeed, the thesis is a systematic compendium of careful analyses regarding, among others, reproducibility, novel evaluation dimensions [Anelli et al., 2023a], and tasks/scenarios complementary to recommendation. Awarded by : Politecnico di Bari, Bari, Italy on 30 January 2024. Supervised by : Tommaso Di Noia. Available at : https://hdl.handle.net/11589/264941.