Cannabis sativa, a globally commercialized plant used for medicinal, food, fiber production, and recreation, necessitates effective identification to distinguish legal and illegal varieties in forensic contexts. This research utilizes multivariate statistical models and Machine Learning approaches to establish correlations between specific genotypes and tetrahydrocannabinol (Δ9-THC) content (%) in C. sativa samples. 132 cannabis leaves samples were obtained from legal growers in Piedmont, Italy, and illegal drug seizures in Turin. Samples were genetically profiled using a 13-loci STR multiplex and their Δ9-THC content was detected through quantitative GC-MS analysis. This study aims to assess the use of supervised classification modelling on genetic data to distinguish cannabis samples into legal and illegal categories, revealing distinct clusters characterized by unique allele profiles and THC content. t-distributed Stochastic Neighbor Embedding (t-SNE), Random Forest (RF) and Partial Least Squares Regression (PLS-R) were executed for the machine learning modelling. All the tested models resulted effective discriminating between legal samples and illegal. Although further validation is necessary, this study presents a novel forensic investigative approach, potentially aiding law enforcement in significant marijuana seizures or tracking illicit drug trafficking routes.