Criminal investigations, particularly sexual assaults, frequently require the identification of body fluid type in addition to body fluid donor to provide context. In most cases this can be achieved by conventional methods, however, in certain scenarios, alternative molecular methods are required. An example of this is the detection of menstrual fluid and vaginal material, which are not able to be identified using conventional techniques. Endpoint reverse-transcription PCR (RT-PCR) is currently used for this purpose to amplify body fluid specific messenger RNA (mRNA) transcripts in forensic casework. Real-time quantitative reverse-transcription PCR (RT-qPCR) is a similar method but utilises fluorescent markers to generate quantitative results in the form of threshold cycle (Cq) values. Despite the uncertainty surrounding body fluid identification, most interpretation guidelines utilise categorical statements. Probabilistic modelling is more realistic as it reflects biological variation as well as the known performance of the method. This research describes the application of various machine learning models to single-source mRNA profiles obtained by RT-qPCR and assesses their performance. Multinomial logistic regression (MLR), Naïve Bayes (NB), and linear discriminant analysis (LDA) were used to discriminate between the following body fluid categories: saliva, circulatory blood, menstrual fluid, vaginal material, and semen. We identified that the performance of MLR was somewhat improved when the quantitative dataset of the original Cq values was used (overall accuracy of approximately 0.95) rather than presence/absence coded data (overall accuracy of approximately 0.94). This indicates that the quantitative information obtained by RT-qPCR amplification is useful in assigning body fluid class. Of the three classification methods, MLR performed the best. When we utilised receiver operating characteristic curves to observe performance by body fluid class, it was clear that all methods found difficulty in classifying menstrual blood samples. Future work will involve the modelling of body fluid mixtures, which are common in samples analysed as part of sexual assault investigations.
Read full abstract