Systematic Reviews of diagnostic test accuracy (DTA) studies are increasingly comparing the accuracy of multiple tests to facilitate selection of the best performing test(s). Common approaches to compare multiple tests include multiple meta-analyses or meta-regression with the test type as a covariate. Within-study correlation between tests are typically not considered in these approaches. Several DTA network meta-analysis (DTA-NMA) models have been suggested to compare the accuracy of multiple index tests in a single model. Our aim was to identify all DTA-NMA methods for comparing the accuracy of multiple diagnostic tests. We conducted a methodological review of the DTA-NMA models. We searched PubMed, Web of Science, and Scopus from inception until the end of July 2019. Studies of any design published in English were eligible for inclusion. We also reviewed relevant unpublished material. The methods were applied in a network of 37 studies comparing human papillomavirus (HPV) DNA, mRNA, and cytology (ASCUS+/ LSIL+ threshold) for the diagnosis of invasive cervical cancer (CIN2+). We included 10 relevant studies, and identified four Bayesian hierarchical DTA-NMA methods including the 2×2 data table for each index test. Using CIN2+ as a case study, we applied the DTA-NMA methods to determine the most promising test, in terms of sensitivity and specificity. All models showed the mRNA test as the most accurate test followed by HPV DNA: relative sensitivity compared to the cytology test 1.36-1.39 and 1.33-1.35, respectively. However, both tests had similar or worse specificity than cytology (relative specificity range in mRNA 0.96-0.98 and in HPV-DNA 0.94-0.95). Both sensitivity and specificity of mRNA were associated with the highest uncertainty across all models (widest 95% credible intervals 0.68-0.97 and 0.74-0.94, respectively). Precision and estimation of between-study and within-study variability vary across models, which might be due to the differences in the key properties of the models. Different DTA-NMA methods may lead to different results. The choice of a DTA-NMA method for the comparison of multiple diagnostic tests may depend on the available data, e.g., threshold data, as well as on clinically-related factors.