Can Artificial Intelligence help a clinical laboratory to draw useful information from limited data sets? Application to mixed connective tissue disease

Daniel Bertin,Pierre Bongrand,Nathalie Bardin

doi:10.32629/jai.v6i2.664

Abstract

<p>Diagnosis is a key step of patient management. During decades, refined decision algorithms and numerical scores based on conventional statistical methods were elaborated to ensure optimal reliability. Recently, a number of machine learning tools were developed and applied to process more and more extensive data sets, including up to millions of items and yielding sophisticated classification models. While this approach met with impressive efficiency in some cases, practical limitations stem from the high number of parameters that may be required by a model, resulting in increased cost and delay of decision making. Also, information relative to the specificity of local recruitment may be lost, hampering any simplification of universal models. Here, we explored the capacity of currently available artificial intelligence tools to classify patients found in a single health center on the basis of a limited number of parameters. As a model, the discrimination between systemic lupus erythematosus (SLE) and mixed connective tissue disease (MCTD) on the basis of thirteen biological parameters was studied with eight widely used classifiers (including logistic regression, support vector machine, nearest neighbor classifier, random forests and neural networks). A retrospective study including 44 patients (34 SLE, 10 MCTD) was conducted in Marseilles hospital organization. The best area under ROC curve yielded on test sets with classifiers using all 13 parameters was 0.83 ± 0.03 standard error and 0.86 ± 0.02 SE with 5 selected parameters. It is concluded that classification efficiency may be significantly improved by a knowledge-based selection of discriminating parameters.</p>

Full Text