Background: According to the World Health Organization (WHO), over 23% of the human population infected with tuberculosis bacilli – M.tuberculosis. For each infected person, the likelihood of a transition from a state of latent tuberculosis infection (LTBI) to active tuberculosis remains. WHO considers testing and treatment for LTBI in groups at high risk of reactivations as a necessary condition for tuberculosis elimination. Early detection of the transition of LTBI into active tuberculosis presents a certain difficulty due to the absence of clinically and radiographically distinguishable symptoms of the onset of the disease, therefore, immunodiagnostics, including the use of a skin test with a recombinant tuberculosis allergen, and methods of clinical laboratory diagnostics come to the aid of clinicians. The objective of our study was to demonstrate the possibilities of using an artificial intelligence system to identify the level of activity of tuberculosis infection in children with the presence of small tuberculosis changes in the respiratory organs, detected by X-ray. Material and methods The total number of patients registered in anti-tuberculosis institutions enrolled in the study was 489, including: the main group – a training sample consisting of patients with confirmed active tuberculosis (n1 = 369); the control group – a test sample of patients in whom the pathogen was in an inactive form (n2 = 120). As variables for calculations: anamnesis, laboratory parameters and X-ray data, were obtained by routine methods and used in accordance with current national standards of care and clinical guidelines, which did not require additional invasive interventions, equipment and material costs. The above survey results: age, gender, medical history, BCG vaccination, blood biochemical parameters in dynamics, X-ray signs, formalized according to the binary principle (presence / absence), were retrieved from patients files into the study database based on MS Excel spreadsheets for further processing. At the initial stage, the Wolfram Mathematica software package was used for calculations; six classical machine learning (ML) methods were carried out: Logistic Regression, Naive Bayes, Nearest Neighbors, Neural Network and Random Forest. Results: The results of the calculations based on combinations of categorical features did not suit us in terms of the quality of the forecast, and we proceeded to the search for a decision rule based on quantitative features. All spelled above methods predicted the presence of the disease significantly better than its absence. The Random Forest method showed the best results for both categorical and quantitative traits, however, interpretation of its results was not possible for clinical decision making. Convinced of the non-optimality of applying classical ML methods, it was decided to apply the author’s committee machine method with the possibility of minimal correction of conditions for significantly different cardinalities of the separable sets and subsequent geometric interpretation of the results. As a result of the application of the committee machine method, 7 most informative parameters were identified to create a decision rule that makes possible to distinguish patients with inactive pathogen and who do not require treatment in children with suspected tuberculosis. Conclusion: the committee machine method in a geometric formulation lead to localize areas in the feature space that correspond to sick and healthy patients from the training sample. That areas were unambiguously described in the form of a system of inequalities and could be easily explained to clinicians and allow moving from a geometric interpretation to a meaningful description of cause-and-effect relationships between the laboratory parameters in a certain area and the patient’s condition.
Read full abstract