Abstract

Digitalization of various spheres of economic and social life is accompanied by the emergence of large amounts of data, processing of which is necessary to identify certain dependencies, build models of processes and systems. The study is devoted to the development and research of a mathematical model for the classification of data on medical care in the medical organization of Lipetsk region. As inputs there were used indicators of medical care, divided into five groups (data describing patient; data describing the medical organization in which the care was provided; indicators of the disease; data on health employee that assisted; indicators characterizing the specific features of the patient's visits to a particular specialist). The volume of records on which the study was conducted is more than one million records of the facts. The purpose of the study is to propose models and approaches for identifying erroneous records, as well as cases of falsification. The paper presents a statement of the binary classification problem. Anomaly detection refers to the problem of finding data that does not correspond to some expected process behavior or indicator that occurs in the system. When building systems for detecting anomalous observations, much attention must be paid to the model underlying the system. The study is devoted to the construction of a model for detecting anomalous values of a fixed indicator based on a combination of an isolation forest algorithm to estimatie the observation anomaly index and the subsequent application of a neural network classifier. The study contains the results of computational experiments to determine the threshold value for dividing records into classes of anomalous observations and data that do not have signs of abnormality. To evaluate which factors should be passed to the input of the neural network classifier (in order to increase the time efficiency of data processing), the approach to the reduction of the neural network model based on Sensitivity Analysis was proposed. The classical approach when considering the sensitivity of systems is to find the sensitivity by the parameter of the system under study, however, there is also a direction of Sensitivity Analysis that involves using its factors as the estimated parameters of the system. The proposed approach is based on applying Analysis of Finite Fluctuation. This analysis is based on replacing the mathematical model of the dependence of the system output on factors with a model of the dependence of the finite fluctuation in output on the finite fluctuations of factors. In Mathematical Analysis such a structure is known – this is Lagrange mean value theorem. The approach allows us to determine the values of the so-called factor loads. The paper presents a new approach to averaging the obtained values of factor loads and constructing interval characteristics for their estimation. A study of the stability of the proposed procedure for calculating the sensitivity coefficients of the model is presented.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.