Abstract
Ordering of classification methods should increase their role in solving applied problems, in particular, in diagnostics of materials. Development of the requirements to be met by classification methods is the first concern. The initial formulation of the requirements is the main content of this work. Mathematical methods of classification are considered as a part of the applied statistics methods. The natural requirements to the considered methods of data analysis and presentation of the calculation results arising from the achievements and ideas accumulated by the national probabilistic and statistical scientific school are discussed. Specific recommendations are given on a number of issues, as well as criticism of individual errors. In particular, methods of data analysis must be invariant with respect to the permissible transformations of the scales in which the measured data are gained, i.e., the methods should be adequate in the terms of measurement theory. The basis of any statistical method of data analysis is always a particular probabilistic model which must be clearly described and the premises must be justified either from theoretical considerations, or experimentally. Data processing methods intended for use in real-world problems should be tested for robustness with respect to the acceptable deviations of the initial data and model assumptions. The accuracy of the solutions provided by the method should be determined. When publishing the results of statistical analysis of real data, it is necessary to indicate their accuracy (confidence intervals). As an estimate of the predictive power of the classification algorithm, it is recommended to use predictive power instead of the proportion of correct forecasts. Methods of mathematical research are divided into «exploratory analysis» and «evidence-based statistics». Specific requirements for data processing methods arise in connection with their «docking» during their sequential execution. The limits of applicability of probabilistic-statistical methods are discussed. Specific statements of the classification problems and typical errors in application of different methods of their solution are also considered.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have