At present, problems related to data mining are aris-ing in various poorly formalizable areas of research(genetics, archaeology, medicine, etc.). Such problemsare characterized by insufficient knowledge about theobjects under study, which hampers the development ofmathematical models for them; by a large number ofmultitype (quantitative and qualitative) factors and arelatively small amount of data; and by the requirementthat analysis results be represented in a form clear toexperts in applied areas.A promising approach to solving such problems isbased on the class of logical decision functions. Theyare frequently represented in the form of a decisiontree. A detailed description of this class and referencescan be found in [1].The construction of decision functions with a mini-mal risk of erroneous predictions is an important task inmethods based on logical decision functions and inother data analysis techniques. It is well known that thecomplexity of the class of decision functions (thecapacity characteristic of the class [2]) is a importantfactor affecting the quality of decisions. To achieve thebest quality, we need a certain tradeoff between thecomplexity of the class and the accuracy of decisionsobtained on the training sample. Thus, the problemarises of choosing an optimal complexity of the class.In numerous applications, empirical data (trainingsample) are supplemented with expert knowledge,which is not tightly related to a distribution model(expert knowledge can be specified as upper bounds forthe risk; they can express constraints on the class of dis-tributions or the class of decision functions, specifypreference rules used in decision making, etc.). Whenan optimal complexity of the class is chosen, availableempirical data and expert knowledge have to be jointlytaken into account.This problem can be solved within the framework ofthe Bayesian learning theory. The idea behind thisapproach is to use a priori knowledge about the prob-lem in question so that each possible distribution (strat-egy, the state of nature) is assigned a certain weight.This weight reflects the expert’s intuitive confidence inwhether the unknown true distribution coincides withthat under consideration.In this paper, we propose a Bayesian recognitionmodel based on a finite set of events. The model is usedto develop and analyze methods for constructing logi-cal decision functions. While choosing an optimal com-plexity of the class, the model takes into account empir-ical data and expert knowledge.In pattern recognition, the task is to predict the pat-tern’s index for an arbitrary object of the populationdescribed by a set of variables. The prediction is madeby analyzing a training sample consisting of the valuesof these variables, together with the index of the corre-sponding pattern for each object. The variables can beof different types; i.e., some of them can be quantita-tive, while the others can be qualitative. As a rule, theproblem is solved in a class of decision functions inwhich an optimal function is sought according to agiven criterion. The class of logical decision functionsis defined on a set of partitions of the feature space intoa finite number of subdomains described by conjunc-tions of simple predicates. The number of subdomainsdetermines the degree of complexity of a logical func-tion.A Bayesian pattern recognition model on a finite setof events is defined by formulating certain statementsthat avoid the local metric properties of the featurespace (the transition from points of the space to events,where an event is understood as that of the original vari-ables taking values from a certain subdomain of thepartition). Moreover, the pattern recognition problem isconsidered on the values of a discrete unordered vari-able, the concept of a learning method (a mapping from
Read full abstract