Abstract

Database records can be often interpreted as state descriptions of some world, system or generic object, states of which occur independently and are de- scribed by binary properties. If records do not contain missing values, then there exists close relationship between association rules and propositions about state properties. In data mining we usually get a lot of association rules with large con- fidence and large support. Since their interpretation is often cumbersome, some quantitative measure of their informativeness would be very helpful. The main aim of the paper is to define a measure of the amount of infor- mation contained in an association rule. For this purpose we make use of the tight correspondence between association rules and logical implications. At first a quantitative measure of information content of logical formulas is introduced and studied. Information content of an association rule is then defined as information content of the corresponding logical implication in the situation when no knowl- edge about dependence among properties of world states is at our disposal. The intuitive meaning of the defined measure is that the association rule that allows more appropriate correction of the distribution of world states, acquired under un- fair assumption of independence of state properties, contains also larger amount of information. The more appropriate correction here means a correction of the current probability distribution of states that leads to the distribution that is closer to the true distribution in the sense of Kullback-Leibler divergence measure.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call