Mining Association Algorithm Based on ROC Convex Hull Method in Bibliographic Navigation System

Minoru Kawahara,Hiroyuki Kawano

doi:10.1007/3-540-46846-3_36

Abstract

Minoru Kawahara and Hiroyuki Kawano 1 Data Processing Center, Kyoto University, Kyoto 6068501, JAPAN, kawahara@kudpc.kyoto-u.ac.jp, http://www.kudpc.kyoto-u.ac.jp/∼kawahara/index.html 2 Department of Systems Science, Kyoto University, Kyoto 6068501, JAPAN, kawano@i.kyoto-u.ac.jp, http://www.kuamp.kyoto-u.ac.jp/∼kawano/index.html In order to dissolve or ease retrieval difficulties on bibliographic databases, we have been developing bibliographic navigation system with the implementation of our proposed mining algorithms[1]. Our navigation system shows related keywords derived from the query which is inputed by a query user, and navigates query users to retrieve appropriate bibliographies. Although those thresholds that are used in the mining association algorithm are usually given by the system administrator, it is required methods to give such thresholds that can derive appropriate association rules for bibliographic navigation system. In this paper, we propose a method which specifies the optimal thresholds based on the ROC (Receiver Operating Characteristic) analysis[2] and evaluate the performance of the method on our practical navigation system. According to the bibliography [2], ROC graphs have long been used in signal detection theory to depict tradeoffs between hit rate and false alarm rate. ROC graphs illustrate the behavior of a classifier without regard to class distribution or error cost, and so they decouple classification performance from these factors. The ROC convex hull method is a method to compare multiple classifiers on an ROC graph and specify the optimal classifier which supplies the highest performance. ROC graph uses two parameters true positive rate TP and false positive rate FP as classifiers. If FP is plotted on the X axis and TP is plotted on the Y axis on a graph for several instances, then a curve is drawn and the curve, which is called as the ROC curve, drown nearer the point on which TP is higher and FP is lower, that is the most-northwest line, is better. Although ROC graph illustrates classification performance separated from class and cost, the ROC convex hull method can consider them. It is assumed that c(classification, class) is a two-place error cost function where c(n, P ) is the cost of a false negative error and c(y, N) is the cost of a false positive error, and p(P ) is the prior probability of a positive instance, so the prior probability of a negative instance is p(N) = 1 − p(P ). So the slope of an iso-performance line can be represented by p(N)/p(P ) · c(y, N)/c(n, P ). S. Arikawa, K. Furukawa (Eds.): DS’99, LNAI 1721, pp. 333–334, 1999. c © Springer-Verlag Berlin Heidelberg 1999 334 Minoru Kawahara and Hiroyuki Kawano Table 1. Minsups at Rerror = 145 and the average distances from the point (1, 0) on the ROC graph. “AllPos” means deriving all and “AllNeg” means deriving nothing. Category p(N)/p(P ) · 1/Rerror Optimal Minsup Minsup = 0.08 ROC Algorithm ROC distance ROC distance 1 0.0000 ∼ 0.2211 AllPos ∼ 0.02 0.8211 0.9477 2 0.2211 ∼ 0.7139 0.02 ∼ 0.04 0.8940 0.9725 3 0.7141 ∼ 2.2706 0.04 ∼ 0.25 0.9119 0.9008 4 2.2728 ∼ 7.1847 0.25 ∼ 0.40 0.9322 0.9857 5 7.2075 ∼ 22.565 0.40 ∼ 0.60 0.9926 0.9976 6 22.790 ∼ 69.076 0.60 0.9929 0.9968 7 71.235 ∼ 207.24 0.60 1.0262 1.0001 8 227.97 ∼ 569.93 AllNeg 1.0159 1.0000 9 759.91 ∼ 1139.9 AllNeg 1.0351 1.0000 1

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mining Association Algorithm Based on ROC Convex Hull Method in Bibliographic Navigation System

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The Utility of Receiver Operating Characteristic Curve in Educational Assessment: Performance Prediction
Hyunsuk Han
Mathematics | VOL. 10
Hyunsuk HanHyunsuk Han
30 Apr 2022
Mathematics | VOL. 10

The Hippocampus Supports both the Recollection and the Familiarity Components of Recognition Memory
Peter E Wais ... John T Wixted
Neuron | VOL. 49
Peter E Wais, et. al.Peter E Wais ... John T Wixted
01 Feb 2006
Neuron | VOL. 49

Fault Diagnosis based on DPCA and CA
Celina Rea ...
Computer Aided Chemical Engineering | VOL. 31
Celina Rea, et. al.Celina Rea ...
01 Jan 2012
Computer Aided Chemical Engineering | VOL. 31

ROC curves and the binormal assumption
...
The Journal of Neuropsychiatry and Clinical Neurosciences | VOL. 3
, et. al. ...
01 Nov 1991
The Journal of Neuropsychiatry and Clinical Neurosciences | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining Association Algorithm Based on ROC Convex Hull Method in Bibliographic Navigation System

Abstract

Talk to us

Similar Papers