Abstract
Email filtering is a cost-sensitive task, because missing a legitimate message is more harmful than the opposite error. Therefore, how to evaluate the error risk of a filter which is trained from a given labeled dataset is significant for this task. This paper surveys the researches on the Receiver Operation Characteristic (ROC) curve analysis. And, with the experimental results of four compared filters on four public available corpus, we discuss how to use the techniques of ROC curve analysis to evaluate the risk of email filters. In our view, this work is useful for designing a bread-and-butter filter.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have