Test-Cost-Sensitive Attribute Reduction of Data with Normal Distribution Measurement Errors

Hong Zhao,William Zhu,Fan Min

doi:10.1155/2013/946070

Hong Zhao, William Zhu + Show 1 more

Open Access

https://doi.org/10.1155/2013/946070

Copy DOI

Abstract

The measurement error with normal distribution is universal in applications. Generally, smaller measurement error requires better instrument and higher test cost. In decision making, we will select an attribute subset with appropriate measurement error to minimize the total test cost. Recently, error-range-based covering rough set with uniform distribution error was proposed to investigate this issue. However, the measurement errors satisfy normal distribution instead of uniform distribution which is rather simple for most applications. In this paper, we introduce normal distribution measurement errors to covering-based rough set model and deal with test-cost-sensitive attribute reduction problem in this new model. The major contributions of this paper are fourfold. First, we build a new data model based on normal distribution measurement errors. Second, the covering-based rough set model with measurement errors is constructed through the “3-sigma” rule of normal distribution. With this model, coverings are constructed from data rather than assigned by users. Third, the test-cost-sensitive attribute reduction problem is redefined on this covering-based rough set. Fourth, a heuristic algorithm is proposed to deal with this problem. The experimental results show that the algorithm is more effective and efficient than the existing one. This study suggests new research trends concerning cost-sensitive learning.

Highlights

The measurement error is the difference between a measurement value and its true value
We introduce normal distribution measurement errors to covering-based rough set model and deal with test-cost-sensitive attribute reduction problem in this new model
We introduce normal distribution to build a new model of covering-based rough set to address normal distribution measurement errors (NDME) according to the “3-sigma” rule

Summary

Introduction

The measurement error is the difference between a measurement value and its true value. It can come from the measuring instrument, from the item being measured, from the environment, from the operator, and from other sources [1]. The data model based on measurement errors is an important form of uncertain data (see, e.g., [2,3,4]). There are a number of measurement methods with different test costs to obtain a data item. Higher test cost is required to obtain data with smaller measurement error. We will select an attribute subset with appropriate measurement error to minimize the total test cost and at the same time preserve necessary information of the original decision system

Methods

Results

Conclusion