Abstract

This paper presents a new means of selecting quality data for mining multiple data sources. Traditional data-mining strategies obtain necessary data from internal and external data sources and pool all the data into a huge homogeneous dataset for discovery. In contrast, our data-mining strategy identifies quality data from (internal and external) data sources for a mining task. A framework is advocated for generating quality data. Experimental results demonstrate that application of this new data collecting technique can not only identify quality data, but can also efficiently reduce the amount of data that must be considered during mining.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call