Abstract

Sampling is one of fundamental techniques for data preprocessing and mining. It helps to reduce computational costs and improve the mining quality. A sampling method is typically developed independently for a specific problem and for a specific user's interest, because it is hard to develop a method that is generalized across various user's interests. An absence of general framework for sampling makes it inefficient to develop or revise a sampling method as user's interest changes. This paper proposes a general framework, isampling, which facilitates a user developing sampling methods and easily modifying the user's sampling interest in the method. In the framework, a user explicitly describes her sampling interest into a graph model called interest model. Then, isampling automatically selects a sample set according to the model, which satisfies the user's interest. In order to demonstrate the effectiveness of our framework, we develop new trajectory sampling methods using our framework; trajectory sampling has been a challenging problem due to its high complexity of data and various user's interests. We demonstrate the flexibility of our framework by showing how easily trajectory samples of different interests can be generated within our framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call