Abstract

There is an increasing demand for similarity searches in a multidimensional non-ordered discrete data space (NDDS) from application areas such as bioinformatics and data mining. The non-ordered and discrete nature of an NDDS raises new challenges for developing efficient indexing methods for similarity searches. In this article, we propose a new indexing technique, called the NSP-tree , to support efficient similarity searches in an NDDS. As we know, overlap causes a performance degradation for indexing methods (e.g., the R-tree) for a continuous data space. In an NDDS, this problem is even worse due to the limited number of elements available on each dimension of an NDDS. The key idea of the NSP-tree is to use a novel discrete space-partitioning (SP) scheme to ensure no overlap at each level in the tree. A number of heuristics and strategies are incorporated into the tree construction algorithms to deal with the challenges for developing an SP-based index tree for an NDDS. Our experiments demonstrate that the NSP-tree is quite promising in supporting efficient similarity searches in NDDSs. We have compared the NSP-tree with the ND-tree, a data-partitioning-based indexing technique for NDDSs that was proposed recently, and the linear scan using different NDDSs. It was found that the search performance of the NSP-tree was better than those of both methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.