Abstract

In reality,it is often the case that multi-dimensional data distributions do not exhibit one single type of data distribution as a whole,but rather,in different regions of the data space,different types of data distributions are obviously shown.The authors proposed a new kind of hybrid multi-dimensional histograms-COCA-Hist-based hybrid data distributions to tackle the problem.The method built up COCA-Hist,which was composed of different kinds of buckets according to different regions in the data space with different data distribution characteristics,under the given space budget.The aim was to enhance the estimation accuracy of the multi-dimensional histograms in general.Because COCA-Hist had to scan the tree structure twice of the histogram being built to discern the different data regions and allocated the space budget among them,COCA-Hist was a little inferior in efficiency.But the improvement in both universality and estimation accuracy made the cost in time worthwhile.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call