Abstract

We propose a non-parametric method to cluster mixed data containing both continuous and discrete random variables. The product space of the continuous and discrete sample space is transformed into a new product space based on adaptive quantization on the continuous part. Detection of cluster patterns on the product space is determined locally by using a weighted modified chi-squared test. Our algorithm does not require any user input since the number of clusters is determined automatically by data. Simulation studies and real data analysis results show that our proposed method outperforms the benchmark method, AutoClass, in various settings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call