Abstract

The proliferation and affordability of sensing systems, smart networks, data storage and transmission technologies means water utility companies are now able to collect larger datasets than ever before. This information revolution opens up hereto unseen possibilities for data mining. Existing (nonparametric) non-linear visualization techniques can struggle with the visualization of real world, high-dimensional data because generally they are not able of retaining both the local and the global data structure in a single map. t-SNE is a technique that can be used for human-intuitive (two dimensional) visualization of high-dimensional data. It converts similarities between data points to joint probabilities and then minimizes the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the original high-dimensional data. Consequently the clusters that are separated in high dimensional space that could be merged by PCA can still remain separate in low dimensional space with t-SNE. The parametric version of t-SNE uses deep neural networks. Results are provided for artificial data and for a smart water meter dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call