Abstract

Generative Stochastic Networks (GSN) for supervised tasks generalize the denoising autoencoders by fixing the deepest layer to the output variables (e.g. class) and estimate the input–output joint distribution as the stationary transition operator of a Markov chain. Because of multi-layer network architectures with stochastic neurons, GSN performance depends on the selected architecture and network training. Aiming to improve such a performance, we introduce a supervised kernel-based learning within a GSN framework. Firstly, the considered network model induces a temporal model working as a data filtering that extracts refined data representations. Then, we use the conventional exhaustive search strategy to fix the hidden layer size. Lastly, we propose a novel supervised layer-wise pre-training that initializes the fine tuning stage of the GSN with more discriminative projection matrices favoring the optimization of the non-convex cost function. Initial matrices are computed by maximizing the centered-kernel alignment (CKA) metric, measuring the affinity between projected samples and labels. We evaluate the proposal performance in comparison with Random, AutoEncoders, and Principal Component Analysis approaches. As a result, CKA-based pre-training approach captures the complex dependencies between parameters, increases the convergence speed in the learning stage, and unravels the data distribution to favor the class discrimination for five widely image collections used in classification tasks of image object recognition.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call