Abstract
The conventional mini-batch gradient descent algorithms are usually trapped in the local batch-level distribution information, resulting in the ``zig-zag'' effect in the learning process. To characterize the correlation information between the batch-level distribution and the global data distribution, we propose a novel learning scheme called epoch-evolving Gaussian process guided learning (GPGL) to encode the global data distribution information in a non-parametric way. Upon a set of class-aware anchor samples, our GP model is built to estimate the class distribution for each sample in mini-batch through label propagation from the anchor samples to the batch samples. The class distribution, also named the context label, is provided as a complement for the ground-truth one-hot label. Such a class distribution structure has a smooth property and usually carries a rich body of contextual information that is capable of speeding up the convergence process. With the guidance of the context label and ground-truth label, the GPGL scheme provides a more efficient optimization through updating the model parameters with a triangle consistency loss. Furthermore, our GPGL scheme can be generalized and naturally applied to the current deep models, outperforming the state-of-the-art optimization methods on six benchmark datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on neural networks and learning systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.