Abstract
Steepest descent algorithms, which are commonly used in deep learning, use the gradient as the descent direction, either as-is or after a direction shift using preconditioning. In many scenarios calculating the gradient is numerically hard due to complex or non-differentiable cost functions, specifically next to singular points. This has been commonly overcome by increased DNN model sizes and complexity. In this work we propose a novel mechanism we refer to as Cost Unrolling, for improving the ability of a given DNN model to solve a complex cost function, without modifying its architecture or increasing computational complexity. We focus on the derivation of the Total Variation (TV) smoothness constraint commonly used in unsupervised cost functions. We introduce an iterative differentiable alternative to the TV smoothness constraint, which is demonstrated to produce more stable gradients during training, enable faster convergence and improve the predictions of a given DNN model. We test our method in several tasks, including image denoising and unsupervised optical flow. Replacing the TV smoothness constraint with our loss during DNN training, we report improved results in all tested scenarios. Specifically, our method improves flows predicted at occluded regions, a crucial task by itself, resulting in sharper motion boundaries.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on pattern analysis and machine intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.