Abstract

Many recent text recognition studies achieved successful performance by applying a sequential-label prediction framework such as connectionist temporal classification. Meanwhile, regularization is known to be essential to avoid overfitting when training deep neural networks. Regularization techniques that allow for semi-supervised learning have a greater impact than those that do not. Among widely researched single-label regularization techniques, virtual adversarial training (VAT) performs successfully by smoothing posterior distributions around training data points. However, VAT is almost solely applied to single-label prediction tasks, not to sequential-label prediction tasks. This is because the number of possible candidates in the label sequence exponentially increases with the sequence length, making it impractical to calculate posterior distributions and the divergence between them. Investigating this problem, we have found that there is an easily computable upper bound for divergence. Here, we propose fast distributional smoothing (FDS) as a method for drastically reducing computational costs by minimizing this upper bound. FDS allows regularization at practical computational costs in both supervised and semi-supervised learning. An experiment under simple settings confirmed that upper-bound minimization decreases divergence. Experiments also show that FDS improves scene text recognition performance and enhances state-of-the-art regularization performance. Furthermore, experiments show that FDS enables efficient semi-supervised learning in sequential-label prediction tasks and that it outperforms a conventional semi-supervised method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.