7-days of FREE Audio papers, translation & more with Prime
7-days of FREE Prime access
7-days of FREE Audio papers, translation & more with Prime
7-days of FREE Prime access
https://doi.org/10.1109/icassp.2018.8461602
Copy DOIPublication Date: Apr 1, 2018 |
Simply feeding of a last hidden layer of the deep neural network (DNN) back to the input layer recently found to be effective for noise robust acoustic modeling. Such high level feature strengthens the robustness of DNN based acoustic model while paying approximately twice the computational cost. In this paper, we proposed to feed such high level feature iteratively back to lower layers, which is referred as multi-scale feedback connection. With this intention, we firstly extract the high level feature at the last hidden layer of DNN. Second, this high level feature feed back to a lower scale features, they then generates a subsequent prediction as well as a subsequent high level feature. This subsequent high level feature is further feed down to a lower layers. We evaluated the proposed approach on both TIMIT and a large scale internal dataset. The large scale internal dataset includes voice search and far field dataset. Our finding is two aspects. First, at equivalent computational costs, the multiscale feedback connection outperforms the DNN, the DNN with skip connection and the DNN with feedback connection. The improvement is larger on the far field dataset. Second, pair layers-wise pretraining helps the proposed approach to converge better.
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.