LSTM-based System Research Articles

Speech intelligibility can be affected by multiple factors, such as noisy environments, channel distortions or physiological issues. In this work, we deal with the problem of automatic prediction of the speech intelligibility level in this latter case. Starting from our previous work, a non-intrusive system based on LSTM networks with attention mechanism designed for this task, we present two main contributions. In the first one, it is proposed the use of per-frame modulation spectrograms as input features, instead of compact representations derived from them that discard important temporal information. In the second one, two different strategies for the combination of per-frame acoustic log-mel and modulation spectrograms into the LSTM framework are explored: at decision level or late fusion and at utterance level or Weighted-Pooling (WP) fusion. The proposed models are evaluated with the UA-Speech database that contains dysarthric speech with different degrees of severity. On the one hand, results show that attentional LSTM networks are able to adequately modeling the modulation spectrograms sequences producing similar classification rates as in the case of log-mel spectrograms. On the other hand, both combination strategies, late and WP fusion, outperform the single-feature systems, suggesting that per-frame log-mel and modulation spectrograms carry complementary information for the task of speech intelligibility prediction, than can be effectively exploited by the LSTM-based architectures, being the system with the WP fusion strategy and Attention-Pooling the one that achieves best results.

Abstract. A deep recurrent neural network system based on a long short-term memory (LSTM) model was developed for daily PM10 and PM2.5 predictions in South Korea. The structural and learnable parameters of the newly developed system were optimized from iterative model training. Independent variables were obtained from ground-based observations over 2.3 years. The performance of the particulate matter (PM) prediction LSTM was then evaluated by comparisons with ground PM observations and with the PM concentrations predicted from two sets of 3-D chemistry-transport model (CTM) simulations (with and without data assimilation for initial conditions). The comparisons showed, in general, better performance with the LSTM than with the 3-D CTM simulations. For example, in terms of IOAs (index of agreements), the PM prediction IOAs were enhanced from 0.36–0.78 with the 3-D CTM simulations to 0.62–0.79 with the LSTM-based model. The deep LSTM-based PM prediction system developed at observation sites is expected to be further integrated with 3-D CTM-based prediction systems in the future. In addition to this, further possible applications of the deep LSTM-based system are discussed, together with some limitations of the current system.

LSTM-based System Research Articles

Articles published on LSTM-based System

On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification

An attention Long Short-Term Memory based system for automatic classification of speech intelligibility

Development of a daily PM&lt;sub&gt;10&lt;/sub&gt; and PM&lt;sub&gt;2.5&lt;/sub&gt; prediction system using a deep long short-term memory neural network model

De-identification of medical records using conditional random fields and long short-term memory networks

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

LSTM-based System Research Articles

Articles published on LSTM-based System

On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification

An attention Long Short-Term Memory based system for automatic classification of speech intelligibility

Development of a daily PM&amp;lt;sub&amp;gt;10&amp;lt;/sub&amp;gt; and PM&amp;lt;sub&amp;gt;2.5&amp;lt;/sub&amp;gt; prediction system using a deep long short-term memory neural network model

De-identification of medical records using conditional random fields and long short-term memory networks

Development of a daily PM<sub>10</sub> and PM<sub>2.5</sub> prediction system using a deep long short-term memory neural network model