Abstract

Acoustic-to-articulatory inversion has potential application in number of fields. For decades, average root mean square error and Pearson correlation coefficient are the most prevalent quantities adopted to evaluate the performance of acoustic-to-articulatory inversion. Various inversion methods have been developed to less the average root mean square error, but very few studies explored whether the average root mean square error is appropriate for evaluating and comparing the performance of different inversion methods. In this study, we attempt to tackle this issue by comparing not only the average root mean square error but also channel root mean square error of each articulatory channel, and the root mean square error of the critical and non-critical portions of each articulatory channel for methods within and between different groups. It is found that: i) the root mean square error of each articulatory channel, and the root mean square error of the critical and non-critical portions of each articulatory channel decrease while the average root mean square error decrease if the AAI methods belong to the same group; ii) exceptions are found if the inversion methods belong to different categories; iii) the average root mean square error is dominated by that of non-critical portions of articulatory channels. This suggests that new methods, which pay more attention to the performance of acoustic-to-articulatory inversion on critical articulators and facilitate the comparison of performance of methods belonging to different categories, should be developed in the future.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call