A recent study [Hogden etal., J. Acoust. Soc. Am. 91, 2443 (A) (1992)] showed that articulatory speech synthesizer parameters describing tongue positions can be recovered from synthesized acoustic signals using continuitymapping. Previous results are extended to two neural network implementations of continuity mapping. Both networks are tested on a two-dimensional spatial analog of the articulator tracking problem. The network’s inputs (analogous to acoustic signals in the previous study) are the distances between a randomly moving robot (analogous to the tongue) and its nearest obstacles. The network’s outputs encode the robot’s position. The first implementation uses a frequency-sensitive, competitive learning network (FSCLN) to vector quantize the inputs, followed by a linear layer with one output unit for each degree of freedom of the robot. At each time t the delta rule is used to teach the network to produce an output, O(t), that is similar to O(t−1). The second implementation follows the FSCLN with a Kohonen-like neural layer. The relationship between these simulations and the acoustic-to-articulatory mapping problem in speech will be discussed.
Read full abstract