Abstract

A machine for neural computation of acoustical patterns for use in real-time speech recognition, comprising a plurality of analog electronic neurons connected for the analysis and recognition of acoustical patterns, including speech. Input to the neural net is provided from a set of bandpass filters which separate the input acoustical patterns into frequency ranges. The neural net itself is organized into two parts, the first for performing the real-time decomposition of the input patterns into their primitives of energy, space (frequency) and time relations, and the second for decoding the resulting set of primitives into known phonemes and diphones. During operation, the outputs of the individual bandpass filters are rectified and fed to sets of neurons in an opponent center-surround organization of synaptic connections ("on center" and "off center"). These units compute maxima and minima of energy at different frequencies. The next sets of neurons compute the temporal boundaries ("on" and "off"), while the following sets of neurons compute the movement of the energy maxima (formants) up or down the frequency axis. Then, in order to recognize speech sounds at the phoneme or diphone level, the set of primitives belonging to the phoneme is decoded such that only one neuron or a non-overlapping group of neurons fire when a particular sound pattern is present at the input. The output from these neurons is then fed to an Erasable Programmable Read Only Memory (EPROM) decoder and computer for displaying in real-time a phonetic representation of the speech input.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call