Abstract

400 analog electronic neurons have been assembled and connected for the analysis and recognition of acoustical patterns, including speech. Input to the net comes from a set of 18 band pass filters (Qmax 300 dB/octave; 180 to 6000 Hz, log scale). The net is organized into two parts, the first performs in real time the decomposition of the input patterns into their primitives of energy, space (frequency) and time relations. The other part decodes the set of primitives.216 neurons are dedicated to pattern decomposition. The output of the individual filters is rectified and fed to two sets of 18 neurons in an opponent center‐surround organization of synaptic connections (‘‘on center’’ and (‘‘off center’’). These units compute maxima and minima of energy at different frequencies.The next two sets of neutrons compute the temporal boundaries (‘‘on’’) and ‘‘off’’) and the following two the movement of the energy maxima (formants) up or down the frequency axis. There are in addition ‘‘hyperacuity’’ units which expand the frequency resolution to 36, other units tuned to a particular range of duration of the ‘‘on center’’ units and others tuned exclusively to very low energy sounds.In order to recognize speech sounds at the phoneme or diphone level, the set of primitives belonging to the phoneme is decoded such that only one neuron or a non‐overlapping group of neurons fire when the sound pattern is present at the input. For display and translation into phonetic symbols the output from these neurons is fed into an EPROM decoder and computer which displays in real time a phonetic representation of the speech input.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call