Abstract

Several interrelated strands of research in linguistics, acoustic phonetics, and cognitive neuroscience suggest a host of new directions for the development of end-to-end computational models of speech perception and recognition. Natural candidates for exploration include (i) phonological representations in terms of distinctive features; (ii) nonlinear detectors for distinctive feature landmarks (or any other set of perceptually salient acoustic events), which define a sparse point process representation of the speech signal; (iii) syllable-metered temporal processing and/or syllable-sized integration windows; and (iv) point process models and hierarchical strategies for recognizing words, syllables, phonemes, and features. A computational framework around these ideas has been developed and has led to phonetic recognition and keyword spotting performance that is competitive with equivalent hidden Markov model-based systems. This framework thus connects a computational platform for benchmarking competing scientific theories with simultaneous advancement toward a viable technological solution to the speech recognition problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call