This paper presents a digital filtering algorithm which clarifies dysphonic speech with the speaker’s individuality preserved. The study deals with the clarification of oesophageal speech and the speech of patients with cerebral palsy, and the filtering ability is being evaluated by listening experiments. Over 20,000 patients are currently suffered from laryngeal cancer in Japan, and the only treatment for the terminal symptoms requires the removal of the larynx including vocal cords. The authors are developing a clarification filtering algorithm of oesophageal speech, and the primal algorithm of software clarification and its effectiveness was reported in the previous ICDVRAT. Several algorithms for the clarification have been newly developed and implemented, and are being evaluated by questionnaires. The algorithms were extended and applied for the clarification of the speech by the patients of cerebral palsy. A voice is the most important and effective medium employed not only in the daily communication but also in logical discussions. Only humans are able to use words as means of verbal communication, although almost all animals have voices. Vocal sounds are generated by the relevant operations of the vocal organs such as a lung, trachea, vocal cords, vocal tract, tongue and muscles. The airflow from the lung causes a vocal cord vibration to generate a source sound, then the glottal wave is led to the vocal tract, which works as a sound filter as to form the spectrum envelope of a particular voice. If any part of the vocal organs is injured or disabled, we may be involved in the impediment in the vocalization and, in the worst case, we may lose our voices. Over 20,000 patients are currently suffered from laryngeal cancer in Japan, and the only treatment for the terminal symptoms is to remove the larynx. The removal of vocal cords means the loss of the voice, and causes various difficulties in the communication with other people, since the employment of a voice is essentially important for humans to make verbal communications. There are mainly two ways to recover voice. One is to use an artificial larynx, which is a hand-held device with a pulse generator that produces a vocal cord-like vibration. An electrolarynx has a vibrating plastic diaphragm, which is placed against the neck during the speech. The vibration of the diaphragm generates a source sound in the throat, and the speaker then articulates with the tongue, palate, throat and lips as he does for the usual vocalization. The device has an advantage to be used by just being held to the neck and to be easily mastered, but the sound quality is rather electronic and artificial. Furthermore one hand is occupied to hold the device during the speech, which disturbs the gestural communication. The other way is to train oesophageal speech, which is a method of speech production using an oesophagus (Sato, 1993; Max et al, 1996). In the speech, air is inhaled and caught in the upper oesophagus instead of being swallowed, and then the released air generates the oesophagus vibration to produce a “belch-like” sound that can be shaped into speech. A patient has difficulties to master the oesophageal speech (several years are ordinary required for the practice), however the voice is able to keep the speaker’s individuality since the speech is generated by his own vocal organs, although several distinctive characteristics exist. Moreover the speaker is able to employ his body parts such as hands and facial expressions freely and actively for the communication to assist the speech. Cerebral palsy (CP) is a condition caused by an injury to the parts of the brain which controls the ability to use muscles and bodies. The injury may happen before birth, sometimes during delivery, or soon after
Read full abstract