Critical Issues in the Development of Speech Technology for Man-Machine Integration

S Joy Mountford,Wayne A Lea

doi:10.1177/154193128202600302

Abstract

It is evident that speech technology, recognition and generation, has grown rapidly in the last few years with new commercial and industrial products developing almost overnight. Speech technology offers the potential of a natural, efficient, and hands-free communication medium for humans to interact with computers. This exciting new man-machine interface concept is not merely looking for a new home, it is in desperate need of much good human factors research. The capabilities demonstrated by human and machine are very different. This means that the integration of a new technology should enhance, in particular, those human tasks that are difficult and fatiguing. In other words, the primary consideration should be in designing for the human's needs and capabilities. The machine and the interface can be redesigned and improved with time and scientific progress. Speech technology as a new interface medium offers another opportunity for the roles of human and machine to be complementary. Speech technology requires an expansion of the traditional concept of the visual-manual interface. A verbal capability permits new dialogue formats and information access and should not be viewed as a mere one for one substitute for visual or manual operations. The research reviewed will include performance studies in time-shared environments, which illustrate that voice recognition can be a preferred method of input over a keyboard. The speech generation studies reviewed indicate the utility of remote location spoken messages for alerting functions. However, there appears to be a lack of research effort in exploiting the interplay of the two speech technologies as a natural conversational dialogue interface. Some key basic research studies are sorely missing and their suggested format will be described. This presentation attempts to summarize the kinds of research issues that have been addressed in the application of speech technology to the human-machine interface, especially within the context of military and industrial environments. Merely determining technically that a particular task can be performed using the speech medium does not imply that this same task should be implemented using speech technology. Such features as user utility, role within the whole work station, type of payoff, and additional environmental factors need to be simultaneously considered and weighted accordingly for each particular application. Some methodological approaches that have been developed to aid in these determinations will be discussed. Guidelines for both human factors considerations and methodology development will be described to include the following: a. Criteria for assessing task utility b. Vocabulary selection using confusion matrices c. Structure of task-oriented grammar d. Structure of dialogue tasks e. Potential linguistic-semantic enhancements f. Recommended performance evaluation tests g. Operator training recommendations h. Impact of environmental constraints i. Problems encountered in restricted communication modes j. Potential of multi-modal communications k. Enhancements to hardware and software configurations In addition to these guidelines for speech technology implementations, some specific considerations that need to be given for speech recognition and generation individually will be described. For example, the impact that connected or continuous speech recognition may have on man-machine interfaces, and the use of flexible word order entry needs to be considered. There are also some concerns about the additional auditory memory loadings that may be placed on operators using lengthy speech generation feedback messages. Furthermore, speech generation implementation requires complex prioritization and inhibition logic trees to be developed to prevent simultaneous receipt of two messages with vastly different user impact. Guidelines and recommendations for speech technology users and researchers alike will be discussed. The current gap in the development of speech technology as a successful and useful input/output mode focuses on the need for good human factors research. This discussion portrays what is known about the strengths and limitations of speech technology. In doing this, it illustrates the kinds of human factors issues and cautions that have to be addressed before speech technology can find its rightful role in enhancing any man-machine interface.

Full Text