Cockpit voice recorders (CVRs) are one of the two mandatory flight recording devices embarked in commercial aircraft. Its analysis is crucial to understand the context of an air incident or accident. However, in such scenarios, when the audio recordings are usable, CVR may contain strong mixtures of crew member speech signals, radio communications, and cockpit alarms. However, contrary to the “cocktail party problem” that blind source separation (BSS) aims to tackle, modeling CVR mixtures—that are here named the “cockpit party problem”—was never done before. In this paper, the authors thus propose a CVR mixture model and highlight its limitations. While not trivial—even in a two-source scenario—BSS methods can be applied to real CVR recordings. It is found that taking into account several BSS outputs provided by various methods may help audio analysts to transcribe the CVR data. That is, near 90% of unintelligible words can be transcribed from CVR recordings processed by BSS methods.
Read full abstract