Recent developments in deep learning strategies have revolutionized Speech and Language Technologies(SLT). Deep learning models often rely on massive naturalistic datasets to produce the necessary complexity required for generating superior performance. However, most massive SLT datasets are not publicly available, limiting the potential for academic research. Through this work, we showcase the CRSS-UTDallas led efforts to recover, digitize, and openly distribute over 50,000 hrs of speech data recorded during the 12 NASA Apollo manned missions, and outline our continuing efforts to digitize and create meta-data through diarization of the remaining 100,000hrs. We present novel deep learning-based speech processing solutions developed to extract high-level information from this massive dataset. Fearless-Steps APOLLO resource is a 50,000 hrs audio collection from 30-track analog tapes originally used to document Apollo missions 1,7,8,10,11,&13. A customized tape read-head developed to digitize all 30 channels simultaneously has been deployed to expedite digitization of remaining mission tapes. Diarized transcripts for these unlabeled audio communications have also been generated to facilitate open research from speech sciences, historical archives, education, and speech technology communities. Robust technologies developed to generate human-readable transcripts include: (i) speaker diarization, (ii) speaker tracking, and (iii) text output from speech recognition systems.
Read full abstract