統計式語言模型 – 語音文件標記、檢索以及摘要

陳冠宇

doi:10.6342/ntu.2015.00784

Abstract

The inestimable volumes of multimedia associated with spoken documents that been made available to the public in the past two decades have brought spoken document understanding and organization to the forefront as subjects of research. Among all the related subtasks, spoken document indexing, retrieval and summarization can be thought of as the cornerstones of this research area. Statistical language modeling (LM), which purports to quantify the acceptability of a given piece of text, has long been an interesting yet challenging research area. Much research shows that language modeling for spoken document processing has enjoyed remarkable empirical success. Motivated by the great importance of and interest in language modeling for various spoken document processing tasks (i.e., indexing, retrieval and summarization), language modeling is the backbone of this thesis. In real-world applications, a serious challenge faced by the search engine is that queries usually consist of only a few words to address users’ information needs. This thesis starts with a general survey of the practical challenge, and then not only proposes a principled framework which can unify the relationships among several widely-used approaches but also extends this school of techniques to spoken document summarization tasks. Next, inspired by the concept of the i-vector technique, an i-vector based language modeling framework is proposed for spoken document retrieval and reformulated to accurately represent users’ information needs. Following, we are aware that language models have shown preliminary success in extractive speech summarization, but a central challenge facing the LM approach is how to formulate sentence models and accurately estimate their parameters for each sentence in the spoken document to be summarized. Thus, in this thesis we propose a framework which builds on the notion of recurrent neural network language models and a curriculum learning strategy, which shows promise in capturing not only word usage cues but also long-span structural information about word co-occurrence relationships within spoken documents, thus eliminating the need for the strict bag-of-words assumption made by most existing LM-based methods. Lastly, word embedding has been a recent popular research area due to its excellent performance in many natural language processing (NLP)-related tasks. However, as far as we are aware, there are relatively few studies that investigate its use in extractive text or speech summarization. First of all, this thesis focuses on building novel and efficient ranking models based on general word embedding methods for extractive speech summarization. Next, the thesis proposes a novel probabilistic modeling framework for learning word and sentence representations, which not only inherits the advantages of the original word embedding methods but also boasts a clear and rigorous probabilistic foundation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

統計式語言模型 – 語音文件標記、檢索以及摘要

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A recurrent neural network language modeling framework for extractive speech summarization
Kuan-Yu Chen ... Berlin Chen
-
Kuan-Yu Chen, et. al.Kuan-Yu Chen ... Berlin Chen
01 Jul 2014
01 Jul 2014

Word Embeddings for Natural Language Processing

-

01 Jan 2015
01 Jan 2015

Survey on Neural Networks in Natural Language Processing
Fei Hu
-
Fei HuFei Hu
28 Apr 2023
28 Apr 2023

Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling
Ryo Masumura ... Taichi Asami
-
Ryo Masumura, et. al.Ryo Masumura ... Taichi Asami
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

統計式語言模型 – 語音文件標記、檢索以及摘要

Abstract

Talk to us

Similar Papers