Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing

Xiaozhong Liu ,Ying Ding ,Miao Chen ,Min Song

doi:10.1145/2513549

Abstract

It is our great pleasure to welcome you to the 2013 ACM International Workshop on Mining Unstructured Big Data using Natural Language Processing, which will be held at ACM International Conference on Information and Knowledge Management, CIKM 2013. Unstructured text data is heterogeneous and available in different formats, such as text document, scientific publication, web page, and customer comment. The availability of many big unstructured text datasets enables, while also challenges researchers to discover and explore valuable information/knowledge via different techniques. Mining semantics by using Natural Language Processing (NLP) methodologies is an important approach to uncover the "latent knowledge/semantic" of the unstructured text data. In the past decade, while a number of NLP based features already successfully used to enhance the performance of the text mining or information retrieval systems, we are also facing some challenges. For instance, most NLP algorithms' computational cost is high, and we can hardly employ them directly to large-scale text data for online systems. In this workshop, we aggregate different but highly related research communities, i.e., "NLP", "Text Mining" and "IR" researchers, to investigate the possible opportunities and challenges in semantic mining problem. Nine very interesting papers, covering semantic analysis, social media mining, real-time information extraction, and etc., will be presented in this workshop. For this workshop, an opportunity is offered to both NLP and text mining research communities to better clarify the opportunities and challenges in NLP based semantic mining for big unstructured text data with their research experience. We also encourage attendees to attend the keynote presentation - "HathiTrust Data, Opportunities and Challenges for Text Mining and NLP" by Dr. Beth A. Plale, Director of Data to Insight Center, and Professor at School of Informatics and Computing, Indiana University. HathiTrust is a partnership of academic & research institutions, offering a collection of millions of digitized from libraries around the world plus effective API access. We hope that you will find this program interesting and thought-provoking and that the workshop will provide you with a valuable opportunity to share ideas with other researchers and practitioners from institutions around the world.

Full Text