Complex System Analysis of Social Networks Extracted from Literary Fictions

Gyeong-Mi Park,Hwan-Gue Cho,Sung-Hwan Kim,Hye-Ryeon Hwang

doi:10.7763/ijmlc.2013.v3.282

Abstract

Recently we witnessed that the social network analysis focusing on social entities is applied in the social science and web-science, behavioral sciences, as well as in economics, marketing. In this paper we present one method to construct the social network from literary fictions by a simple lexical analysis, not using the complex natural language processing tools. And we will show that those social graphs, saying literary social graph, shows the power law distribution of some features, which is the typical characteristics of complex systems. We showed that the social network extracted from literary data reflects the similar network structure which was semantically designed by authors of fictions. And we newly proposed the concept of the kernel of literary social network by which we can classify the abstract level of protagonists appeared in fictions. Our study shows that the metric distance among characters written in linear text is very similar to the intrinsic and semantic relationship described by fiction writers, which implies the proposed social network from fictions could be another representation of literary fiction. So we can apply other scientific and quantitative approach by analyzing the concrete social graph model extracted from textual data. Extracting useful information from a large textual repository is getting essential in data mining field. One difficulty in this work is how to deal with the various kinds of natural languages. Most work has based on English based texts, but recently some other languages including East-Asian languages have shown interesting result. Due to the recent prevailing SNS (Social Network Service), people try to extract the sentiment from text data shared in on-line users. After obtaining attitude from one individual, we are able to identify the connection structure of elements in a community. The main issues of these sentiment analysis is how to identify the polarity of adjectives based on conjunctions linking them in a large corpus. And another hot research area is mining information over online discussion by observing discussion threads. By mining used words or related replying patterns among discussion thread, we can identify the friendly group or conflicting group. Our basic idea is that we can regard the complicated text (e.g. long literary fictions) as the typical complex system

Full Text