Abstract
With development of Internet, an increasing number of user-generated-contents provide valuable information to the public. Microblog is a new platform where peoples discuss all kinds of topics. It also provides a good opportunity for the researchers to explore the online public opinion. News collection and summarization has been attracted lots of research previously. However, manually labeling is impossible since the task is time-consuming. In this paper, we focus on news summarization with few labeled samples. A semi-supervised learning method has been proposed to tackle the problem. We employ Co-Training method to extract the news information. Posts and replies of Microblog have been identified as two independent views to train a classification model. Entity, Time, place and incident of news have been identified as well. Experimental result in different datasets shows the proposed method outperform the baseline methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have