Abstract

Driven by internet technology, online has become the main way of news dissemination, but redundant information such as navigation bars and advertisements affects people's access to news content. The research aims to enable users to obtain pure news content from redundant web information. Firstly, based on the narrative characteristics of literary news, the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm is employed to extract pure news content from the analyzed web pages. The algorithm uses keyword matching, text analysis, and semantic processing to determine news content's boundaries and key information. Secondly, the news text classification algorithm (support vector machine, K-nearest neighbor, AdaBoost algorithm) is selected through comparative experiments. The news extraction system based on keyword feature and extended Document Object Model (DOM) tree is constructed. DOM technology analyzes web page structure and extracts key elements and information. Finally, the research can get their narrative characteristics by studying the narrative sequence and structure of 15 American literary news reports. The results reveal that the most used narrative sequence in American literary news is sequence and flashback. The narrative duration is dominated by the victory rate and outline, supplemented by scenes and pauses. In addition, 53.3% of the narrative structures used in literary news are time-connected. This narrative structure can help reporters have a clear conceptual structure when writing, help readers quickly grasp and understand the context of the event and the life course of the protagonists in the report, and increase the report's readability. This research on the narrative characteristics of American literature news can provide media practitioners with a reference on news narrative techniques and strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call