Sensing Semantics of RSS Feeds by Fuzzy Matchmaking

M.W Yuan,X.N Wang,P Jiang,J Zhu

doi:10.4236/iim.2010.22014

Abstract

RSS feeds provide a fast and effective way to publish up-to-date information or renew outdated contents for information subscribers. So far RSS information is mostly managed by content publishers but Internet users have less initiative to choose what they really need. More attention needs to be paid on techniques for user-initiative information discovery from RSS feeds. In this paper, a quantitative semantic matchmaking method for the RSS based applications is proposed. Semantic information is extracted from an RSS feed as numerical vectors and semantic matching can then be conducted quantitatively. Ontology is applied to provide a common-agreed matching basis for the quantitative matchmaking. In order to avoid semantic ambiguity of literal statements from distributed and heterogeneous RSS publishers, fuzzy inference is used to transform an individual-dependent vector into an individual-independent vector. Semantic similarities can be revealed as the result.

Highlights

Internet is a complex environment with dynamically changing contents and large-scale distributed users
This paper proposes a quantitative method for information acquisition from RSS feeds with the aid of the semantic web technique
BEGIN Get root; conceptVector (0) = root.Name ; pos = 0; // the index of first concept in every level count the childNum of root; While childNum > 0 k=0; //count the children numbers of level l For each entity i in level l For each child j of entity i k++; conceptVector(pos+k) = j.Name; Endfor Endfor pos = pos + childNum; childNum = k; Endwhile After flattening an ontology graph into a concept vector, a semantic distance matrix can be obtained to depict semantic difference between any two concepts defined in ontology

Summary

Introduction

Internet is a complex environment with dynamically changing contents and large-scale distributed users. Due to the autonomy of heterogeneous RSS publishers in an open environment it is impossible to force them to use strictly consistent terminologies and sentences This introduces ambiguities into text mining of RSS feeds and makes the keyword based vector space model [15] difficult for RSS based applications. This paper proposes a quantitative method for information acquisition from RSS feeds with the aid of the semantic web technique. The method proposed in this paper is for general formats of RSS feeds, and RSS 1.0, RSS 2.0 or any other RSS-like formats (e.g. Atom) can be applied It acts as a real-time sensor of RSS feeds in the Internet with the capability of semantic awareness and fuzzy sensing.

Semantic Distance between Concepts in Ontology

Semantic Matchmaking of RSS Feeds

An RSS Filter Agent for Job Hunting

Findings

Conclusions