Abstract

Abstract This chapter presents a tutorial introduction to modern information retrieval concepts, models, and systems. It begins with a reference architecture for the current Information Retrieval (IR) systems, which provides a backdrop for rest of the chapter. Text preprocessing is discussed using a mini Gutenberg corpus. Next, a categorization of IR models is presented followed by Boolean IR model description. Positional index is introduced, and execution of phrase and proximity queries is discussed. Various term weighting schemes are discussed next followed by descriptions of three IR models—Vector Space, Probabilistic, and Language models. Approaches to evaluating IR systems are presented. Relevance feedback techniques as a means to improving retrieval effectiveness are described. Various IR libraries, frameworks, and test collections are indicated. The chapter concludes by outlining facets of IR research and indicating additional reading.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call