Chapter 9 - Probabilistic methods

Eibe Frank,Christopher J Pal,Mark A Hall,Ian H Witten

doi:10.1016/b978-0-12-804291-5.00009-x

Abstract

It is now time to introduce probabilistic approaches to data mining and machine learning in a formal way. We begin by outlining fundamental aspects of probability theory that are widely used in data mining and practical machine learning. The maximum likelihood approach is presented, along with methods for learning with hidden variables, including the well-known expectation maximization algorithm. Maximum likelihood methods are cited in the context of approaches that are more Bayesian in nature, and the role of variational methods and sampling procedures are also discussed. Bayesian networks are presented and used to describe a wide variety of methods, such as mixture models, principal component analysis, and latent Dirichlet allocation, as well as other probabilistic methods. Conditional probability models are discussed, including key derivations for regression and multiclass classification. Their relationships to widely used generalized linear models are also outlined. Methods for modeling and extracting information from sequences and for structured prediction problems are also presented, including Markov models, hidden Markov models, Markov random fields, and conditional random fields. Software packages that specialize in creating and using these models and methods are also surveyed.

Full Text