Abstract

This thesis considers probabilistic finite automata (PFA), distributions of sequences over finite alphabets, the links between them and the learnability thereof. Pervasive in scientific fields ranging from computer science to electrical engineering to information theory, PFA models also find numerous practical applications in speech recognition, bioinformatics and natural language processing. PFA models are the most general among the myriad of syntactic objects providing probabilistic extensions of finite state machines. Closely related to hidden Markov models (HMMs), PFAs have been the focus of extensive research, but continue to pose interesting theoretical and practical problems to this day. The thesis presents geometric insights into the PFA learning problem, a characterization theorem for the family of distributions induced by PFA models, as well as a number of applications of this theorem. For a subclass of PFA called probabilistic deterministic finite automata (PDFA), a number of learnability results are presented. These results place limits on the PDFA subclasses which are learnable using a class of algorithms collectively known as state merging. The sample complexity of learning general distributions over countable sets is considered, and lower and upper bounds, which asymptotically match up to a logarithmic factor are developed. An example is constructed exhibiting a class of PDFA models which is efficiently learnable using state merging. It is demonstrated that distributions induced by this class are not efficiently learnable by direct estimation (making no assumptions on the distribution’s source) in the sense that the sample complexity is bounded below by an exponential in the number of states.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call