Linear Algebra and Learning From Data [Bookshelf

George Cybenko

doi:10.1109/mcs.2020.2976390

Abstract

This book contains the key linear algebra and optimization techniques at the forefront of active data-science and machine learning practice today. This is an appropriate choice of content because while state-of-the-art machine learning applications can change each month (as in reinforcement learning, language translation, game playing, or image classification), the underlying mathematical concepts and algorithms do not. The text is an offspring of a current Massachusetts Institute of Technology (MIT) mathematics course, Matrix Methods in Data Analysis and Signal Processing, which, in turn, was greatly influenced by a current University of Michigan electrical engineering and computer science course, Matrix Methods for Signal Processing, Data Analysis, and Machine Learning. Both courses are aimed at advanced undergraduate and graduate students in science and engineering. The book is broken down into seven parts. Part 1 presents highlights of linear algebra and covers a variety of basic matrix concepts, starting with matrix multiplication and moving up to singular values, principal components, low-rank approximations, and matrix/tensor factorizations. Part 2 discusses computation with large matrices, focusing on matrix factorizations, iterative methods, and recent powerful algorithms for approximating matrix problem solutions using randomization and projection. Due to the writing style, this is one of those books that you can open to almost any page and find a topic you didn’t know about or understand. After reading a few pages, it is quite likely that you will learn something new and useful. To me, such a resource has great value above and beyond its use solely as a textbook. I believe this textbook would be valuable for a matrix methods class oriented toward signal processing and machine learning. It would also be useful as a reference for someone seeking to be self-taught or review material learned years ago (but not used recently). Given that there is no discussion of system identification or recurrent networks for learning systems or automata, the book may be slightly less appealing to control theorists. Overall, a book of this kind is a very ambitious project. It covers many topics, several of which are typically reserved for dedicated graduate courses. Consequently, some depth must be sacrificed when such breadth is attempted. To help readers delve deeper, a more coherent approach to presenting references would be useful. As it stands, some references are embedded in the main text, while others are listed at the end of a section or chapter, but with inconsistent cross references. Some sections simply do not reference anything for further reading. Although one might argue that this is less of a problem (because Google Scholar and other web searches can help find original articles quite efficiently) it would still be helpful to get pointers about where to start. As this textbook becomes more widely used and the material continues to evolve, I believe a second edition that combines Jupyter Notebook with Matlab, Python, or TensorFlow examples would be a very welcome addition, consistent with the spirit of how machine learning practitioners actually work. In summary, this text is a valuable foundational asset for teaching, learning, and reference and is surely worth considering for your classroom and/or personal library.

Full Text