A high-bias, low-variance introduction to Machine Learning for physicists

Pankaj Mehta,Marin Bukov,Ching-Hao Wang,Alexandre G.R Day,Clint Richardson,Charles K Fisher,David J Schwab

doi:10.1016/j.physrep.2019.03.001

Abstract

Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias–variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton–proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute.

Highlights

Initiative for the Theoretical Sciences, The Graduate Center, City University of New York, 365 Fifth Ave., New York, NY 10016
Splitting the data into mutually exclusive training and test sets provides an unbiased estimate for the predictive performance of the model – this is known as cross-validation in the Machine Learning (ML) and statistics literature
Stochastic Gradient Descent (SGD) is almost always used with a “momentum” or inertia term that serves as a memory of the direction we are moving in parameter space

Summary

INTRODUCTION

Machine Learning (ML), data science, and statistics are fields that describe how to learn from, and make predictions about, data. The review is based on an advanced topics graduate course taught at Boston University in Fall of 2016. As such, it assumes a level of familiarity with several topics found in graduate physics curricula (partition functions, statistical mechanics) and a fluency in mathematical techniques such as linear algebra, multivariate calculus, variational methods, probability theory, and Monte-Carlo methods. It assumes a level of familiarity with several topics found in graduate physics curricula (partition functions, statistical mechanics) and a fluency in mathematical techniques such as linear algebra, multivariate calculus, variational methods, probability theory, and Monte-Carlo methods It assumes a familiarity with basic computer programming and algorithmic design

What is Machine Learning?

Scope and structure of the review

Setting up a problem in ML and data science

Polynomial Regression

BASICS OF STATISTICAL LEARNING THEORY

Bias-Variance Decomposition

GRADIENT DESCENT AND ITS GENERALIZATIONS

Gradient Descent and Newton’s method

Limitations of the simplest gradient descent algorithm

Adding Momentum

Methods that use the second moment of the gradient

Comparison of various methods

Gradient descent in practice: practical tips

OVERVIEW OF BAYESIAN INFERENCE

Bayes Rule

Bayesian Decisions

Hyperparameters

LINEAR REGRESSION

Using Linear Regression to Learn the Ising Hamiltonian

Recap and a general perspective on regularizers

LOGISTIC REGRESSION

Identifying the phases of the 2D Ising model

An Example of SoftMax Classification

VIII. COMBINING MODELS

Revisiting the Bias-Variance Tradeoff for Ensembles

Bias-Variance Decomposition for Ensembles

Summarizing the Theory and Intuitions behind Ensembles

Applications to the Ising model and Supersymmetry Datasets

The basic building block: neurons

Dropout

Batch Normalization

Deep learning packages

Approaching the learning problem

SUSY dataset

The structure of convolutional neural networks

H Convolution

HIGH-LEVEL CONCEPTS IN DEEP NEURAL NETWORKS

Neural networks as representation learning

Some of the challenges of high-dimensional data

DIMENSIONAL REDUCTION AND DATA VISUALIZATION

XIII. CLUSTERING

Maximization

Ward’s linkage

Clustering and Latent Variables via the Gaussian Mixture Models

Maximization step

ENERGY BASED MODELS

An overview of energy-based generative models

MaxEnt models in statistical mechanics

Generalized Ising Models from MaxEnt

Cost functions for training energy-based models

Maximum likelihood

Regularization

DEEP GENERATIVE MODELS

Practical Considerations

VAEs as variational models

Training via the reparametrization trick

Connection to the information bottleneck

Implementing the Gaussian VAE

VAEs for the MNIST dataset

XVIII. OUTLOOK

Research at the intersection of physics and ML

Rebranding Machine Learning as “Artificial Intelligence”

Ising dataset

Findings

MNIST Dataset

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Physics reports	Publication Date: Mar 14, 2019
Citations: 845	License type: cc-by

R Discovery Prime

R Discovery Prime

A high-bias, low-variance introduction to Machine Learning for physicists

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Physics reports

Lead the way for us

Similar Papers

Machine-Learning Implementation in Clinical Anesthesia: Opportunities and Challenges.
Danton S Char ... Alyssa Burgart
Anesthesia & Analgesia | VOL. 130
Danton S Char, et. al.Danton S Char ... Alyssa Burgart
01 Jun 2020
Anesthesia & Analgesia | VOL. 130

Detecting ADRD Caregivers’ Information Wants in Social Media: A Machine Learning–Aided Approach
Bo Xie ... Zhendong Wang
Innovation in Aging | VOL. 4
Bo Xie, et. al.Bo Xie ... Zhendong Wang
16 Dec 2020
Innovation in Aging | VOL. 4

Developing Machine Learning Skills With No-Code Machine Learning Tools
Emmanuel Djaba ... Joseph Budu
-
Emmanuel Djaba, et. al.Emmanuel Djaba ... Joseph Budu
14 Oct 2022
14 Oct 2022

Towards the representational power of restricted Boltzmann machines
Linyan Gu ... Lihua Yang
Neurocomputing | VOL. 415
Linyan Gu, et. al.Linyan Gu ... Lihua Yang
06 Sep 2020
Neurocomputing | VOL. 415

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A high-bias, low-variance introduction to Machine Learning for physicists

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Physics reports