Abstract

Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias–variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton–proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute.

Highlights

  • Initiative for the Theoretical Sciences, The Graduate Center, City University of New York, 365 Fifth Ave., New York, NY 10016

  • Splitting the data into mutually exclusive training and test sets provides an unbiased estimate for the predictive performance of the model – this is known as cross-validation in the Machine Learning (ML) and statistics literature

  • Stochastic Gradient Descent (SGD) is almost always used with a “momentum” or inertia term that serves as a memory of the direction we are moving in parameter space

Read more

Summary

INTRODUCTION

Machine Learning (ML), data science, and statistics are fields that describe how to learn from, and make predictions about, data. The review is based on an advanced topics graduate course taught at Boston University in Fall of 2016. As such, it assumes a level of familiarity with several topics found in graduate physics curricula (partition functions, statistical mechanics) and a fluency in mathematical techniques such as linear algebra, multivariate calculus, variational methods, probability theory, and Monte-Carlo methods. It assumes a level of familiarity with several topics found in graduate physics curricula (partition functions, statistical mechanics) and a fluency in mathematical techniques such as linear algebra, multivariate calculus, variational methods, probability theory, and Monte-Carlo methods It assumes a familiarity with basic computer programming and algorithmic design

What is Machine Learning?
Scope and structure of the review
Setting up a problem in ML and data science
Polynomial Regression
BASICS OF STATISTICAL LEARNING THEORY
Bias-Variance Decomposition
GRADIENT DESCENT AND ITS GENERALIZATIONS
Gradient Descent and Newton’s method
Limitations of the simplest gradient descent algorithm
Adding Momentum
Methods that use the second moment of the gradient
Comparison of various methods
Gradient descent in practice: practical tips
OVERVIEW OF BAYESIAN INFERENCE
Bayes Rule
Bayesian Decisions
Hyperparameters
LINEAR REGRESSION
Using Linear Regression to Learn the Ising Hamiltonian
Recap and a general perspective on regularizers
LOGISTIC REGRESSION
Identifying the phases of the 2D Ising model
An Example of SoftMax Classification
VIII. COMBINING MODELS
Revisiting the Bias-Variance Tradeoff for Ensembles
Bias-Variance Decomposition for Ensembles
Summarizing the Theory and Intuitions behind Ensembles
Applications to the Ising model and Supersymmetry Datasets
The basic building block: neurons
Dropout
Batch Normalization
Deep learning packages
Approaching the learning problem
SUSY dataset
The structure of convolutional neural networks
H Convolution
HIGH-LEVEL CONCEPTS IN DEEP NEURAL NETWORKS
Neural networks as representation learning
Some of the challenges of high-dimensional data
DIMENSIONAL REDUCTION AND DATA VISUALIZATION
XIII. CLUSTERING
Maximization
Ward’s linkage
Clustering and Latent Variables via the Gaussian Mixture Models
Maximization step
ENERGY BASED MODELS
An overview of energy-based generative models
MaxEnt models in statistical mechanics
Generalized Ising Models from MaxEnt
Cost functions for training energy-based models
Maximum likelihood
Regularization
DEEP GENERATIVE MODELS
Practical Considerations
VAEs as variational models
Training via the reparametrization trick
Connection to the information bottleneck
Implementing the Gaussian VAE
VAEs for the MNIST dataset
XVIII. OUTLOOK
Research at the intersection of physics and ML
Rebranding Machine Learning as “Artificial Intelligence”
Ising dataset
Findings
MNIST Dataset

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.