Topological Data Analysis Methods for Data Set Characterization to Augment Machine Learning Methods

Christopher Griffin ,Trevor Karn ,Benjamin Apple

doi:10.48448/2xhx-bm27

Abstract

Neural networks have shown to be extremely effective tools for machine learning in broad contexts, including natural language processing, image processing, and adversarial game playing such as Chess and Go. Despite their success in a wide variety of contexts, the design of a neural network (which is inescapably tied to its performance) is often a matter of iterated re-engineering to find an architecture that performs the best on a small set of predetermined metrics. As such, this leaves the parsimony of such systems vague at best. Over the past few years it has become apparent that the mathematics of topology can be used to understand some theoretical fundamentals of neural networks. Roughly speaking, topology is the field of math that studies spaces that are considered to be ``the same'' up to a continuous stretching (i.e., homeomorphism). For that reason it is known as ``rubber sheet geometry.'' The main reason that topology has provided tools to study neural networks is that a deep neural network can be thought of as an iterated sequence of continuous transformations between spaces. In this work we develop an algorithm for automatically generating a trainable deep neural network with information obtained from the geometric and topological properties of training data. Our approach first finds a dense ellipsoidal covering of the training data set that is consistent with the classification information. We then find an (approximately) minimum sub-cover that models the classification information. A neural network is constructed that approximates the structure of the minimum sub-cover and which encodes logical statements representative of the data geometry. We show empirically that after training, the neural network retains the imprinted geometric information, making each module in the neural network geometrically interpretable. Theoretical and experimental results provide information on this approach across a variety of data sets. In addition to this we show how the proposed approach, when combined with methods from topological data analysis (specifically homology), can be used to quantify the likelihood that any neural network classifier will perform well on a given binary classifier data set before neural network engineering takes place. All results are illustrated using multiple data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Topological Data Analysis Methods for Data Set Characterization to Augment Machine Learning Methods

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Primer on Machine Learning.
Audrene S Edwards ... Bruce Kaplan
Transplantation | VOL. 105
Audrene S Edwards, et. al.Audrene S Edwards ... Bruce Kaplan
18 Aug 2020
Transplantation | VOL. 105

Machine learning in pain research.
Jörn Lötsch ... Alfred Ultsch
Pain | VOL. 159
Jörn Lötsch, et. al.Jörn Lötsch ... Alfred Ultsch
24 Nov 2017
Pain | VOL. 159

Artificial intelligence in interdisciplinary life science and drug discovery research.
Jürgen Bajorath
Future science OA | VOL. 8
Jürgen BajorathJürgen Bajorath
08 Mar 2022
Future science OA | VOL. 8

Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
-
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34
--
12 Apr 2022
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Topological Data Analysis Methods for Data Set Characterization to Augment Machine Learning Methods

Abstract

Talk to us

Similar Papers