Abstract

Without assuming any functional or distributional structure, we select collections of major factors embedded within response-versus-covariate (Re-Co) dynamics via selection criteria [C1: confirmable] and [C2: irrepaceable], which are based on information theoretic measurements. The two criteria are constructed based on the computing paradigm called Categorical Exploratory Data Analysis (CEDA) and linked to Wiener–Granger causality. All the information theoretical measurements, including conditional mutual information and entropy, are evaluated through the contingency table platform, which primarily rests on the categorical nature within all involved features of any data types: quantitative or qualitative. Our selection task identifies one chief collection, together with several secondary collections of major factors of various orders underlying the targeted Re-Co dynamics. Each selected collection is checked with algorithmically computed reliability against the finite sample phenomenon, and so is each member’s major factor individually. The developments of our selection protocol are illustrated in detail through two experimental examples: a simple one and a complex one. We then apply this protocol on two data sets pertaining to two somewhat related but distinct pitching dynamics of two pitch types: slider and fastball. In particular, we refer to a specific Major League Baseball (MLB) pitcher and we consider data of multiple seasons.

Highlights

  • Two news articles have recently been published on the topic of Major League Baseball (MLB) pitchers’ performance being drastically empowered or caused by baseball’s spin rate increases

  • Cole Went from So-So to Unbeatable?”, and the other is a 2021 New York Times article with the title, “Once Again, MLB Faces a Crisis of Its Own Making”

  • We demonstrate the merits of a Categorical Exploratory Data Analysis (CEDA)-based selection protocol for collections of major factors as a brand new means of studying the dynamics underlying a single complex system as well as multiple complex systems in a collective fashion

Read more

Summary

Introduction

Two news articles have recently been published on the topic of Major League Baseball (MLB) pitchers’ performance being drastically empowered or caused by baseball’s spin rate increases. If we take a collection of pitches delivered by an MLB pitcher in a single season as a data set observed from a pitcher-specific pitching dynamics with complexity [1,2], one fundamental problem called the Many System Problem (MSP) underlies both news articles. Under the setting of imposing no assumed functional structure upon a targeted Re-Co dynamics that characterizes an MSP under study, we construct a selection protocol by employing information theoretic measures, such as conditional entropy and mutual information, to identify one chief collection and several alternative collections of major factors underlying the targeted Re-Co dynamics Such a computational resolution in the form of a collection of major factors aims to realistically discover constituent parts or mechanisms pertaining to a complex system under study. Based on the computed collections of major factors of various orders, we are able to make a conclusion regarding the roles of spin rate in Cole’s slider and fastball pitching dynamics across the three MLB seasons

Background
Methods
A Simple Illustrative Example for a Single System
Structural Formation and Major Factors in Complex Systems
A Complex Illustrative Example for One System
Gerrit Cole’s Pitching Dynamics
Gerrit Cole’s Slider Pitching Dynamics
Gerrit Cole’s Fastball Pitching Dynamics
Summary of Gerrit Cole’s Slider and Fastball Pitching Dynamics
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call