Kernel Smoothed Probability Mass Functions for Ordered Datatypes

Jeffrey Racine,Karen X Yan,Qi Li

doi:10.2139/ssrn.3064732

Abstract

We propose a kernel function for ordered categorical data that overcomes certain limitations present in ordered kernel functions that have appeared in the literature on the estimation of probability mass functions for multinomial ordered data. Some of these limitations arise from assumptions made about the support of the random variable that may be at odds with the data at hand. Furthermore, many existing ordered kernel functions lack a particularly appealing property, namely the ability to deliver discrete uniform probability estimates for some value of the smoothing parameter. To overcome these limitations, we propose an asymmetric empirical support kernel function that adapts to the data at hand and possesses certain desirable features. In particular, there are no difficulties arising from zero counts caused by gaps in the data while it encompasses both the empirical proportions and the discrete uniform probabilities at the lower and upper boundaries of the smoothing parameter. We propose using likelihood and least squares cross-validation for smoothing parameter selection, and study the asymptotic behaviour of these data-driven methods. We use Monte Carlo simulations to examine the finite sample performance of the proposed estimator and we also provide a simple empirical example to illustrate the usefulness of the proposed estimator in applied settings.

Highlights

In multinomial discrete support random variable settings, it is common to encounter situations in which the support contains only a handful of values, and such values may contain gaps (e.g., {0, 1, 2, 5}). When such data are of the ordered type, using a kernel function that recognizes order present in data can lead to improved accuracy relative to kernel functions that ignore order
Unlike Hall (1987), who considered likelihood cross-validation in a density estimation context and demonstrated how its asymptotic properties are profoundly influenced by tail properties of the kernel function and of the unknown density function, our approach is immune to this phenomenon because we explicitly treat our problem as one having finite support, there is no “tail” in the sense of Hall (1987)
Ryzin (1981) and Ahmad and Cerrito (1994), but they presume that the support is the set of all consecutive integers which may not be the case for the data at hand, while there is no value of the smoothing parameter for which the kernel function is the discrete uniform (the same goes for the ordered kernels proposed by Rajagopalan and Lall (1995), Chu et al (2017) and others)

Summary

Introduction

In multinomial discrete support random variable settings, it is common to encounter situations in which the support contains only a handful of values, and such values may contain gaps (e.g., {0, 1, 2, 5}). When such data are of the ordered type, using a kernel function that recognizes order present in data can lead to improved accuracy relative to kernel functions that ignore order (e.g., binary unordered counting kernel functions). The proposed approach exhibits better finite sample performance than estimators based on kernel functions that ignore order present in the data and than the empirical proportions themselves. Unlike Hall (1987), who considered likelihood cross-validation in a density estimation context and demonstrated how its asymptotic properties are profoundly influenced by tail properties of the kernel function and of the unknown density function, our approach is immune to this phenomenon because we explicitly treat our problem as one having finite support, there is no “tail” in the sense of Hall (1987)

Background

Data-driven Smoothing Parameter Selection Methods

Monte Carlo Simulation

The Multivariate Case

Multivariate Data-Driven Smoothing Parameter Selection

Empirical Illustration

Summary

Some Useful Lemmas

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Kernel Smoothed Probability Mass Functions for Ordered Datatypes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: SSRN Electronic Journal

Lead the way for us

Journal: SSRN Electronic Journal	Publication Date: Nov 3, 2017
License type: cc-by

Similar Papers

Kernel smoothed probability mass functions for ordered datatypes
Jeffrey S Racine ... Karen X Yan
Journal of Nonparametric Statistics | VOL. 32
Jeffrey S Racine, et. al.Jeffrey S Racine ... Karen X Yan
12 May 2020
Journal of Nonparametric Statistics | VOL. 32

Enhanced mean shift tracking algorithm based on evolutive asymmetric kernel
Dai Yuan-Ming ... Lin Yi-Ning
-
Dai Yuan-Ming, et. al. Dai Yuan-Ming ... Lin Yi-Ning
01 Jul 2011
01 Jul 2011

A Robust Asymmetric Kernel Function for Bayesian Optimization, With Application to Image Defect Detection in Manufacturing Systems
Areej Albahar ... Inyoung Kim
IEEE Transactions on Automation Science and Engineering | VOL. 19
Areej Albahar, et. al.Areej Albahar ... Inyoung Kim
01 Oct 2022
IEEE Transactions on Automation Science and Engineering | VOL. 19

A deep feature extraction method for bearing fault diagnosis based on empirical mode decomposition and kernel function
Fengtao Wang ... Gang Deng
Advances in Mechanical Engineering | VOL. 10
Fengtao Wang, et. al.Fengtao Wang ... Gang Deng
01 Sep 2018
Advances in Mechanical Engineering | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Kernel Smoothed Probability Mass Functions for Ordered Datatypes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: SSRN Electronic Journal