Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

THE DISCRETE NEW XLINDLEY DISTRIBUTION: A STATISTICAL FRAMEWORK FOR MODELLING MEDICAL AND BIOLOGICAL SCIENCE DATA

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Modelling the frequency of events is a significant problem that has received a lot of attention in recent years. Discrete probability distributions such as the Poisson, Negative Binomial, Geometric, and Poisson-Lindley are commonly used for this purpose. However, these traditional distributions often exhibit limited flexibility in capturing the complexity of real-world count data. In this regard, we study the New Discrete XLindley distribution introduced by (Maya et al., 2024) and discussed its various structural properties. A Bayesian analysis is conducted to enhance the inferential understanding of the model. To address the presence of excess zeros in count data, we propose a zero-inflated extension of the New Discrete XLindley model. Parameters are estimated using the Maximum Likelihood Estimation method, and the performance of the estimators is assessed via simulation studies. The practical relevance of the proposed model is demonstrated through its application to a real-life dataset. Finally, a Likelihood Ratio Test is employed to test the significance of the zero-inflation parameter, providing strong evidence in support of the extended model. Overall, the zero-inflated New Discrete XLindley model offers a flexible and effective tool for modeling zero-inflated count data.

Similar Papers
  • Research Article
  • 10.29020/nybg.ejpam.v18i2.6144
A Novel Discrete Probability Distribution with Theoretical and Inferential Insights: Cutting-Edge Approaches to Sustainable Dispersion Data Modeling
  • May 1, 2025
  • European Journal of Pure and Applied Mathematics
  • Mohamed Eliwa + 2 more

As real-world data becomes increasingly complex, there is a growing need for advanced probability models to support sustainable discrete data analysis. This study presents a novel and flexible extension of the discrete Gompertz distribution, developed within the framework of the half-logistic model. Named the discrete Gompertz half-logistic (DGzHLo) model, this new formulation enhances the adaptability of existing discrete distributions to better capture intricate data patterns. To understand its theoretical foundation, key mathematical and statistical properties are examined, including the probability mass function, cumulative distribution function, reliability function, and hazard rate function. Measures such as dispersion, skewness, and kurtosis offer insights into the model's behavior, while entropy and order statistics further reveal its structural characteristics. The model's parameters are estimated using the maximum likelihood estimation method, and a thorough simulation study assesses the accuracy and efficiency of the estimators across various sample sizes. To illustrate its practical relevance, the model is applied to three real-world datasets, demonstrating its superior flexibility and robustness in capturing complex data structures compared to existing models. These findings underscore the DGzHLo model's effectiveness in advancing discrete data modeling and analysis.

  • Research Article
  • Cite Count Icon 1
  • 10.3390/axioms13100707
On the Conflation of Negative Binomial and Logarithmic Distributions
  • Oct 13, 2024
  • Axioms
  • Anfal A Alqefari + 2 more

In recent decades, the study of discrete distributions has received increasing attention in the field of statistics, mainly because discrete distributions can model a wide range of count data. One common distribution used for modeling count data, for instance, is the negative binomial distribution (NBD), which performs well with over-dispersed data. In this paper, a new count distribution is introduced, called the conflation of negative binomial and logarithmic distributions, which is formed by conflating the negative binomial and logarithmic distributions, resulting in a distribution that possesses some of the properties of negative binomial and logarithmic distributions. The distribution has two parameters and is verified by a positive integer. Two modifications are proposed to the distribution, which includes zero as a support point. The new distribution is valuable from a theoretical perspective since it is a member of the weighted negative binomial distribution family. In addition, the distribution differs from the NBD in the sense that the probability of lower counts is inflated. This study discusses the characteristics of the proposed distribution and its modified versions, such as moments, probability generating functions, likelihood stochastic ordering, log-concavity, and unimodality properties. Real-world data are used to evaluate the performance of the proposed models against other models. All computations shown in this paper were produced using the R programming language.

  • Research Article
  • 10.3390/axioms14070518
On the Conflation of Poisson and Logarithmic Distributions with Applications
  • Jul 6, 2025
  • Axioms
  • Abdulhamid A Alzaid + 2 more

It is frequent for real-life count data to show inflation in lower values; however, most of the well-known count distributions cannot capture such a feature. The present paper introduces a new distribution for modeling inflated count data in small values based on a conflation of distributions approach. The new distribution inherits some properties from Poisson distribution (PD) and logarithmic distribution (LD), making it a powerful modeling tool. It can serve as an alternative to PD, LD, and zero-truncated distributions. The new distribution is worth considering theoretically, as it belongs to the weighted PD family. With zero as a support point, two additional models are suggested for the new distribution. These modifications yield distributions that demonstrate overdispersion models comparable to the negative binomial distribution (NBD) while retaining essential PD properties, making them suitable for accurately representing count data with frequent events of low frequency and high variance. Furthermore, we discuss the superior performance of three new distributions in modeling real count data compared to traditional count distributions such as PD and NBD, as well as other discrete distributions. This paper examines the key statistical properties of the proposed distributions. A comparison of the novel and other distributions in the literature is shown employing real-life data from some domains. All of the computations shown in this study are generated using the R programming language.

  • Research Article
  • 10.18502/jbe.v9i1.13977
Beta-Geometric Regression for Modeling Count Data on First Antenatal Care Visit (ANC) with Application
  • Oct 31, 2023
  • Journal of Biostatistics and Epidemiology
  • Zainab M Al-Balushi + 2 more

Introduction: Little attention has been paid to modeling count data with the geometric distribution. There are many real-life phenomena with a constant probability of first success. However, in practice, the probability of the first success may vary, making simple geometric models unsuitable for modeling such data. One can assume one of many continuous distributions for modeling the probability of first success with the parameter space [0, 1]. In this respect Beta distribution defined on the standard unit interval [0,1] is the most useful distribution due to its ability to accommodate a wide range of shapes. Thus, in this paper, by mixing Beta and geometric distribution, we developed a Beta-geometric distribution for modeling the count data through application to real-life count data on time to the first antenatal care (ANC) visit.
 Methods: The estimation of the distribution parameters using the method of moments, maximum likelihood estimation (MLE) method, and Bayesian estimation approach are provided. Based on the Beta-geometric distribution, we developed a new Beta-geometric regression model for analyzing count data that follow the geometric distribution. The goodness of fit of the derived model has been tested using real data on time to the first ANC visit.
 Results: Beta-geometric distribution has a simple form for its probability mass function (pmf), and is flexible in capturing both underdispersion and overdispersion that may present in count data. It was found that the proposed Beta-geometric regression model fit the count data on the first ANC visit better than simple geometric distribution or Negative Binomial distribution.
 Conclusion: Unlike the Poisson or Negative Binomial distribution, Beta-geometric distribution does not need an additional parameter to accommodate underdispersion or overdispersion and thus could be a flexible choice for analyzing any count data. The goodness of fit test of the Beta-geometric model provides better fitting of the model to real data on time to first ANC visit than geometric or Negative binomial models.

  • Research Article
  • Cite Count Icon 4
  • 10.34172/jrhs13846
Social hidden groups size analyzing: application of count regression models for excess zeros.
  • Sep 17, 2013
  • Journal of Research in Health Sciences
  • Ali Akbar Haghdoost + 5 more

In the case of sensitive questions such as number of alcoholics known, majority of respondents might give an answer of zero. Poisson regression model (P) is the standard tool to analyze count data. However, P provides poor fit in the case of zero inflated counts, when over-dispersion exists. Therefore, the questions to be addressed are to compare performance of alternative count regression models; and to investigate whether characteristics of respondents affect their responses. A total of 700 participants were asked about number of people they know in hidden groups; alcoholics, methadone users, and Female Sex Workers (FSW). Five regression models were fitted to these outcomes: Logistic, P, Negative Binomial (NB), Zero Inflated Poisson (ZIP), and Zero Inflated Negative Binomial (ZINB). Models were compared in terms of Likelihood Ratio Test (LRT), Vuong, AIC and Sum Square of Error (SSE). Percentages of zero were 35% for number of alcoholics, 50% for methadone users, and 65% for FSWs. ZINB provided the best fit for alcoholics, and NB provided the best fit for other outcomes. In addition, we noticed that young respondents, male and those with low education were more likely to know or reveal sensitive information. Although P is the first choice for modeling of count data in many cases, it seems because of over-dispersion of zero inflated counts in the case of sensitive questions, other models, specifying NB and ZINB, might have better goodness of fit.

  • Book Chapter
  • 10.1016/b0-08-043076-7/00409-5
Distributions, Statistical: Special and Discrete
  • Jan 1, 2001
  • International Encyclopedia of Social & Behavioral Sciences
  • C.D Kemp

Distributions, Statistical: Special and Discrete

  • Research Article
  • Cite Count Icon 246
  • 10.1080/10543400600719384
On the Use of Zero-Inflated and Hurdle Models for Modeling Vaccine Adverse Event Count Data
  • Aug 1, 2006
  • Journal of Biopharmaceutical Statistics
  • C E Rose + 3 more

We compared several modeling strategies for vaccine adverse event count data in which the data are characterized by excess zeroes and heteroskedasticity. Count data are routinely modeled using Poisson and Negative Binomial (NB) regression but zero-inflated and hurdle models may be advantageous in this setting. Here we compared the fit of the Poisson, Negative Binomial (NB), zero-inflated Poisson (ZIP), zero-inflated Negative Binomial (ZINB), Poisson Hurdle (PH), and Negative Binomial Hurdle (NBH) models. In general, for public health studies, we may conceptualize zero-inflated models as allowing zeroes to arise from at-risk and not-at-risk populations. In contrast, hurdle models may be conceptualized as having zeroes only from an at-risk population. Our results illustrate, for our data, that the ZINB and NBH models are preferred but these models are indistinguishable with respect to fit. Choosing between the zero-inflated and hurdle modeling framework, assuming Poisson and NB models are inadequate because of excess zeroes, should generally be based on the study design and purpose. If the study's purpose is inference then modeling framework should be considered. For example, if the study design leads to count endpoints with both structural and sample zeroes then generally the zero-inflated modeling framework is more appropriate, while in contrast, if the endpoint of interest, by design, only exhibits sample zeroes (e.g., at-risk participants) then the hurdle model framework is generally preferred. Conversely, if the study's primary purpose it is to develop a prediction model then both the zero-inflated and hurdle modeling frameworks should be adequate.

  • Research Article
  • Cite Count Icon 304
  • 10.1126/science.172.3988.1089
Fitting discrete probability distributions to evolutionary events.
  • Jun 11, 1971
  • Science
  • Thomas Uzzell + 1 more

The assumptions underlying the use of the Poisson distribution are essentially that the probability of an event is small but nearly identical for all occurrences and that the occurrence of an event does not alter the probability of recurrence of such events. These assumptions do not seem to be met for evolutionary events since (i) the probability of fixing nucleotide codon substitutions is not equal for all substitutions at a codon, and probably varies for the same substitution in different lineages; (ii) the probability of fixing codon substitutions varies among positions of a cistron; and (iii) the fixation of a nucleotide codon substitution at one position in a cistron modifies, and may even promote, the fixation of a codon substitution elsewhere along the cistron. Natural selection presumably is the causative factor that acts to modify the probability of a nucleotide codon substitution's being fixed in a population. The use of the negative binomial distribution is consistent with the evidence that selective pressure on amino acid or nucleotide codon positions varies both among codon positions of a cistron and at a particular position during evolutionary time. If the number of fixations of nucleotide codon substitutions per position of cistrons encoding cytochromes c are phyletically inferred (phylogeny based on a paleontological record) rather than phenetically inferred (based on paired comparisons of extant species' differences in the absence of a phylogeny) the distribution of these fixation data cannot be described adequately by a single Poisson distribution. The fit of these same data to a negative binomial distribution is very satisfactory. It has been argued that the fit of phenetically inferred fixation data, which do not take account of parallel or reverse fixations, to the Poisson distribution was supportive evidence for the hypothesis that protein evolution results from the fixation of selectively neutral codon substitutions. This argument now appears to be undercut by the evidence that data on nucleotide codon fixation are more probably distributed according to the negative binomial distribution. The fact that fixation data can be described by a particular discrete probability distribution does not of itself provide insight into the mechanisms of the evolutionary process. However, the facts-(i) that the assumptions underlying the use of the negative binomial distribution adequately deal with the varying probability of fixing amino acid or nucleotide codon substitutions at and among the positions of a cistron and (ii) that the negative binomial distribution provides an excellent fit for the phyletically inferred fixation data-suggest that the negative binomial is a very appropriate discrete probability distribution for describing evolutionary events. Amino acids or their nucleotide codon substitutions may be fixed at a position of a cistron as though selectively neutral relative to the codon being replaced, even though the codon position will not be selectively neutral, since many amino acids cannot function there. The negative binomial distribution treats this situation well whereas a single Poisson distribution could only be satisfactory if all codon positions that could vary were selectively neutral.

  • Research Article
  • Cite Count Icon 12
  • 10.52876/jcs.878742
A WEB-BASED SOFTWARE FOR THE CALCULATION OF THEORETICAL PROBABILITY DISTRIBUTIONS
  • Jun 29, 2021
  • The Journal of Cognitive Systems
  • Fatma Hilal Yağin + 2 more

Abstract— Aim: The aim of this study is to develop a public web-based theoretical probability distributions software (KODY) that can calculate probabilities for discrete and continuous distributions. Materials and Methods: The Discrete Uniform, Bernoulli, Binomial, Multinomial, Poisson, Geometric, Negative Binomial, Hypergeometric and Zeta (Zipf) distributions from the discrete distributions are explained. Among the continuous distributions, The Continuous Uniform, Beta, Normal, Log-Normal, Exponential, Gamma, Weibull, Rayleigh, Logistics, Pareto, Laplace, Cauchy and Erlang distributions are elucidated. Illustrative examples are presented on hypothetical medical data. The software was developed using the MATH and DASH libraries of the Python programming language. Results: When making statistical analysis, the feature of the distribution is essential. Because the descriptive and analytical statistical methods to be applied to data with different distributions are also different. Probability distributions of variables are important in the effectiveness of these methods. For this reason, it is an essential step for researchers to determine the probability distributions of their data before starting their studies. It is thought that the software developed in this study will enable researchers to make the necessary calculations in probabilistic estimates regarding the theoretical probability distributions. The developed software can be accessed at http://biostatapps.inonu.edu.tr/KODY/. Conclusion: The open access web-based software with Turkish/English language options may guide and contribute to researchers in probabilistic estimation processes regarding theoretical distributions. In the later stages of this study, it is foreseen to develop simulation processes based on each probability distribution. Keywords— Discrete Probability Distributions, Continuous Probability Distributions, Web - Based Software, Python.

  • Research Article
  • Cite Count Icon 130
  • 10.1016/j.aap.2009.07.012
Zero-state Markov switching count-data models: An empirical assessment
  • Aug 3, 2009
  • Accident Analysis & Prevention
  • Nataliya V Malyshkina + 1 more

Zero-state Markov switching count-data models: An empirical assessment

  • Research Article
  • Cite Count Icon 3
  • 10.53570/jnt.902066
Models for Overdispersion Count Data with Generalized Distribution: An Application to Parasites Intensity
  • Jun 30, 2021
  • Journal of New Theory
  • Öznur İşçi Güneri + 1 more

The Poisson regression model is widely used for count data. This model assumes equidispersion. In practice, equidispersion is seldom reflected in data. However, in real-life data, the variance usually exceeds the mean. This situation is known as overdispersion. Negative binomial distribution and other Poisson mix models are often used to model overdispersion count data. Another extension of the negative binomial distribution in another model for count data is the univariate generalized Waring. In addition, the model developed by Famoye can be used in the analysis of count data. When the count data contains a large number of zeros, it is necessary to use zero-inflated models. In this study, different generalized regression models are emphasized for the analysis of excessive zeros count data. For this purpose, a real data set was analysed with the generalized Poisson model, generalized negative binomial model, generalized negative binomial Famoye, generalized Waring model, and the foregoing zero-inflated models. Log-likelihood, Akaike information criterion, Bayes information criterion, Vuong statistics were used for model comparisons.

  • Research Article
  • 10.1088/1757-899x/546/5/052023
Generalized Linier Autoregressive Moving Average (GLARMA) Negative Binomial Regression Models with Metropolis Hasting Algorithm
  • Jun 1, 2019
  • IOP Conference Series: Materials Science and Engineering
  • Popy Febritasari + 2 more

This paper discusses regression models when the variance in count data is not equal to the mean. It happens in mortality cause of traffic accident data in jurisdiction’s territory of Dharmasraya’s Police Resort, where the variance is larger than the mean, which is called overdispersion. In this case we used negative binomial regression in time series with generalized linier autoregressive moving average (GLARMA) models. The parameters were estimated using maximum likelihood estimation (MLE) method and metropolis hasting algorithm at 100th burn - in period and 150000 iteration. The prior distribution and the number of iteration in metropolis hasting algorithm had less Mean Square Error (MSE) than MLE method. Prediction for next period using model metropolis hasting algorithm.

  • Research Article
  • Cite Count Icon 9
  • 10.1111/j.1440-6055.1983.tb01858.x
DISPERSION OF ARTHROPODS, FLOWER BUDS AND FRUIT IN COTTON FIELDS: EFFECTS OF POPULATION DENSITY AND SEASON ON THE FIT OF PROBABILITY DISTRIBUTIONS
  • May 1, 1983
  • Australian Journal of Entomology
  • L T Wilson + 2 more

Three years data on insects and plant parts gathered using 3 sample methods (visual whole plant field observations, removal of plants in bags, sweepnet) were examined to determine the influence of several factors on the ‘fit’ of 8 discrete probability distributions. To enable ‘fits’ to be tested, critical values of the Kolmogorov‐Smirnov statistic for use with discrete distributions were obtained by Monte Carlo simulation.For data from all 3 sample methods most distributions gave worse fits as population density increased. Better fits were obtained early in the growing season but the effect was small. The Negative Binomial appeared to be the most robust model on which to base simultaneous sampling of plant parts and several species of arthropods because it was least sensitive to effects of population density. In data collected by sweep sampling, clumping was less evident, and the Poisson and Poisson with Zeros gave the best fits at low densities when the data were uniformly distributed (variance less than mean) and the Negative Binomial at high densities when the distributions were clumped.

  • Research Article
  • Cite Count Icon 2
  • 10.3390/sym16091123
A Novel Discrete Linear-Exponential Distribution for Modeling Physical and Medical Data
  • Aug 29, 2024
  • Symmetry
  • Khlood Al-Harbi + 3 more

In real-life data, count data are considered more significant in different fields. In this article, a new form of the one-parameter discrete linear-exponential distribution is derived based on the survival function as a discretization technique. An extensive study of this distribution is conducted under its new form, including characteristic functions and statistical properties. It is shown that this distribution is appropriate for modeling over-dispersed count data. Moreover, its probability mass function is right-skewed with different shapes. The unknown model parameter is estimated using the maximum likelihood method, with more attention given to Bayesian estimation methods. The Bayesian estimator is computed based on three different loss functions: a square error loss function, a linear exponential loss function, and a generalized entropy loss function. The simulation study is implemented to examine the distribution’s behavior and compare the classical and Bayesian estimation methods, which indicated that the Bayesian method under the generalized entropy loss function with positive weight is the best for all sample sizes with the minimum mean squared errors. Finally, the discrete linear-exponential distribution proves its efficiency in fitting discrete physical and medical lifetime count data in real-life against other related distributions.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1002/9781118182635.efm0097
Discrete Probability Distributions
  • Dec 15, 2012
  • Encyclopedia of Financial Models
  • Markus Höchstötter + 2 more

Discrete probability distributions are needed whenever the random variable is to describe a quantity that can assume values from a countable set, either finite or infinite. A discrete probability distribution (or law) is quite intuitive in that it assigns certain values positive probabilities adding up to one, while any other value automatically has zero probability. In general, neglecting some of the mathematical rigor, discrete distributions can be understood from the insight gained from descriptive statistics. For example, the random number of defaults in a bond portfolio inside of a given period of time can be modeled with a discrete probability distribution. Another example is given by sampling when we are interested in whether an observation belongs to a certain group. Also, simple stock price models are based on discrete laws where the stock price can only change to one of a finite number of possible values. Keywords: Discrete random variables; probability distribution; probability law; discrete law; discrete cumulative distribution; variance; standard deviation; Bernoulli distribution; drawing with replacement; binomial distribution; binomial coefficient; binomial tree; hypergeometric; multinomial distribution; discrete uniform distribution; Poi; binomial coefficient; factorial; multinomial coefficient; polynomial coefficient; path-dependent; Markov

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant