A decomposition of Fisher's information to inform sample size for developing or updating fair and precise clinical prediction models - part 2: time-to-event outcomes.
When developing a clinical prediction model using time-to-event data (i.e. with censoring and different lengths of follow-up), previous research focuses on the sample size needed to minimise overfitting and precisely estimating the overall risk. However, instability of individual-level risk estimates may still be large. We propose using a decomposition of Fisher's information matrix to help examine and calculate the sample size required for developing a model that aims for precise and fair risk estimates. We propose a six-step process which can be used either before data collection or when an existing dataset is available. Steps 1 to 5 require researchers to specify the overall risk in the target population at a key time-point of interest: an assumed pragmatic 'core model' in the form of an exponential regression model, the (anticipated) joint distribution of core predictors included in that model and the distribution of censoring times. The 'core model' can be specified directly or based on a specified C-index and relative effects of (standardised) predictors. The joint distribution of predictors may be available directly in an existing dataset, in a pilot study or in a synthetic dataset provided by other researchers. We derive closed-form solutions that decompose the variance of an individual's estimated event rate into Fisher's unit information matrix, predictor values and total sample size; this allows researchers to calculate and examine uncertainty distributions around individual risk estimates and misclassification probabilities for specified sample sizes. We provide an illustrative example in breast cancer and emphasise the importance of clinical context, including any risk thresholds for decision-making, and examine fairness concerns for pre- and postmenopausal women. Lastly, in two empirical evaluations, we provide reassurance that uncertainty interval widths based on our exponential approach are close to using more flexible parametric models. Our approach allows users to identify the (target) sample size required to develop a prediction model for time-to-event outcomes, via the pmstabilityss module. It aims to facilitate models with improved trust, reliability and fairness in individual-level predictions.
- Research Article
1
- 10.1186/s41512-025-00193-9
- Jul 8, 2025
- Diagnostic and Prognostic Research
BackgroundWhen using a dataset to develop or update a clinical prediction model, small sample sizes increase concerns of overfitting, instability, poor predictive performance and a lack of fairness. For models estimating the risk of a binary outcome, previous research has outlined sample size calculations that target low overfitting and a precise overall risk estimate. However, more guidance is needed for targeting precise and fair individual-level risk estimates.MethodsWe propose a decomposition of Fisher’s information matrix to help examine sample sizes required for developing or updating a model, aiming for precise and fair individual-level risk estimates. We outline a five-step process for use before data collection or when an existing dataset or pilot study is available. It requires researchers to specify the overall risk in the target population, the (anticipated) distribution of key predictors in the model and an assumed ‘core model’ either specified directly (i.e. a logistic regression equation is provided) or based on a specified C-statistic and relative effects of (standardised) predictors.ResultsWe produce closed-form solutions that decompose the variance of an individual’s risk estimate into the Fisher’s unit information matrix, predictor values and the total sample size. This allows researchers to quickly calculate and examine the anticipated precision of individual-level predictions and classifications for specified sample sizes. The information can be presented to key stakeholders (e.g. health professionals, patients, grant funders) to inform target sample sizes for prospective data collection or whether an existing dataset is sufficient. Our proposal is implemented in our new software module pmstabilityss. We provide two real examples and emphasise the importance of clinical context, including any risk thresholds for decision making and fairness checks.ConclusionsOur approach helps researchers examine potential sample sizes required to target precise and fair individual-level predictions when developing or updating prediction models for binary outcomes.
- Research Article
236
- 10.1002/ijch.198000018
- Jan 1, 1980
- Israel Journal of Chemistry
Standard concepts of information theory, including Shannon's entropy, Fisher's information, Jaynes' principle of entropy maximization, Fisher's locality information matrix, and Kullback and Leibler's information measure, are described and extended to many dimensions as appropriate, to establish precise connections between the many body quantum‐mechanical kinetic energy functional T[Ψ] and information measures. Implications for density functional theory of electronic structure are discussed, and elementary examples are displayed to illustrate the argument. Among the several exact relations obtained, one of special interest is the identity where the first term is the intrinsic accuracy or Fisher's information for locality of the one‐particle density (normalized to 1), \documentclass{article}\pagestyle{empty}\begin{document}$ \rho \left( 1 \right) = \int { \cdot \cdot \cdot \int {|{\rm \psi |}^{\rm 2} d_{{\rm T}_2 } \cdot \cdot \cdot } } \,d_{{\rm T}_{\rm N} ,} $\end{document}, and the second term is the average over the one‐particle density of Fisher's information associated with the conditional density \documentclass{article}\pagestyle{empty}\begin{document}$ f\left( {2,\;3,\; \cdot \cdot \cdot ,\;N|1} \right) \equiv \,|\psi |^2 /\rho \left( 1 \right) $\end{document} that is, the second term is the average over the marginal distribution ρ(1) of the trace of Fisher's information matrix for the distribution f(2,3, …, N|1). Because of this formula, the quantum mechanical variation principle may be precisely stated as a principle of minimal information.
- Research Article
1
- 10.1051/0004-6361/202555342
- Oct 1, 2025
- Astronomy & Astrophysics
Context. In point spread function (PSF) photometry, the selection of the fitting aperture radius plays a critical role in determining the precision of flux and background estimations. Traditional methods often rely on maximizing the signal-to-noise ratio (S/N) as a criterion for aperture selection. However, S/N-based approaches do not necessarily provide the optimal precision for joint estimation problems as they do not account for the statistical limits imposed by the Fisher information in the context of the Cramér-Rao lower bound (CRLB). Aims. This study aims to establish an alternative criterion for selecting the optimal fitting radius based on Fisher information rather than S/N. Fisher information serves as a fundamental measure of estimation precision, providing theoretical guarantees on the achievable accuracy for parameter estimation. By leveraging Fisher information, we seek to define an aperture selection strategy that minimizes the loss of precision. Methods. We conducted a series of numerical experiments that analyze the behavior of Fisher information and estimator performance as a function of the PSF aperture radius. Specifically, we revisited fundamental photometric models and explored the relationship between aperture size and information content. We compared the empirical variance of classical estimators, such as maximum likeli-hood and stochastic weighted least squares, against the theoretical CRLB derived from the Fisher information matrix. Results. Our results indicate that aperture selection based on the Fisher information provides a more robust framework for achieving optimal estimation precision. The findings reveal that S/N-based aperture selection may lead to significant discrepancies, with potential precision losses of up to 70%. In contrast, Fisher information-based selection allows a more accurate and consistent estimation process, ensuring that the empirical variance closely aligns with the theoretical limits.
- Research Article
18
- 10.1016/s0024-3795(01)00231-2
- Apr 5, 2001
- Linear Algebra and its Applications
On Stein's equation, Vandermonde matrices and Fisher's information matrix of time series processes. Part I: The autoregressive moving average process
- Research Article
5
- 10.1162/neco_a_01411
- Jul 26, 2021
- Neural computation
The Fisher information matrix (FIM) plays an essential role in statistics and machine learning as a Riemannian metric tensor or a component of the Hessian matrix of loss functions. Focusing on the FIM and its variants in deep neural networks (DNNs), we reveal their characteristic scale dependence on the network width, depth, and sample size when the network has random weights and is sufficiently wide. This study covers two widely used FIMs for regression with linear output and for classification with softmax output. Both FIMs asymptotically show pathological eigenvalue spectra in the sense that a small number of eigenvalues become large outliers depending on the width or sample size, while the others are much smaller. It implies that the local shape of the parameter space or loss landscape is very sharp in a few specific directions while almost flat in the other directions. In particular, the softmax output disperses the outliers and makes a tail of the eigenvalue density spread from the bulk. We also show that pathological spectra appear in other variants of FIMs: one is the neural tangent kernel; another is a metric for the input signal and feature space that arises from feedforward signal propagation. Thus, we provide a unified perspective on the FIM and its variants that will lead to more quantitative understanding of learning in large-scale DNNs.
- Book Chapter
2
- 10.1049/pbce123e_ch10
- Jul 14, 2019
The Fisher information matrix (FIM) has long been of interest in statistics and other areas. It is widely used to measure the amount of information and calculate the lower bound for the variance for maximum likelihood estimation (MLE). In practice, we do not always know the actual FIM. This is often because obtaining the firstor second-order derivative of the log-likelihood function is difficult, or simply because the calculation of FIM is too formidable. In such cases, we need to utilize the approximation of FIM. In general, there are two ways to estimate FIM. One is to use the product of gradient and the transpose of itself, and the other is to calculate the Hessian matrix and then take negative sign. Mostly people use the latter method in practice. However, this is not necessarily the optimal way. To find out which of the two methods is better, we need to conduct a theoretical study to compare their efficiency. In this paper, we mainly focus on the case where the unknown parameter that needs to be estimated by MLE is scalar, and the random variables we have are independent. In this scenario, FIM is virtually Fisher information number (FIN). Using the Central Limit Theorem (CLT), we get asymptotic variances for the two methods, by which we compare their accuracy. Taylor expansion assists in estimating the two asymptotic variances. A numerical study is provided as an illustration of the conclusion. The next is a summary of limitations of this paper. We also enumerate several fields of interest for future study in the end of this paper.
- Conference Article
12
- 10.1109/acc.2012.6315584
- Jun 1, 2012
Covariance matrix and confidence interval calculations for maximum likelihood estimates (MLEs) are commonly used in system identification and statistical inference. To accurately construct such confidence intervals, one typically needs to know the covariance of the MLE. Standard statistical theory tells that the normalized MLE is asymptotically normally distributed with mean zero and covariance being the inverse of the Fisher Information Matrix (FIM) at the unknown parameter. Two common estimates for the covariance of MLE are the inverse of the observed FIM (the same as the Hessian of negative log-likelihood) and the inverse of the expected FIM (the same as FIM). Both of the observed and expected FIM are evaluated at the MLE from the sample data. We show that, under reasonable conditions, the expected FIM outperforms the observed FIM under a mean squared error criterion. This result suggests that, with certain conditions, the expected FIM is a better estimate for the covariance of MLE in confidence interval calculations.
- Research Article
2
- 10.1007/s40300-018-0145-3
- Nov 21, 2018
- METRON
A common approach to analyzing categorical correlated time series data is to fit a generalized linear model (GLM) with past data as covariate inputs. There remain challenges to conducting inference for time series with short length. By treating the historical data as covariate inputs, standard errors of estimates of GLM parameters computed from the empirical Fisher information do not fully account the auto-correlation in the data. To overcome this serious limitation, we derive the exact conditional Fisher information matrix of a general logistic autoregressive model with endogenous covariates for any series length T. Moreover, we also develop an iterative computational formula that allows for relatively easy implementation of the proposed estimator. Our simulation studies show that confidence intervals derived using the exact Fisher information matrix tend to be narrower than those utilizing the empirical Fisher information matrix while maintaining type I error rates at or below nominal levels. Further, we establish that, as T tends to infinity, the exact Fisher information matrix approaches the asymptotic Fisher information matrix previously derived for binary time series data. The developed exact conditional Fisher information matrix is applied to time-series data on respiratory rate among a cohort of expectant mothers where it is found to provide narrower confidence intervals for functionals of scientific interest and lead to greater statistical power when compared to the empirical Fisher information matrix.
- Conference Article
- 10.1109/ciss.2009.5054760
- Mar 1, 2009
Confidence intervals for the maximum likelihood estimates (MLEs) are commonly used in statistical inference. To accurately construct such confidence intervals, one typically needs to know the distribution of the MLE. Standard statistical theory says normalized MLE is asymptotically normal with mean zero and variance being a function of the Fisher Information Matrix (FIM) at the unknown parameter. Two common estimates for the variance of MLE are the observed FIM (same as Hessian of negative log-likelihood) and the expected FIM, both of which are evaluated at the MLE given sample data. We show that, under reasonable conditions, the expected FIM tends to outperform the observed FIM under a mean-squared error criterion. This result suggests that, under certain conditions, the expected FIM is a better estimate for the variance of MLE when used in confidence interval calculations.
- Research Article
- 10.2991/jsta.2017.16.4.12
- Dec 1, 2017
- Journal of Statistical Theory and Applications
In this paper, the Fisher information matrix (FIM) contained in n record values is considered for the two parameter distributions belong to the exponentiated and inverse exponentiated class of distributions. The problem of existence and uniqueness of the maximum likelihood estimates of the parameters for these families are also considered based on record values. The explicit expressions for the elements of the FIM contained in record values as well as in independent and identically (iid) observations are obtained. The Fisher information (FI) matrices are compared by using the relative efficiency, the total information and the total variance. A simulation study is carried out to compare the FI matrices. A real data analysis has also been performed for illustrative purposes.
- Report Series
3
- 10.1920/wp.cem.2005.0405
- Apr 5, 2005
The existence of a uniformly consistent estimator for a particular parameter is well-known to depend on the uniform continuity of the functional that defines the parameter in terms of the model. Recently, Potscher (Econometrica, 70, pp 1035 - 1065) showed that estimator risk may be bounded below by a term that depends on the oscillation (osc) of the functional, thus making the connection between continuity and risk quite explicit. However, osc has no direct statistical interpretation. In this paper we slightly modify the definition of osc so that it reflects a (generalized) derivative (der) of the functional. We show that der can be directly related to the familiar statistical concepts of Fisher information and identification, and also to the condition numbers that are used to measure ‘distance from an ill-posed problem’ in other branches of applied mathematics. We begin the analysis assuming a fully parametric setting, but then generalize to the nonparametric case, where the inverse of the Fisher information matrix is replaced by the covariance matrix of the efficient influence function. The results are applied to a number of examples, including the structural equation model, spectral density estimation, and estimation of variance and precision.
- Research Article
- 10.13160/ricns.2014.7.2.138
- Jun 30, 2014
- Journal of the Chosun Natural Science
Fisher information matrix plays an important role in statistical inference of unknown parameters. Especially, it is used in objective Bayesian inference where we calculate the posterior distribution using a noninformative prior distribution, and also in an example of metric functions in geometry. To estimate parameters in a distribution, we can use the Fisher information matrix. The more the number of parameters increases, the more its matrix form gets complicated. In this paper, by using Mathematica programs we derive the Fisher information matrix for 4-parameter generalized gamma distribution which is used in reliability theory.
- Research Article
71
- 10.1002/sim.1041
- Jul 12, 2002
- Statistics in Medicine
We address the problem of the choice and the evaluation of designs in population pharmacokinetic studies that use non-linear mixed-effects models. Criteria, based on the Fisher information matrix, have been developed to optimize designs and adapted to such models. We optimize designs under different constraints and evaluate them for a population pharmacokinetics study, within a new phase III trial of enoxaparin, a low molecular weight heparin. To do this, we approximate the expression of the Fisher information matrix for non-linear mixed-effects models including the residual error variance as a parameter to be estimated. We use the Fedorov-Wynn algorithm to minimize the inverse of the determinant of this matrix as required by the D-optimality criterion. Two optimal designs, as well as a design defined by pharmacologists, are evaluated by the simulation of 30 replicated data sets with NONMEM; all designs involve 220 patients with four measurements per patient. We also evaluate the relevance of the standard errors of estimation given from the Fisher information matrix by comparison with those given by NONMEM. The three designs provide more precise population parameter estimates; the optimal design gives the best precision and offers a simple clinical implementation. The expected standard errors given by the information matrix are close to those obtained by NONMEM on the simulation. Moreover, the proposed criterion of D-optimality appears to be a good measure to compare designs for population studies.
- Research Article
4
- 10.3390/e16042023
- Apr 8, 2014
- Entropy
In this survey paper, a summary of results which are to be found in a series of papers, is presented. The subject of interest is focused on matrix algebraic properties of the Fisher information matrix (FIM) of stationary processes. The FIM is an ingredient of the Cram´er-Rao inequality, and belongs to the basics of asymptotic estimation theory in mathematical statistics. The FIM is interconnected with the Sylvester, Bezout and tensor Sylvester matrices. Through these interconnections it is shown that the FIM of scalar and multiple stationary processes fulfill the resultant matrix property. A statistical distance measure involving entries of the FIM is presented. In quantum information, a different statistical distance measure is set forth. It is related to the Fisher information but where the information about one parameter in a particular measurement procedure is considered. The FIM of scalar stationary processes is also interconnected to the solutions of appropriate Stein equations, conditions for the FIM to verify certain Stein equations are formulated. The presence of Vandermonde matrices is also emphasized.
- Research Article
4
- 10.1109/tmi.2015.2410342
- Mar 5, 2015
- IEEE Transactions on Medical Imaging
The accurate determination of the local impulse response and the covariance in voxels from penalized maximum likelihood reconstructed images requires performing reconstructions from many noise realizations of the projection data. As this is usually a very time-consuming process, efficient analytical approximations based on the Fisher information matrix (FIM) have been extensively used in PET and SPECT to estimate these quantities. For 3D imaging, however, additional approximations need to be made to the FIM in order to speed up the calculations. The most common approach is to use the local shift-invariant (LSI) approximation of the FIM, but this assumes specific conditions which are not always necessarily valid. In this paper we take a single-pinhole SPECT system and compare the accuracy of the LSI approximation against two other methods that have been more recently put forward: the non-uniform object-space pixelation (NUOP) and the subsampled FIM. These methods do not assume such restrictive conditions while still increasing the speed of the calculations considerably. Our results indicate that in pinhole SPECT the NUOP and subsampled FIM approaches could be more reliable than the LSI approximation, especially when a high accuracy is required.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.