Fisher Scoring for crossed factor linear mixed models

Thomas Maullin-Sapey,Thomas E Nichols

doi:10.1007/s11222-021-10026-6

Abstract

The analysis of longitudinal, heterogeneous or unbalanced clustered data is of primary importance to a wide range of applications. The linear mixed model (LMM) is a popular and flexible extension of the linear model specifically designed for such purposes. Historically, a large proportion of material published on the LMM concerns the application of popular numerical optimization algorithms, such as Newton–Raphson, Fisher Scoring and expectation maximization to single-factor LMMs (i.e. LMMs that only contain one “factor” by which observations are grouped). However, in recent years, the focus of the LMM literature has moved towards the development of estimation and inference methods for more complex, multi-factored designs. In this paper, we present and derive new expressions for the extension of an algorithm classically used for single-factor LMM parameter estimation, Fisher Scoring, to multiple, crossed-factor designs. Through simulation and real data examples, we compare five variants of the Fisher Scoring algorithm with one another, as well as against a baseline established by the R package lme4, and find evidence of correctness and strong computational efficiency for four of the five proposed approaches. Additionally, we provide a new method for LMM Satterthwaite degrees of freedom estimation based on analytical results, which does not require iterative gradient estimation. Via simulation, we find that this approach produces estimates with both lower bias and lower variance than the existing methods.

Highlights

1.1 BackgroundSince its conception in the seminal work of Laird and Ware (1982), the literature on linear mixed model (LMM) estimation and inference has evolved rapidly
We have presented derivations for and demonstrated potential applications of, score vector and Fisher Information matrix expressions for the LMMs containing multiple random factors
While many of the examples presented in this paper were benchmarked against existing software, it is not the authors’ intention to suggest that the proposed methods are superior to existing software packages

Summary

Introduction

1.1 BackgroundSince its conception in the seminal work of Laird and Ware (1982), the literature on linear mixed model (LMM) estimation and inference has evolved rapidly. Many software packages exist which are capable of performing LMM estimation and inference for large and complex LMMs in an incredibly quick and memory-efficient manner. For some packages, this exceptional speed and efficiency arise from simplifying model assumptions, while for others, complex mathematical operations such as sparse matrix methodology and sweep operators are utilized to improve performance (Wolfinger et al 1994; Bates et al 2015). To efficiently perform a mass-univariate analysis within a practical timeframe, the use of vectorized computation which exploits the repetitive nature of simplistic operations to streamline calculation must be employed (Smith and Nichols 2018; Li et al 2019). Alternative methodology, using more conceptually simplistic mathematical operations for which vectorized support exists, is required

Objectives

Methods

Findings

Conclusion