Articles published on Conditional independence
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
2522 Search results
Sort by Recency
- New
- Research Article
- 10.1038/s41467-025-67017-8
- Dec 7, 2025
- Nature communications
- Helen R Wagstaffe + 25 more
Identifying host factors that mediate protection against newly-emergent viruses is needed for improved pandemic preparedness. Here, we analysed pre- and early post-exposure immune factors associated with resisting SARS-CoV-2 infection after human challenge in seronegative individuals, using multiplex protein, cytometric and RNA sequencing approaches in the nasopharynx and circulation. Pre-existing cross-reactive antibodies correlate poorly with clinical outcome. Instead, protection is associated with heightened nasopharyngeal CCL13 levels locally produced by conventional dendritic cells and monocytes, along with cross-reactive T cells and less differentiated NK cells. Conditional independence network analysis implicates nasal CCL13 as the central node connected to pre-existing non-structural protein-specific T cells by CD1c+ DCs. In those who became infected, baseline cross-reactive T cell and less differentiated NK cell frequencies also correlate with shorter infection duration. Thus, pre-existing mucosal chemokine levels may promote rapid innate and innate-like responses that effectively block infection. ClinicalTrials.gov identifier NCT04865237.
- New
- Research Article
- 10.1093/ehr/ceaf150
- Nov 29, 2025
- English Historical Review
- James Moore
Abstract Egypt’s historical development between the two world wars represents an interesting case of gradual decolonisation. Many historians have emphasised how Britain continued to exercise power over local Egyptian administration following the country’s conditional independence in 1922. This article explores the role of Britain in managing urban domestic policy questions in Egypt and, in particular, the response of British officials to one of the most serious political crises of the inter-war period, the Alexandria Corniche Scandal. A local dispute over contractors’ payments revealed a web of incompetence and corruption that contributed to the downfall of a prime minister and the disgrace of British engineers. The response of the British High Commissioner and British officials in the Egyptian government revealed the difficulties that colonial officials had in managing domestic Egyptian affairs and their weakness in the face of Egyptian politicians with a strong local political base. Unable to build effective political alliances and to unite the Alexandrian European community behind a vision of local government reform, senior British officials were unable to shape the future of local administration in Egypt or obtain any of their core political objectives. The case has important wider implications as it highlights the declining influence of the British over Egyptian domestic affairs and how the issue of corruption could be used to challenge the image of competent and sophisticated colonial oversight. It also demonstrates some of the difficulties Britain faced in managing the long-term process of decolonisation in the face of competing local political interests.
- New
- Research Article
- 10.1080/10618600.2025.2571167
- Nov 27, 2025
- Journal of Computational and Graphical Statistics
- Changhao Ge + 1 more
Single-cell RNA-sequencing (scRNA-seq) technologies have advanced our understanding of cell-type-specific gene expression networks by enabling the direct inference of conditional independence structures among genes. However, scRNA-seq data are characterized by count distributions with numerous zeros, rendering standard Gaussian-based network inference methods inadequate. To address this challenge, we propose SPLN, a hierarchical Poisson log-normal model that simultaneously estimates multiple gene networks under different conditions or across distinct samples, while leveraging shared structural information. We develop an efficient estimation procedure that combines variational expectation–maximization (EM) with the alternating direction method of multipliers (ADMM), optimized for parallel processing. Through extensive simulation studies, SPLN demonstrates superior performance over existing methods in terms of network structure recovery and parameter estimation. We illustrate its utility on two scRNA-seq datasets: one from yeast cells measured under 11 environmental conditions, and another from 13 patients with inflammatory bowel disease. In both applications, SPLN uncovers a broader range of conditional dependence networks among genes, offering deeper insights into the underlying gene expression mechanisms. Supplementary materials for this article are available online.
- Research Article
- 10.3758/s13428-025-02861-6
- Nov 11, 2025
- Behavior research methods
- Nikola Sekulovski + 3 more
Graphical models have become an important method for studying the network structure of multivariate psychological data. Accurate recovery of the underlying network structure is paramount and requires that the models are appropriate for the data at hand. Traditionally, Gaussian graphical models for continuous data and Ising models for binary data have dominated the literature. However, psychological research often relies on ordinal data from Likert scale items, creating a model-data mismatch. This paper examines the effect of dichotomizing ordinal variables on network recovery, as opposed to analyzing the data at its original level of measurement, using a Bayesian analysis of the ordinal Markov random field model. This model is implemented in the R package bgms. Our analysis shows that dichotomization results in a loss of information, which affects the accuracy of network recovery. This is particularly true when considering the interplay between the dichotomization cutoffs used and the distribution of the ordinal categories. In addition, we demonstrate a difference in accuracy when using dichotomized data, depending on whether edges are included or excluded in the true network, which highlights the effectiveness of the ordinal model in recovering conditional independence relationships. These findings underscore the importance of using models that deal directly with ordinal data to ensure more reliable and valid inferred network structures in psychological research.
- Research Article
- 10.1111/cdoe.70034
- Nov 6, 2025
- Community dentistry and oral epidemiology
- Roger Keller Celeste + 3 more
Many oral health-related quality of life instruments have been developed but few have undergone a comprehensive psychometric assessment. One commonly used measure is the Oral Impact on Daily Performance (OIDP). This study revised the configural and metric properties as well as the performance of items based on Item Response Theory (IRT) of a dichotomous-item version of OIDP in Brazil. The nine-item dichotomous version of the OIDP was analysed using data from a nationally representative sample from the Oral Health Survey (SBBrasil 2010). It consisted of 30 064 individuals aged 12 to 75 and was split into two partitions comprising n1 = 20 040 and n2 = 10 024, respectively. Confirmatory factor analyses (CFA) were conducted on the larger partition and cross-validated on the smaller to assess configural and metric properties. The item performance was evaluated using a 2-parameter item response theory (IRT) model. Sampling weights were used in all analyses. The unidimensional model presented two violations of conditional independence, one between items i5 (practising sports) and i4 (going out) and another between items i6 (trouble in speaking) and i7 (shame of speaking or smiling). A CFA of the most parsimonious model (removing i5, i6 and i7) yielded a RMSEA = 0.02, WRMR = 1.42, CFI = 0.99 and TLI = 0.99. The IRT analyses showed that three pairs of items had very similar levels of difficulty and discrimination suggesting redundancy. A shorter dichotomous version of the OIDP scale has acceptable configural and metric properties. Being more concise and thus efficient, it may be better suited for large-scale population surveys than the version currently in use.
- Research Article
- 10.1016/j.jenvman.2025.127318
- Nov 1, 2025
- Journal of environmental management
- Zongsen Wang + 6 more
Mechanisms behind evapotranspiration dynamics in the Middle Yellow River Basin: Role of climate and vegetation.
- Research Article
- 10.1016/j.neunet.2025.108275
- Nov 1, 2025
- Neural networks : the official journal of the International Neural Network Society
- Chenran Zhao + 7 more
D3HRL: A distributed hierarchical reinforcement learning approach based on causal discovery and spurious correlation detection.
- Research Article
- 10.1093/bioinformatics/btaf598
- Nov 1, 2025
- Bioinformatics
- Feng Chen + 1 more
MotivationBiological functions are governed by gene regulatory networks (GRNs). Accurately inferring GRNs from high-dimensional and noisy single-cell data remains a major challenge in systems biology. Conventional approaches often struggle with robustness and interpretability, particularly when applied to complex biological processes such as cell fate decisions and complex diseases.ResultsIn this study, we propose GGANO, a hybrid framework that integrates Gaussian Graphical Models for conditional independence learning with Neural Ordinary Differential Equations for dynamic modeling and inference. Benchmark analyzes show that GGANO achieves superior accuracy and stability compared to existing methods, particularly under high-noise conditions. Furthermore, GGANO enables the inference of stochastic dynamics from single-cell data. Applying GGANO to the EMT datasets, we uncover intermediate cellular states and key regulatory genes driving EMT progression.Availability and implementationThe source code is available at GitHub: https://github.com/ChenFeng87/GGANO.
- Research Article
- 10.1080/10618600.2025.2579526
- Oct 28, 2025
- Journal of Computational and Graphical Statistics
- Beatrice Foroni + 3 more
This paper introduces a novel hidden Markov quantile graphical model for capturing time-varying conditional dependence structures in multivariate time series. The proposed method allows the identification of state-specific graphs and the dynamic relationships between variables across hidden regimes via joint mixtures of hidden Markov quantile regressions. We leverage the sparsity pattern of the quantile regression coefficients to recover conditional independence networks within each latent state. Estimation of model parameters is achieved through pseudo maximum likelihood using a penalized Expectation-Maximization algorithm to induce sparsity in the quantile regression coefficients. The performance of the method is validated through simulations and compared with existing approaches. The proposed model is applied to air pollution data in Northern Italy, analyzing the interdependence of PM 2.5 concentration levels across 14 major cities from 2019 to 2022.
- Research Article
- 10.63313/aerpc.9054
- Oct 27, 2025
- Advances in Engineering Research Possibilities and Challenges
- Wenming Lu + 1 more
Causal discovery is crucial for understanding complex system dynamics, es-pe-cially with non-stationary time series data. This paper presents An At-ten-tion-Driven Framework with Kernel Conditional Independence Testing (AKCD), a novel two-stage framework to enhance time series causal discovery accuracy and robustness. In the first stage, AKCD uses a Transformer-based Multi-Scale Temporal Self-Attention Network (MS-TSAN) to capture complex temporal de-pendencies in time series, generating attention matrices and hid-den representa-tions for subsequent causal inference. In the second stage, it ap-plies ker-nel-based conditional independence (KCI) tests on the hidden repre-sentations to generate a conditional independence (CI) matrix, which is used as prior knowledge to guide causal graph optimization via regularization con-straints. This hybrid approach combines deep learning’s feature extraction ad-vantages with statistical tests’ causal inference rigor, making AKCD particularly suitable for non-stationary time series data.
- Research Article
- 10.1080/10618600.2025.2551268
- Oct 14, 2025
- Journal of Computational and Graphical Statistics
- Shuoyang Wang + 1 more
Functional data refer to data that are realizations of random functions varying over a continuum, such as images or signals. In many modern fields, including neuroscience, medical science, and traffic monitoring, observations are better modeled as multivariate random functions rather than as vectors. To capture the conditional independence structure of such multivariate functional data, functional graphical models have been developed. In this article, we propose a novel and flexible method to estimate the neighborhood of each node using a deep neural network-based functional data regression and feature selection approach with an arbitrary nonparametric form. The full graph structure is then recovered by combining the estimated neighborhoods. Our approach avoids common distributional assumptions on the random functions and circumvents the need for a well-defined precision operator, which may not exist in the functional data context. Furthermore, we establish model consistency for the proposed algorithm. The convergence rate reaches to the classical nonparametric regression rate up to a logarithmic factor. We discover a novel critical sampling frequency that governs the convergence rates of the deep neural network estimator for both densely and sparsely observed functional data. The empirical performance of our method is demonstrated through simulation studies and a real data application. Supplementary materials for this article are available online.
- Research Article
- 10.1145/3763166
- Oct 9, 2025
- Proceedings of the ACM on Programming Languages
- Yuanfeng Shi + 2 more
Bayesian program analysis is a systematic approach to learn from external information for better accuracy by converting logical deduction in conventional program analysis into Bayesian inference. A key challenge in Bayesian program analysis is how to select program abstractions to effectively generalize from external information. A recent approach addresses this challenge by learning a selection policy on training programs but may result in sub-optimal performance on new programs due to its learning nature and when the training set selection is not ideal. To address this problem, we propose an approach that is inspired by the framework of counterexample-guided refinement to search for an abstraction on the fly. Our key innovation is to apply the theory of conditional independence to refine the abstraction so that incorrect generalizations can be removed. To demonstrate the effectiveness of our approach, we have instantiated it on a Bayesian thread-escape analysis and a Bayesian datarace analysis and shown that it significantly improves the performance of the analyses.
- Research Article
- 10.1093/biomtc/ujaf128
- Oct 8, 2025
- Biometrics
- Eleanor M Pullenayegum + 1 more
Longitudinal data are often subject to irregular and informative visit times. Weighting generalized estimating equationsby the inverse of the visit rate yields asymptotically unbiased estimates of regression coefficients provided that outcomes and visit times are conditionally independent, given the covariates in the visit model. Adding other covariates has no impact on the asymptotic bias of estimated regression coefficients, provided that conditional independence is maintained, but the impact on their variances is unknown. We show that variances are unchanged on adding variables associated with neither outcome nor visit process, and decrease on adding variables associated with outcome but not visit process. Adding variables associated with visits but not outcome may either increase or decrease variances of estimated outcome model regression coefficients, depending on the correlation structure of the covariates and the outcome. Application to a study of major depressive disorder found that the variances of estimated regression coefficients were of a similar magnitude when predictors of outcome but not visits were added to the visit rate model but consistently larger, in some cases by a factor of 2, on adding predictors of visits but not outcome. We recommend that visit process models include variables associated with outcome, but that those unassociated with the outcome be treated with caution.
- Research Article
- 10.1093/biomtc/ujaf133
- Oct 8, 2025
- Biometrics
- Baoying Yang + 3 more
Conditional independence is a foundational concept for understanding probabilistic relationships among variables, with broad applications in fields such as causal inference and machine learning. This study focuses on testing conditional independence, $T\perp X|Z$, where T represents survival data possibly subject to right censoring, Z represents established risk factors for T, and X represents potential novel biomarkers. The goal is to identify novel biomarkers that offer additional merits for further risk assessment and prediction. This can be achieved by using either the partial or parametric likelihood ratio statistic to evaluate whether the coefficient vector of X in the conditional model of T given $(X^{ \mathrm{\scriptscriptstyle \top } }, Z^{ \mathrm{\scriptscriptstyle \top } })^{ \mathrm{\scriptscriptstyle \top } }$ is equal to zero. Traditional tests such as directly comparing likelihood ratios to chi-squared distributions may produce erroneous type-I error rates under model misspecification. As an alternative, we propose a resampling-based method to approximate the distribution of the likelihood ratios. A key advantage of the proposed test is its double robustness: it achieves approximately correct type-I error rates when either the conditional outcome model or the working model of ${\rm pr} (X|Z)$ is correctly specified. Additionally, machine learning techniques can be incorporated to improve test performance. Simulation studies and the application to the Alzheimer's Disease Neuroimaging Initiative (ADNI) data demonstrate the finite-sample performance of the proposed tests.
- Research Article
- 10.1093/jrsssb/qkaf059
- Oct 4, 2025
- Journal of the Royal Statistical Society Series B: Statistical Methodology
- Eliana Duarte + 1 more
Abstract We address the problem of representing context-specific causal models based on both observational and experimental data collected under general (e.g. hard or soft) interventions by introducing a new family of context-specific conditional independence models called CStrees. This family is defined via a novel factorization criterion that allows for a generalization of the factorization property defining general interventional directed acyclic graph (DAG) models. We derive a graphical characterization of model equivalence for observational CStrees that extends the Verma and Pearl criterion for DAGs. This characterization is then extended to CStree models under general, context-specific interventions. To obtain these results, we formalize a notion of context-specific intervention that can be incorporated into concise graphical representations of CStree models. We relate CStrees to other context-specific models, showing that the families of DAGs, CStrees, labelled DAGs, and staged trees form a strict chain of inclusions. We then present an algorithm for learning CStrees from a combination of observational and interventional data where the intervention targets are assumed to be unknown with hard or soft and possibly context-specific effects. The algorithm, evaluated on simulated and real data, performs well in the recovery of context-specific dependence structure as well as context-specific interventional perturbations.
- Research Article
- 10.1080/02664763.2025.2567980
- Oct 4, 2025
- Journal of Applied Statistics
- Qiying Wu + 2 more
Understanding and modeling distribution-valued data, an important form of symbolic data, has garnered significant attention in statistics because of its effectiveness in handling large datasets. Conventional statistical inference methods are not directly applicable to distribution-valued data, which has prompted extensive research efforts aimed at addressing this challenge. However, graphical models, which are powerful tools in applied statistics, have not yet been fully developed for distribution-valued data. To fill this gap, this study proposes a novel nonparametric graphical model estimation method for distribution-valued data. The proposed method first removes the inherent constraints of distributions, effectively capturing both position information (as a scalar) and shape information (as a function). We subsequently propose an aggregation method, which is based on the conditional independence test, to integrate the position information and shape information for graphical model estimation. Several numerical simulations have validated that our method outperforms other potential competing methods. Furthermore, we apply our method to construct the network of stocks that constitute the SSE 50 Index using daily distribution-valued data of five-minute returns. The empirical results reveal sector-specific relationships as well as cross-sector influences, highlighting the evolving interconnections between stocks from different sectors over time.
- Research Article
- 10.55214/2576-8484.v9i10.10344
- Oct 3, 2025
- Edelweiss Applied Science and Technology
- Nidhal Ziadi Ellouze + 2 more
This research aims to develop predictive models to estimate the probability of bank customer default based on socio-economic factors. Relying on Bayesian inference frameworks, the study implements both Naive Bayes classifiers and Bayesian Neural Networks (BNNs) to improve prediction accuracy. The methodology is grounded in Bayes’ theorem and conditional independence, with practical implementation using Python libraries such as scikit-learn and TensorFlow Probability. Following rigorous preprocessing and model training on financial datasets, the results show effective segmentation of customers according to their credit risk levels. The models enable personalized financial product recommendations, including tailored interest rates and guarantees. The findings demonstrate the statistical robustness of Bayesian approaches and their ability to deliver interpretable solutions for credit risk assessment. This approach supports strategic decision-making by aligning banking offers with individual risk profiles, ultimately contributing to risk mitigation and enhanced customer relationship management.
- Research Article
- 10.1175/aies-d-24-0114.1
- Oct 1, 2025
- Artificial Intelligence for the Earth Systems
- Peter Miersch + 3 more
Abstract Estimating causal drivers of high-impact extreme events such as floods from data is an aspiring pursuit. Time series causal discovery methods, such as the conditional-independence-based PC Momentary Conditional Independence (PCMCI) framework, are designed to identify causal relationships from complex multivariate observational time series. However, the application to extreme event data remains a challenge due to, by the nature of extremes, data length limitations, conditional independence testing for nonlinear relationships, and potential violations of the methods’ assumptions. So far, these challenges have mostly been explored on synthetic data with limited transferability to real-world applications. In this study, we evaluate causal discovery on real and pseudoreal data generated with a hydrological model across 45 catchments with varying flood-generating processes. Because no detailed causal ground truth exists, we focus on the robustness of output graphs. To this end, we simulate a large sample, identify discharge peaks, and investigate the robustness of the causal discovery algorithm PCMCI+ when applied to different realizations of the same setting for various sample sizes. We find that the robustness generally increases with sample size, yet a significant proportion of inferred causal edges remain inconsistent even for large datasets. Notably, while some flood drivers are reliably identified, other key hydrological mechanisms are systematically missed even for very large sample sizes, highlighting methodological limitations. Our study provides a blueprint for investigating the real-world performance of causal discovery methods and illustrates their current limitations for identifying causal drivers of floods. Significance Statement Revealing the causes of extremes in the Earth system, like floods, is important for climate risk assessments. Causal discovery is a modern machine learning approach aiming to find these causal drivers in ever more abundant observational data. However, the reliability of causal discovery algorithms depends on many sources of uncertainty. Here, we evaluate the robustness of causal discovery on real and pseudoreal data generated with a state-of-the-art hydrological model and find that current observational sample sizes may not be enough to reliably estimate causal drivers in such challenging settings.
- Research Article
1
- 10.1016/j.aap.2025.108181
- Oct 1, 2025
- Accident; analysis and prevention
- Yifan Wang + 1 more
Causal relationship discovery for highway crash analysis using semi-data-driven Bayesian network.
- Research Article
4
- 10.1016/j.artint.2025.104391
- Oct 1, 2025
- Artificial Intelligence
- Yixin Ren + 7 more
Regression-based conditional independence test with adaptive kernels