Holdout Dataset Research Articles

Abstract Study question Which follicle sizes on trigger day (TD) are key to obtaining mature oocytes, embryos, and blastocysts, thereby enhancing live birth rates? Summary answer Follicles sized 13-18mm are most likely to yield mature oocytes. Live birth rate (LBR) was increased with a greater proportion of follicles in this range. What is known already At the time of oocyte retrieval, follicles that are either too small or too large are less likely to yield mature oocytes. However, there is only limited data examining which follicle sizes on TD administration are optimally sized to yield mature oocytes, nor any data examining whether live birth rates are impacted. Explainable artificial intelligence (XAI) offers a valuable opportunity to leverage clinical data to determine the range of follicle sizes that are most likely to yield mature oocytes and lead to improved downstream outcomes. Study design, size, duration A retrospective study incorporating 19,082 patients undergoing their first IVF/ICSI cycle (2005-2023) across 11 European clinics. Follicle sizes on TD were related to oocyte maturity (14,140 patients), fertilization (17,822 patients), high-quality blastocyst development (17,488 patients), and pregnancy outcomes (12,672 patients). Secondary analyses included stratification by age (&gt;35 years: 4,717 patients; ≤35 years: 5,707), and suppressant protocol (GnRH agonist: n = 5,420 patients; GnRH antagonist: n = 3,982). Participants/materials, setting, methods An ensemble-based XAI model, developed through ‘internal-external validation’, was trained, Bayesian optimized, and independently iteratively tested on hold-out clinic datasets. The model determined follicle sizes contributing most to successful laboratory outcomes by harnessing permutation importance and ‘SHAP’ values. Model performance was evaluated and optimized for mean absolute error (MAE). The top 50% contiguous cohort of influential follicle sizes on TD were identified, and further validated through analysis of follicle sizes on preceding days. Main results and the role of chance XAI analysis revealed that follicles sized 13-18mm on TD were most influential in yielding mature oocytes, especially those sized 15-18mm (MAE: 3.60 ±0.35). Predictive follicle sizes from scans 1-2 days preceding TD administration were consistent with expected mean follicle growth rates. For high-quality blastocysts, follicles sized 14-20mm were contributory, narrowing to 15-18mm in the subset of patients where oocytes had been confirmed to be mature. In hCG-triggered cycles, follicles sized 12-19mm were most likely to yield mature oocytes in GnRH antagonist cycles, whereas follicles sized 14-20mm were most contributory in GnRH agonist cycles. In patients ≤35 years, follicles 13-18mm were most contributory, whereas a wider range of follicles (11-20mm) were contributory in patients &gt;35 years. In line with current practice, having three lead follicles ≥17mm on TD was associated with a median improvement in mature oocyte yield of 10% versus those that did not (p &lt; 0.0001). By comparison, maximizing the proportion of follicles on TD within the optimal size range could better improve mature oocyte yield, for example, by 42% when at least 70% of follicles were sized 15-18mm (p &lt; 0.0001). Likewise, LBR improved as the proportion of follicles sized 13-18mm was increased (&lt;10%: 23.3%; 10-20%: 25.7%; 20-30%: 28.8%; 30-40%: 31.6%). Limitations, reasons for caution These data could help guide when to administer the trigger to optimize clinical outcomes, however prospective assessment of different TD strategies based on the proportion of follicles within the optimal follicle size ranges identified in comparison to current practice based on lead follicle size is required prior to clinical adoption. Wider implications of the findings These findings provide a data-driven approach to personalizing ovarian stimulation in IVF. The findings generated by applying an XAI model could assist clinicians in optimizing TD timing, potentially enhancing clinical outcomes by moving beyond the conventional focus on lead follicle size. Trial registration number not appicable

Read full abstract

Background and objectivesBio-medical image segmentation models typically attempt to predict one segmentation that resembles a ground-truth structure as closely as possible. However, as medical images are not perfect representations of anatomy, obtaining this ground truth is not possible. A surrogate commonly used is to have multiple expert observers define the same structure for a dataset. When multiple observers define the same structure on the same image there can be significant differences depending on the structure, image quality/modality and the region being defined. It is often desirable to estimate this type of aleatoric uncertainty in a segmentation model to help understand the region in which the true structure is likely to be positioned. Furthermore, obtaining these datasets is resource intensive so training such models using limited data may be required. With a small dataset size, differing patient anatomy is likely not well represented causing epistemic uncertainty which should also be estimated so it can be determined for which cases the model is effective or not. MethodsWe use a 3D probabilistic U-Net to train a model from which several segmentations can be sampled to estimate the range of uncertainty seen between multiple observers. To ensure that regions where observers disagree most are emphasised in model training, we expand the Generalised Evidence Lower Bound (ELBO) with a Constrained Optimisation (GECO) loss function with an additional contour loss term to give attention to this region. Ensemble and Monte-Carlo dropout (MCDO) uncertainty quantification methods are used during inference to estimate model confidence on an unseen case. We apply our methodology to two radiotherapy clinical trial datasets, a gastric cancer trial (TOPGEAR, TROG 08.08) and a post-prostatectomy prostate cancer trial (RAVES, TROG 08.03). Each dataset contains only 10 cases each for model development to segment the clinical target volume (CTV) which was defined by multiple observers on each case. An additional 50 cases are available as a hold-out dataset for each trial which had only one observer define the CTV structure on each case. Up to 50 samples were generated using the probabilistic model for each case in the hold-out dataset. To assess performance, each manually defined structure was matched to the closest matching sampled segmentation based on commonly used metrics. ResultsThe TOPGEAR CTV model achieved a Dice Similarity Coefficient (DSC) and Surface DSC (sDSC) of 0.7 and 0.43 respectively with the RAVES model achieving 0.75 and 0.71 respectively. Segmentation quality across cases in the hold-out datasets was variable however both the ensemble and MCDO uncertainty estimation approaches were able to accurately estimate model confidence with a p-value < 0.001 for both TOPGEAR and RAVES when comparing the DSC using the Pearson correlation coefficient. ConclusionsWe demonstrated that training auto-segmentation models which can estimate aleatoric and epistemic uncertainty using limited datasets is possible. Having the model estimate prediction confidence is important to understand for which unseen cases a model is likely to be useful.

Read full abstract

Holdout Dataset Research Articles

Related Topics

Articles published on Holdout Dataset

ValveVision AI - a multimodal language model for qualitative reporting of the aortic valve in echocardiography

Comorbidity neural network model for personalised recommendations in cardiometabolic risk factor management: towards early precision medicine in routine clinical practice

Prediction model for major bleeding in anticoagulated patients with cancer-associated venous thromboembolism using machine learning and natural language processing.

Topology-Driven Discovery of Transmembrane Protein S-Palmitoylation.

Major depletion of insulin sensitivity-associated taxa in the gut microbiome of persons living with HIV controlled by antiretroviral drugs

Enhancing the reliability of deep learning-based head and neck tumour segmentation using uncertainty estimation with multi-modal images

A Deep Transfer Learning Approach for Sleep Stage Classification and Sleep Apnea Detection Using Wrist-Worn Consumer Sleep Technologies.

Adaptive fine-tuning based transfer learning for the identification of MGMT promoter methylation status.

Prognostic value of platelet levels in patients with aneurysmal Subarachnoid Hemorrhage

Novel Domain Knowledge-Encoding Algorithm Enables Label-Efficient Deep Learning for Cardiac CT Segmentation to Guide Atrial Fibrillation Treatment in a Pilot Dataset.

Contrastive Self-supervised Learning for Neurodegenerative Disorder Classification.

O-219 Explainable artificial intelligence (XAI) to determine the follicle sizes on trigger day that maximize mature oocyte yield

LGG-15. DEEP LEARNING ENABLES LONGITUDINAL RISK PREDICTION FOR PEDIATRIC LOW-GRADE GLIOMAS AFTER SURGERY

DentalSegmentator: Robust open source deep learning-based CT and CBCT image segmentation

Pulmonary Fibrosis Diagnosis and Disease Progression Detected Via Hair Metabolome Analysis.

Epistatic Features and Machine Learning Improve Alzheimer's Disease Risk Prediction Over Polygenic Risk Scores.

Uncertainty estimation using a 3D probabilistic U-Net for segmentation with small radiotherapy clinical trial datasets

A digital pathology AI model to predict immune-oncology biomarker status and pembrolizumab response in a real-world cohort of patients with colorectal and breast cancer.

Design of a novel deep network model for spinal cord injury prediction

A computational clinical decision-supporting system to suggest effective anti-epileptic drugs for pediatric epilepsy patients based on deep learning models using patient’s medical history

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Holdout Dataset Research Articles

Related Topics

Articles published on Holdout Dataset

ValveVision AI - a multimodal language model for qualitative reporting of the aortic valve in echocardiography

Comorbidity neural network model for personalised recommendations in cardiometabolic risk factor management: towards early precision medicine in routine clinical practice

Prediction model for major bleeding in anticoagulated patients with cancer-associated venous thromboembolism using machine learning and natural language processing.

Topology-Driven Discovery of Transmembrane Protein S-Palmitoylation.

Major depletion of insulin sensitivity-associated taxa in the gut microbiome of persons living with HIV controlled by antiretroviral drugs

Enhancing the reliability of deep learning-based head and neck tumour segmentation using uncertainty estimation with multi-modal images

A Deep Transfer Learning Approach for Sleep Stage Classification and Sleep Apnea Detection Using Wrist-Worn Consumer Sleep Technologies.

Adaptive fine-tuning based transfer learning for the identification of MGMT promoter methylation status.

Prognostic value of platelet levels in patients with aneurysmal Subarachnoid Hemorrhage

Novel Domain Knowledge-Encoding Algorithm Enables Label-Efficient Deep Learning for Cardiac CT Segmentation to Guide Atrial Fibrillation Treatment in a Pilot Dataset.

Contrastive Self-supervised Learning for Neurodegenerative Disorder Classification.

O-219 Explainable artificial intelligence (XAI) to determine the follicle sizes on trigger day that maximize mature oocyte yield

LGG-15. DEEP LEARNING ENABLES LONGITUDINAL RISK PREDICTION FOR PEDIATRIC LOW-GRADE GLIOMAS AFTER SURGERY

DentalSegmentator: Robust open source deep learning-based CT and CBCT image segmentation

Pulmonary Fibrosis Diagnosis and Disease Progression Detected Via Hair Metabolome Analysis.

Epistatic Features and Machine Learning Improve Alzheimer's Disease Risk Prediction Over Polygenic Risk Scores.

Uncertainty estimation using a 3D probabilistic U-Net for segmentation with small radiotherapy clinical trial datasets

A digital pathology AI model to predict immune-oncology biomarker status and pembrolizumab response in a real-world cohort of patients with colorectal and breast cancer.

Design of a novel deep network model for spinal cord injury prediction

A computational clinical decision-supporting system to suggest effective anti-epileptic drugs for pediatric epilepsy patients based on deep learning models using patient’s medical history