Weighted Kappa Analysis Research Articles

Abstract Background Magnetic resonance imaging (MRI) is key in evaluating central cartilage tumors. The BACTIP (Birmingham Atypical Cartilaginous Tumour Imaging Protocol) protocol assesses central cartilage tumor risk based on the tumor size and degree of endosteal scalloping on MRI. It provides a management protocol for assessment, follow-up, or referral of central cartilage tumors. Objective Our study compared four MRI sequences: T1-weighted (T1-w), fluid sensitive (Short Tau Inversion Recovery (STIR)- weighted, STIR-w), and grayscale inversions (T1-w GSI and short tau inversion recovery [STIR] GSI) to see how reliably endosteal scalloping was detected. Materials and Methods Two senior consultant musculoskeletal radiologists with experience reviewed randomly selected 60 representative central cartilage tumor cases with varying degree of endosteal scalloping to reflect a spectrum of BACTIP pathologies. The endosteal scalloping was graded as per the definition of BACTIP A, B, and C. They agreed on a consensus BACTIP grade for each of the 240 key images (60 cases × 4 sequences), which was considered the final “consensus” BACTIP grade. These 240 images were then randomized into a test set and given to two fellowship-trained consultant musculoskeletal radiologists for analysis. They assigned a BACTIP grade to each of the 240 selected images while being blinded to the final “consensus” BACTIP grade. The training set was further subdivided into three groups based on the MR image quality (good quality, average quality, and poor quality) to ascertain if the quality of the acquired images influenced intraobserver and interobserver agreements on the BACTIP grading. The two observers were blinded to the grade assigned to the image quality. Results Linearly weighted kappa analysis was performed to measure the agreement between the BACTIP grading answers by two observers and the “consensus” BACTIP grading answers, as well as the BACTIP grading agreement between the two observers themselves.The analysis revealed that T1-w and STIR-w sequences demonstrated more consistent and higher agreement across different image qualities. However, the T1-w GSI and STIR-w GSI sequences exhibited lower agreement, particularly for poor-quality images. T1-w imaging demonstrated substantial agreement between BACTIP gradings for poor-quality images, suggesting potential resilience of T1-w sequence in challenging imaging conditions. Conclusion T1-w imaging is the best sequence for BACTIP grading of endosteal scalloping, followed by fluid-sensitive STIR sequences.

Read full abstract

BackgroundWe aimed to evaluate the image quality, feasibility, and diagnostic performance of three-dimensional ultrashort echo time magnetic resonance imaging (3D UTE-MRI) to assess idiopathic pulmonary fibrosis (IPF) compared with high-resolution computed tomography (HRCT) and half-Fourier single-shot turbo spin-echo (HASTE) MRI.MethodsA total of 36 patients with IPF (34 men; mean age: 62±8 years, age range: 43 to 78 years) were prospectively included and underwent HRCT and chest MRI on the same day. Chest MRI was performed with a free-breathing 3D spiral UTE pulse sequence and HASTE sequence on a 1.5 T MRI. Two radiologists independently evaluated the image quality of the HRCT, HASTE, and 3D UTE-MRI. They assessed the representative imaging features of IPF, including honeycombing, reticulation, traction bronchiectasis, and ground-glass opacities. Image quality of the 3D UTE-MRI, HASTE, and HRCT were assessed using a 5-point visual scoring method. Kappa and weighted kappa analysis were used to measure intra- and inter-observer and inter-method agreements. Sensitivity (SE), specificity (SP), and accuracy (AC) were used to assess the performance of 3D UTE-MRI for detecting image features of IPF and monitoring the extent of pulmonary fibrosis. Linear regressions and Bland-Altman plots were generated to assess the correlation and agreement between the assessment of the extent of pulmonary fibrosis made by the 2 observers.ResultsThe image quality of HRCT was higher than that of HASTE and UTE-MRI (HRCT vs. UTE-MRI vs. HASTE: 4.9±0.3 vs. 4.1±0.7 vs. 3.0±0.3; P<0.001). Interobserver agreement of HRCT, HASTE, and 3D UTE-MRI when assessing pulmonary fibrosis was substantial and excellent (HRCT: 0.727≤ κ ≤1, P<0.001; HASTE: 0.654≤ κ ≤1, P<0.001; 3D UTE-MRI: 0.719≤ κ ≤0.824, P<0.001). In addition, reticulation (SE: 97.1%; SP: 100%; AC: 97.2%; κ =0.654), honeycombing (SE: 83.3%; SP: 100%; AC: 86.1%; κ =0.625) patterns, and traction bronchiectasis (SE: 94.1%; SP: 100%; AC: 94.4%, κ =0.640) were also well-visualized on 3D UTE-MRI, which was significantly superior to HASTE. Compared with HRCT, the sensitivity of 3D UTE-MRI to detect signs of pulmonary fibrosis (n=35) was 97.2%. The interobserver agreement in elevation of the extent of pulmonary fibrosis with HRCT and 3D UTE-MRI was R2=0.84 (P<0.001) and R2=0.84 (P<0.001), respectively. The extent of pulmonary fibrosis assessed with 3D UTE-MRI [median =9, interquartile range (IQR): 6.25 to 10.00] was lower than that from HRCT (median =12, IQR: 9.25 to 13.00; U=320.00, P<0.001); however, they had a positive correlation (R=0.72, P<0.001).ConclusionsAs a radiation-free non-contrast enhanced imaging method, although the image quality of 3D UTE-MRI is inferior to that of HRCT, it has high reproducibility to identify the imaging features of IPF and evaluate the extent of pulmonary fibrosis.

Read full abstract

Weighted Kappa Analysis Research Articles

Related Topics

Articles published on Weighted Kappa Analysis

Automated AI-based coronary calcium scoring using retrospective CT data from SCAPIS is accurate and correlates with expert scoring.

Postmastectomy Breast Reconstruction in Irradiated Patients: A 12-year follow-up of Deep Inferior Epigastric Perforator and Latissimus Dorsi Flap Outcomes

Optimal Magnetic Resonance Sequence for Assessment of Central Cartilage Tumor Scalloping

Comparison between clinical and computerized methods for assessing gingival pigmentation.

The relative validity of a semiquantitative food frequency questionnaire among pregnant women in the United Arab Emirates: The Mutaba'ah study.

Community directed assessment of pain in a northern Saskatchewan Cree community

The Role of Artificial Intelligence in Coronary Calcium Scoring in Standard Cardiac Computed Tomography and Chest Computed Tomography With Different Reconstruction Kernels.

Inter-grader reliability in the Danish screening programme for diabetic retinopathy.

Prevalence and associated factors of drug-drug interactions in elderly outpatients in a tertiary care hospital: a cross-sectional study based on three databases.

Deep learning reconstruction for the evaluation of neuroforaminal stenosis using 1.5T cervical spine MRI: comparison with 3T MRI without deep learning reconstruction.

Diagnostic Value of Fully Automated Artificial Intelligence Powered Coronary Artery Calcium Scoring from 18F-FDG PET/CT.

Three-dimensional ultrashort echo time magnetic resonance imaging in assessment of idiopathic pulmonary fibrosis, in comparison with high-resolution computed tomography.

Accuracy and Reliability of Whole Blood Bilirubin Measurements Using a Roche Blood Gas Analyzer for Neonatal Hyperbilirubinemia Screening and Risk Stratification.

The Impact of COVID-19 on the Choice of Treatment for Hand Fractures: A Single-Centre Concordance Study.

Oxygen desaturation index as alternative parameter in screening patients with severe obstructive sleep apnea.

The diagnostic value of susceptibility-weighted imaging for identifying acute intraarticular hemorrhages.

Verification of Atellica 1500 and comparison with Iris urine analyser and urine culture.

Free-breathing radial 3D fat-suppressed T1-weighted gradient echo (r-VIBE) sequence for assessment of pulmonary lesions: a prospective comparison of CT and MRI

Evaluation of calcium to phosphorus ratio in spot urine samples as a practical method to monitor phosphorus intake adequacy in sows.

The Patient Health Questionnaire-9 vs. the Hamilton Rating Scale for Depression in Assessing Major Depressive Disorder.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Weighted Kappa Analysis Research Articles

Related Topics

Articles published on Weighted Kappa Analysis

Automated AI-based coronary calcium scoring using retrospective CT data from SCAPIS is accurate and correlates with expert scoring.

Postmastectomy Breast Reconstruction in Irradiated Patients: A 12-year follow-up of Deep Inferior Epigastric Perforator and Latissimus Dorsi Flap Outcomes

Optimal Magnetic Resonance Sequence for Assessment of Central Cartilage Tumor Scalloping

Comparison between clinical and computerized methods for assessing gingival pigmentation.

The relative validity of a semiquantitative food frequency questionnaire among pregnant women in the United Arab Emirates: The Mutaba'ah study.

Community directed assessment of pain in a northern Saskatchewan Cree community

The Role of Artificial Intelligence in Coronary Calcium Scoring in Standard Cardiac Computed Tomography and Chest Computed Tomography With Different Reconstruction Kernels.

Inter-grader reliability in the Danish screening programme for diabetic retinopathy.

Prevalence and associated factors of drug-drug interactions in elderly outpatients in a tertiary care hospital: a cross-sectional study based on three databases.

Deep learning reconstruction for the evaluation of neuroforaminal stenosis using 1.5T cervical spine MRI: comparison with 3T MRI without deep learning reconstruction.

Diagnostic Value of Fully Automated Artificial Intelligence Powered Coronary Artery Calcium Scoring from 18F-FDG PET/CT.

Three-dimensional ultrashort echo time magnetic resonance imaging in assessment of idiopathic pulmonary fibrosis, in comparison with high-resolution computed tomography.

Accuracy and Reliability of Whole Blood Bilirubin Measurements Using a Roche Blood Gas Analyzer for Neonatal Hyperbilirubinemia Screening and Risk Stratification.

The Impact of COVID-19 on the Choice of Treatment for Hand Fractures: A Single-Centre Concordance Study.

Oxygen desaturation index as alternative parameter in screening patients with severe obstructive sleep apnea.

The diagnostic value of susceptibility-weighted imaging for identifying acute intraarticular hemorrhages.

Verification of Atellica 1500 and comparison with Iris urine analyser and urine culture.

Free-breathing radial 3D fat-suppressed T1-weighted gradient echo (r-VIBE) sequence for assessment of pulmonary lesions: a prospective comparison of CT and MRI

Evaluation of calcium to phosphorus ratio in spot urine samples as a practical method to monitor phosphorus intake adequacy in sows.

The Patient Health Questionnaire-9 vs. the Hamilton Rating Scale for Depression in Assessing Major Depressive Disorder.