Average Measures Intraclass Correlation Coefficient Research Articles

The Health of the Nation Outcome Scales (HoNOS) is a widely used clinical measure designed to rate and monitor the outcomes of service users accessing specialist mental healthcare. Since its development (in 1996), numerous research studies have confirmed the HoNOS captures the aspects of care that it purports to (validity), and that clinicians' ratings are consistent both over time, and between different raters (reliability). In 2018, the HoNOS was reviewed with updates made to some terminology and other revisions intended to remove ambiguity in the guidance for raters. However, although the new version (HoNOS 2018) was accompanied by a recommendation that its validity and reliability be re-tested this was not undertaken. To our knowledge, this is the first study to re-assess the updated tool's reliability by measuring the level of agreement between different raters. Our findings confirm that there is an acceptable level of consistency between student mental health nurses that have been trained to use the (new) HoNOS 2018. The HoNOS is nationally mandated for use by all specialist mental healthcare providers in the UK. Our findings provide some assurance that, with appropriate update training and monitoring of organisational-level data sets, the original HoNOS glossary can safely be replaced with the HoNOS 2018 to ensure more contemporary routine outcome measurement can occur. INTRODUCTION: The Health of the Nation Outcome Scales (HoNOS) is a well-established clinician rated outcome measure for use in mental health services. Following an international review, an updated version (HoNOS 2018) was published with a recommendation that its psychometric properties be re-tested prior to widespread implementation. To date, only one such study has been published. To test the inter-rater agreement levels for HoNOS 2018. Third-year student mental health nurses received training to complete the HoNOS 2018. Following this timetabled session, they were each invited to independently rate two, randomly selected, videos of (simulated) patient interviews. The resulting data were then analysed to calculate the tool's internal consistency and inter-rater agreement levels. The 55 participants provided 106 ratings from four vignettes. Cronbach's alphas and McDonalds omegas confirmed the revised tool's internal consistency was acceptable. Average measure intraclass correlation coefficients for the four patient vignettes indicated excellent reliability. This study provides initial assurance that the HoNOS 2018 is a reliable clinician rated outcome measure suitable for use in routine clinical practice by relatively inexperienced mental health practitioners with limited training.

Read full abstract

RationaleAmyloid-β (Aβ) pathology is one of the earliest detectable brain changes in Alzheimer’s disease pathogenesis. In clinical practice, trained readers will visually categorise positron emission tomography (PET) scans as either Aβ positive or negative. However, adjunct quantitative analysis is becoming more widely available, where regulatory approved software can currently generate metrics such as standardised uptake value ratios (SUVr) and individual Z-scores. Therefore, it is of direct value to the imaging community to assess the compatibility of commercially available software packages. In this collaborative project, the compatibility of amyloid PET quantification was investigated across four regulatory approved software packages. In doing so, the intention is to increase visibility and understanding of clinically relevant quantitative methods.MethodsComposite SUVr using the pons as the reference region was generated from [18F]flutemetamol (GE Healthcare) PET in a retrospective cohort of 80 amnestic mild cognitive impairment (aMCI) patients (40 each male/female; mean age = 73 years, SD = 8.52). Based on previous autopsy validation work, an Aβ positivity threshold of ≥ 0.6 SUVrpons was applied. Quantitative results from MIM Software’s MIMneuro, Syntermed’s NeuroQ, Hermes Medical Solutions’ BRASS and GE Healthcare’s CortexID were analysed using intraclass correlation coefficient (ICC), percentage agreement around the Aβ positivity threshold and kappa scores.ResultsUsing an Aβ positivity threshold of ≥ 0.6 SUVrpons, 95% agreement was achieved across the four software packages. Two patients were narrowly classed as Aβ negative by one software package but positive by the others, and two patients vice versa. All kappa scores around the same Aβ positivity threshold, both combined (Fleiss’) and individual software pairings (Cohen’s), were ≥ 0.9 signifying “almost perfect” inter-rater reliability. Excellent reliability was found between composite SUVr measurements for all four software packages, with an average measure ICC of 0.97 and 95% confidence interval of 0.957–0.979. Correlation coefficient analysis between the two software packages reporting composite z-scores was strong (r2 = 0.98).ConclusionUsing an optimised cortical mask, regulatory approved software packages provided highly correlated and reliable quantification of [18F]flutemetamol amyloid PET with a ≥ 0.6 SUVrpons positivity threshold. In particular, this work could be of interest to physicians performing routine clinical imaging rather than researchers performing more bespoke image analysis. Similar analysis is encouraged using other reference regions as well as the Centiloid scale, when it has been implemented by more software packages.

Read full abstract

Average Measures Intraclass Correlation Coefficient Research Articles

Related Topics

Articles published on Average Measures Intraclass Correlation Coefficient

586. Let's Have a Chat: How Well Does an Artificial Intelligence Chatbot Answer Clinical Infectious Diseases Questions?

Let's Have a Chat: How Well Does an Artificial Intelligence Chatbot Answer Clinical Infectious Diseases Pharmacotherapy Questions?

A large language model in solving primary healthcare issues: A potential implication for remote healthcare and medical education.

Assessing the Capability of Large Language Models in Naturopathy Consultation.

Reliability testing of the Health of the Nation Outcome Scales 2018.

Validity and reliability of Arabic version of pediatric migraine disability assessment scale (Child Self-Report versus Parent Proxy-Report): a multi-center study

P467 Towards AI-Augmented Clinical Decision Making: An Examination of ChatGPT's Utility in Acute Ulcerative Colitis Presentations

PROJECT, PERSUADE, AND PROTECT: ADULT CHILDREN’S PREDICTION OF PARENTS’ HEALTH STATE VALUATIONS

The development and preliminary evaluation of the Genetic Counseling Skills Checklist.

Software compatibility analysis for quantitative measures of [18F]flutemetamol amyloid PET burden in mild cognitive impairment

Performance of an MRI scoring system for inflammation of joints and entheses in peripheral SpA: post-hoc analysis of the CRESPA trial.

Correlations in radiographic and MAKO Total Knee Robotic-Assisted Surgery intraoperative limb coronal alignment.

Reliability of a three-dimensional spinal proprioception assessment for patients with adolescent idiopathic scoliosis

Reliability and validity of the Slovenian Translation of the Functional Oral Intake Scale (FOIS-SI)

Very high interrater reliability for manual segmentation of the medial perirhinal cortex

Reliability of forearm medial-anterior surface dimensional changes at different isometric hand grip forces

The revised Approved Instructional Resources score: An improved quality evaluation tool for online educational resources.

Interobserver reliability of canine urine specific gravity assessed by analog or digital refractometers.

Inter-Observer and Intra-Observer Reliability of 2D Radiograph-Based Valgus Cut Angle Measurement in Preoperative Planning for Primary Total Knee Arthroplasty.

Assessing Resident Diagnostic Skills Using a Modified Bronchiolitis Score.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Average Measures Intraclass Correlation Coefficient Research Articles

Related Topics

Articles published on Average Measures Intraclass Correlation Coefficient

586. Let's Have a Chat: How Well Does an Artificial Intelligence Chatbot Answer Clinical Infectious Diseases Questions?

Let's Have a Chat: How Well Does an Artificial Intelligence Chatbot Answer Clinical Infectious Diseases Pharmacotherapy Questions?

A large language model in solving primary healthcare issues: A potential implication for remote healthcare and medical education.

Assessing the Capability of Large Language Models in Naturopathy Consultation.

Reliability testing of the Health of the Nation Outcome Scales 2018.

Validity and reliability of Arabic version of pediatric migraine disability assessment scale (Child Self-Report versus Parent Proxy-Report): a multi-center study

P467 Towards AI-Augmented Clinical Decision Making: An Examination of ChatGPT's Utility in Acute Ulcerative Colitis Presentations

PROJECT, PERSUADE, AND PROTECT: ADULT CHILDREN’S PREDICTION OF PARENTS’ HEALTH STATE VALUATIONS

The development and preliminary evaluation of the Genetic Counseling Skills Checklist.

Software compatibility analysis for quantitative measures of [18F]flutemetamol amyloid PET burden in mild cognitive impairment

Performance of an MRI scoring system for inflammation of joints and entheses in peripheral SpA: post-hoc analysis of the CRESPA trial.

Correlations in radiographic and MAKO Total Knee Robotic-Assisted Surgery intraoperative limb coronal alignment.

Reliability of a three-dimensional spinal proprioception assessment for patients with adolescent idiopathic scoliosis

Reliability and validity of the Slovenian Translation of the Functional Oral Intake Scale (FOIS-SI)

Very high interrater reliability for manual segmentation of the medial perirhinal cortex

Reliability of forearm medial-anterior surface dimensional changes at different isometric hand grip forces

The revised Approved Instructional Resources score: An improved quality evaluation tool for online educational resources.

Interobserver reliability of canine urine specific gravity assessed by analog or digital refractometers.

Inter-Observer and Intra-Observer Reliability of 2D Radiograph-Based Valgus Cut Angle Measurement in Preoperative Planning for Primary Total Knee Arthroplasty.

Assessing Resident Diagnostic Skills Using a Modified Bronchiolitis Score.