Abstract

Knottnerus et al. 10,005 continue the debate on this in response to the earlier article by Sacristan and Dilla who proposed a shift from generalization of average results, to application of study results to the individual patient [[1]Sacristan J.A. Dilla T. Pragmatic trials revisited: applicability is about individualization.J Clin Epidemiol. 2018; 99: 164-166Abstract Full Text Full Text PDF PubMed Scopus (5) Google Scholar]. Knottnerus et al. argue that several sub-types of generalisation accepted in the social science [statistical, generalization, variation-covering generalization, theory-supported generalization, and exemplary generalization] can provide important information for individualising study results to populations and individuals with essentially different characteristics than the average. Another paper directly relevant to generalisation and applicability by Bokor-Billman et al. provides an update on the ongoing failure to adequately report gender and ethnicity. In this journal back in 2003 Corbie Smith et al. showed little progress in reporting of sex and ethnicity 10 years after the 1993 the NIH Revitalization Act of 1993, PL 103-43, [[2]https://www.ncbi.nlm.nih.gov/books/NBK236531/Google Scholar]. Now [[3]Iverson C. Chritiansen S. Flanagin A. Fontanarosa P.B. Glass R.M. Gregoline B. et al.International Committee of Medical Journal Editors. Recommendations for the conduct, reporting, editing, and publication of scholarly work in medical journals. i. Selection and description of participants.http://www.icmje.org/icmjerecommendations.pdfGoogle Scholar] Bokor Bilman et al. report that among 995 original articles (28 interventional and 916 observational studies) in the top 10 ranking medical academic journals, the adherence to reporting sex and ethnicity is still only reported in a third of the reports. Even when reported, there is no consensus of how to define and classify race and ethnicity: 5% provided a formal definition of race/ethnicity and 81 different race/ethnicity classifications were identified, too often imprecise and open to interpretation. Five papers address methods issues of RCTs. Are placebo controlled trials still ethical in surgery? For sure some of the most important debunking of highly popular but fatally flawed surgery has only occurred after a definitive RCT–eg in the last century the classic trial to demonstrate the absence of benefit from internal mammary ligation for severe angina [[4]Cobb L.A. Thomas G.I. Dillard D.H. Merendino K.A. Bruce R.A. An evaluation of internal-mammary-artery ligation by a double-blind technic.N Engl J Med. 1959; 260: 1115-1118Crossref PubMed Scopus (396) Google Scholar],…, and more recently in 2017 a sham surgical study of ‘liberation therapy’ a popular but biologically inexplicable [[5]Zamboni P. Tesio L. Galimberti S. Massacesi L. Salvi F. D'Alessandro R. Efficacy and safety of extracranial vein angioplasty in multiple sclerosis: a randomized clinical trial.JAMA Neurol. 2018; 75: 35-43Google Scholar] venous balloon angioplasty procedure for multiple sclerosis was necessary to halt this operation which in some jurisdictions was previously funded from public plans. Sham surgery is the optimal design to isolate the specific effects of the treatment as opposed to the incidental effects caused by anesthesia, the incisional trauma, pre- and postoperative care, and the patient's perception of having had a regular operation. However, placebo-controlled surgical trials are rare and even those funded have difficulty recruiting sufficient patients. There are a number of initiatives to increase these numbers but as Cousins et al. report on reviewing 96 published recent trials, they found a number of serious flaws such as failure to document that the patients understood that the placebo group could not benefit, and absence of technical details of the actual surgery, risk minimisation, adequate follow-up and commitment to offering the intervention to the placebo group. Tapering or immediately completely stopping [‘cold turkey’] long term medications is a frequently important de-prescribing clinical decision, for example when there are concerns that a immediate stopping may increase the chance of a flare of the condition. One stimulus for this is the ‘Choosing Wisely’ campaign to reduce polypharmacy. While evidence on effects of deprescribing is still scarce [[6]Knottnerus J.A. Tugwell P. Research-agenda bias.J Clin Epidemiol. 2018; 98: vii-viiiAbstract Full Text Full Text PDF PubMed Scopus (3) Google Scholar], there are an increasing number of trials evaluating this, so it is timely that Dirven et al. have reviewed these to see if the details are reported in sufficient quality using the Template for Intervention Description and Replication (TIDieR) in 27 trial reports studied, details of the intervention were insufficiently reported in most studies, with high variability between studies. The authors suggest that such deficits are a real problem for clinicians to make decisions on de-prescribing. Health information technologies (e.g., clinical decision support notifications/alerts/reminders, electronic communication between patient and provider) are increasingly used in the pediatric environment. These need to be tailored to the specific needs of children and families. Evaluation studies need to evaluate outcomes of importance and interest to key stakeholders (including patients, families/parents, health care professionals, and policy makers). Neame et al. reviewed 45 RCTs involving over 300,00 children; most measured ‘process’ such as utilisation, but only 26% of these trials included patient-focused outcomes in their methods. Moreover, not all these patient-focused outcomes were reported in the final publication. Only 7% reported adverse effects. There is an urgent need to improve the quality of these trials by publication of study protocols and the development of an outcome reporting framework, and a core outcome set. Transparency is improving in oncology trials. Seung et al. report that in a sample of 625 oncology trial in five major journals (Annals of Oncology, Journal of Clinical Oncology, JAMA Oncology, Lancet Oncology, New England Journal of Medicine) the clinical trial transparency significantly increased over a 5 year period (2013-2017). Trial registration was up to 92%; however it is then surprising that a disappointingly low rate of 27% provided publicly available protocols, with industry–funded trials being worse than non-industry trials. Journal editors of these journals have a major role here given they have successfully established that they will not accept a paper based on a trial not registered. They can do the same and insist on a publicly available protocol since a protocol is almost always required for ethics approval before the trial begins. Trial recruitment failure is the achilles heel of trials, so new recruitment strategies are needed. van der Worp et al. decided to use social media. They achievedng the required sample size by recruitment of nearly 50% of the sample size of 250 through social media and general practitioners. That this resulted in comparable samples in an RCT on using an app for incontinence is very encouraging. Incident and prevalent cases were recruited through 89 participating GPs. Prevalent cases were approached by letter in 14 of the 30 collaborating practices. The (social) media campaign consisted of the following: interviews in regional newspapers spread through LinkedIn, Facebook, and Twitter; interviews on national and regional radio, as well as local TV; and directed advertisements on Facebook in the study region. The only difference found between the two component samples when the baseline characteristics and outcomes were analysed, was a difference in average age of 5 years. Since the social media strategy was able to recruit patients who had not accessed medical care, this approach can have wider generalisability to underserved populations. Previous experience with this has had mixed results so if now employed by others, the baseline characteristic and outcomes of each group do need to be compared and reported. Seven papers on systematic review methods. Santesso et al., in the 26th JCE GRADE series article, update the GRADE recommendations on the wording options for making informative statements to communicate the findings of systematic reviews of interventions. They reaffirm the essential need to always include both a) an estimate of the magnitude and b) the certainty/trustworthiness of the synthesised evidence To communicate these two concepts and distinguish different levels of each they propose the following terms be consistently used: a) magnitude use the terms ‘large’ and ‘small’; b) for certainty/trustworthiness: for low certainty use the word ‘may’ for low certainty; for moderate certainty use the word ‘likely’; for high certainty use no qualifying word. While this may seem somewhat picayune, we agree that use of a common vocabulary will really help knowledge users understand these important gradations of each of the two concepts, magnitude of effect and certainty/trustworthiness of systematic review results, when looking at summary of findings evidence tables and plain language summaries. Assessing and reporting on the trustworthiness/certainty of the conclusions is a hallmark of leading systematic review organisations such as Cochrane [formerly known as the Cochrane Collaboration] and GRADE [[7]Guyatt G.H. Oxman A.D. Kunz R. Vist G.E. Falck-Ytter Y. Schunemann H.J. What is “quality of evidence” and why is it important to clinicians?.BMJ (Clinical research ed.). 2008; 336: 995-998Crossref PubMed Google Scholar]. In 2011 the Cochrane Collaboration handbook adopted a new approach to assessing the risk of bias, whereby instead of assessing the risk of bias of a whole study (which usually includes a number of different outcomes with varying precision and risks of bias), they moved to assessing each clinically important outcome and reporting the risk of bias for each outcome in a Summary of Findings Table (felt to be the ‘beating heart’ of Cochrane Reviews). Sensitivity analyses to present analyses ‘stratified according to summary risk of bias, or restricted to studies at low risk of bias’, are also listed in the core guidance document (Methodological Expectations of Cochrane Intervention Reviews (MECIR)) [[8]https://methods.cochrane.org/methodological-expectations-cochrane-intervention-reviewsGoogle Scholar]. Babic et al., in their review of nearly 1,500 Cochrane reviews published in the period 2015-2018, found that a fifth[300] still reported by whole study rather than by outcome; although 80% planned a sensitivity analysis in their protocol only 18% reported a sensitivity analysis. Of special importance is that 56 reviews reported that there was indeed a statistically significant change in the effect of at least one outcome. Although the quality of Cochrane reviews is consistently reported as higher than non-Cochrane reviews, this lack of consistency assessing outcomes instead of whole studies, exacerbated by the evidence in over 50 systemic reviews of an actual change in statistical significance of an outcome needs resolving. In conducting systematic reviews, a core requirement of the Cochrane Handbook, PRISMA and MOOSE is to contact authors to confirm data on missing items. Thus the finding by Tsokani et al. who - after contacting the authors of 116 RCTs in four systematic reviews - only managed an 8% success rate of successfully contacting authors to confirm data on missing items, is disappointing to say the least. This failure has the potential to impact substantively on the robustness and trustworthiness of meta-analyses as reported. They suggest ways of improving this, including data repositories and data-sharing platforms, and formalising scientific collaboration between authors of original studies and evidence synthesis experts. A study comparing Cochrane and non-Cochrane systematic reviews is reported by Hacke and Nunan, who studied the reasons for discrepancies between 24 paired Cochrane systematic reviews and non-Cochrane systematic reviews evaluating the effect of physical activity for the prevention and treatment of major chronic diseases including cancer, cardiovascular disease, diabetes, mental health, and osteoarthritis. Based on AMSTAR 2 these authors found that the non-Cochrane reviews performed uniformly poorly with 70% of the Cochrane Reviews performing much better (70% rated moderate to high) This paper found discrepancies in effect size in almost all pairs (22 of 24 pairs). 62% were higher in non-Cochrane reviews, 33% were higher in Cochrane reviews, and only one pair has the same effect size. Reasons varied but most often were due to differences in studies included, eligibility criteria for study design (12%), intervention (21%), controls (21%, condition definition (13%), and outcome (13%). Again this demonstrates that authors and journals are not adhering to reporting guidelines. Gao et al. report on the quality of 30 non-Cochrane systematic reviews and 30 updates identified between 1994 and 2018 from searching Pubmed and Embase. They were all assessed as low or critically low on AMSTAR, mainly due to failure to fully report the protocol and registration, adequately present the data collection and adequately assess Risk of Bias. There was no substantive improvement in the systematic review updates. This suggests that new mechanisms are needed to increase adherence by each of researchers, journal editors, and peer reviewers. Systematic reviews of prevalence and cumulative incidence is a growing type of clinically important systematic review with over half of all those listed in Pubmed being published since 2015. The paper by Hoffman et al. cites as examples for a) prevalence: the prevalence of psychiatric disorders in nursing home residents, the use of psychotropic drugs in patients with autism spectrum disorders; and b)for cumulative incidence:the proportion of an at-risk population with new events over a period such as the occurrence of urinary leakage after laparoscopic radical prostatectomy, or the use of harmful practices in the management of childhood diarrhea in low- and middle income countries. They examined the characteristics of a sample of 215 of these types of systematic reviews up to 2018. It is clear that this area needs attention and consensus established on reporting guidelines. These reviews were published in a surprisingly large number (187) of different journals, of which only 5% registered their protocols in PROSPERO. Three-quarters had language restrictions; half failed to assess risk of bias or study quality; half performed a meta-analysis. The authors call for a consensus on guidance on how to conduct and report prevalence and cumulative incidence systematic reviews. There are an increasing number of epidemiological studies on genotypes or genogroups of pathogens, so researchers and clinicians turn to reviews of prevalence, genotypes and risk factors to keep up with the rising genomic knowledge. Their quality is rarely appraised so their robustness is unknown. Tran et al. looked for these studies in September 2017 on nine electronic databases and found three formats are most commonly seen: 1) using weighted pooling meta-analysis (16 studies); (2) reporting summary statistics, which used unweighted analysis of the study-level measures (15 studies); and (3) not using any data pooling (5 studies). Methodologic quality was assessed by three tools (Assessment of Multiple Systematic Reviews (AMSTAR), Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA), and Risk Of Bias In Systematic reviews (ROBIS)) and the 16 studies that used weighted pooling meta-analysis performed best. The authors conclude that weighted pooling meta-analysis is the most suitable method for reaching rational conclusions in epidemiological studies of pathogen phenotypes/genogroups. Luiken et al. report on the importance of changing predictor measurement procedures affecting the performance of prediction models in clinical examples. Clinical prediction models are commonly applied in clinical practice to assist health care professionals in determining a patient's diagnosis or prognosis [[1]Sacristan J.A. Dilla T. Pragmatic trials revisited: applicability is about individualization.J Clin Epidemiol. 2018; 99: 164-166Abstract Full Text Full Text PDF PubMed Scopus (5) Google Scholar]. These models are typically applied on patients that were not part of the study population used to derive the model, often with the aim to estimate a probability for the presence of a disease or future health status [[2]https://www.ncbi.nlm.nih.gov/books/NBK236531/Google Scholar]. When applied on new patients, the performance in estimating these probabilities is often different from the performance in the derivation data. Discrepancies in predictive performance between derivation and validation setting are commonly explained by the specific modeling strategies (that may result in overfitting) and by differences in case-mix distribution across settings [[9]Steyerberg E.W. Uno H. Ioannidis J.P. Van Calster B. Ukaegbu C. Dhingra T. et al.Poor performance of clinical prediction models: the harm of commonly applied methods.J Clin Epidemiol. 2018; 98: 133-143Abstract Full Text Full Text PDF PubMed Scopus (43) Google Scholar]. In this paper evidence is presented for another factor: the study of three real world examples (the diagnosis of ovarian cancer, hereditary nonpolyposis colorectal cancer and intrauterine pregnancy) identified another important factor, namely predictor measurement heterogeneity. They recommend this be assessing to explain unanticipated results for predictive performance. Finally a review of the quality of observational studies in psychiatry suggests that their methodology is often substantively sub-optimal. Munkholm et al. examined 120 articles published in the period 2015-2018, in the five psychiatry journals with the highest 5-year impact factor (World Psychiatry, JAMA Psychiatry, Lancet Psychiatry, American Journal of Psychiatry, and Molecular Psychiatry). They found that only 5% followed a reporting guideline for observational studies such as STROBE. Nearly half failed to even mention confounding or bias in the abstract or discussion, and only one article expressed any caution related to confounders in the abstract or conclusion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call