Discovery Logo
Sign In
Search
Paper
Search Paper
R Discovery for Libraries Pricing Sign In
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
Discovery Logo menuClose menu
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
features
  • Audio Papers iconAudio Papers
  • Paper Translation iconPaper Translation
  • Chrome Extension iconChrome Extension
Content Type
  • Journal Articles iconJournal Articles
  • Conference Papers iconConference Papers
  • Preprints iconPreprints
  • Seminars by Cassyni iconSeminars by Cassyni
More
  • R Discovery for Libraries iconR Discovery for Libraries
  • Research Areas iconResearch Areas
  • Topics iconTopics
  • Resources iconResources

Articles published on Incorrect Options

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
75 Search results
Sort by
Recency
  • Research Article
  • 10.31436/imjm.v25i01.3213
Comparative Evaluation of ChatGPT and Microsoft Copilot in Solving Clinical Vignette- style multiple-choice questions (MCQs) in Physiology
  • Mar 3, 2026
  • IIUM Medical Journal Malaysia
  • Rekha Prabhu + 2 more

INTRODUCTION: Large language models (LLMs) are increasingly used by MBBS students as supplementary resources for exam preparation. The objective of this study was to evaluate the performance of ChatGPT and Microsoft Copilot in answering clinical vignette-style physiology MCQs from widely used resources for the United States Medical Licensing Examination (USMLE). MATERIALS AND METHODS: Fifty clinical vignette-style physiology multiple choice questions (MCQs) from the various USMLE question banks were submitted to ChatGPT and Microsoft Copilot to choose the correct option. The performance of ChatGPT and Microsoft Copilot was assessed using the provided answers in the question bank. Two experienced physiologists independently reviewed the explanations provided by ChatGPT and Microsoft Copilot for each MCQ. The explanations were rated between one to three points based on whether the answers were completely incorrect, partially correct with inaccurate information, or correct with adequate information. RESULTS: ChatGPT and Microsoft Copilot both correctly answered 48 and 47 out of 50 questions, reflecting a 96% and 94% accuracy rates respectively. One MCQ each on hypothyroidism and arrhythmia was incorrectly answered by both ChatGPT and Microsoft Copilot. For two MCQs, the explanations provided were inaccurate by ChatGPT and Microsoft Copilot provided inaccurate explanations for four of the MCQs. CONCLUSION: ChatGPT and Microsoft Copilot both demonstrated more than 90% accuracy in answering case-based MCQs from the USMLE Step 1 resources. Their incorrect option choices MCQs on hypothyroidism and inaccurate explanations for some MCQs highlight cautious use of AI by students.

  • Research Article
  • 10.1016/j.dib.2026.112597
SSC-BanglaTutor: A curriculum-aligned Bengali dataset for intelligent tutoring systems
  • Feb 24, 2026
  • Data in Brief
  • Eshraque Jabid Ifti + 7 more

This dataset presents a Bengali-language dataset designed to fine-tune AI powered hint-based tutoring systems for the Secondary School Certificate (SSC) science curriculum in Bangladesh. This data includes 11,286 hint-based question–answer entries, comprising 4859 questions from Biology covering 14 chapters, 3034 from Chemistry across 12 chapters, and 3393 from Physics spanning 14 chapters. All items were created manually using government-issued textbooks, SSC focused study materials, and past exam question banks. Each question is paired with candidate answers containing one correct option and several closely related but incorrect options to help measure the effectiveness of the hints. A convergence score is attached to each entry, estimating how far a student may need to go through the hints to answer correctly. These features support personalized feedback and offer meaningful insight into the students’ learning progress. The dataset is encoded in UTF-8, with some English terms retained for scientific precision and consistency with source materials. This makes it accessible to native learners while remaining valuable for low-resource Natural Language Processing (NLP) applications. By emphasizing curriculum alignment, ranked hinting, and learner modeling, the dataset provides a strong foundation for fine-tuning large language models (LLMs) and developing intelligent tutoring systems that are both linguistically inclusive and educationally effective.

  • Research Article
  • 10.3390/computers15020130
Rethinking Distractor Quality in Multimodal Multiple-Choice Questions: Automated Evaluation and Hard Benchmark Construction
  • Feb 18, 2026
  • Computers
  • Wenjian Ding + 3 more

In Multimodal Multiple-Choice Questions, distractors play a pivotal role in rigorously evaluating the cross-modal reasoning capabilities of Multimodal Large Language Models by serving as plausible yet incorrect options. A comprehensive and reliable evaluation of distractor quality is therefore imperative for fostering genuine progress in this domain. However, prevailing evaluation approaches face a fundamental dilemma: they either rely on model-based metrics that fail to fully capture semantic nuances, or depend on human evaluation, which is resource-intensive and prone to subjective bias. To address these limitations, we introduce a comprehensive suite of 9 automated metrics, spanning both intrinsic and extrinsic dimensions, to reliably quantify distractor quality. Leveraging this framework, we propose a metric-driven ensemble strategy for constructing hard benchmarks. Specifically, we aggregate candidate pools from diverse advanced baselines and rigorously select the optimal subset of distractors that yield the highest quality scores under our proposed metrics. Extensive evaluations involving 33 Multimodal Large Language Models across 16 diverse benchmarks demonstrate that our method generates distractors with significantly higher confusability, posing a more rigorous challenge to current state-of-the-art models.

  • Research Article
  • 10.1186/s12909-026-08656-3
Consistency over accuracy: run-to-run stability of contemporary large language models on Turkish curriculum-aligned theoretical anatomy multiple-choice questions.
  • Jan 23, 2026
  • BMC medical education
  • Ömer Alperen Gürses + 1 more

Stability across repeated administrations is essential for educational use of large language models (LLMs), yet it is rarely quantified in non-English, curriculum-aligned anatomy contexts. Eleven contemporary LLMs answered 100 Turkish, faculty-authored, curriculum-aligned anatomy multiple-choice questions from AYDEP, targeting the undergraduate Physiotherapy and Rehabilitation anatomy curriculum in three independent runs (≥ 12-hour intervals). Testing used developers' web interfaces in Turkey with browsing disabled and default generation settings (August-September 2025). Performance was summarized with a stability-aware 0-3 item score (number of correct responses across three runs) and predefined response-consistency classes. A subset of models achieved near-ceiling totals with high run-to-run stability, whereas others showed greater session-to-session variability (i.e., changes in the selected option across independent runs initiated as separate sessions under identical inputs). Several nominal differences among higher-performing systems did not remain significant after multiplicity control. Within-family updates produced selective, not universal, gains. Many models exhibited medians of 3 (IQR 3-3) on the 0-3 scale (ceiling effects), and lower means were accompanied by larger dispersion. Consistency profiles provided information beyond mean accuracy by distinguishing reliably correct from volatile behavior. In addition, we observed "consistent & wrong" patterns on a subset of items, where the same incorrect option was repeatedly selected across runs. In Turkish, curriculum-aligned anatomy items, contemporary LLMs can be both accurate and stable, but single-trial accuracy can mask volatility and stable systematic errors. Adoption decisions should prioritize stability-aware appraisal (including consistent-correct and consistent-wrong rates), with local validation on institutional item banks and periodic re-evaluation as models evolve. Extending this framework to multimodal anatomy and constructed-response tasks will further inform trustworthy, learner-facing use.

  • Research Article
  • 10.1002/bdm.70025
Mouse Cursor Movements in Cognitive Bias Tasks Reveal Underlying Processing Differences
  • Jul 1, 2025
  • Journal of Behavioral Decision Making
  • Jinjin Wu + 2 more

ABSTRACTBiases are prevalent in human judgment and decision‐making (JDM). Previous research has suggested that some biases might share common underlying causes and can be accounted for under dual‐process theories in which fast and error‐prone System 1 drives erroneous behavior. Here, we use an online paradigm to investigate similarities and differences in behavior across three commonly studied cognitive bias phenomena: cognitive reflection test (CRT), gambler's fallacy (GF), and conjunction fallacy (CF). These are all thought to emerge during biased System 1 processing. Critically, we examine both summative performance metrics and process tracing measures derived from mouse cursor movements and growth curve analysis (GCA). Summative performance in these tasks was broadly in line with previous studies, and we replicated correlations in accuracy between tasks (CRT vs. CF and CRT vs. GF). However, we found key differences in our GCA of mouse trajectories. Specifically, in the CRT and the CF tasks, participants tended to choose the incorrect option more quickly relative to the correct option, as might be expected. However, the opposite tendency was observed for GF—people tended to take longer to choose the wrong answer. We also found evidence from the mouse movement analyses for between‐task differences in the extent to which participants were tempted by the option they did not choose. These findings challenge prominent dual‐process accounts of JDM and highlight the potential of process tracing (and in particular mouse movement analyses) for revealing insights into cognitive processes.

  • Research Article
  • 10.1111/cogs.70076
Adults Represent Others’ Logical Inferences Even When It Is Unnecessary
  • Jun 1, 2025
  • Cognitive Science
  • Dóra Fogd + 2 more

Successful social interactions require representing not only what others know, but also what they may deductively infer from evidence. For instance, to help deciding between two alternatives, we may just reveal the incorrect option, expecting others to draw the correct conclusion. Seemingly, we readily track others’ logical inferences if it is necessary for our goals. However, it is currently unknown whether we also track them when we do not have to, and whether these inferences affect our own conclusions. To address this, in four online experiments, we presented adults with scenarios where an agent could arrive at the same or different conclusions as the participant, based on what she witnessed (via excluding one or two out of three target locations). Participants rated the likelihood of an outcome from self or from the agent's perspective. We hypothesized that if participants track others’ inferences also when making self‐perspective judgments, that is, when they could respond without even paying attention to the other, the spontaneous representation of the other's different conclusion may result in higher ratings for the outcome the agent (but not the participant) considers possible, compared to the one both consider impossible. In three experiments, we found such an altercentric bias in self‐perspective judgments, suggesting that participants spontaneously encoded the conclusions the agent could draw (Experiments 1 and 2), even when this required multistep inferences (Experiment 4), although there were considerable individual differences and the bias was absent when task‐demands were high (Experiment 3), implying a potentially resource‐dependent use of the capacity.

  • Research Article
  • 10.32473/flairs.38.1.138995
Generating Distractors for Code Completion Problems: Can LLM Assist Instructors?
  • May 14, 2025
  • The International FLAIRS Conference Proceedings
  • Mohammad Hassany + 5 more

Code completion problems are an effective type of formative assessment; especially, when used to practice newly learned concepts or topics. While there is a growing body of research in computing education on the use of large language models (LLMs) to support learning content development, the use of LLMs for producing high-quality code completion problems has not yet been explored. In this paper, we analyze the capability of LLMs to generate effective distractors (i.e., plausible but incorrect options) and explanations for completion problems. We utilize common student misconceptions to improve the quality of the generated distractors. Our study suggests that LLMs are capable of generating reasonable distractors and explanations. At the same time, we identify a lack of a sufficiently granular taxonomy of common student misconceptions that would be needed for aligning the generated distractors with the common misconceptions and errors -- a gap that should be addressed in future work.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.neunet.2025.107138
Read, Eliminate, and Focus: A reading comprehension paradigm for distant supervised relation extraction.
  • May 1, 2025
  • Neural networks : the official journal of the International Neural Network Society
  • Zechen Meng + 7 more

Read, Eliminate, and Focus: A reading comprehension paradigm for distant supervised relation extraction.

  • Research Article
  • 10.1609/aaai.v39i24.34722
Thought-Path Contrastive Learning via Premise-Oriented Data Augmentation for Logical Reading Comprehension
  • Apr 11, 2025
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Chenxu Wang + 2 more

Logical reading comprehension is a challenging task that entails grasping the underlying semantics of text and applying reasoning to deduce the correct answer. Prior researches have primarily focused on enhancing logical reasoning capabilities through Chain-of-Thought (CoT) or data augmentation. However, previous work constructing chain-of-thought rationales concentrates solely on analyzing correct options, neglecting the incorrect alternatives. Addtionally, earlier efforts on data augmentation by altering contexts rely on rule-based methods, which result in generated contexts that lack diversity and coherence. To address these issues, we propose a Premise-Oriented Data Augmentation (PODA) framework. This framework can generate CoT rationales including analyses for both correct and incorrect options, while constructing diverse and high-quality counterfactual contexts from incorrect candidate options. We integrate summarizing premises and identifying premises for each option into rationales. Subsequently, we employ multi-step prompts with identified premises to construct counterfactual context. To facilitate the model's capabilities to better differentiate the reasoning process associated with each option, we introduce a novel thought-path contrastive learning method that compares reasoning path between the original and counterfactual samples. Experimental results on three representative LLMs demonstrate that our method can improve the baselines substantially across two challenging logical reasoning benchmarks (ReClor and LogiQA 2.0).

  • PDF Download Icon
  • Research Article
  • 10.21105/joss.07783
MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models
  • Apr 10, 2025
  • Journal of Open Source Software
  • Sayak Chakrabarty + 1 more

This paper introduces Multiple Choice Reasoning via.Process of Elimination using Multi-Modal models, also known as Multi-Modal Process of Elimination (MM-PoE), is a method to enhance vision language models' performance on multiple choice visual reasoning tasks by employing a two-step scoring system that first eliminates incorrect options and then predicts from the remaining ones.Our experiments across three question-answering datasets show the method's effectiveness, particularly in visual reasoning tasks.This method addresses one of the key limitations of the paper (Ma & Du, 2023) by extending to tasks involving multi-modalities and also includes experimentation techniques for few-shot settings. Statement of NeedLarge Language models (LLMs) excel at in-context learning for multiple-choice reasoning tasks but often treat all options equally, unlike humans who typically eliminate incorrect choices before selecting the correct answer.The same is true for vision language models (VLMs) in case of visual question-answering tasks with multiple choices.This discrepancy can limit the effectiveness of vision language models in accurately solving such tasks.To address this, we introduce Multi-Modal Process of Elimination (MM-PoE), a two-step scoring method designed to enhance VLM performance by mimicking human reasoning strategies in multi-modal settings.In the first step, the method evaluates and scores each option, systematically eliminating those that appear incorrect.The second step involves masking these eliminated options, allowing the VLM to focus solely on the remaining viable choices to make a final prediction.Our zero-shot experiments across three datasets demonstrate MM-PoE's effectiveness, particularly excelling in logical reasoning scenarios.Additionally, MM-PoE proves adaptable to few-shot settings and is compatible with the current state-of-the-art vision language models (VLMs).Using this tool, researchers and practitioners can experiment and significantly improve the accuracy and reliability of VLMs in multiple choice reasoning tasks, making it a valuable tool for advancing machine learning models for visual reasoning.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.1038/s41598-025-85827-0
The importance of clinical experience in AI-assisted corneal diagnosis: verification using intentional AI misleading
  • Jan 9, 2025
  • Scientific Reports
  • Hiroki Maehara + 15 more

We developed an AI system capable of automatically classifying anterior eye images as either normal or indicative of corneal diseases. This study aims to investigate the influence of AI’s misleading guidance on ophthalmologists’ responses. This cross-sectional study included 30 cases each of infectious and immunological keratitis. Responses regarding the presence of infection were collected from 7 corneal specialists and 16 non-corneal-specialist ophthalmologists, first based on the images alone and then after presenting the AI’s classification results. The AI’s diagnoses were deliberately altered to present a correct classification in 70% of the cases and incorrect in 30%. The overall accuracy of the ophthalmologists did not significantly change after AI assistance was introduced [75.2 ± 8.1%, 75.9 ± 7.2%, respectively (P = 0.59)]. In cases where the AI presented incorrect diagnoses, the accuracy of corneal specialists before and after AI assistance was showing no significant change [60.3 ± 35.2% and 53.2 ± 30.9%, respectively (P = 0.11)]. In contrast, the accuracy for non-corneal specialists dropped significantly from 54.5 ± 27.8% to 31.6 ± 29.3% (P < 0.001), especially in cases where the AI presented incorrect options. Less experienced ophthalmologists were misled due to incorrect AI guidance, but corneal specialists were not. Even with the introduction of AI diagnostic support systems, the importance of ophthalmologist’s experience remains crucial.

  • Research Article
  • Cite Count Icon 2
  • 10.35631/ijmoe.623029
THE DEVELOPMENT OF FORCE &amp; MOTION ACHIEVEMENT TEST (FMAT) FOR FORM TWO STUDENTS
  • Dec 23, 2024
  • International Journal of Modern Education
  • Putri Sathirah Saaban + 1 more

Force and Motion (FM) is one of the most challenging science concepts for students. However, the instruments for assessing students’ understanding about this concept are limited. As such, this study focused on the development processes of an achievement test, specifically in this topic. Force and Motion Achievement Test (FMAT) was developed based on the Standard Curriculum of Secondary Schools (KSSM) for Science subject. It consists of 25 items, namely 22 objective items (Section A) and 3 subjective items (Section B). The FMAT was validated by 3 experts with more than 10 years experience. Consequently, 3 items were amended. A pilot study was conducted on 40 form two students, then followed by a reliability procedure through Internal Consistency with KR-20 for Section A and Split-Half method for Section B. The results from reliability analysis obtained an acceptable alpha coefficient value for both sections, which are 0.757 for Section A and 0.732 for Section B. In terms of difficulty index (p), there were 4 difficult items, 23 moderately difficult items, and 9 easy items. This findings indicate that the FMAT is valid and reliable to be utilised in any study involving the achievement test of Force and Motion topic. Future researchers are encouraged to perform item discrimination and distractor analysis to evaluate the items’ ability in distinguishing among high and low achiever students, and how well the incorrect options divert students away from the correct answer. The implication of this study lies in providing a scientifically validated and reliable instrument for measuring students' understanding of Force and Motion topic, that help teachers identify learning gaps and enhance teaching strategies, ultimately improving the quality of science education.

  • Open Access Icon
  • Research Article
  • 10.15354/sief.24.or617
Development of a Protein Concept Inventory: A Proposal for Item Scoring and Responding
  • Aug 29, 2024
  • Science Insights Education Frontiers
  • Güntay Taşçi

The present study has aimed to develop and validate a protein concept inventory (PCI) consisting of 25 multiple-choice (MC) questions to assess students’ understanding of protein, which is a fundamental concept across different biology disciplines. The development process of the PCI involved a literature review to identify protein-related content, validation interviews to iteratively validate and refine the created items (n = 26), and data collection from a large sample (n = 291) for statistical analysis. An expert interview was held with two different field experts regarding the content validity of the draft PCI tool, the suitability of the options, and the clarity of the items. Free choice format (multiple marking) was used to answer the developed MC items. In scoring these items, positive points were given to correct options, and negative points were given to incorrect options. Evidence regarding the psychometric properties of the PCI trial form was collected through factor analysis, group differentiation, internal consistency, and item analysis using quantitative data. The evidence collected demonstrates that the validity and reliability of the PCI as a measurement tool have been confirmed. PCI’s scoring approach and the use of response patterns created by multiple markings in teaching are discussed.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.3389/feduc.2024.1423602
Identifying student profiles in a digital mental rotation task: insights from the 2017 NAEP math assessment
  • Aug 5, 2024
  • Frontiers in Education
  • Xin Wei + 2 more

Mental rotation (MR), a key aspect of spatial reasoning, is highly predictive of success in STEM fields. This study analyzed strategies employed by 27,600 eighth-grade students during a digital MR task from the 2017 National Assessment of Educational Progress (NAEP) in mathematics. Utilizing K-means cluster analysis to categorize behavioral and performance patterns, we identified four distinct profiles: Cognitive Offloaders (15% of the sample), Internal Visualizers (55%), External Visualizers (5%), and Non-Triers (25%). Cognitive Offloaders, skilled at minimizing cognitive load by eliminating incorrect options, demonstrated the highest MR accuracy rates at 45%. Internal Visualizers, relying less on digital tools and more on mental strategies, achieved robust performance with an average score of 38%. External Visualizers, despite their extensive use of assistive tools and greater time investment, scored an average of 36%. Non-Triers showed minimal engagement and correspondingly the lowest performance, averaging 29%. These findings not only underscore the diverse strategies students adopt in solving MR tasks but also emphasize the need for educational strategies that are tailored to accommodate different cognitive styles. By integrating MR training into the curriculum and enhancing teacher preparedness to support diverse learning needs, this study advocates for educational reforms to promote equitable outcomes in mathematics and broader STEM fields.

  • Research Article
  • Cite Count Icon 39
  • 10.1016/j.artmed.2024.102938
MedExpQA: Multilingual benchmarking of Large Language Models for Medical Question Answering
  • Jul 31, 2024
  • Artificial Intelligence In Medicine
  • Iñigo Alonso + 2 more

Large Language Models (LLMs) have the potential of facilitating the development of Artificial Intelligence technology to assist medical experts for interactive decision support. This potential has been illustrated by the state-of-the-art performance obtained by LLMs in Medical Question Answering, with striking results such as passing marks in licensing medical exams. However, while impressive, the required quality bar for medical applications remains far from being achieved. Currently, LLMs remain challenged by outdated knowledge and by their tendency to generate hallucinated content. Furthermore, most benchmarks to assess medical knowledge lack reference gold explanations which means that it is not possible to evaluate the reasoning of LLMs predictions. Finally, the situation is particularly grim if we consider benchmarking LLMs for languages other than English which remains, as far as we know, a totally neglected topic. In order to address these shortcomings, in this paper we present MedExpQA, the first multilingual benchmark based on medical exams to evaluate LLMs in Medical Question Answering. To the best of our knowledge, MedExpQA includes for the first time reference gold explanations, written by medical doctors, of the correct and incorrect options in the exams. Comprehensive multilingual experimentation using both the gold reference explanations and Retrieval Augmented Generation (RAG) approaches show that performance of LLMs, with best results around 75 accuracy for English, still has large room for improvement, especially for languages other than English, for which accuracy drops 10 points. Therefore, despite using state-of-the-art RAG methods, our results also demonstrate the difficulty of obtaining and integrating readily available medical knowledge that may positively impact results on downstream evaluations for Medical Question Answering. Data, code, and fine-tuned models will be made publicly available.11https://huggingface.co/datasets/HiTZ/MedExpQA.

  • Research Article
  • Cite Count Icon 4
  • 10.4103/jiaps.jiaps_232_23
Unaided Visual Inspection for Assessment of Penile Curvature in the Clinical Setting of Hypospadias Surgery: Survey of Members of Society of Pediatric Urology (India).
  • Jul 1, 2024
  • Journal of Indian Association of Pediatric Surgeons
  • V V S Chandrasekharam + 3 more

To compare the accuracy of unaided visual inspection (UVI) to Software App measurement (SAM) of penile curvature (PC) during hypospadias surgery. Seven clinical pictures of PC (15°-60°) taken during hypospadias repair were shared with 300 members of the Society of Pediatric Urology (India). The respondents were asked to assess the angles by UVI and indicate their preferred correction method of that PC. For each picture, the angles of curvature estimated by UVI were compared with the objective angle measured using an app (SAM), which was considered an accurate estimation. Statistical analysis was done using software; P<0.05 was considered as statistically significant. Ninety-one of 101 (90%) respondents preferred UVI to measure PC during hypospadias surgery. For 6/7 pictures, <40% of participants estimated the angle correctly by UVI (P < 0.001), with the difference in estimation being 3.6°-14.9°. For pictures with PC >30°, the error in UVI estimation was >10°, with no correlation between the accuracy of UVI estimate and surgeon experience. A significant proportion of surgeons chose the incorrect option for PC correction, which was the lowest (69%) for PC 35.8°. Most surgeons preferred UVI to assess PC; UVI is an erroneous technique to measure PC angle, especially in the PC range 30°-60°, where the error was >10°. Most errors were an underestimation of the PC, irrespective of surgeon experience. There was a significant error in the choice of technique for PC correction for a PC of 35°. These results strongly support the objective assessment of PC using SAM during hypospadias repair.

  • Research Article
  • Cite Count Icon 26
  • 10.1111/emip.12590
Using OpenAI GPT to Generate Reading Comprehension Items
  • Jan 24, 2024
  • Educational Measurement: Issues and Practice
  • Ayfer Sayin + 1 more

Abstract The purpose of this study is to introduce and evaluate a method for generating reading comprehension items using template‐based automatic item generation. To begin, we describe a new model for generating reading comprehension items called the text analysis cognitive model assessing inferential skills across different reading passages. Next, the text analysis cognitive model is used to generate reading comprehension items where examinees are required to read a passage and identify the irrelevant sentence. The sentences for the generated passages were created using OpenAI GPT‐3.5. Finally, the quality of the generated items was evaluated. The generated items were reviewed by three subject‐matter experts. The generated items were also administered to a sample of 1,607 Grade‐8 students. The correct options for the generated items produced a similar level of difficulty and yielded strong discrimination power while the incorrect options served as effective distractors. Implications of augmented intelligence for item development are discussed.

  • Research Article
  • Cite Count Icon 2
  • 10.1177/07342829231167892
The Features of Plausible but Incorrect Options: Distractor Plausibility in Synonym-Based Vocabulary Tests
  • Apr 6, 2023
  • Journal of Psychoeducational Assessment
  • Ulrich Ludewig + 2 more

A better understanding of how distractor features influence the plausibility of distractors is essential for an efficient multiple-choice (MC) item construction in educational assessment. The plausibility of distractors has a major influence on the psychometric characteristics of MC items. Our analysis utilizes the nominal categories model to investigate German fourth graders' ( N = 924) selection of response options in a German MC Vocabulary test. We used principles from cognitive psychology to identify relevant option features capturing the option’s potential to distract students from the correct answer. The results show that only a few option characteristics explain option choice behavior to a large extent. Options with distracting features (i.e., semantic relatedness and orthographic similarity) increase the item difficulty and discrimination, whereas distractors that are less synonym than the attractor decrease item discrimination. Implications for test score interpretations and item construction guidelines are highlighted.

  • Research Article
  • 10.5406/21638195.95.1.03
The Nightmare Island: Representations of St. Barthélemy in Swedish Novels
  • Apr 1, 2023
  • Scandinavian Studies
  • Ale Pålsson

The Nightmare Island: Representations of St. Barthélemy in Swedish Novels

  • Research Article
  • Cite Count Icon 6
  • 10.1016/j.burns.2023.01.007
Public perception of household risks for pediatric burn injuries and assessment of management readiness
  • Jan 27, 2023
  • Burns
  • Tomer Lagziel + 9 more

Public perception of household risks for pediatric burn injuries and assessment of management readiness

  • 1
  • 2
  • 3
  • 4
  • 1
  • 2
  • 3
  • 4

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers