Generation Method Research Articles

Ongoing research attempts to benchmark large language models (LLM) against physicians' fund of knowledge by assessing LLM performance on medical examinations. No prior study has assessed LLM performance on internal medicine (IM) board examination questions. Limited data exists on how knowledge supplied to the models, derived from medical texts improves LLM performance. The performance of GPT-3.5, GPT-4.0, LaMDA and Llama 2, with and without additional model input augmentation, was assessed on 240 randomly selected IM board-style questions. Questions were sourced from the Medical Knowledge Self-Assessment Program released by the American College of Physicians with each question serving as part of the LLM prompt. When available, LLMs were accessed both through their application programming interface (API) and their corresponding chatbot. Mode inputs were augmented with Harrison's Principles of Internal Medicine using the method of Retrieval Augmented Generation. LLM-generated explanations to 25 correctly answered questions were presented in a blinded fashion alongside the MKSAP explanation to an IM board-certified physician tasked with selecting the human generated response. GPT-4.0, accessed either through Bing Chat or its API, scored 77.5-80.7% outperforming GPT-3.5, human respondents, LaMDA and Llama 2 in that order. GPT-4.0 outperformed human MKSAP users on every tested IM subject with its highest and lowest percentile scores in Infectious Disease (80th) and Rheumatology (99.7th), respectively. There is a 3.2-5.3% decrease in performance of both GPT-3.5 and GPT-4.0 when accessing the LLM through its API instead of its online chatbot. There is 4.5-7.5% increase in performance of both GPT-3.5 and GPT-4.0 accessed through their APIs after additional input augmentation. The blinded reviewer correctly identified the human generated MKSAP response in 72% of the 25-question sample set. GPT-4.0 performed best on IM board-style questions outperforming human respondents. Augmenting with domain-specific information improved performance rendering Retrieval Augmented Generation a possible technique for improving accuracy in medical examination LLM responses.

Read full abstract

Objective Autonomous vehicles (AVs) have the potential to revolutionize the future of mobility by significantly improving traffic safety. This study presents a novel method for validating the safety performance of AVs in high-risk scenarios involving powered 2-wheelers (PTWs). By generating high-risk scenarios using in-depth crash data, this study is devoted to addressing the challenge of public road scenarios in testing, which often lack the necessary complexity and risk to effectively evaluate the capabilities of AVs in high-risk situations. Method Our approach employs a Wasserstein generative adversarial network (WGAN) to generate high-risk scenes, particularly focusing on PTW scenarios. By extracting 314 car-to-PTW crashes from the China In-depth Mobility Safety Study–Traffic Accident database, we simulate outcomes using PC-Crash software. The data are divided into scenes at 0.1-s intervals, with WGAN generating numerous high-risk scenes. By using a cumulative distribution function (CDF), we sampled and analyzed the vehicle’s dynamic information to generate complete scenarios applicable to the test. The validation process involves using the SVL Simulator and the Baidu Apollo joint simulation platform to evaluate the AV’s driving behavior and interactions with PTWs. Results This study evaluates model generation results by comparing distributions using Wasserstein distance as an indicator. The generator converges after approximately 200 epochs, with the iterator converging quickly. Subsequently, 10,000 new scenes are then generated. The distribution of several key parameters in the generated scenes can be found to approximate that of the original scenes. After sampling, the usability of generated scenarios is 64.76%. Virtual simulations confirm the effectiveness of the scenario generation method, with a generated scenario crash rate of 16.50% closely reflecting the original rate of 15.0%, showcasing the method’s capacity to produce realistic and hazardous scenarios. Conclusions The experimental results suggest that these scenarios exhibit a level of risk similar to the original crashes and are effective for testing AVs. Consequently, the generated scenarios enhance the diversity of the scenario library and accelerate the overall testing process of AVs.

Read full abstract

Generation Method Research Articles

Related Topics

Articles published on Generation Method

Performance of Publicly Available Large Language Models on Internal Medicine Board-style Questions.

High-value application of kaolin by wet mixing method in low heat generation and high wear-resistant natural rubber composites

Memory-Based Learning and Fusion Attention for Few-Shot Food Image Generation Method

Is It Hard to Generate Holistic Commit Message?

Comparing AIGC and traditional idea generation methods: Evaluating their impact on creativity in the product design ideation phase

Recent advances of 5-endo-trig radical cyclization: promoting strategies and applications.

Guiding Diffusion Models for Antibody Sequence and Structure Co-design with Developability Properties

A method for selecting the type of energy storage for power systems with high penetration of renewable energy with multi-application scenarios

Target-Specific De Novo Peptide Binder Design with DiffPepBuilder.

High-risk powered two-wheelers scenarios generation for autonomous vehicle testing using WGAN

Multiple-ResNet GAN: An enhanced high-resolution image generation method for translation from fundus structure image to fluorescein angiography.

Precision strike: Precise backdoor attack with dynamic trigger

An approach to generate cross-polarization modulation-enabled optical frequency comb with enhanced spectral flatness in traveling-wave semiconductor optical amplifiers

Brain tumoroids: treatment prediction and drug development for brain tumors with fast, reproducible and easy-to-use personalized models.

APMG: 3D Molecule Generation Driven by Atomic Chemical Properties.

A high‐precision timing and frequency synchronization algorithm for multi‐h CPM signals

RT-SNDETR: real-time supernova detection via end-to-end image transformers

Towards similar alignment and unique uniformity in collaborative filtering

RESEARCH ON THE CONSTRUCTION OF AI COMPOSITION SYSTEM BASED ON HMM

Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Generation Method Research Articles

Related Topics

Articles published on Generation Method

Performance of Publicly Available Large Language Models on Internal Medicine Board-style Questions.

High-value application of kaolin by wet mixing method in low heat generation and high wear-resistant natural rubber composites

Memory-Based Learning and Fusion Attention for Few-Shot Food Image Generation Method

Is It Hard to Generate Holistic Commit Message?

Comparing AIGC and traditional idea generation methods: Evaluating their impact on creativity in the product design ideation phase

Recent advances of 5-endo-trig radical cyclization: promoting strategies and applications.

Guiding Diffusion Models for Antibody Sequence and Structure Co-design with Developability Properties

A method for selecting the type of energy storage for power systems with high penetration of renewable energy with multi-application scenarios

Target-Specific De Novo Peptide Binder Design with DiffPepBuilder.

High-risk powered two-wheelers scenarios generation for autonomous vehicle testing using WGAN

Multiple-ResNet GAN: An enhanced high-resolution image generation method for translation from fundus structure image to fluorescein angiography.

Precision strike: Precise backdoor attack with dynamic trigger

An approach to generate cross-polarization modulation-enabled optical frequency comb with enhanced spectral flatness in traveling-wave semiconductor optical amplifiers

Brain tumoroids: treatment prediction and drug development for brain tumors with fast, reproducible and easy-to-use personalized models.

APMG: 3D Molecule Generation Driven by Atomic Chemical Properties.

A high‐precision timing and frequency synchronization algorithm for multi‐h CPM signals

RT-SNDETR: real-time supernova detection via end-to-end image transformers

Towards similar alignment and unique uniformity in collaborative filtering

RESEARCH ON THE CONSTRUCTION OF AI COMPOSITION SYSTEM BASED ON HMM

Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?