An evaluation on large language model outputs: Discourse and memorization

Adrian De Wynter,Xun Wang,Alex Sokolov,Qilong Gu,Si-Qing Chen

doi:10.1016/j.nlp.2023.100024

An evaluation on large language model outputs: Discourse and memorization

Adrian De Wynter, Xun Wang + Show 3 more

Open Access

https://doi.org/10.1016/j.nlp.2023.100024

Copy DOI

Journal: Natural Language Processing Journal	Publication Date: Jul 5, 2023
Citations: 5	License type: cc-by-nc-nd

Affiliation: Microsoft (United States), University of York

#Large Language Models #Evaluate Mitigation Strategies + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

We present an empirical evaluation of various outputs generated by nine of the most widely-available large language models (LLMs). Our analysis is done with off-the-shelf, readily-available tools. We find a correlation between percentage of memorized text, percentage of unique text, and overall output quality, when measured with respect to output pathologies such as counterfactual and logically-flawed statements, and general failures like not staying on topic. Overall, 80.0% of the outputs evaluated contained memorized data, but outputs containing the most memorized content were also more likely to be considered of high quality. We discuss and evaluate mitigation strategies, showing that, in the models evaluated, the rate of memorized text being output is reduced.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Natural Language Processing Journal

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.