To systematically assess the ChatGPT large language model on diverse tasks relevant to pharmacokinetic data analysis. ChatGPT was evaluated with prototypical tasks related to report writing, code generation, non-compartmental analysis, and pharmacokinetic word problems. The writing task consisted of writing an introduction for this paper from a draft title. The coding tasks consisted of generating R code for semi-logarithmic graphing of concentration-time profiles and calculating area under the curve and area under the moment curve from time zero to infinity. Pharmacokinetics word problems on single intravenous, extravascular bolus, and multiple dosing were taken from a pharmacokinetics textbook. Chain-of-thought and problem separation were assessed as prompt engineering strategies when errors occurred. ChatGPT showed satisfactory performance on the report writing, code generation tasks and provided accurate information on the principles and methods underlying pharmacokinetic data analysis. However, ChatGPT had high error rates in numerical calculations involving exponential functions. The outputs generated by ChatGPT were not reproducible: the precise content of the output was variable albeit not necessarily erroneous for different instances of the same prompt. Incorporation of prompt engineering strategies reduced but did not eliminate errors in numerical calculations. ChatGPT has the potential to become a powerful productivity tool for writing, knowledge encapsulation, and coding tasks in pharmacokinetic data analysis. The poor accuracy of ChatGPT in numerical calculations require resolution before it can be reliably used for PK and pharmacometrics data analysis.
Read full abstract