Background: Electronic health records (EHRs) store valuable clinical data, often in unstructured narrative formats that require manual extraction. This process is time-consuming, costly, and error-prone. Natural language processing (NLP) offers a promising solution for automating data extraction, improving research efficiency, and maintaining accuracy. However, its generalizability and reliability remain areas of active investigation. This study evaluates the performance of an NLP tool for extracting clinical conditions, medications with dosage, and echocardiographic parameters compared to manual retrieval. Objective: To assess the accuracy, sensitivity, and specificity of an NLP tool for extracting clinical data from unstructured EHR narratives, validating its performance against manual data extraction methods. Methods: This prospective study was conducted in three tertiary care hospitals in Punjab, Pakistan, from December 2023 to May 2024. A total of 500 participants were included, stratified by urban (68%, 340) and rural (32%, 160) residency. The NLP tool extracted 5,700 data points across three categories: 3,000 clinical conditions, 1,500 medications with dosage, and 1,200 echocardiographic parameters. Performance metrics, including accuracy, sensitivity, and specificity, were calculated by comparing the tool's results with manual retrieval. Discrepancies were analyzed to identify root causes, including algorithmic and human errors. Results: The NLP tool achieved an accuracy of 98.5%, sensitivity of 96.7%, and specificity of 97.2%, closely aligning with manual retrieval at 99.0%, 97.5%, and 97.8%, respectively. For clinical conditions, the tool retrieved 2,955 of 3,000 data points correctly (98.5%), while manual retrieval achieved 2,970 (99.0%). For medications with dosage, the tool extracted 1,452 of 1,500 data points (96.8%) compared to 1,488 (99.2%) manually. Similarly, 1,178 of 1,200 echocardiographic parameters (98.2%) were correctly retrieved by the tool, compared to 1,185 (98.8%) through manual methods. Urban participants (242 males, 98 females) outnumbered rural participants (106 males, 54 females), with the majority aged 31–70 years (75%). Conclusion: The NLP tool demonstrated high accuracy and near-human precision in extracting structured data from unstructured EHR narratives. Its performance across clinical conditions, medications, and echocardiographic parameters highlights its potential to streamline clinical research while reducing manual workload. Further refinement is required to address context-sensitive errors and enhance generalizability across diverse datasets.
Read full abstract