Abstract Background Copy Number Variations (CNVs) are critical genetic markers in diversity and disease, yet their accurate extraction from medical literature remains challenging due to the complexity of genetic data. While specialized NLP models like CNV-ETLAI have been developed for this task, the advent of Large Language Models (LLMs) such as GPT-4 presents a potential alternative with broader applicability. This study evaluates the efficacy of GPT-4 against CNV-ETLAI in extracting CNVs from medical journal articles, aiming to enhance genetic research and clinical decision-making. Methods We configured GPT-4 to process and interpret medical journal PDFs, designing custom prompts for CNV information extraction. The performance of GPT-4 was benchmarked against CNV-ETLAI using a dataset of 146 true positive CNVs extracted from 23 journal articles. Performance metrics focused on accuracy in extracting CNVs from both text and tables, recognizing the importance of structured data interpretation in genomic analysis. Results CNV-ETLAI demonstrated superior accuracy, achieving a 98% success rate in CNV extraction, compared to GPT-4’s 49%. Specifically, CNV-ETLAI outperformed GPT-4 in table extraction accuracy (99% vs. 41.2%) and context extraction accuracy (96% vs. 63.2%). Despite GPT-4's lower performance, its capacity for improvement and adaptability was noted, indicating potential future applicability in medical data extraction. Conclusions The study highlights CNV-ETLAI's current superiority in extracting CNVs from medical texts, particularly in interpreting structured data like tables. However, the adaptability and potential for growth in LLMs like GPT-4 suggest they could soon become valuable tools for medical data extraction, offering a more versatile and powerful solution across a broader range of applications. The promise of LLMs, despite their current limitations, underscores the need for continued research and development in AI technologies for genomic data interpretation.