Abstract The success of large language models (LLMs) with attention mechanisms in natural language processing has inspired a series of LLMs for single-cell analysis, such as scGPT. Such LLMs claim to have excellent performance as foundation models across various downstream tasks. This study aims to validate scGPT and explore its potential in cancer research using real-world clinical trial data. We assessed scGPT's performance in both zero-shot and fine-tuned scenarios on various tasks using single-nucleus sequencing data from pancreatic ductal adenocarcinoma (PDAC) patients undergoing different treatments, including ones treated at the Dana-Farber Cancer Institute. Our methodology encompassed a comprehensive evaluation of scGPT's capabilities, including zero-shot clustering, gene expression prediction, and cell reconstruction. We fine-tuned the model for downstream tasks like cell type annotation, treatment perturbation prediction, and gene regulatory network inference. Additionally, we extracted and analyzed multi-head, multi-layer embeddings and attention matrices, visualizing the flow of information within the model and how they respond to different perturbations or fine-tuning objectives to investigate the model's learning process and its correlation with biological information. The study utilized two main datasets for fine-tuning and evaluating scGPT: an unpublished clinical trial dataset featuring metastatic PDAC patients under three comparative treatment arms with pre- and on-treatment single-nucleus sequencing data, and a published clinical trial dataset containing sequencing data from PDAC patients who underwent various treatment regimens. Our results validate scGPT's potential to effectively extract biological information as a foundation model for single-cell biology. For zero-shot, scGPT demonstrated strong performance in tasks like clustering, while showing areas with suboptimal performance like gene expression prediction and reconstruction. In addition, fine-tuning significantly boosts the model's capabilities across various tasks. The model responds well to multiple fine-tuning objectives, capturing information that distinguishes between individuals and treatments, and accurately predicting patient and treatment groups at the single-cell level. The analysis of model structure and information flow revealed that multi-head attention and representations could, to some extent, capture biological information. For example, the model’s attention to certain genes varies in different heads and layers, with being altered to certain patterns responding to fine-tuning objectives. Our analysis of the model's architecture and attention mechanisms offers preliminary insights into the relationship between model behavior and biological processes. It also provides a framework for interpreting complex biological information through the lens of attention mechanisms in single cell LLMs, paving the way for future studies exploring the intersection of foundation models and cancer biology, potentially leading to transformative progress in clinical cancer care. Citation Format: Runzi Tan, Haotian Cui, Bo Wang, Kimberly Perez, Andressa Dias Costa, Alexander Jordan, Thomas Karacic, Dalia Elganainy, Dan Y Gui, Suryun Kim, Chen Yuan, Morgan Truitt, Michael Downes, Ronald Evans, Tae Gyu Oh, Peter O’Dwyer, Andrew Aguirre, Jonathan A Nowak, Brian Wolpin, Simona Cristea. Evaluating and interpreting scGPT: A foundation model for single-cell biology in real-world cancer clinical trial data [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Advances in Pancreatic Cancer Research; 2024 Sep 15-18; Boston, MA. Philadelphia (PA): AACR; Cancer Res 2024;84(17 Suppl_2):Abstract nr A029.