• All Solutions All Solutions Caret
    • Editage

      One platform for all researcher needs

    • Paperpal

      AI-powered academic writing assistant

    • R Discovery

      Your #1 AI companion for literature search

    • Mind the Graph

      AI tool for graphics, illustrations, and artwork

    • Journal finder

      AI-powered journal recommender

    Unlock unlimited use of all AI tools with the Editage Plus membership.

    Explore Editage Plus
  • Support All Solutions Support
    discovery@researcher.life
Discovery Logo
Paper
Search Paper
Cancel
Ask R Discovery Chat PDF
Explore

Feature

  • menu top paper My Feed
  • library Library
  • translate papers linkAsk R Discovery
  • chat pdf header iconChat PDF
  • audio papers link Audio Papers
  • translate papers link Paper Translation
  • chrome extension Chrome Extension

Content Type

  • preprints Preprints
  • conference papers Conference Papers
  • journal articles Journal Articles

More

  • resources areas Research Areas
  • topics Topics
  • resources Resources

Cancer Research Research Articles

  • Share Topic
  • Share on Facebook
  • Share on Twitter
  • Share on Mail
  • Share on SimilarCopy to clipboard
Follow Topic R Discovery
By following a topic, you will receive articles in your feed and get email alerts on round-ups.
Overview
143382 Articles

Published in last 50 years

Related Topics

  • Clinical Cancer Research
  • Clinical Cancer Research
  • Translational Cancer Research
  • Translational Cancer Research
  • Lung Cancer Research
  • Lung Cancer Research

Articles published on Cancer Research

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
136967 Search results
Sort by
Recency
Mapping the landscape of locally advanced colon cancer research: a 30-year bibliometric perspective

Mapping the landscape of locally advanced colon cancer research: a 30-year bibliometric perspective

Read full abstract
  • Journal IconDiscover Oncology
  • Publication Date IconJul 16, 2025
  • Author Icon Yanwu Sun + 6
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Advancing prostate cancer research: an exploration of periprostatic adipose stem cells.

Prostate cancer (PCa) is the most prevalent cancer among men, highlighting the urgent need for innovative treatment strategies. The periprostatic adipose tissue (PPAT) plays a crucial role in the PCa tumor microenvironment, with direct crosstalk between PPAT and PCa cells, particularly in advanced stages with extraprostatic extension-a feature linked to poor prognosis. Owing to their migratory capacity, adipose stem cells (ASCs) are promising in regenerative medicine and play a key role in tissue engineering and cancer research. These findings offer potential for novel approaches in targeted drug delivery and gene therapy for PCa. While ASCs within PPAT influence the tumor stroma, the mechanisms behind their interactions with PCa cells are not fully understood, with studies reporting both inhibitory and promoting effects on cancer progression. The adipose tissue secretome, including PPAT-ASC exosomal proteins, mediates communication between PPAT and PCa cells, with exosomal dysregulation observed in stage T3 PCa. This dysregulation implicates key cancer pathways such as integrin-mediated cell interactions, epithelialmesenchymal transition, and mRNA stability regulation. Although ASCs show promise as therapeutic carriers, their use is complicated by the need to prevent unwanted interactions with cancer cells. Moreover, environmental contaminants such as endocrine disruptors can alter ASC behavior, potentially influencing PCa development. This review synthesizes current knowledge on the multifaceted roles of ASCs and ASC-derived exosomes in PCa biology, their therapeutic applications, and the impact of environmental toxicants on their function and cancer-related outcomes. Further research into the underlying biological mechanisms is needed, highlighting the need for safe, targeted therapeutic approaches in PCa treatment.

Read full abstract
  • Journal IconJournal of translational medicine
  • Publication Date IconJul 14, 2025
  • Author Icon Paula Alejandra Sacca + 1
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Process evaluation for the STAMINA randomised controlled trial: A protocol

BackgroundSTAMINA is a randomised controlled trial of a complex lifestyle intervention incorporating exercise prescription into a prostate cancer care pathway. The 12-month intervention aims to improve disease specific quality of life and reduce fatigue of people receiving androgen deprivation therapy for prostate cancer. Previously published work outlines the development of the trial intervention which included recruitment and training of healthcare professionals and exercise professionals to embed a lifestyle intervention and referral pathway within NHS prostate cancer care.MethodsA mixed-methods process evaluation, embedded within the STAMINA trial, will be conducted to assess quantitative process outcomes (recruitment, intervention reach, dose and fidelity), together with up to 45 qualitative interviews with patients, healthcare professionals and exercise professionals. Interviews will explore the perceptions and experiences of those involved in the STAMINA trial, and the organisational implications of embedding and sustaining the intervention. Quantitative process data will be analysed descriptively. Qualitative interview data will be analysed before trial outcomes are known using an inductive and deductive approach. Findings from the different elements will be reported separately and then integrated to inform interpretation of trial outcomes.ConclusionThis process evaluation protocol provides a detailed description of relevant data collection methods and trial processes of the STAMINA randomised controlled trial which will allow us to determine whether the intervention can be delivered with fidelity, is acceptable to patients, healthcare professionals and exercise professionals, and understand the implications for embedding and sustaining the intervention in the routine care.Trial registrationISRCTN 46385239, registered on 30/07/2020. Cancer Research UK 17002, retrospectively registered on 24/08/2022.

Read full abstract
  • Journal IconPLOS One
  • Publication Date IconJul 14, 2025
  • Author Icon Saïd Ibeggazene + 11
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

<b>Factor Associated with Work-Related Musculoskeletal Injuries Among Health Care Professionals Working in the Operation Room at Shaukat Khanum Cancer Memorial Hospital and Research Center</b>

Background: Work-related musculoskeletal injuries (MSIs) represent a significant occupational hazard among healthcare professionals, particularly those working in operating rooms (OR), where prolonged static postures, repetitive movements, and high physical demands increase the risk of musculoskeletal disorders, potentially compromising worker health and patient care quality. Objective: This study aimed to assess the occurrence and severity of MSIs among OR healthcare professionals at Shaukat Khanum Memorial Cancer Hospital and Research Center and to determine associations between MSIs and demographic and occupational factors. Methods: An observational analytical study was conducted over four months among OR staff, employing universal sampling. Data were collected via a structured, validated questionnaire capturing demographics, work tasks, and MSI symptoms across various body regions. Statistical analyses, including Chi-square tests and calculation of odds ratios, were performed using SPSS version 20 to examine associations between MSIs and demographic or occupational factors, with significance set at p < 0.05. Results: MSIs were reported by 40% of participants, with the upper back (46.5%), neck (33.5%), and wrists/hands (23.5%) most frequently affected. Significant associations were observed between MSIs and age (p=0.015), profession (p=0.011), work experience (p=0.003), patient lifting (p=0.021), instrument retraction (p=0.047), and poor posture during procedures (p=0.001). Conclusion: High MSI prevalence among OR staff underscores the need for ergonomic interventions and training programs to reduce occupational injuries and enhance healthcare workforce sustainability and patient safety.

Read full abstract
  • Journal IconJournal of Health, Wellness, and Community Research
  • Publication Date IconJul 12, 2025
  • Author Icon Fozia Ali + 4
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Simultaneous clustering and joint modeling of multivariate binary longitudinal and time-to-event data.

Joint modeling of longitudinal outcomes and time-to-event data has been extensively used in medical studies because it can simultaneously model the longitudinal trajectories and assess their effects on the event-time. However, in many applications we come across heterogeneous populations, and therefore the subjects need to be clustered for a powerful statistical inference. We consider multivariate binary longitudinal outcomes for which we use Bayesian data-augmentation and get the corresponding latent continuous outcomes. These latent outcomes are clustered using Bayesian consensus clustering, and then we perform a cluster-specific joint analysis. Longitudinal outcomes are modeled by generalized linear mixed models, and we use the proportional hazards model for modeling time-to-event data. Our work is motivated by a clinical trial conducted by Tata Translational Cancer Research Center, Kolkata, where 184 cancer patients were treated for the first twoyears, and then were followed for the next threeyears. Three biomarkers (lymphocyte count, neutrophil count and platelet count), categorized as normal/abnormal, were measured during the treatment, and the relapse time (if any) was recorded for each patient. Our analysis finds three latent clusters for which the effects of the covariates and the median non-relapse probabilities substantially differ. Through a simulation study we illustrate the effectiveness of the proposed simultaneous clustering and joint modeling.

Read full abstract
  • Journal IconLifetime data analysis
  • Publication Date IconJul 12, 2025
  • Author Icon Srijan Chattopadhyay + 4
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

The state of cancer research and its association with the cancer burden in Ecuador: a bibliometric study

PurposeCancer has emerged as a major public health concern in Ecuador, reflecting global trends. Thus, it is imperative to understand the country´s cancer research landscape. We aim to conduct a bibliometric analysis of Ecuadorian cancer research publications from 2008 to 2021 to identify research trends, institutional contributions, international collaborations, and the association with the national cancer burden.MethodsArticles were retrieved from Scopus, PubMed, and LILACS databases. Descriptive statistics and chi-square tests were employed to analyze different bibliometric indicators.ResultsA marked increase in cancer-related research output was observed, particularly after 2014. The most common study designs were case reports (n = 244, 30.7%), cross-sectional studies (n = 174, 21.9%) and review articles (n = 131, 16.5%). Universities were the main contributors to national cancer research, accounting for 32.4% (n = 256) of all publications, with private institutions more frequently publishing in higher-ranked journals. Collaborative efforts between universities and hospitals represented 25.3% (n = 200) of publications, though 45.1% of these were indexed in the lowest SCImago Journal Rank quartile (Q4). The most frequently studied cancer types by body location/system were gastrointestinal, gynecologic, and breast cancer. This trend contrasts with national cancer statistics reported in 2022, in which the most common cancer types were breast, prostate (genitourinary), and stomach (gastrointestinal) cancers.ConclusionOur study provides a comprehensive overview of oncology research in Ecuador over a 14-year period. While research output has increased, there remains a need to enhance research quality and ensure closer alignment with the country’s primary cancer burdens to better inform national cancer control strategies.

Read full abstract
  • Journal IconDiscover Oncology
  • Publication Date IconJul 11, 2025
  • Author Icon Santiago D Padilla-Sánchez + 2
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Significance of Artificial Intelligence in Animal Disease Recognition

Abstract: Artificial intelligence (AI) is a rapidly expanding field of innovative technology that has great potential to transform many different scientific and technological fields. AI can be used in veterinary treatment and animal disease management to produce better outcomes for people and animals. AI can help in many disciplines, including genetics, cancer research, epidemiology, disease surveillance, therapy and vaccine development, studies on antimicrobial resistance (AMR), and is presented as an essential tool to address worldwide health issues in many fields. Most investigational AI-driven animal care research focuses on data collection, processing, evaluation, and analysis for animal behaviour detection, disease surveillance, growth estimation, and environmental monitoring. This paper describes and investigates the potential consequences of different elements of AI on animal disease and how AI is developing across many disciplines; the most prominent are deep learning and machine learning. “Machine learning” (ML) can be used to create models capable of predicting the future through algorithms that discover patterns in data. The development of AI technologies has sped up the process of drug discovery by locating possible therapeutic targets and improving candidate medications. This paper discusses these advancements while also analyzing the opportunities that lie ahead for artificial intelligence in the field of animal disease control. We also highlight the potential of AI to preserve the wellness of humans and animals across nations, highlighting the role AI plays in advancing the management of animal illnesses.

Read full abstract
  • Journal IconCurrent Signal Transduction Therapy
  • Publication Date IconJul 11, 2025
  • Author Icon Ravinder Kumar + 2
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

An evidence gap map of the personalized medicine in bladder cancer.

The study aims to develop an Evidence Gap Map (EGM) to summarize the current evidence cited in personalized medicine (PM) in bladder cancer, focusing on systematic reviews and high-level evidence syntheses. The review involved a comprehensive search in databases up to June 2024, and involved a two-phase analysis, using data from PubMed for scientometric analysis and R Studio with the Biblioshiny tool for co-occurring word network analysis. After filtering out irrelevant articles, the selection was narrowed to 3,705 items. The most frequently occurring words were aged, middle-aged, animals, cell lines, tumor, prognosis, gene expression, and mice. The study identified gaps and under-researched categories in PM and bladder cancer research, with Immunotherapy, Neoadjuvant/Adjuvant Therapy, and Gene Therapy being the most researched areas. The evidence map revealed a predominance of low or moderate quality evidence in most domains of PM in bladder cancer, particularly within clinical trials for immunotherapy and biomarker. The field of PM in bladder cancer requires robust research methodologies and interdisciplinary collaboration to overcome challenges. By improving study design and data quality, the field can translate scientific discoveries into clinical innovations, utilizing molecular profiling and targeted therapies to enhance treatment strategies and improve outcomes.

Read full abstract
  • Journal IconPersonalized medicine
  • Publication Date IconJul 11, 2025
  • Author Icon Hadi Mostafaei + 5
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract B024: Artificial intelligence enables the ethical reconstruction and social value realization of global cancer research: From technological innovation to humanistic care

Abstract The technical breakthrough of artificial intelligence (AI) in the field of oncology has moved from the laboratory to the clinic, but the realization of its social value is still facing the "last mile" dilemma. According to the WHO, there are more than 19 million new cancer cases worldwide every year, but the algorithmic advantages of AI are in sharp contrast to the uneven distribution of resources: while high-income countries are using AI to optimize personalized treatment programs, low-income regions are difficult to enjoy the technical dividends due to the lack of data. This work takes the " Technology-Ethics-Fairness" framework as the starting point to explore how to build a more inclusive AI oncology research ecology through interdisciplinary cooperation. Despite the outstanding performance of AI in the fields of tumor image recognition and genomics analysis, most studies focus on technical performance optimization and ignore the impact of social and cultural differences on the implementation of algorithms. For example, the driver gene mutation characteristics of lung cancer in Asian populations are significantly different from those in Europe and the United States, but the proportion of non-European ancestry samples in the public database is less than 10%, which leads to bias when the model is applied across regions. Furthermore, the inherent "black box" nature of AI decision-making exacerbates the crisis of trust between doctors and patients, especially in areas with limited medical resources, where technical authority may override clinical experience. To foster responsible and equitable AI in oncology, we propose three key pillars so that AI research can better serve society: (1) Data Equity: Establishing a global federated learning consortium for privacy-preserving, multi-omic data sharing to enable cross-regional model training. (2) Interpretability & Trust: Developing "decision traceability" tools that dynamically link AI outputs to clinical guidelines and supporting evidence. (3) Proactive Ethics: Integrating ethical impact assessments, informed by frameworks like the EU AI Act, into clinical trial design, including explicit metrics for equity and bias. The ultimate value of AI should not stop at improving the efficiency of diagnosis and treatment but also reshape the global collaboration network of cancer research. It is recommended to establish an international certification standard of "AI for Oncology," covering the dimensions of data representativeness, algorithm transparency, and cross-cultural adaptability. At the same time, bridging the technology gap through immersive medical education can help doctors in underdeveloped countries or regions to practice AI-assisted decision-making on 3D tumor models. As AI evolves from "technology enabler" to "ecological builder," cancer research will break through the boundaries of regions and disciplines and realize exponential growth of social value. We look forward to seeing more solutions that integrate technological innovation and humanistic care in the future. Citation Format: Zhicheng Du, Lijin Lian, Wenji Xi, Yu Zheng, Gang Yu, Hui-Yan Luo, Peiwu Qin. Artificial intelligence enables the ethical reconstruction and social value realization of global cancer research: From technological innovation to humanistic care [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr B024.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Zhicheng Du + 6
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Adherence to World Cancer Research Fund/American Institute for Cancer Research Guidelines and Mortality Among Participants with Colorectal Cancer in the MEC Cohort.

Racial and ethnic minority patients with colorectal cancer (CRC) are underrepresented in studies on health behavior and mortality. We examined the association between post-diagnosis health behavior and mortality in the Multiethnic Cohort (MEC), a diverse group of 215,000 participants from Hawai'i and Los Angeles (recruited 1993-1996). Follow-up was through December 31, 2019. Post-diagnosis health behavior was assessed using a modified World Cancer Research Fund/American Institute of Cancer Research (WCRF/AICR) score (excluding ultra-processed foods). The primary outcome was overall mortality; CRC-specific mortality was secondary. Among 1,079 eligible participants, 489 (45.3%) were women, and 850 (78.8%) self-identified as racial/ethnically minoritized people. Over a median follow-up of 12.2 years, there were 613 all-cause deaths and 105 CRC-related deaths. Median time from diagnosis to questionnaire completion was 5 years (interquartile range, IQR: 2-8). Higher WCRF/AICR scores (4.5-7) were associated with lower risk of overall mortality compared to lower scores (≤2.25) (HR: 0.63; 95% CI: 0.45, 0.87). Risk of CRC-specific mortality was also lower but not statistically significant. Among individual health behaviors, physical activity was associated with lower risk of all-cause and CRC-specific mortality (reference: <75 min/week), with HRs of 0.59 (95% CI: 0.43, 0.81) for 75-<150 min/week and 0.51 (95% CI: 0.41, 0.64) for ≥150 min/week. Higher adherence to WCRF/AICR guidelines, particularly engaging in moderate-to-vigorous physical activity, was associated with lower risk of mortality in long-term CRC survivors. These findings support the generalizability of prior studies examining adherence to WCRF/AICR guidelines to a broader group of patients with CRC.

Read full abstract
  • Journal IconCancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology
  • Publication Date IconJul 10, 2025
  • Author Icon Edgar Asiimwe + 10
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract IA03: Accelerating oncology drug discovery with the power of microscopy &amp; AI

Abstract Cell images contain a vast amount of quantifiable information about the status of the cell: for example, whether it is diseased, whether it is responding to a drug treatment, or whether a pathway has been disrupted by a genetic mutation. We aim to go beyond measuring individual cell phenotypes that biologists already know are relevant to a particular disease. Instead, in a strategy called image-based profiling, often using the Cell Painting assay, we extract hundreds of features of cells from microscopy images. Just like transcriptional or proteomic profiling, the similarities and differences in the patterns of extracted features reveal connections among diseases, drugs, and genes, with many applications in cancer research. In fact, these strategies underpin drug discovery platform companies such as Recursion and SyzOnc. Because images are inexpensive and high-throughput, we can carry out experiments at very large scale, yielding single-cell profiles for hundreds of thousands of samples through public-private consortia (JUMP, OASIS, VISTA, NIH IGVF) and pooled barcode-based optical screens. Cell morphology is therefore a powerful data source for cancer systems biology, alongside molecular omics. Citation Format: Anne E. Carpenter. Accelerating oncology drug discovery with the power of microscopy &amp; AI [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr IA03.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Anne E Carpenter
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract B057: DrBioRight: an AI chat assistant enabling scalable and flexible multi-omics analysis in cancer

Abstract Over the past decade, high-throughput omics technologies have generated massive datasets from patient tumors, cell lines, and animal models, offering deep insights into disease mechanisms. However, analyzing these data remains a major challenge for researchers lacking computational expertise. Existing tools and bioinformatics cores offer partial solutions but often fall short in flexibility, scalability, or accessibility. To address these challenges, we developed DrBioRight, an AI-powered assistant that enables natural language-based analysis of multi-omics data. By integrating large-scale omics datasets, advanced analytic/visualization tools, and large language model-based AI agents into a unified chat interface, DrBioRight allows users to ask biomedical questions and receive interpretable results in real time. Already adopted by thousands of users, the platform significantly enhances data analysis efficiency in biomedical research. DrBioRight consists of three main components: (i) a data portal featuring multi-omics datasets from thousands of clinical and preclinical samples across hundreds of patient cohorts; (ii) a tool store offering customizable analytic and visualization modules; and (iii) AI agents that interpret queries, generate code, and automate workflows. The platform also supports external data uploads and community-contributed tools through easy-to-use integration APIs. To demonstrate its utility, a user might request, “Generate a heatmap for gene expression data.” DrBioRight processes the input data, identifies the most appropriate tool, and returns an interactive heatmap with features such as gene selection, zooming, scatter views, and pathway mapping. Users can then issue follow-up requests through ongoing conversation, such as correlation analysis, survival analysis, or subgroup comparisons. The platform also supports result summarization, plot customization, and export of both data and code, enabling flexible, iterative, and reproducible biomedical research. Overall, DrBioRight enhances research efficiency by minimizing computational barriers; fosters collaboration through shared data and tools; improves the quality and interpretability of analyses; increases transparency and reproducibility; and serves as an open-access hub for integrating and disseminating omics data and bioinformatics tools. In summary, DrBioRight is a versatile, end-to-end platform that lowers barriers to omics data analysis while accelerating discovery and collaboration in cancer research. Citation Format: Jun Li, Wei Liu, Yitao Tang, Yining Zhao, Han Liang. DrBioRight: an AI chat assistant enabling scalable and flexible multi-omics analysis in cancer [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr B057.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Jun Li + 4
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Therapeutic potential of gold nanoparticles in cancer therapy: a comparative insight into synthesis overview and cellular mechanisms.

Cancer has become a global health issue that demands transformative and advanced therapeutic approaches. Recent advancements in cancer research and therapeutics have shown that gold nanoparticles (AuNPs) possess potential anticancer, apoptotic, phagocytic, and immune modulation properties. Their unique physicochemical properties at the nanoscale make AuNPs effective candidate in various cancer therapeutics. This review uniquely highlights recent insights into the molecular mechanisms of AuNP-induced cancer cell death, cell cycle arrest, and immune modulation with multidimensional approach, an area often overlooked in earlier reviews. Also, it examines recent studies, exploring how AuNPs trigger their cancer-fighting properties such as cell death initiation, DNA destruction, and immune system modulation. AuNPs initiate cell death by targeting mitochondrial enzymes along with producing reactive oxygen species (ROS) and activating caspase proteins. AuNPs harm DNA, leading to cell cycle arrest, which in turn trigger apoptosis (programmed cell death) or other forms of cell death. Research study claims that AuNPs activate macrophages and regulate cytokine release which helps suppress tumor growth and metastasis. Moreover, this review compares chemical and green synthesis approaches, emphasizing green synthesis for its enhanced biocompatibility and alignment with targeted cancer therapy. Green synthesis methods are not only devoid of toxic chemicals but also superior in controlling particle size, shape, and surface functionality. Consequently, these green-synthesized AuNPs have been utilized in cancer research with improved therapeutic efficacy and enhanced cancer-targeting capabilities.

Read full abstract
  • Journal IconMedical oncology (Northwood, London, England)
  • Publication Date IconJul 10, 2025
  • Author Icon Laveeza Bano + 4
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract B007: TheBlueScrubs-v1: A Large-Scale Curated Dataset with ∼11 Billion Oncology Tokens for AI-Driven Cancer Research

Abstract Large language models (LLMs) are increasingly pivotal in cancer research, yet current public datasets offer insufficient scale and diversity to capture the complexity of oncology. To address this gap, we created TheBlueScrubs-v1, a 25-billion-token corpus of medical texts curated from the SlimPajama dataset. Approximately one-third of these tokens (∼11 billion) are annotated as cancer-related, making this one of the largest public, domain-focused text collections available for training and benchmarking oncology LLMs. Our two-stage pipeline first applied a high-speed logistic regression classifier (trained on a balanced set of 60,000 medical vs. non-medical documents) to label texts by medical relevance. This process extracted ∼4% of SlimPajama, yielding documents with at least 0.8 probability of containing medical content. Next, a 70B-parameter open-source LLM (Llama 3.1) evaluated each text’s medical scope, factual precision, and safety on 1–5 scales. Validation by clinicians and GPT-4o found strong concordance, confirming the reliability of these automated assessments. We further developed a specialized cancer classifier using logistic regression with TF-IDF features, trained on 60,000 examples, to identify oncology-related texts. This yielded a high-quality oncology subset (∼11 billion tokens) spanning topics such as cancer diagnosis, therapeutics, and real-world clinical notes. Detailed safety metrics enable red-teaming to mitigate misinformation and promote ethical use in oncology research. Potential applications include (1) fine-tuning LLMs for oncology-focused tasks such as treatment recommendation, clinical trial matching, and patient education, (2) building safety classifiers to detect harmful or misleading content, and (3) synthetic data generation to expand training sets while preserving privacy. Early experiments demonstrate that LLMs fine-tuned on TheBlueScrubs-v1 achieve performance on par with or exceeding models trained on smaller, specialized medical corpora. By releasing this large-scale, annotated dataset under an open license, we aim to accelerate innovation in AI-driven cancer research and foster collaborative efforts toward safer, more accurate clinical language models. Citation Format: Luis Felipe, Gilmer Valdes. TheBlueScrubs-v1: A Large-Scale Curated Dataset with ∼11 Billion Oncology Tokens for AI-Driven Cancer Research [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr B007.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Luis Felipe + 1
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract A037: Survival-based subtyping of lung cancer through integrated genomic and methylation signatures

Abstract Background: Lung cancer prognosis varies significantly due to the interplay of clinical, genomic and epigenomic factors. We developed an integrative analytic pipeline to cluster lung cancer patients based on clinical characteristics, somatic mutations, and methylation profiles, linking these clusters to survival outcomes. Methods: We defined a cohort using clinical data from GuardantINFORM, a clinico-genomics database to identify 75,000 lines of therapy (LOT) records of advanced lung cancer patients with a Guardant360 on the Infinity platform resulting in genomic mutation data (3M variants) and targeted methylation data (top 1,000 variable regions) and used to predict clusters of progression on the following line of therapy . Features included clinical factors (LOT, therapy regimens), binary mutation indicators (top-100 genes), and 20 principal components (PCs) derived from methylation data. Ensemble clustering using K-means and agglomerative methods identified patient subgroups. Survival was analyzed via Kaplan–Meier methods, and a penalized Cox proportional hazards model determined feature-level hazard ratios. Results: Four robust clusters emerged with distinct survival profiles (p&amp;lt;0.0001). Cox modeling (concordance=0.72) highlighted several prognostic markers: KRAS mutations (HR=1.24, p=0.01), STK11 mutations (HR=1.49, p&amp;lt;0.005), KEAP1 mutations (HR=1.33, p=0.01), and methylation PC5 (HR=0.03, p=0.01). Biological interpretation of PC5 revealed enrichment of hypermethylation in key developmental genes such as PAX6, SOX9, OTX2, WT1, and TBX20. This suggests epigenetic suppression of differentiation-related transcription factors may contribute significantly to tumor aggressiveness. Conclusions: Integrating clinical, genetic, and methylation data via ensemble clustering effectively delineates lung cancer subgroups with clinically meaningful survival differences. Key mutations in KRAS, STK11, and KEAP1 and methylation-driven silencing of developmental transcription factors represent robust markers for patient risk stratification and potential therapeutic targeting Citation Format: Aaron Hardin, Sheila Solomon, Amar Das. Survival-based subtyping of lung cancer through integrated genomic and methylation signatures [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr A037.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Aaron Hardin + 2
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract B020: Automated classification of thymic epithelial tumors. A novel deep learning approach

Abstract Introduction: Accurate histological classification of thymic epithelial tumours (TETs), including subtypes A, AB, B1, B2, B3, and thymic carcinomas (TC), is essential for prognosis and treatment planning. However, expert-level classification remains challenging due to significant inter-observer variability and the rarity of these tumors. This study presents a novel deep learning approach for subtype classification using digital histopathology. Methods: Hematoxylin and eosin (H&amp;E)-stained whole slide images (WSIs) from the Cancer Genome Atlas Program (TCGA) were used. The dataset included TETs resections from 119 patients. WSIs were divided into 224×224 pixel patches. We used a foundational model called UNI to extract high-dimensional features. UNI was pretrained on over 100 million histopathology images across 20 major tissue types. We then employed an attention-based Multiple Instance Learning (MIL) model to aggregate patch-level information for slide-level classification. The key innovation in our approach is the implementation of a biologically-informed hierarchical loss function with three components: a multiclass classifier to distinguish A/AB, B1–B3, and TC categories, a binary classifier to differentiate A and AB subtype, and an ordinal classifier to model the biological continuum among B1–B3 subtypes, implemented using a novel binary encoding scheme where each class receives one more "1" bit than the previous class (e.g., B1: [0,0], B2: [1,0], B3: [1,1]). This approach represents the biological continuum of B-subtypes based on increasing epithelial-to-lymphocyte ratios. Results: The model was evaluated using 3-fold cross-validation. Compared to random classification accuracy of only 17%, our model achieved an overall six-class classification accuracy of 59.4% (95% CI: 55.3–63.5) with a Cohen’s kappa of 0.485 (0.437–0.534). When collapsed into three high-level classes (A/AB vs. B1–B3 vs. TC), accuracy improved to 81.0% (76.5–85.5) with a Cohen’s kappa of 0.678 (0.598–0.757). Performance was exceptionally high for TC classification: Accuracy: 95.8% (92.5–99.0), Sensitivity: 94.4% (86.0–100), and Specificity: 96.0% (93.1–98.9). Conclusions: This deep learning approach demonstrates strong performance in classifying TETs subtypes, with especially high accuracy for thymic carcinomas. Prior studies have shown substantial inter-observer variability, with expert reviews at specialized centers resulting in a different histological classification for up to 56% of referred thymic tumor cases—potentially altering treatment decisions in more than 40% of these instances. Our model offers a promising tool to augment diagnostic accuracy and reduce variability, particularly in settings lacking expert pathology review. Citation Format: Matteo Sacco, Erica Pietroluongo, James M. Dolezal, Anna Di Lello, Mirella Marino, Alessandra Esposito, Maha AT. Elsebaie, Marina C. Garassino. Automated classification of thymic epithelial tumors. A novel deep learning approach [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr B020.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Matteo Sacco + 7
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract B033: Identifying triple-negative breast cancer patients at high risk of worse prognosis using molecular features derived from histology images

Abstract Triple-negative breast cancer (TNBC) is the most aggressive subtype of invasive breast cancer characterized by the lack of estrogen receptor (ER), progesterone receptor (PR), and HER2 expression. TNBC is heterogeneous in terms of the biological and clinical perspective and a subset of these tumors exhibits markedly poor prognosis. Identification of this TNBC subset is an unmet clinical need. To identify patients with primary TNBC at high risk of worse prognosis, we developed a graph (network)-based analysis approach combined with an unbalanced optimal transport technique, utilizing molecular features derived from histology images. A total of 143 H&amp;E-stained histology images from The Cancer Genome Atlas (TCGA) primary TNBC cases were analyzed. Tumor tissues were segmented on whole-slide images using a pre-trained ResNet18 model with a patch size of 512×512 and a processing resolution of 0.5 microns per pixel. A pre-trained ResNet34 model was then used to estimate four molecular features—microsatellite instability, hypermutation density, chromosomal instability, and TP53 mutation—on the segmented tumor tissues. In addition, the spatial fraction of tumor-infiltrating lymphocytes (TILs), derived from histology images (Saltz et al., Cell Reports, 2018), and four morphology features (epithelial area, tubule formation, nuclear pleomorphism, and mitosis; Thennavan et al., Cell Genomics, 2021) graded by the breast cancer pathology expert committee were analyzed. Following exclusion of cases with incomplete data, 113 cases were used for network analysis. A feature network was constructed using nine histology-derived features, based on Spearman’s correlation. K-means clustering, employing unbalanced optimal transport to calculate Wasserstein distance, was used to identify subgroups in the resulting feature network. The Wasserstein distance computed between samples and cluster centroids served as the cost function during the K-means clustering process. The two identified subgroups, categorized as a high-risk group (N=81) and a low-risk group (N=32) based on disease-specific survival (DSS) rates, showed a statistically significant difference in DSS (log-rank p=0.047). Estimated TILs were significantly different between the high and low-risk groups (p&amp;lt;0.0001). CIBERSORT scores that quantify 22 immune cell types were assessed. The low-risk group showed significantly higher CD8 T cells (p=0.030), regulatory Tregs T cells (p=0.029), and M1 macrophages (p=0.006), whereas the high-risk group showed significantly higher M0 macrophages (p=0.026) and M2 macrophages (p=0.006). Restricting the analysis to TNBC cases with tumor stage ≥2 revealed a greater DSS difference between the high (N=64) and low-risk (N=28) groups (p=0.026). CIBERSORT analysis revealed that the low-risk group had significantly higher levels of CD8 T cells (p=0.009) and activated CD4 memory T cells (p=0.027). Our analyses show that a cold immune milieu characterized by low TILs and a dominance of non-activated and anti-inflammatory macrophages are associated with poor prognosis in TNBC. Citation Format: Jung Hun Oh, Fresia Pareja, Rena Elkin, Larry Norton, Joseph Deasy. Identifying triple-negative breast cancer patients at high risk of worse prognosis using molecular features derived from histology images [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr B033.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Jung Hun Oh + 4
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract B022: Prediction of radiotherapy-Induced esophagitis in non-small cell lung cancer using a 3D vision foundation model

Abstract Introduction: Radiotherapy (RT)-induced acute esophagitis (AE) is a common side effect in lung cancer patients receiving RT, which can significantly impact their quality of life. This highlights the need for a predictive model that can estimate AE risk in advance using pretreatment imaging data. However, collecting sufficient data for model development is often resource-intensive and costly. Additionally, acquiring a homogeneous training dataset (e.g., from a single center or modality) is not always feasible. This study aimed to develop an automated artificial intelligence-based 3D vision foundation model (VFM) combining standard-of-care planning computed tomography (pCT) and planned radiation dose maps to predict grade II or higher AE. Materials and methods: This study included 246 patients with non-small cell lung cancer who underwent either image-guided radiation treatment (IMRT) or proton-beam radiation therapy. The endpoint was grade two or higher AE (33% positive and 67% negative). The IMRT group consisted of 182 patients from a single center, whereas the proton therapy group included 64 patients from the Proton Collaborative Group trial from 11 different institutions. For each patient, pCT, dose maps, and radiotherapy segmentations were available. The VFM was created using a transformer pretrained using a large number of unlabeled volumetric 3D CTs from patients with varied diseases through a self-supervised learning approach, which extracts useful features directly from images. Next, the pretrained encoder was combined with fully connected classification layers and fine-tuned using stratified 5-fold cross-validation with 20 patients set aside for testing. We developed two models: a CT only and CT + dose model. Model performance was assessed using the area under the curve (AUC), specificity, and sensitivity. Results: In cross-validation, adding dose information to CT (CT+Dose) improved model performance, increasing AUC from 0.72 ± 0.08 to 0.76 ± 0.10 and specificity from 0.82 ± 0.09 to 0.87 ± 0.04, while sensitivity remained the same. On the independent test set, CT+Dose showed a marked improvement over CT only (AUC: 0.82 vs. 0.59; sensitivity: 0.57 vs. 0.14), with a slight decrease in specificity (0.77 vs. 1.00). Conclusion: A VFM model combining CT and radiation dose showed the capability to predict AE more accurately with higher specificity. Further studies on larger cohorts of testing patients are planned to assess model generalization. Citation Format: Chloe Min Seo Choi, Jue Jiang, Nikhil Mankuzhy, John Chang, Jing Zeng, Carlos Vargas, James Urbanic, Isabella Choi, Mark Mcdonald, James Gray, Joseph Deasy, Maria Thor, Charles Simone, Harini Veeraraghavan. Prediction of radiotherapy-Induced esophagitis in non-small cell lung cancer using a 3D vision foundation model [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr B022.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Chloe Min Seo Choi + 13
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract B034: Decoding the Leukocyte Effect: How Cell Retention Shapes ML Outcomes in Platelet RNA-based Cancer Detection

Abstract Introduction: Liquid biopsy offers a promising, minimally invasive approach for cancer detection by providing valuable insights into tumor biology. Among the various sources used in liquid biopsy, platelets and their RNA stand out as a unique diagnostic tool, reflecting the body’s systemic response to cancer. However, during laboratory platelet extraction known as platelet washing, white blood cells (WBCs) retention can occur, potentially confounding platelet-derived RNA sequencing data. Materials and methods: To address this challenge, we developed a method to quantify WBC enrichment in platelet RNA-seq datasets. Using the largest publicly available dataset (GEO GSE183635), which includes 2,351 samples from cancer patients, healthy donors, and individuals with benign, non-cancer conditions, we identified 3 sample clusters based on leukocyte marker levels. This allowed us to differentiate two entirely distinct sets of samples, with the lowest and the highest WBC retention, containing 377 matched samples in each set. We then assessed how the leukocyte presence influences the performance of machine learning models for cancer classification for the low and high leukocyte subgroup. Results: Across the full test set (n=453), our model achieved an area under the curve (AUC) of 0.94. Interestingly, in the low-leukocyte subset (n=229), the AUC slightly decreased to 0.93, whereas in the high-leukocyte subset (n=224), it increased to 0.95. Discussion: These results suggest that while leukocyte retention may slightly enhance classification performance, platelets alone provide substantial diagnostic value. Overall, our findings highlight the strong potential of platelet-based liquid biopsy and reveal how understanding leukocyte retention can further refine cancer detection strategies. Citation Format: Michał Sieczczynski, Krzysztof Pastuszak, Anna J. Zaczek, Matthew T. Rondina, Sjors GJG. in 't Veld, Myron G. Best, Anna Supernat. Decoding the Leukocyte Effect: How Cell Retention Shapes ML Outcomes in Platelet RNA-based Cancer Detection [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr B034.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Michał Sieczczynski + 6
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

Abstract A043: Mutational profiling and machine learning for risk stratification and biomarker identification in intraductal papillary mucinous neoplasms progressing to pancreatic cancer

Abstract Intraductal papillary mucinous neoplasms (IPMNs) are common precursors to invasive pancreatic ductal adenocarcinoma (PDAC), with the risk of progression varying by IPMN type and anatomic location. Despite some IPMNs having a high risk of progression, there are limited non-surgical options for precision prevention in patients with IPMNs. One of the main challenges is that existing radiologic and molecular markers are insufficient for reliably assessing the risk of progression, mostly due to a lack of validated intervention targets. Currently, cancer prevention for patients with IPMN is centered on surgery or surveillance using a risk-based strategy; the clinical ability to stratify risk of cancer progression of individual IPMN tumors is poor and essentially no effective non-surgical interventions exist. Thus, the objective of this work is to apply statistical features extraction and use latent features derived from sequencing data as input to machine learning (ML) models to identify unexplored markers driving the progression of IPMNs to invasive PDAC. Data consisting of formalin-fixed, paraffin-embedded tissue cores sampled from 34 unique patients with the following pathological diagnoses: 8 non-IPMN-derived PDAC, 7 IPMN-derived PDAC, 7 high-grade IPMNs, and 12 low-grade IPMNs, were analyzed using the Moffitt STAR 2.0 Cancer Mutation and Molecular Biomarker Profiling panel. This STAR 2.0 next generation sequencing method performed using the TruSight Oncology 500 panel from Illumina, Inc., is designed to interpret sequence information for over 500 somatically altered genes. The analyses of sequencing data incorporated statistical and ML analyses of the mutational profiles of patients’ genomes followed by integration of trinucleotide sequence-derived mutational features. By applying non-negative matrix factorization to DNA trinucleotide motif mutational data, we identified 4 distinct mutational signatures. These signatures exhibited varying degrees of similarity to the Single Base Substitution Signatures from COSMIC (Sanger Institute). However, the contribution of these signatures to the mutational profile in each sample is more complex, as samples may carry a combination of signatures rather than a single, defining signature. Preliminary results from the discriminative ML models showed high performance in predicting type of malignancy (multiclass area under the curve &amp;gt; 0.8), using the mutational counts and each sample-to-signature contribution. We demonstrate that insights extracted from mutational profiles have the potential to enhance the interpretation of mutational patterns and improve the stratification of IPMNs. Further analyses are needed to fully understand the complex interplay of mutational processes across the samples, beyond the initial identification of signatures. We aim to uncover key gene patterns, focusing on codon mutations and their mapping to protein alterations, to better understand the molecular mechanisms driving the progression of IPMNs and identify potential therapeutic targets for early intervention. Citation Format: Aleksandra Karolak, Evan W. Davis, Mouktik Isukapalli, Rohit Veligeti, Margaret A. Park, Jamie K. Teer, Daniel K. Jeong, Kun Jiang, Dung-Tsa Chen, Jennifer B. Permuth, Ghulam Rasool. Mutational profiling and machine learning for risk stratification and biomarker identification in intraductal papillary mucinous neoplasms progressing to pancreatic cancer [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning; 2025 Jul 10-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(13_Suppl):Abstract nr A043.

Read full abstract
  • Journal IconClinical Cancer Research
  • Publication Date IconJul 10, 2025
  • Author Icon Aleksandra Karolak + 10
Just Published Icon Just Published
Cite IconCite
Chat PDF IconChat PDF
Save

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2025 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers