Abstract A consequence of the progress in cancer genomics is the exponential growth in data produced by research and clinical projects. Given the limited capacity within a scientific manuscript to share experimental data, various approaches have been developed: some data are simply excluded from the publication, moved to the supplementary materials, external repositories are used, and big research programmes present their integrated results on dedicated websites. This last approach has been extremely successful in generating comprehensive and unbiased datasets used by legions of scientists to discover new mechanisms driving cancer, biomarkers and drug targets. However, journal publications are still delivering more data globally, especially from rare cancers and underrepresented populations. To bring the full potential of these data and to make all this knowledge fully available, several challenges need to be addressed. COSMIC (Catalogue of Somatic Mutations in Cancer) has an in-house expert curation team that extracts information about genetic variants, their significance, patients and their response to therapy from publications and integrates it into a standardised cross-referenced database available to the scientific community. Although curators are able to extract data presented in non-standard forms (e.g. manuscript figures), interpret and translate between various data formats, they cannot extract data that are not included in the publication. Approximately one third of manuscripts that are curateable are discarded due to the absence of data generated in the research process but not included in the publication. This clearly shows an urgent need for bringing researchers, publishers, and curators together to develop new publication standards that would allow communities to fully exploit the potential of published data and make them re-usable for further discoveries. We would like to propose several suggestions to start this discussion: - Use accepted naming conventions and standards to make your data interoperable, think about your work as part of the world heritage - Include all data you generate; even if not directly relevant to the manuscript topic, they may be invaluable for somebody else’s research - Share full data and use service providers that share all the data with you - When possible, share the data per patient or sample, not per cohort - Publishing a manuscript is not an endpoint, there’s a huge potential in your data to be further utilised to fight cancer Citation Format: Alexander Holmes, Sari A. Ward, Zbyslaw Sondka. The need for new publication standards in cancer genomics, lessons from curating COSMIC database. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 4319.
Read full abstract