Abstract BACKGROUND: In recent years AI and deep learning have transformed the ability to use large amounts of medical data to augment diagnosis and prognosis processes for cancer. For developing AI methodology, histopathologic assessment serves as the gold standard “labels”, enabling investigators to finely map (or annotate) biologically and clinically important features. Yet correlating high dimensional data (radiomic, morphometric, genomic, metabolomic, etc.) with expert histopathologic diagnosis for dataset generation remains a major challenge. CHALLENGE: Traditionally, labels have been extracted from a snapshot that contains all of the annotation layers overlaid on the original tissue through image processing techniques. This implies the use of distinct colors for annotation, which severely constrain the number of possible labels. Particularly, this is most noticeable for heterogeneous tissues like prostate that require complex annotation. Furthermore, the resolution with which the labels can be mapped is limited by the area of the extracted region. OBJECTIVE: Here we present a workflow for pathology-oriented dataset generation for AI studies that is compatible with standard annotation platforms, and addresses these limitations. We introduce a detailed multi-grade and multi-scale annotation protocol for prostate biopsies. The proposed method is capable of exporting labels as independent layers (representing specific grades of the pathology), and resampling them to the desired resolution. METHODS: A collection of 38 prostate biopsy sections from 19 patients fixed on slides were used. The proposed grading annotation protocol is based on the spatial distribution of cancer cells. Nine layers of annotation were considered depicting stroma, benign tissue, low grade (Gleason pattern 3) and high grade cancer (Gleason patterns 4 and 5), two mixed cancer patterns, prostatic intraepithelial neoplasia (PIN), intraductal carcinoma (IDC), and artifact. The coordinates of the annotation boundaries are post-processed and combined into a label image containing all 9 pathological classes. The metabolomic profiles of the prostate biopsies acquired by desorption electrospray ionization (DESI) is considered for data features in this study. The generated image labels are therefore spatially registered to corresponding DESI data of each slide. RESULTS: The generated dataset through proposed method is used in the application of prostate cancer detection. The dataset is validated through qualitative visualization and quantitative analysis. High correlation is observed between label images of the slides and unsupervised linear representation of corresponding DESI spectra. The pixel-based supervised identification of tissue types based on the DESI also shows high accuracy. CONCLUSION: The proposed digitized pathology annotation protocol and dataset generation workflow is compatible with AI oriented cancer research and is capable of handling large number of pathological classes and high dimensional imaging modalities. Citation Format: Amoon Jamzad, Tamara Jamaspishvili, Rachael Iseman, Martin Kaufmann, David Berman, Parvin Mousavi. An efficient digitized annotation platform for pathology-oriented dataset generation in AI research [abstract]. In: Proceedings of the AACR Virtual Special Conference on Artificial Intelligence, Diagnosis, and Imaging; 2021 Jan 13-14. Philadelphia (PA): AACR; Clin Cancer Res 2021;27(5_Suppl):Abstract nr PO-005.