To develop an algorithm for identifying prostate cancer (PC) clinical states. Each state represents a key treatment milestone and is imperative for research using real-world data. Patients with PC from Optum electronic health records (EHR) and claims databases were categorized into five states: (1) localized PC – newly-diagnosed PC without evidence of metastases, castration, or other cancer diagnoses within 90 days of PC diagnosis, (2) non-metastatic hormone-sensitive PC (nmHSPC) – evidence of androgen deprivation therapy (ADT) after new PC diagnosis and no prostate-specific-antigen (PSA) increase (EHR only), (3) metastatic HSPC (mHSPC) – evidence of metastases between PC diagnosis and initial indication of ADT (4) nm castration-resistant PC (nmCRPC) – having ≥1 of the following after evidence of surgical/medical castration: a) diagnosis of hormone-resistant malignancy status, b) new prescription for antiandrogens, or c) PSA level increase after castration (EHR only) (5) mCRPC – evidence of metastases between PC diagnosis and castration-resistance. From 2007 to 2018, we identified 125,505 and 51,299 newly-diagnosed PC patients in Optum EHR and claims databases, respectively. Using our clinical algorithm, patients were categorized into prognostically-relevant clinical states. In the follow-up period, of the total 125,505 patients in the EHR database, 87.5% had localized PC, 3.4% nmHSPC, 1.2% mHSPC, 9.8% nmCRPC, and 3.1% mCRPC, whereas, of the 51,299 patients in claims, 83.5% had localized PC, 30% nmHSPC, 1.7% mHSPC, 34.4% nmCRPC, and 1.6% mCRPC. Complex disease state definitions, missing clinical information (eg, lab values), and limited follow-up confound identification of PC patients in retrospective data. Understanding various states is critical for effective disease management and improving patient health outcomes, yet limited data by clinical states is currently published. Initial application of this claims-EHR algorithm was successful in identifying hormone-sensitive and castration-resistant status. Future work should focus on validation of the algorithm by comparing output to historical cohort sizes.