Abstract Background At the moment, over 180,000 breast cancer survivors are living in the Netherlands and survival after a breast cancer diagnosis and treatment is still increasing. Therefore, the number of survivors is rising, and these patients are at risk of metachronous metastases. Data at a national level on the development of metachronous metastases are limited, as this is not registered in the Netherlands Cancer Registry on a regular basis. Due to the vast amount of survivors, it is not feasible to actively monitor all patients to signal and register metachronous metastases. Applying machine learning may be an option to overcome this issue. The aim of our study is to develop a M1 detection algorithm based on hospital data to signal patients who developed metachronous metastases after their primary breast cancer treatment. Methods The Netherlands Cancer Registry (NCR) collects data on all individuals newly diagnosed with cancer in one of the 76 hospitals in the Netherlands since 1989. Dutch Hospital Data (DHD) collects and processes data from hospitals, including data on diagnosis and treatment. DHD data from 2019-2020 were linked to the NCR using a probabilistic matching method. We matched on date of birth, gender, diagnosis, postal code, hospital and patient ID. Column values that matched were weighted inversely proportional to the respective column’s probability distribution, where a match on a rare column value (e.g. postal codes with a relatively small population) increased the probability that the match was correct. Scores for each column were combined and patients with high matching scores were included in the algorithm development, validation and deployment. Actively signaled and manually registered data on metastases were available for subgroups of breast cancer patients included in previous studies (‘the golden standard’). First, 80% of these data was used to train the model, 20% was used to validate the model. Second, a pilot study was performed in which patients files were checked for 928 patients, sampled with variance in prediction probability, to evaluate a diverse range of cases. Results We included 4,395 patients. Variables that were included to predict metastases were i.e. specific medication for metastatic disease (palbociblib), counselling for metastatic breast cancer, Carcinoembryonic Antigen test, a confirmed diagnosis of metastases, and number of patient visits. The first validation step (including 20% of known data) showed that the model had a precision of 0.91 to predict metachronous metastases, 0.93 to predict free of metastases. The pilot study confirmed that a higher prediction probability of >0.8 correlated with a higher chance that a patient has metachronous metastases. However, false positive predictions did occur. Conclusion We developed a M1 detection algorithm to signal patients with metachronous metastases after breast cancer treatment on a national level. With this algorithm we are one step closer to identify all patients with metachronous metastases and to reach a complete registration of all breast cancer metastases reusing existing data sources. After review of patients with a high prediction probability, the model will be re-trained using these data and updated data from DHD. Citation Format: Linda de Munck, Daan Knoors, Harm Buisman, Koen Scholman, Janneke Verloop, Sabine Siesling. M1 registration: Signaling patients who develop metachronous metastases after primary breast cancer [abstract]. In: Proceedings of the 2022 San Antonio Breast Cancer Symposium; 2022 Dec 6-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2023;83(5 Suppl):Abstract nr P3-03-18.
Read full abstract