Abstract

BackgroundHormone receptors of breast cancer, such as estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (Her-2), are important prognostic factors for breast cancer.ObjectiveThe current study aimed to develop a method to retrieve the statistics of hormone receptor expression status, documented in pathology reports, given their importance in research for primary and recurrent breast cancer, and quality management of pathology laboratories.MethodA two-stage text mining approach via regular expression-based word/phrase matching, was developed to retrieve the data.ResultsThe method achieved a sensitivity of 98.8%, 98.7% and 98.4% for extraction of ER, PR, and Her-2 results. The hormone expression status from 3679 primary and 44 recurrent breast cancer cases was successfully retrieved with the method. Statistical analysis of these data showed that the recurrent disease had a significantly lower positivity rate for ER (54.5% vs 76.5%, p=0.001278) than primary breast cancer and a higher positivity rate for Her-2 (48.8% vs 16.2%, p=9.79e-8). These results corroborated the previous literature.ConclusionText mining on pathology reports using the developed method may benefit research of primary and recurrent breast cancer.

Highlights

  • Electronic pathology reports, an important component of electronic health records [1], often document valuable data for research and quality control [2]

  • The data documented in pathology reports is especially important since the expression statuses of hormone receptors, such as the estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (ErbB2 or Her-2), are immunohistochemically examined [3,4,5] and documented in pathology reports

  • Due to the possible value of the hormone receptor expression status on the prediction of local recurrence [12], statistics of hormone receptor expression status is valuable for research on recurrent breast cancer

Read more

Summary

Introduction

Electronic pathology reports, an important component of electronic health records [1], often document valuable data for research and quality control [2]. The data documented in pathology reports is especially important since the expression statuses of hormone receptors, such as the estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (ErbB2 or Her-2), are immunohistochemically examined [3,4,5] and documented in pathology reports Expression of these markers affects prognosis [6, 7] and has implications on the choice of hormone therapy and chemotherapy [8, 9]. The hormone expression status from 3679 primary and 44 recurrent breast cancer cases was successfully retrieved with the method Statistical analysis of these data showed that the recurrent disease had a significantly lower positivity rate for ER (54.5% vs 76.5%, p=0.001278) than primary breast cancer and a higher positivity rate for Her-2 (48.8% vs 16.2%, p=9.79e-8).

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.