It has recently been argued [1] that the effectiveness of a cure depends on the doctor-patient shared understanding of an illness and its treatment. Although a better communication between doctor and patient can be pursued through dedicated training programs, or by collecting patients' experiences and symptoms by means of questionnaires, the impact of these actions is limited by time and resources. In this paper we suggest that a patient-centered view of a disease - as well as potential misalignment between patient and doctor focuses - can be inferred at a larger scale through automated textual analysis of health-related forums. People are generating an enormous amount of social data to describe their health care experiences, and continuously search information about diseases, symptoms, diagnoses, doctors, treatment options and medicines. By automatically collecting, analyzing and exploiting this information, it is possible to obtain a more detailed and nuanced vision of patients' experience, that we call the "social phenotype" of diseases. As a use-case for our analysis, we consider diabetes, a widespread disease in most industrialized countries. We create a high quality data sample of diabetic patients' messages in Italy, extracted from popular medical forums during more than 10 years. Next, we use a state-of-the-art topic extraction technique based on generative statistical models improved with word embeddings, to identify the main complications, the frequently reported symptoms and the common concerns of these patients. Finally, in order to detect differences in focus, we compare the results of our analysis with available quality of life (QoL) assessments obtained with standard methodologies, such as questionnaires and survey studies. We show that patients with diabetes, when accessing on-line forums, express a perception of their disease in a way that might be noticeably different from what is inferred from published QoL assessments on diabetes. In our study, we found that issues reported to have a daily impact on these patients are diet, glycemic control, drugs and clinical tests. These problems are not commonly considered in QoL assessments, since they are not perceived by doctors as representing severe limitations. Although limited to the case of Italian diabetic patients, we suggest that the methodology described in this paper, which is language and disease agnostic, could be applied to other diseases and countries, since misalignment between doctor and patients, and the importance of collecting unbiased patient perceptions, has been emphasized in many studies ([2,3]inter alia). Extracting the social phenotype of a disease might help acquiring patient-centered information on health care experiences on a much wider scale.
Read full abstract