BackgroundCancer is a critical disease that affects millions of people and families around the world. In 2012 about 14.1 million new cases of cancer occurred globally. Because of many reasons like the severity of some cases, the side effects of some treatments and death of other patients, cancer patients tend to be affected by serious emotional disorders, like depression, for instance. Thus, monitoring the mood of the patients is an important part of their treatment. Many cancer patients are users of online social networks and many of them take part in cancer virtual communities where they exchange messages commenting about their treatment or giving support to other patients in the community. Most of these communities are of public access and thus are useful sources of information about the mood of patients. Based on that, Sentiment Analysis methods can be useful to automatically detect positive or negative mood of cancer patients by analyzing their messages in these online communities. ObjectiveThe objective of this work is to present a Sentiment Analysis tool, named SentiHealth-Cancer (SHC-pt), that improves the detection of emotional state of patients in Brazilian online cancer communities, by inspecting their posts written in Portuguese language. The SHC-pt is a sentiment analysis tool which is tailored specifically to detect positive, negative or neutral messages of patients in online communities of cancer patients. We conducted a comparative study of the proposed method with a set of general-purpose sentiment analysis tools adapted to this context. MethodsDifferent collections of posts were obtained from two cancer communities in Facebook. Additionally, the posts were analyzed by sentiment analysis tools that support the Portuguese language (Semantria and SentiStrength) and by the tool SHC-pt, developed based on the method proposed in this paper called SentiHealth. Moreover, as a second alternative to analyze the texts in Portuguese, the collected texts were automatically translated into English, and submitted to sentiment analysis tools that do not support the Portuguese language (AlchemyAPI and Textalytics) and also to Semantria and SentiStrength, using the English option of these tools. Six experiments were conducted with some variations and different origins of the collected posts. The results were measured using the following metrics: precision, recall, F1-measure and accuracy ResultsThe proposed tool SHC-pt reached the best averages for accuracy and F1-measure (harmonic mean between recall and precision) in the three sentiment classes addressed (positive, negative and neutral) in all experimental settings. Moreover, the worst accuracy value (58%) achieved by SHC-pt in any experiment is 11.53% better than the greatest accuracy (52%) presented by other addressed tools. Finally, the worst average F1 (48.46%) reached by SHC-pt in any experiment is 4.14% better than the greatest average F1 (46.53%) achieved by other addressed tools. Thus, even when we compare the SHC-pt results in complex scenario versus others in easier scenario the SHC-pt is better. ConclusionsThis paper presents two contributions. First, it proposes the method SentiHealth to detect the mood of cancer patients that are also users of communities of patients in online social networks. Second, it presents an instantiated tool from the method, called SentiHealth-Cancer (SHC-pt), dedicated to automatically analyze posts in communities of cancer patients, based on SentiHealth. This context-tailored tool outperformed other general-purpose sentiment analysis tools at least in the cancer context. This suggests that the SentiHealth method could be instantiated as other disease-based tools during future works, for instance SentiHealth-HIV, SentiHealth-Stroke and SentiHealth-Sclerosis.