Abstract

of the Thesis Sideffective System to mine patient reviews: Side Effect Extraction by Sangeetha Rajagopalan Thesis Director: Prof. Tomasz Imielinski Sideffective is the system to crawl, rank and analyze patient testimonials about side effects from common medications. Since the wealth of any mining model is the Data corpus, the data collection phase involved extensive crawling of massive medical websites comprised of user forums from the internet. Subsequently, the raw files were subjected to certain site-specific parsing routines, yielding outputs conforming to a well-defined data model. Currently, the system holds close to 400,000 user testimonials pertaining to more than 2500 drugs/medicines. Sideffective aims at gathering and aggregating this wealth of information, build useful associations and present interesting observations and numeric validations, all in a user-friendly interface. The important issues that we have tried to tackle are: Extracting side effects without relying on pre-built lists, aggregating distribution of different side effect for a give drug, site-specific search, ranking and determining the negativity of reviews. The main focus of this thesis undertaking is Extraction & Discovery of Side-effect from a users review about a drug. Apache Lucenes Shingle Analyzer, which extracts terms and their frequency, was used to generate more than 7 million phrases out of which the top 25,000 terms, with frequencies more than 100 was chosen for discovering side effects. After eliminating the syntactically incorrect phrases, our method calculates the frequency of occurrence of each of the terms in a medical websites domain versus a purely non-medical user websites domain, which proves

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.