Abstract

Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasibility of having such a framework to provide efficient querying of unstructured data in unlimited ways. The feasibility study was conducted specifically in the epilepsy field. The proposed framework evaluates a query in two phases. In phase 1, structured data is used to filter the clinical data warehouse. In phase 2, feature extraction modules are executed on the unstructured data in a distributed manner via Hadoop to complete the query. Three modules have been created, volume comparer, surface to volume conversion and average intensity. The framework allows for user-defined modules to be imported to provide unlimited ways to process the unstructured data hence potentially extending the application of this framework beyond epilepsy field. Two types of criteria were used to validate the feasibility of the proposed framework – the ability/accuracy of fulfilling an advanced medical query and the efficiency that Hadoop provides. For the first criterion, the framework executed an advanced medical query that spanned both structured and unstructured data with accurate results. For the second criterion, different architectures were explored to evaluate the performance of various Hadoop configurations and were compared to a traditional Single Server Architecture (SSA). The surface to volume conversion module performed up to 40 times faster than the SSA (using a 20 node Hadoop cluster) and the average intensity module performed up to 85 times faster than the SSA (using a 40 node Hadoop cluster). Furthermore, the 40 node Hadoop cluster executed the average intensity module on 10,000 models in 3h which was not even practical for the SSA. The current study is limited to epilepsy field and further research and more feature extraction modules are required to show its applicability in other medical domains. The proposed framework advances data-driven medicine by unleashing the content of unstructured medical data in an efficient and unlimited way to be harnessed by medical experts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.