Abstract

There is a strong need to systematically organize and comprehend the rapidly expanding stores of biomedical knowledge to formulate hypotheses on disease mechanisms. However, no method is available that automatically structuralizes fragmentary knowledge along with domain-specific expressions for a large-scale integration. A method presented here, cross-subspace analysis (CSA), produces a holistic view of over 3,000 human genes with a two-dimensional (2D) arrangement. The genes are plotted in relation to functions determined by machine learning from the occurrence patterns of various biomedical terms in MEDLINE abstracts. By focusing on the 2D distributions of gene plots that share the same biomedical concepts, as defined by databases such as Gene Ontology, relevant biomedical concepts can be computationally extracted. In an analysis where myocardial infarction and ischemic stroke were taken as examples, we found valid relations with lifestyle, diet-related metabolism, and host immune responses, all of which are known risk factors for the diseases. These results demonstrate that systematizing accumulated gene knowledge can lead to hypothesis generation and knowledge discovery, regardless of the area of inquiry or discipline.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.