Abstract
With the vast amount of immunological data available, immunology research is entering the big data era. These data vary in granularity, quality, and complexity and are stored in various formats, including publications, technical reports, and databases. The challenge is to make the transition from data to actionable knowledge and wisdom and bridge the knowledge gap and application gap. We report a knowledge-based approach based on a framework called KB-builder that facilitates data mining by enabling fast development and deployment of web-accessible immunological data knowledge warehouses. Immunological knowledge discovery relies heavily on both the availability of accurate, up-to-date, and well-organized data and the proper analytics tools. We propose the use of knowledge-based approaches by developing knowledgebases combining well-annotated data with specialized analytical tools and integrating them into analytical workflow. A set of well-defined workflow types with rich summarization and visualization capacity facilitates the transformation from data to critical information and knowledge. By using KB-builder, we enabled streamlining of normally time-consuming processes of database development. The knowledgebases built using KB-builder will speed up rational vaccine design by providing accurate and well-annotated data coupled with tailored computational analysis tools and workflow.
Highlights
Data represent the lowest level of abstraction and do not have meaning by themselves
The Human Papillomavirus T-cell Antigen Database (HPVdB) contains 2781 curated antigen entries of antigenic proteins derived from 18 genotypes of high-risk HPV and 18 genotypes of low-risk HPV
The functions of the data mining tools integrated in HPVdB include antigen and epitope/ligand search, sequence comparison using basic local alignment search tool (BLAST) search, multiple alignments of antigens, classification of HPV types based on cancer risk, T-cell epitope prediction, T-cell epitope/HLA ligand visualization, T-cell epitope/HLA ligand conservation analysis, and sequence variability analysis
Summary
Data represent the lowest level of abstraction and do not have meaning by themselves. Overwhelmed by the vast amount of immunological data, to make the transition from data to actionable knowledge and wisdom and bridge the knowledge gap and application gap, we are confronted with several challenges. These include asking the “right questions,” handling unstructured data, data quality control (garbage in, garbage out), integrating data from various sources in various formats, and developing specialized analytics tools with the capacity to handle large volume of data
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.