Abstract

GAPS EXIST AND CONTINUE TO BE PROPAGATED between most data management systems in the biological domain, their potential users and the ultimate goal of extracting critical knowledge in support of research in molecular and cell biology. This is, in part, because the focus of these systems has been on the integration, annotation and rapid extraction of data from large, non-homogeneous data, which in itself is not a simple task. The fault is not actually with the data management system developers but rather in the inability for the research scientists to effectively recognize the complexity of the questions that need to be addressed and to communicate theses rather than questions that can be solved through the implementation of new technology. Thus, it has become somewhat apparent that the development of new technology frequently drives the science rather than the science driving the technology development. There is no question that advances in computational algorithms (and grid computing) to support string comparisons and therefore drive solutions to problems such as sequence annotation have greatly improved the ability to qualify and classify the large data sets coming from the genomic sequences of humans and other organisms. Analogous development of hardware and software support the ability to solve large families of differential equations to approach the simulation of cell behavior and pathway. Both types of solutions, however, beg the scientific questions: (1) While sequence homology is used to imply structural and functional homologies, hence its use for annotation, has it been substantiated mechanistically so that it can be used successfully in all the instances, that is, homology ranges, in which it is applied? (2) While simulation of the sets of differential equations requires a degree of accuracy and dependability in the data, as well as completeness and uniformity of conditions, not typically available in experimental results, what is the sensitivity of the simulation to real data issues and constraints? The underlying biological hypotheses upon which the technology is being applied sits on a slippery slope that the technology, by itself, cannot address. There are many scientists who study non-human organisms, but when we talk of bioinformatics applications in the context of cell and molecular biology, we are frequently using it as a surrogate. This extension towards the human organism significantly increases the complexity of the research problem and confounds it because of the more limited ability to develop adequate experimental results. Clinical data, including biochemical microarray and genetic analysis, tends to be less reliable, overall, than most cell and molecular biology data, particularly when disease diagnosis and clinical outcome are included. Thus, a major challenge to the translational researcher, that is, the research physician, is the dependency on the integration of data that ranges widely in both quality and quantity for use in research and evaluation. Such data ranges from quantitative laboratory results, to histological images, to written or transcribed comments entered into the patient’s chart. All of this requires an additional level of technical complexity in the area of security because of the need to comply with HIPAA regulations as well as patient confidentiality, both of which typically cause academic medical centers to maintain separate networks for the clinical systems and university systems, for example, cell and molecular biology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call