Abstract

It is vital to improve our ability to mine database information effectively in the pursuit of answers to biologically important questions. There is a deluge of DNA-sequence and expression-profile information and the flood has only begun, as technology improves to generate more data at a rapidly increasing pace. Valuable as these data are, they can only help to answer biological questions in a synergistic relationship with laboratory experimentation. The significant bonus is that fewer unnecessary, time-consuming and costly laboratory bench experiments will be done and this phase of the research will begin at a far more advanced level. To many, the arrival of the genome (the complete DNA sequence of an organism) and the emerging transcriptome (the level of expression of transcribed sequences in particular tissues under specific conditions), proteome (all the proteins present in a cell at a particular period in time and their interaction with one another) and metabolome (an understanding of how the proteins interact to effect metabolic pathways) is a frightening development, but it should be an exciting prospect to all.Integrating available data effectively is a challenge. A collection of conceptual models has been developed by Paton et al.1xConceptual modelling of genomic information. Paton, N.W. et al. Bioinformatics. 2000; 16: 548–557Crossref | PubMedSee all References1 to analyse information resources to help assign gene function and to determine the pathways of gene action and interaction. With the effective use of flow charts, they describe the nature of genomic data (focusing on completed genomes) and of transcribed chromosome segments, as well as integration with protein interaction and transcriptome data. Their models provide an important starting point from which to develop additional computer algorithms to be used in the search, analysis and visualization of data towards a particular biological focus.Computational outputs that are dependent on the available data could easily change on the addition of new data. Therefore, it is essential that provisional suggestions resulting from computer analysis are validated by appropriate biological experimentation. Loging et al.2xIdentifying potential tumour markers and antigens by database mining and rapid expression screening. Loging, W.T. et al. Genome Res. 2000; 10: 1393–1402Crossref | PubMed | Scopus (74)See all References2 describe a good example of this process in their attempt to identify potential tumour markers and antigens. They combined database mining of several cancer expressed sequence tag (EST) databases to identify genes that are not expressed in normal neural tissue, but abundantly expressed in a particular type of brain tumour. After identifying 76 neural specific genes that were overexpressed in these tumours, they chose 13 for further analysis by using a rapid method for measuring gene expression, termed fluorescent-PCR expression comparison. Seven of these genes have been identified as potential tumour markers and merit further investigation. This is one of an increasing number of papers that underscore the value of database mining coupled with experimental validation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call