Abstract
BackgroundThe systematic, complete and correct reconstruction of genome-scale metabolic networks or metabolic pathways is one of the most challenging tasks in systems biology research. An essential requirement is the access to the complete biochemical knowledge - especially on the biochemical reactions. This knowledge is extracted from the scientific literature and collected in biological databases. Since the available databases differ in the number of biochemical reactions and the annotation of the reactions, an integrated knowledge resource would be of great value.ResultsWe developed a comprehensive non-redundant reaction database containing known enzyme-catalyzed and spontaneous reactions. Currently, it comprises 18,172 unique biochemical reactions. As source databases the biochemical databases BRENDA, KEGG, and MetaCyc were used. Reactions of these databases were matched and integrated by aligning substrates and products. For the latter a two-step comparison using their structures (via InChIs) and names was performed. Each biochemical reaction given as a reaction equation occurring in at least one of the databases was included.ConclusionsAn integrated non-redundant reaction database has been developed and is made available to users. The database can significantly facilitate and accelerate the construction of accurate biochemical models.
Highlights
The systematic, complete and correct reconstruction of genome-scale metabolic networks or metabolic pathways is one of the most challenging tasks in systems biology research
The combined database contains a unique list of reactions that occur in any of the compared databases BRENDA [1], KEGG [2], and MetaCyc [3] and the associations between equivalent reactions
These reactions are assigned to KEGG and MetaCyc pathways
Summary
The systematic, complete and correct reconstruction of genome-scale metabolic networks or metabolic pathways is one of the most challenging tasks in systems biology research. An essential requirement is the access to the complete biochemical knowledge - especially on the biochemical reactions. This knowledge is extracted from the scientific literature and collected in biological databases. Since the available databases differ in the number of biochemical reactions and the annotation of the reactions, an integrated knowledge resource would be of great value. A number of sources for biochemical reactions exist, as the databases BRENDA [1], KEGG [2], and MetaCyc [3]. Due to the fact that the completeness of reaction data differs between the databases, it becomes important to combine the available reaction information of the used source databases in form of an integrated reaction database. As different compound names and compound IDs, as well as reaction IDs, are in use within the described biochemical reactions a comparison is far from straightforward
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have