The advent of high throughput technologies and development of mathematical models of systems biology has led to finer and clearer observations in various domains of biology, especially proteomics, metabolomics, and genomics. Redox biology is certainly a prospective domain to include these approaches. Some initial attempts have been successfully made by researchers at the development of redox protein database, systems biology of thiol redox systems. However, the development of an integrated database of redox parameters is highly required to pace up the redox biology research. The challenges are primarily a collection of a large dataset of various redox parameters which remain scattered across several thousands of research articles and their curation. We developed a community-based approach to gather the data and develop a composite database. We used some existing models of community-based data collection systems used in tuberculosis research and prepared an organized hierarchical system for data collection, validation, and manual curation. We have gathered more than 20 different parameters of redox systems in more than 200 datasets, which is likely to increase to >20,000 datasets by the second phase of the project. Use Google forms and Google live sheets were used for the collection of data and this was followed by data curation by specific team leaders. We observed that there was a high degree of heterogeneity in the data in terms of the units used, prevailing experimental conditions, which were normalized and segregated by curators, that can be beneficial in the next level integration with the systems biology tools. The dataset will be transformed into a highly interactive web-portal with machine learning and artificial intelligence tools, to provide live simulations of redox biology in various model systems.
Read full abstract