Abstract
Numerous genomes are sequenced and made available to the community through the NCBI portal. However, and, unlike what happens for gene function annotation, annotation of promoter sequences and the underlying prediction of regulatory associations is mostly unavailable, severely limiting the ability to interpret genome sequences in a functional genomics perspective. Here we present an approach where one can download a genome of interest from NCBI in the GenBank Flat File (.gbff) format and, with a minimum set of commands, have all the information parsed, organized and made available through the platform web interface. Also, the new genomes are compared with a given genome of reference in search of homologous genes, shared regulatory elements and predicted transcription associations. We present this approach within the context of Community YEASTRACT of the YEASTRACT + portal, thus benefiting from immediate access to all the comparative genomics queries offered in the YEASTRACT + portal. Besides the yeast community, other communities can install the platform independently, without any constraints. In this work, we exemplify the usefulness of the presented tool, within Community YEASTRACT, in constructing a dedicated database and analysing the genome of the highly promising oleaginous red yeast species Rhodotorula toruloides currently poorly studied at the genome and transcriptome levels and with limited genome editing tools. Regulatory prediction is based on the conservation of promoter sequences and available regulatory networks. The case-study examined is focused on the Haa1 transcription factor—a key regulator of yeast resistance to acetic acid, an important inhibitor of industrial bioconversion of lignocellulosic hydrolysates. The new tool described here led to the prediction of a RtHaa1 regulon with expected impact in the optimization of R. toruloides robustness for lignocellulosic and pectin-rich residue biorefinery processes.
Highlights
The analysis of newly sequenced genomes of new species and/or strains remain hindered by the lack of biological tools and databases, currently made available only for model organisms
This tool includes a pioneering approach to the subject of genome annotation by the inclusion of functional promoter analysis, based on the evaluation of transcription factor consensus occurrence and regulatory network prediction based on the corresponding knowledge gathered for related species
The offered tool was primarily designed for yeast species, taking advantage of the support given by the YEASTRACT + portal and the data included therein
Summary
The analysis of newly sequenced genomes of new species and/or strains remain hindered by the lack of biological tools and databases, currently made available only for model organisms. From a genome of interest from NCBI in the GenBank Flat File (.gbff ) format, and with a minimum set of commands, the setup procedure enables to have all the information parsed, organized in a local database and provide a local web interface with similar tools to the ones available in the Yeastract database. R. toruloides can use a wide range of carbon sources for growth and is tolerant to inhibitory compounds found in biomass hydrolysates [5] It is a good example of a biotechnologically relevant yeast, for which there is availability of genome assemblies from NCBI, but currently no database or tools for the comprehensive study of regulatory networks. For the increase of R. toruloides acetic acid tolerance needed for the improved use of the carbon sources present in sugar beet pulp hydrolysates [6] for which the increased expression of RtHaa and the RtHaa1-regulon may be useful [7]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.