Abstract

A first clue to gene function can be obtained by examining whether a gene is required for life in certain standard conditions, that is, whether a gene is essential. In bacteria, essential genes are usually identified by high-density transposon mutagenesis followed by sequencing of insertion sites (Tn-seq). These studies assign the term “essential” to whole genes rather than the protein domain sequences that encode the essential functions. However, genes can code for multiple protein domains that evolve their functions independently. Therefore, when essential genes code for more than one protein domain, only one of them could be essential. In this study, we defined this subset of genes as “essential domain-containing” (EDC) genes. Using a Tn-seq data set built-in Burkholderia cenocepacia K56-2, we developed an in silico pipeline to identify EDC genes and the essential protein domains they encode. We found forty candidate EDC genes and demonstrated growth defect phenotypes using CRISPR interference (CRISPRi). This analysis included two knockdowns of genes encoding the protein domains of unknown function DUF2213 and DUF4148. These putative essential domains are conserved in more than two hundred bacterial species, including human and plant pathogens. Together, our study suggests that essentiality should be assigned to individual protein domains rather than genes, contributing to a first functional characterization of protein domains of unknown function.

Highlights

  • A first clue to gene function can be obtained by examining whether a gene is required for life in certain standard conditions, that is, whether a gene is essential

  • We operationally defined this subclass of essential genes as “essential domain-containing” (EDC) genes (Fig. 1c, d) and present a computational pipeline to identify them in a Tn-seq dataset built-in Burkholderia cenocepacia K56-217

  • Using a CRISPR Interference (CRISPRi)[18] platform we developed for Burkholderia[19], we experimentally confirmed growth defects, representing the loss of a putative essential function, in 27 EDC gene knockdowns

Read more

Summary

Introduction

A first clue to gene function can be obtained by examining whether a gene is required for life in certain standard conditions, that is, whether a gene is essential. Essential genes are usually identified by high-density transposon mutagenesis followed by sequencing of insertion sites (Tn-seq) These studies assign the term “essential” to whole genes rather than the protein domain sequences that encode the essential functions. Tn-seq analysis may classify a gene as “non-essential” due to the presence of transposon insertions in a non-essential coding region, despite the gene coding for a second domain not spanning through the whole gene length that might be e­ ssential[3,15,16] We operationally defined this subclass of essential genes as “essential domain-containing” (EDC) genes (Fig. 1c, d) and present a computational pipeline to identify them in a Tn-seq dataset built-in Burkholderia cenocepacia K56-217. This study highlights that gene essentiality depends on the function of individual protein domains rather than entire proteins

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call