Abstract

MotivationThe activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential. Many methods of inferring TF activity from gene expression data have been described, but due to the lack of appropriate large-scale datasets, systematic and objective validation has not been possible until now.ResultsWe systematically evaluate and optimize the approach to TF activity inference in which a gene expression matrix is factored into a condition-independent matrix of control strengths and a condition-dependent matrix of TF activity levels. We find that expression data in which the activities of individual TFs have been perturbed are both necessary and sufficient for obtaining good performance. To a considerable extent, control strengths inferred using expression data from one growth condition carry over to other conditions, so the control strength matrices derived here can be used by others. Finally, we apply these methods to gain insight into the upstream factors that regulate the activities of yeast TFs Gcr2, Gln3, Gcn4 and Msn2.Availability and implementationEvaluation code and data are available at https://doi.org/10.5281/zenodo.4050573.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

  • The activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential

  • E = CS ● TFA, where E is a gene expression matrix, CS is a matrix of control strengths augmented to incorporate baselines, TFA is a matrix of TF activity levels (TFs by samples), and ● indicates matrix multiplication (Fig. 1)

  • Fitting the CS and TFA matrices to expression data is equivalent to factoring the expression matrix, under the constraints that CS signs are predetermined, TFA is non-negative, and the activities of perturbed TFs are constrained according to the perturbation

Read more

Summary

Introduction

The activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential. Results: Using a new dataset, we systematically evaluate and optimize the approach to TF activity inference in which a gene expression matrix is factored into a conditionindependent matrix of control strengths and a condition-dependent matrix of TF activity levels These approaches require a TF network map, which specifies the target genes of each TF, as input. Evaluation code and data available at https://github.com/BrentLab/TFA-evaluation Conclusions: When a high-quality network map, constraints, and perturbation-response data are available, inferring TF activity levels by factoring gene expression matrices is effective. It provides insight into regulators of TF activity. Inferred activity levels could be used to improve TF network mapping [5, 12, 14, 16, 22,23,24, 26,27,28,29,30,31,32,33]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.