Abstract
Network component analysis (NCA) is a method to deduce transcription factor (TF) activities and TF-gene regulation control strengths from gene expression data and a TF-gene binding connectivity network. Previously, this method could analyze a maximum number of regulators equal to the total sample size because of the identifiability limit in data decomposition. As such, the total number of source signal components was limited to the total number of experiments rather than the total number of biological regulators. However, networks that have less transcriptome data points than the number of regulators are of interest. Thus it is imperative to develop a theoretical basis that allows realistic source signal extraction based on relatively few data points. On the other hand, such methods would inherently increase numerical challenges leading to multiple solutions. Therefore, solutions to both the problems are needed. We have improved NCA for transcription factor activity (TFA) estimation, based on the observation that most genes are regulated by only a few TFs. This observation leads to the derivation of a new identifiability criterion which is tested during numerical iteration that allows us to decompose data when the number of TFs is greater than the number of experiments. To show that our method works with real microarray data and has biological utility, we analyze Saccharomyces cerevisiae cell cycle microarray data (73 experiments) using a TF-gene connectivity network (96 TFs) derived from ChIP-chip binding data. We compare the results of NCA analysis with the results obtained from ChIP-chip regression methods, and we show that NCA and regression produce TFAs that are qualitatively similar, but the NCA TFAs outperform regression in statistical tests. We also show that NCA can extract subtle TFA signals that correlate with known cell cycle TF function and cell cycle phase. Overall we determined that 31 TFs have statistically periodic TFAs in one or more experiments, 75% of which are known cell cycle regulators. In addition, we find that the 12 TFAs that are periodic in two or more experiments correspond to well-known cell cycle regulators. We also investigated TFA sensitivity to the choice of connectivity network we constructed two networks using different ChIP-chip p-value cut-offs. The NCA Toolbox for MATLAB is available at http://www.seas.ucla.edu/~liaoj/download.htm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.