Abstract

Scanning through genomes for potential transcription factor binding sites (TFBSs) is becoming increasingly important in this post-genomic era. The position weight matrix (PWM) is the standard representation of TFBSs utilized when scanning through sequences for potential binding sites. Many transcription factor (TF) motifs are short and highly degenerate, and methods utilizing PWMs to scan for sites are plagued by false positives. Furthermore, many important TFs do not have well-characterized PWMs, making identification of potential binding sites even more difficult. One approach to the identification of sites for these TFs has been to use the 3D structure of the TF to predict the DNA structure around the TF and then to generate a PWM from the predicted 3D complex structure. However, this approach is dependent on the similarity of the predicted structure to the native structure. We introduce here a novel approach to identify TFBSs utilizing structure information that can be applied to TFs without characterized PWMs, as long as a 3D complex structure (TF/DNA) exists. Our approach utilizes an energy function that is uniquely trained on each structure thus leads to increased prediction accuracy and robustness compared with those using a more general energy function. The software is freely available upon request. Please see reference supplementary material for details., 通过基因组扫描潜在的转录因子结合位点(TFBSs)在这个后基因组时代变得越来越重要。位置权重矩阵(PWM)是当通过序列扫描潜在结合位点时所利用的TFBS的标准表示。许多转录因子(TF)基序是短的和高度退化的,并且利用PWM扫描位点的方法被假阳性困扰。此外,许多重要的TF不具有良好表征的PWM,使得潜在的结合位点的鉴定甚至更困难。一种用于鉴定这些TF的位点的方法是使用TF的3D结构来预测TF周围的DNA结构,然后从预测的3D复合结构产生PWM。然而,这种方法依赖于预测结构与天然结构的相似性。我们在这里介绍一种新的方法来识别TFBSs利用结构信息,可以应用于TFs无特征的PWM,只要存在3D复杂结构(TF/DNA)。我们的方法利用在每个结构上唯一训练的能量函数,因此与使用更一般的能量函数的那些相比,导致提高的预测精度和鲁棒性。该软件可应要求免费提供。有关详细信息,请参阅参考补充材料。

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.