Abstract
Promoter is a key DNA element located near the transcription start site, which regulates gene transcription by binding RNA polymerase. Thus, the identification of promoters is an important research field in synthetic biology. Nannochloropsis is an important unicellular industrial oleaginous microalgae, and at present, some studies have identified some promoters with specific functions by biological methods in Nannochloropsis, whereas few studies used computational methods. Here, we propose a method called DNPPro (DenseNet-Predict-Promoter) based on densely connected convolutional neural networks to predict the promoter of Nannochloropsis. First, we collected promoter sequences from six Nannochloropsis strains and removed 80% similarity using CD-HIT for each strain to yield a reliable set of positive datasets. Then, in order to construct a robust classifier, within-group scrambling method was used to generate negative dataset which overcomes the limitation of randomly selecting a non-promoter region from the same genome as a negative sample. Finally, we constructed a densely connected convolutional neural network, with the sequence one-hot encoding as the input. Compared with commonly used sequence processing methods, DNPPro can extract long sequence features to a greater extent. The cross-strain experiment on independent dataset verifies the generalization of our method. At the same time, T-SNE visualization analysis shows that our method can effectively distinguish promoters from non-promoters.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.