In the field of the bioinformatics, during osmotic stress response genes mining processing, it is also very crucial to verify experimental data obtained in the course of complex experiments by using the computer. Aim of this paper is taking Arabidopsis thaliana as the experimental crop, designing technology roadmap, taking advantage of the skills of function and programming, then designing algorithms. After using the program to predict the transcription start point, the promoter sequence is extracted and simplified. In addition, different alignment methods are classified. Then, comparing the promoter sequence with the cis-element and using the formula for further processing. Finally, get the probability P value, which provide further help to experts and scholars on the basis of probability values to determine the correlation between the osmotic stress. The experimental data source of chromosomal sequences is received from Genbank database files, and cis-element sequence that associated with osmotic stress is collected from TRANSFAC and TRRD database. From this, the authors not only used the Arabidopsis promoter as the experimental data, but also use a variety of eukaryotic promoters include promoters GhNHX1 rice, cotton OsNHX1 promoter, as a comparison. Wherein the data obtained in the biological laboratory, which in the course of running the program, 70% have been verified. P value close to 0.8, this article will be treated as the promoter contains osmotic stress cis-elements, the expression of gene induced by osmotic stress. For thaliana, cotton and rice, programs running average time was 51s, 72s and 114s. Through the use of some commonly used bioinformatics gene mining algorithms, MEME algorithm and BioProspector algorithm for the same data have been processed, the average running time of the system is increasing with the increase of data. Running time of MEME algorithm increases from 60s to reach 198s, BioProspector algorithm increases from 45s to 150s model process used herein were 50s, 75s, 110s, 135s. At the same time, the authors can see in the three algorithms, the model algorithm used herein with respect to the first two more optimized. To ensure the accuracy rate, meanwhile has high speed and stabilization of higher.
Read full abstract