Abstract

Esteves GH. Metodos estatisticos para a analise de dados de cDNA microarray em um ambiente computacional integrado [Statistical methods for cDNA microarray data analysis in an integrated computational environment]. Sao Paulo; 2007. [Tese de doutorado Universidade de Sao Paulo] High throughput gene expression analysis has a great importance to molecular biology nowadays because it can measure expression profiles for hundreds of genes, and this turn possible studies focused in systems biology. Between the main experimental techniques available in this direction, the microarray technology has been widely used. This experimental procedure to quantify gene expression profiles is very complex and the data obtained is frequently observational, what difficult the statistical modelling. There is not a standard protocol for the generation and evaluation of microarray data, therefore it is necessary to search by adequate analysis methods for each case. Thus, the main mathematical and statistical methods applied to microarray data analysis would have to be available in an organized, coherent and simple way in a computational environment that confer robustness, reliability and reproducibility to the data analysis. One way to guarantee these characteristics is through the representation (and documentation) of all used algorithms as a directed and acyclic graph that describes the set of transformations, or operations, applied sequentially to the dataset. According to this philosophy, an environment was implemented in this work aggregating several data analysis procedures already available in the literature, beyond other methods that were improved or proposed in this thesis. Between the procedures already available that were incorporated we can distinguish that ones for cluster analysis, differentially expressed genes and classifiers search, construction of relevance networks and functional classification of gene groups. Moreover, the method for construction of relevance networks was revised and improved and an statistical model was proposed and implemented for the functional classification of gene regulation networks. The last two procedures was born from biological problems for which adequate data analysis methods didn’t exist in the literature. Finally, we presented two datasets that were evaluated using several data analysis procedures available in this computational environment. Gustavo H. Esteves iv Bioinformatica-USP

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call