Abstract

Computational methods are getting increasingly important for the analysis of large data sets in molecular biology. The data sets analyzed in this thesis are derived from experiments measuring the changes of expression levels in response to the transcription factor CREM (cAMP Responsive Element Modulator) during mouse spermatogenesis. In the course of this analysis new computational methods were developed and used that will also be of value in other projects in Bioinformatics. CREM belongs to a family of cAMP-responsive nuclear factors. The activator splice-isoform CREM is exclusively expressed at high levels in post-meiotic germ cells during mouse spermiogenesis. Mutant male mice lacking CREM expression are sterile due to lack of maturation of the germ cells. In order to find CREM target genes the mRNA expression levels in testes of CREM-deficient mice and wild-type mice were compared using the suppression subtractive hybridization (SSH) technique as well as oligonucleotide DNA microarrays. SSH was used to selectively amplify the differentially expressed genes. 12,000 clones, which contain sequence fragments of genes expressed stronger in wild-type as in the CREM (-/-) mutant, were analyzed by a combination of sequencing and hybridization. Sequence analysis methods were used to characterize 956 unique sequences. Homologies to 158 known mouse genes and 99 known genes from other organisms were detected. 296 sequences show homologies to sequences of expressed sequence tags (ESTs). 199 novel sequences have been found. The sequences not corresponding to full length genes of known function were characterized using publicly available EST data. To make EST databases useful for data analysis all of the publicly available ESTs have been grouped into clusters and methods to analyze and visualize EST data were developed. Nylon cDNA microarrays containing the unique sequences from the CREM SSH library were constructed to determine expression levels of those sequences. Most of the sequences from the CREM SSH library are shown to be expressed in wild-type but are down-regulated in CREM deficient mice. Statistical methods to standardize microarray expression data were developed and software was implemented to perform comparisons. Further CREM dependent genes were detected comparing the mRNA expression levels in testes of CREM deficient mice and wild-type mice using Affymetrix oligonucleotide microarrays containing 10,000 mouse sequences. Comparison of the different techniques (SSH, nylon cDNA arrays and Affymetrix oligonucleotide microarrays) shows that the results are complementing each other. The unique sequences from the CREM SSH library were further analyzed by determining the spermatogenic stage specific expression profiles. cDNA from prepubertal mice at certain stages of spermatogenesis were hybridized on nylon cDNA arrays. Several important functional groups of genes like transcription factors, signal transduction proteins and metabolic enzymes are shown to be coexpressed at the latest stages of spermatogenesis. Expression profiles were arranged to find similar profile shapes and co-regulation of functionally related genes. An algorithm to arrange the profiles in an optimal linear order was developed. The linear order is constructed in a way that similar expression profiles end up close together in the linear order, i.e. the sum over all distances of neighboring profiles is minimized. This corresponds to the solution of a traveling salesman problem (TSP), which is well known in computer science. A fast algorithm that computes a heuristic solution to a TSP was adapted to be used in expression profile analysis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.