Most public-founded large-scale sequencing projects that use cDNAs are primarily using cDNA libraries which are not enriched for full-length cDNAs. Consequently, only a fraction of the resulting ESTs matches the 5' end of the original transcript. The target of the Genome Science Laboratory of RIKEN is to clone and sequence the largest number possible of full-length mouse cDNAs in two phases. The first phase is to classify the cDNAs and the second is to complete full-length sequencing and functional annotations. We have developed two original methods to construct full-length cDNAs efficiently: 'cap-trapper', which preferentially recognizes the Cap site of mRNA; and the 'trehalose-thermoactivated reverse transcriptase', which allows the reverse transcriptase reaction at higher (60°C) temperatures. We have constructed over 80 libraries from embryonic tissues of different developmental stages and adult tissues to ensure the greatest possible coverage of the expressed mRNA. More than 200,000 successful sequencing passes have been performed with the use of two tools developed in-house: a high-throughput plasmid preparation system and the RISA 384 capillary sequencer. Most of the sequences were performed from the 3' end to select individual cDNAs. We have selected more than 30,000 different cDNAs. Using these sets of RIKEN full-length cDNA, we have established gene expression microarrays containing a 20 K set of RIKEN full-length cDNA unique mouse genes (http://genome.rtc.riken.go.jp). This set has been used to profile expression patterns of various adult and embryonic tissues. Target DNAs were PCR amplified and printed on Poly-L-lysine coated glass slides. Target DNAs were blocked by excess amounts of Cot1DNA. Probes were labelled by two-colour fluorescent dye using random primer and reverse transcriptase. Normalization has been achieved using a global normalization method. We have also developed a program to filter the noise. The experiment was done twice and reproducible results were extracted and clustered. We will present a large set of data that show the spatial and temporal expression patterns of mice. These mouse full-length 20 K cDNA microarrays are widely applicable to analyse the global expression profiling of normal and diseased status of mice.
Read full abstract