Several abundant mRNAs are coordinately expressed specifically during the maturation stage of embryogenesis (6, 7). The cotton Mat mRNAs include those of the major vicilin and legumin storage protein genes (6), and representatives of these genes have been sequenced (2, 3). To help define the temporal and spatial regulation of the maturation program of gene expression, we isolated an additional cotton Mat gene, MatS, which encodes mRNAs represented by cDNA clone C164. The mRNAs from the two Mat5 alloalleles in the allotetraploid genome of Gossypium hirsutum make up 2% of the mRNA in maturation stage embryos (6), and the sequences show that they encode methionine-rich 2S albumin storage proteins. Three genomic fragments were recovered, but restriction site analysis and sequencing showed that they contain the same alloallele, Mat5-A, present in the A genome. One was extensively sequenced, as was all of cDNA C 164 and extensive portions of five other cDNA clones (Table I). Figure 1 shows the sequence of clone GC 1 64-24RC of MatS-A. All six cDNAs are identical with each other in the regions sequenced, but they differ significantly from the sequence of Mat5-A. The gene and cDNA sequences are colinear, with eight nucleotide differences, until MatS-A nucleotide 3661 and C164 nucleotide 514. There is no obvious similarity for the next 62 nucleotides until the site of polyadenylation of the cDNAs (Fig. 1). Because there are only two alloallelic genes in G. hirustum (6), we presume that all the sequenced cDNAs are transcribed from the other alloallele, MatS-D. The two MatS alloalleles encode proteins that differ in only three of 139 amino acid residues. Their sequences are very similar to those of the 2S albumin storage protein family (1, 8, 9), and an alignment of the cotton sequences with other 2S albumin preproproteins is unambiguous in its important features (Fig. 2). There is complete identity at the cysteine and leucine residues that are diagnostic of the family and about 44% of the cotton residues are identical with those of Bertholletia (1) and Arabidopsis (9) 2S albumins. The predicted mature form (9; Fig. 2) of the cotton 2S albumin shares with that of Bertholletia a high methionine content. It is 10% in cotton and 19% in Bertholletia, compared with 2 to 3% in other 2S albumins (1, 8, 9). Four of the 10 methionine residues in the cotton protein are at novel positions that may be sites at which other 2S albumins could be engineered for higher methionine content. Because none of the six cDNAs that were examined are transcribed from MatS-A, it is possible that this gene is not active. However, it is more likely that the apparent absence of MatS-A cDNAs is due to failure to process or polyadenylate its transcripts. In the sequence in Figure 1 and in overlapping phage isolate GC164-27 (Table I), there is a class I-like retrotransposon (10) in reverse orientation to the Mat5-A transcription unit (data not shown). The retrotransposon's putative internal domain/3'-long terminal repeat boundary is at MatS-A nucleotide 3791/92, and the 130 nucleotides between this boundary and the Mat5-A/Mat5-D cDNA divergence point can contain only a highly truncated long terminal repeat, if any part of it. We predict that the element, or element-induced rearrangements of Mat5-A sequence, interferes with the processing of the 3' end of MatS-A transcripts. Rearrangement of MatS-A in this region is suggested by the presence of long direct repeats, part of one of which is present in Mat5-D cDNAs (Fig. 1). A sequence very similar to MatS-A nucleotides 907 to 1765 has been found in reverse orientation in the upstream region of an unrelated cotton Lea4 gene (4). Portions of this element are also duplicated at Mat5-A nucleotides 2289 to 2416 (Fig. 1, repeats 1 and 2). Consequently, the sequences that are responsible for transcription of MatS-A are probably between nucleotide 2615 and the transcription start at nucleotide 3146. In this region the sequence of MatS-A has some similarities with those of other 2S albumin storage proteins, in particular two elements (double underlined in Fig. 1) present in similar locations in many of these genes (8, 9). Other elements suggested to be important in various storage protein genes (5) are not obvious in MatS-A. Finally, MatS-A has been compared with two other cotton Mat genes, vicilin (2) and legumin A (3), with which Mat5-A is coordinately expressed in cotton (6, 7), but no similarities are compelling in any sequence pair. ' Supported by a grant from the National Institutes of Health.