The messenger RNA for the protein silk fibroin has been isolated from the posterior silk gland of Bombyx mori and identified by partial sequence analysis. The sequence of mRNA could be predicted because the protein has a simple repetitious primary structure in which glycine residues comprise 45% of all residues and alternate predominantly with alanine and serine. Codon assignments from the ammo-acid composition of fibroin predict that the fibroin mRNA should have a minimum G + C content of 57% with a G content of about 40%. An RNA which sediments between 45 s and 65 s has been isolated from the posterior silk gland; its base composition is about 40% G and 19% C. Kinetics of synthesis show that fibroin mRNA accumulates slowly over a period of hours and is relatively stable compared to heterogeneous nuclear RNA or the precursor to ribosomal RNA. The mRNA comprises 0.8 to 1.4% of the total RNA in the posterior gland at the end of the larval life of the animal. From the known structure of a repetitious polypeptide comprising about 60 to 75% of the fibroin molecule, the nucleotide sequence of the mRNA was predicted to be GGX-GCY-GGX-GCY-GGX-[UCZ (or AG C U)-GGX-(GCY-GGX) 2] 8-UCZ (or AG C U)-GGX-GCY-GCY-GGX-UA C U, where X, Y and Z can be any ribonucleotide. Digestion of this sequence with RNase T 1 or pancreatic RNase should produce a simple pattern of oligonucleotides. Assuming that the serine codon is UCZ and that neither X, Y nor Z is a G residue, an RNase T 1 digest of the predicted sequence should yield 16, 22, 34, 0, 28 and 0% of mono-(Gp), di-(XpGp), tri-(CpYpGp), tetra-(none), penta-(25% of XpUpCpZpGp and 3% of XpUpAp C UpGp), and hexanucleotides (none), respectively. The actual percentages of these oligonucleotides found were 19, 19, 28, 4, 17 and 5, respectively. Subfractionation of these oligonucleotides demonstrated that single components comprise 50 to 60% of each oligonucleotide class as follows: UpGp, CpUpGp and UpUpCpApGp. These results as well as the analysis of minor components of RNase T 1 digest and the oligonucleotides produced by pancreatic RNase have unequivocally identified this RNA as the fibroin mRNA. In addition, the sequence analyses permitted assignments for the major codons for the principal three amino acids of fibroin as GGU and GGA for glycine, GCU for alanine, and UCA for serine. From the base composition and the abundance of tetranucleotides obtained by RNase T 1 digestion, the purity of the mRNA was estimated to be greater than 80%. Furthermore, the close agreement between the predicted and actual sequence of the mRNA suggests that few if any of its sequences are involved in functions other than coding for the amino acids of fibroin.