Synthetic 32P-labeled oligonucleotides have been used to identify the prostatic proline-rich polypeptide (PRP) mRNA which has partially been characterized. The 14-mer d(G-G-T-T-C-T-G-C-A-T-A-A-T-G) complementary to the coding sequence for His-Tyr-Ala-Glu-Pro, a sequence element occurring in all 38-residue PRP variants, hybridizes specifically with a 12.5-kilobase mRNA which is clearly androgen-controlled. This oligonucleotide was used as an efficient primer for the construction of a PRP-specific lambda gt10 cDNA library. The nucleotide sequence of the inserts from several recombinant clones has been determined. This structural analysis revealed a PRP mRNA encoding a large precursor containing a number of tandemly repeated units. Each repeat codes for a sequence of 100 amino acids in which the highly conserved PRP sequence is embedded. From this polyprotein the large number of PRP variants must be generated by a post-translational processing mechanism which is still unknown. The high degree of conservation of both nucleotide and amino acid sequence in the entire unit also indicates that the PRP gene(s) likely evolved by multiplication of a 300-base pair ancestral DNA sequence. This has resulted in a noninterrupted repetitive DNA coding segment which is detected at the genomic level.