Methods for enhancing the external validity of role-plays were evaluated. The heterosocial performance of 81 male undergraduates in an unobtrusive criterion situation were compared to performances on three content-identical role-plays. The typical role-play requested testees to simulate their behavior in a situation described by the testor. The replication and specification role-plays requested testees to replicate their behavior in the specified criterion situation. On the specification role-play, testees were also told of specific dependent measures which were important to simulate accurately. Results indicated that the specification role-play yielded stronger test-criterion relationships than the other two tests, with the replication role-play tending to be superior to the typical procedure. Only the typical role-play elicited significantly higher levels of performance than the criterion situation, supporting assertions that such tests assess response capabilities as opposed to naturalistic performances.