Responses in personalinterviews about education and career with 415Swedish men and women (age 34) forms the basisof a speech corpus with 1.8 million words. Thevocabulary is described by means of two sets ofvariables. One is based on the number of tokensand types, word length and sectioning of therunning text. The other set divides the corpusinto grammatical categories. Both sets ofvariables are related to a number of backgroundvariables such as gender, socioeconomicbackground, education, and indicators of verbalproficiency at age 13 and 32. This possibilityto study the relationship between vocabularyand a broad set of respondent characteristicsis a unique feature of this corpus.