An apparatus and method are provided for the insertion of punctuation marks into appropriate positions in a sentence. An acoustic processor processes input utterances to extract voice data, and transforms the data into a feature vector. When the automatic insertion of punctuation marks is not performed, a language decoder processes the feature vector using only a general-purpose language model, and inserts a comma at a location marked in the voice data by the entry “ten,” for example, which is clearly a location at which a comma should be inserted. When automatic punctuation insertion is performed, the language decoder employs the general-purpose language model and the punctuation mark language model to identify an unvoiced, pause location for the insertion of a punctuation mark, such as a comma.