The problem of automatically placing punctuation marks in a text divided into sentences is solved. The positions of punctuation marks such as comma, dash, colon, exclamation, and question marks are determined. Two approaches to solving the problem are considered. In the first one, the task is reduced to the classification of n-grams. The class of an n-gram is determined by the type of punctuation marks after its k-th token (n = 5, k = 3). A multilayer perceptron is used as a classifier, the input of which receives vector representations of n-grams formed using the word2vec model. In the second one, a neural network with a transformer architecture that receives an input sequence of tokens (IS) freed from punctuation marks is trained to generate a target sequence of tokens (TS) that allows punctuation marks to be placed in the original sentence. The TS is generated by the IS as a result of replacing tokens associated with punctuation marks with corresponding markers. The IS tokens that are not associated with punctuation marks are transferred to the TS without changes. To reduce the dictionary of tokens, word forms are replaced by lemmas, and text elements containing characters other than letters of the Russian alphabet are replaced by special tokens. In addition, for the same purpose, names, patronymics, surnames, toponyms, and numerals are replaced by special tokens. The classifier of word forms is proposed as a tool that defines parts of speech and named entities. Two types of IS and TS are considered. The IS of the second type is formed by the IS of the first type as a result of adding part of speech designations to the lemmas. TS differ in the way the tokens associated with commas are replaced by the corresponding markers. In TS of the first type, the token and the subsequent comma are replaced by a marker; in the TS of the second type, the token and the preceding comma are replaced by a marker. The effectiveness of the models is estimated by the precision and F1 indicators, which are calculated for each class and then averaged. The value of F1 is equal to 0.77 and 0.86 in the cases of using a classifier and transformer, respectively.