The article reflects the current trends in working with the digital heritage of Russian literature, examines the process of forming virtual archives as a gradual accumulation of the “big data” of scientific research, i. e. unrecognized information array of raster documents containing tens of thousands of images. The research analyzes the specifics of scientific work in the field of ego-documentary heritage that arose at the turn of the 20th – 21st centuries (a corpus of diary entries, workbooks, notebooks, correspondence), the principles of publication and modern standards of digitization of archival heritage. The study and practicing of the three most promising virtual resources on the history of Russian literature of the mid-19th – first half of the 20th centuries allows to formulate specific tasks and methods of visualization of a large corpus of raster images of archival documents, as well as previously untapped possibilities of search engine automation. Much attention is paid to the transition from the graphical elements of the raster image of the manuscript to semantic ones, which allow the use of data mining elements for an unrecognized data array.