The purpose of the study is to propose a framework that summarizes the processing of existing and incoming sources about the life of the overseas Chinese, their historical heritage, and documents. The study addresses the significant problem of fragmentation and relatively poor accessibility of individual collections or sets of documents to interested researchers, who may not even know about the existence of the documents they are looking for. As an approach to solving this problem, it is proposed to use end-to-end processing of cached data, which is a description of digitized or material sources of artifacts already created by researchers. Cached data stores information about the use of a service, program, or data to structure and facilitate future individual use of that data. This is, in particular, the data about the structure of catalog or directory topics, keywords, and database indexes. With the help of AI, human resources, and already existing approaches to the algorithmizing of various types of digital data (for example, images of artifacts or documents), the created descriptions of sources are gradually being universalized and unified through the practice of database queries, and access to them is simplified using tags and ‘cloud’ data storages. This approach has great practical value because it does not require the use of special agreements, data formats, or complex digital tools that would be difficult to implement in international research practice.
Read full abstract