The primary task of museums is to preserve museum objects in the form of physical objects. Despite its apparent simplicity and comprehensibility, damage to man-made objects – artefacts – is a complex and complicated field. Damage processes are grouped as being physical, chemical, mechanical, and biological. In most cases, different processes work together, damaging the materials and structure of the artefacts. A number of factors, the most important of which are the composition and structure of materials, environmental conditions, and human impacts, affect damage processes. It is very difficult, and in most cases impossible, to take all these factors into account. At the same time, modelling the aging of museum objects is especially important for their successful preservation. Modelling of damage processes makes it possible to assess the extent of such processes (which objects have been damaged and what the degree of damage is), the speed of damage processes, and thereby changes in the number of damaged objects over time, and finally, the effectiveness of possible management measures.
 In this article, we discuss the machine learning model Sälli, which predicts the durability of museum objects. For this purpose, the machine learning model uses data from MuIS (Estonian Museum Information System). The condition of objects is assessed in MuIS with four values: ‘good’, ‘satisfactory’, ‘poor’, and ‘very poor’. Almost 3.7 million condition assessments have been entered into MuIS. The development of a condition prediction model based on these data requires at least pairs of consecutive condition assessments in order to attempt to determine what correlates with the change in condition, whether it be one or another event, or a property (nature, material, age, techniques) of a museum object, or some combination of such factors. There are more than 1.4 million such pairs among the museum objects with several condition assessments. Almost 32,000 of them, or a little over 2%, consist of two different condition assessments, i.e., they indicate a change in condition. According to the data entered in MuIS, almost 30,000 museum objects, i.e., less than one percent of all museum objects, have been subject to a change in condition.
 As data points, we used at least two condition assessments for each museum object, to which we added the characteristics of the respective museum object and other features that help to predict the deterioration of the condition of the museum object. These data included static data related to the museum object: museum, museum collection, nature, material, material group, technology, exhibitability, and dating. As additional information, we used the history of the museum object, i.e., a summary of the events related to the museum object, taking into account only the events that took place during the condition assessment (because we do not have information on the future). The model finds the probability that the condition of the museum object will deteriorate in the next n years. If the probability of deterioration is greater than or equal to a set threshold, the model responds with ‘deterioration’. In finding the optimal decision threshold, we used a 10-year forecast period, i.e., we trained the decision-makers to predict deterioration over the next 10 years.
 The best results were obtained using the decision forest algorithm, which was able to identify 92% of deteriorating museum objects with 50% accuracy. This model was also used to create the Sälli prototype. The task of the Kratt Sälli prototype is to draw the attention of museum staff to museum objects, the condition of which may deteriorate in the next 10 years and the situation of which should therefore be reviewed. For testing, a prototype of the 1,000 highest-risk museum objects from that museum was added to each test museum. To test the usefulness and usability of the machine learning model predictions, we created a simple web application that was tested in pilot museums. We found that the available data have the potential to predict deterioration, but the data still need to be improved and the model trained on them is not yet mature enough.
Read full abstract