Abstract

Purpose This paper aims to describe an interdisciplinary and innovative research conducted in Switzerland, at the Geneva School of Business Administration HES-SO and supported by the State Archives of Neuchâtel (Office des archives de l'État de Neuchâtel, OAEN). The problem to be addressed is one of the most classical ones: how to extract and discriminate relevant data in a huge amount of diversified and complex data record formats and contents. The goal of this study is to provide a framework and a proof of concept for a software that helps taking defensible decisions on the retention and disposal of records and data proposed to the OAEN. For this purpose, the authors designed two axes: the archival axis, to propose archival metrics for the appraisal of structured and unstructured data, and the data mining axis to propose algorithmic methods as complementary or/and additional metrics for the appraisal process. Design/methodology/approach Based on two axes, this exploratory study designs and tests the feasibility of archival metrics that are paired to data mining metrics, to advance, as much as possible, the digital appraisal process in a systematic or even automatic way. Under Axis 1, the authors have initiated three steps: first, the design of a conceptual framework to records data appraisal with a detailed three-dimensional approach (trustworthiness, exploitability, representativeness). In addition, the authors defined the main principles and postulates to guide the operationalization of the conceptual dimensions. Second, the operationalization proposed metrics expressed in terms of variables supported by a quantitative method for their measurement and scoring. Third, the authors shared this conceptual framework proposing the dimensions and operationalized variables (metrics) with experienced professionals to validate them. The expert’s feedback finally gave the authors an idea on: the relevance and the feasibility of these metrics. Those two aspects may demonstrate the acceptability of such method in a real-life archival practice. In parallel, Axis 2 proposes functionalities to cover not only macro analysis for data but also the algorithmic methods to enable the computation of digital archival and data mining metrics. Based on that, three use cases were proposed to imagine plausible and illustrative scenarios for the application of such a solution. Findings The main results demonstrate the feasibility of measuring the value of data and records with a reproducible method. More specifically, for Axis 1, the authors applied the metrics in a flexible and modular way. The authors defined also the main principles needed to enable computational scoring method. The results obtained through the expert’s consultation on the relevance of 42 metrics indicate an acceptance rate above 80%. In addition, the results show that 60% of all metrics can be automated. Regarding Axis 2, 33 functionalities were developed and proposed under six main types: macro analysis, microanalysis, statistics, retrieval, administration and, finally, the decision modeling and machine learning. The relevance of metrics and functionalities is based on the theoretical validity and computational character of their method. These results are largely satisfactory and promising. Originality/value This study offers a valuable aid to improve the validity and performance of archival appraisal processes and decision-making. Transferability and applicability of these archival and data mining metrics could be considered for other types of data. An adaptation of this method and its metrics could be tested on research data, medical data or banking data.

Highlights

  • This study presents the results of a practical and theoretical one-year research guided by the needs of the Neuchâtel State Archives in Switzerland (OAEN) and led by a team of researchers in the Information Sciences Department at the Geneva School of Business Administration (HESSO Geneva)

  • Context and problem The overarching objective of this research was to develop a proof of concept for an archival appraisal tool that will assist in decision-making regarding the archiving or disposal of structured or/and unstructured corporate data sets while having a valid and defensible argument and method

  • The program has hosted a workshop for UK practitioners on the topic of appraisal “where we explored how well traditional appraisal theory and practice can be applied to digital records and how we document these processes” (Penn, 2019)

Read more

Summary

Introduction

The challenge proposed by the Neuchâtel Archives was to be able to handle an extreme case of data appraisal where the archivist would have to manage, for example, several digital media (hard disk or optical media) without any indication of their nature and their context To meet this challenge, the idea explored was to make maximum use of data mining and artificial intelligence approaches to facilitate and prepare the archivist’s appraisal phase. The challenge was to take into account all possible and available structured information in a variety of scenarios, ranging from creating organizations with low-record management maturity, through to organizations with well-developed records management maturity On top of this approach, a model of archival metrics, as proposed by research on appraisal criteria and their metrics (Makhlouf Shabou, 2011a, 2011b, 2015a, 2015b), was studied to consider data mining in a business framework

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.