The web and online information has become of utmost importance. However, the short lifespan of online data (with 40% of content being removed after 1 year) poses serious challenges for preserving and safeguarding digital heritage and information. Hence, web or media historians, sociologists or digital scholars must learn to "dig" in online sources such as the Internet Archive or national web archives in order to find relevant research material. In this paper, we explore the requirements of researchers working with web archives and outline how they perceive the limitations and possibilities of using the archived web as a data resource, using survey data (n=154). We asked researchers with and without experience in working with web archives for, amongst others, the search functionalities and selection and access criteria they require. Given that archived web content is relatively new research material, new skills need to be acquired to work with this content which is not something evident or something every researcher is willing to do. Yakel & Thores (2003) point to three distinct forms of knowledge required to work effectively with these sources: (i) domain (subject) knowledge, (ii) artifactual literacy, and their own concept of (iii) archival intelligence. In addition to arriving at significant findings that demonstrate the relationship between researcher’s domain (subject) knowledge, archival intelligence and use frequency of web archives, this study discusses the limitations of using the archived web as a data resource and concludes with actions to overcome these hurdles and fulfill the desiderata of scholars.
Read full abstract