Back to Search
Start Over
Unlocking web archives through metadata, seed lists and derived data
- Publication Year :
- 2022
-
Abstract
- This presentation addresses the use, re-use, access and dissemination of data related to web archives. Web archives (Brügger, 2018) have been for several years in a hybrid position regarding access, depending on the institutions that were preserving them. While Internet Archive has made its collections available online since 2001 through the Wayback Machine (but with limited features for scholars willing to conduct a distant reading based on data, WARC files, etc.), most national libraries only allowed an onsite access due to authors rights restrictions (and in some cases the frame of legal deposits), while starting to provide interesting metadata for research projects willing to explore them. However, the situation is currently evolving in the frame of several research projects that allow to access a vast amount of (international) metadata and datasets. Taking two research projects in progress as case studies, WARCnet and AWAC2, this paper aims to present the move towards the use of metadata and derived data related to huge collections of web archives of the COVID crisis. WARCnet (Web ARChive studies network researching web domains and events) is a network whose activities (funded by the Independent Research Fund Denmark | Humanities (grant no 9055-00005B)) run in 2020-2023. The networking activities are guided by overarching research questions, one of them being “How transnational events developed on the European web?” (and notably the COVID crisis which is explored in WG2 (https://cc.au.dk/en/warcnet/working-groups)). AWAC2 (Analysing Web Archives of the COVID Crisis through the IIPC Novel Coronavirus dataset) is a project part of the Archives Unleashed Cohort Program, that supports and facilitates research engagement with web archives. It aims to explore a unique collection of web material (https://archive-it.org/collections/13529) related to the pandemic, with contributions from over 30 members of IIPC (International Internet Preservation Consortium) as well as
Details
- Database :
- OAIster
- Notes :
- English
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.on1333446868
- Document Type :
- Electronic Resource