The project is committed to using research conducted on the UK Web Archive’s Archive of Tomorrow (AoT) dataset as a gateway into interacting with web archives. While web archives might seem intimidating at first glance, they are a wealth of knowledge for a variety of users, from professional researchers to everyday library visitors who wish to better understand our recent past. By designing new interfaces and resources, the project aims to make them more accessible both for researchers and broader audiences.
The Archive of Tomorrow – Talking about Health project ran from 2022–2023, collecting health information online. During this time, I collaborated with the web archivists at the University of Cambridge to conduct a Machine Learning pilot research in order to understand the collection’s true potential. Building on these preliminary results, I will use AoT data to facilitate the understanding of web archives, explore their useability, and help users connect with the Library and each other by discovering our shared discussion on health.
1. Develop an interactive web app and display screen to explore and play with the dataset
Aiming at a broad audience, the interactive web platform, where users can explore and experiment with the data and the results, will serve as a gamified interface to the collection.
2. Jupyter notebooks to add to the Data Foundry’s notebook collection
For those who would like to engage more profoundly with the dataset through distant reading, the project offers Jupyter notebooks on how to rehydrate the articles from metadata and how to do some basic Natural Language Processing on them.
3. Entry-level technical workshops
To bridge the potential digital literacy gap between users and dataset creators, the project offers beginner, non-coding workshops on distant reading using the collection as an example.
Would you like to know more about why web archives are exciting? Read about it here or watch the video below.