A growing amount of interest has been dedicated to an emerging orientation to collections referred to as “collections as data”(ex. 1, ex. 2, ex. 3). Generally speaking, work in this space seeks to explore what might be possible if cultural heritage organizations began to think about, prepare, describe, and provision access to digital collections in ways that promote their amenability to computational use. During this course librarians, archivists, museum professionals and allied collection stewards will explore how to approach providing collections as data. The course will consider lessons learnt from earlier work on open cultural data, image releases and APIs in the digital cultural heritage sector, and discuss the implications of thinking of collections – metadata, digitized texts, images and objects – as data. Possible user communities served by or providing models for this work include but are not limited to Digital Humanities, Data Science, Data Driven Journalism, and Digital Social Science.

Concretely, participants will (1) be exposed to case studies of established and emergent collections-as data work from a variety of institutional contexts (libraries, archives, museums) (2) consider differences and commonalities across a range of user communities (Humanities, Social Science, STEM, and more) (3) build on lessons learned derived from existing real world implementations of collections as data (4) consider how collections have been used by digital scholars and creatives and (5) discuss the principles and ethical dimensions of collections as data work.

There’s no required preparation or reading for this course, but if you’d like to find out a little more about earlier work in collections as data, there’s a wealth of resources at I’ll be talking at times about a new-ish project called Living with Machines – there’s some basic background at for a taster of how we’re collaborating to bring data science into the library.


0110 University Library