‘Making such bargain’: Transcribe Bentham and the quality and cost-effectiveness of crowdsourced transcription1

AbstractIn recent years, important research on crowdsourcing in the cultural heritage sector has been published, dealing with topics such as the quantity of contributions made by volunteers, the motivations of those who participate in such projects, the design and establishment of crowdsourcing initiatives, and their public engagement value. This article addresses a gap in the literature, and seeks to answer two key questions in relation to crowdsourced transcription: (1) whether volunteers’ contributions are of a high enough standard for creating a publicly accessible database, and for use in scholarly research; and (2) if crowdsourced transcription makes economic sense, and if the investment in launching and running such a project can ever pay off. In doing so, this article takes the award-winning crowdsourced transcription initiative, Transcribe Bentham, which began in 2010, as its case study. It examines a large data set, namely, 4,364 checked and approved transcripts submitted by volunteers between 1 October 2012 and 27 June 2014. These data include metrics such as the time taken to check and approve each transcript, and the number of alterations made to the transcript by Transcribe Bentham staff. These data are then used to evaluate the long-term cost-effectiveness of the initiative, and its potential impact upon the ongoing production of The Collected Works of Jeremy Bentham at UCL. Finally, the article proposes more general points about successfully planning humanities crowdsourcing projects, and provides a framework in which both the quality of their outputs and the efficiencies of their cost structures can be evaluated.