Skip to main content

Text & Data Mining: Support for your TDM

Where we can help you...

In the tabs below we offer brief overviews of the current state of play as we experience it with use of publisher resources for TDM and the ways in which we may be able to assist you.  Our knowledge is limited, but we will seek advice from colleagues in the University or with external bodies (like Jisc or the IPO) that can offer expert advice.  For information on and issues with research data management and for contacts there please visit the University's Research Data Management website.

Use of resources

UK law allows you to mine data and text, as well as other content types including sound, film/video, artistic works, tables and databases. The access you have to the content you aim to mine must be lawful. What this means is that you will have either arranged personal access or have lawful access by virtue of subscription access provided to you as a full member of the University. The resources provided by the University Library to which you have full subscription access are discoverable via iDiscover or via the eresources & ejournals website.

Publishers of content are entitled under law to "apply reasonable measures to maintain their network security or stability". When content is mined some publishers may and indeed do shut down subscription access as a result because their platforms register the TDM activity as in some way worthy of investigation. There is no obligation on the practitioner to forewarn publishers of TDM activity on their content, and the law expressly states that the technical measures "should not prevent or unreasonably restrict researcher's ability to text and data mine".

The law is also clear that the practitioner should not attempt to circumvent the technical measures put in place by publishers to safeguard content. The recommendation from Jisc is that: "Researchers who need to access and process content which is restricted by technical protection measures should negotiate initially with the publisher to authorise access. Circumventing technical protection measures should be avoided"

In the event that access is lost because of lawful TDM activity, the ejournals@cambridge team, in liaison with Digital Services at the University Library, and the University Information Services (UIS), restores access as quickly as possible. If you are planning to text/data mine on any publisher's content and want to have certainty of access being maintained, please contact the ejournals@cambridge team which will liaise with the publisher(s) to ensure access is not shut down. If you are interested in this service please complete the form available at this link.

Unless otherwise indicated, licenses for the following journal packages or online resources expressly permit TDM under the Jisc model licence.  Where the licence does not expressly permit TDM, the journal packages licensed via Jisc include the general clause under which TDM should be permitted ("This Licence shall be deemed to complement and extend the rights of the Institution and Authorised Users under the Copyright, Designs and Patents Act 1988 and the Copyright (Visually Impaired Persons) Act 2002 and nothing in this Licence shall constitute a waiver of any statutory rights held by the Institution and Authorised Users from time to time under these Acts or any amending legislation").  Such packages are marked with an asterisk * below.  Licences that are not via Jisc and that do not explicitly permit TDM are noted below and we are making enquiries with these publishers as to what arrangements they have in place for libraries to support researchers aiming to copy their content for TDM.  This page will be updated as these positions are clarified. 

ACM Digital Library *

American Association for Cancer Research

not via Jisc and not explicitly permitting TDM

American Institute of Physics

not via Jisc and not explicitly permitting TDM

American Chemical Society *

Annual reviews *

ASCE Journals Collection *

ASME

not via Jisc and not explicitly permitting TDM

Bio-One Complete

Brill

Business Source Complete

Ebsco's current position is: "our licences with publishers do not allow for text and data mining of EBSCO databases"

Cambridge University Press

Cell Press

Duke

Ebsco

Ebsco's current position is: "our licences with publishers do not allow for text and data mining of EBSCO databases".  We are in dialogue with Ebsco currently on this question.  Ebsco has clarified that: "unlike publishers who own their content, we do not have rights to grant TDM.   Using the APA as point in reference [for example], we licence their content and add their databases to EBSCOhost.  If you were granted authority to text and data mine their content, this would be provided by the APA and most likely will not be achieved through EBSCOhost but using the raw data supplied by them."

ICE Current Engineering Journals

IEEE/IET Electronic Library

not via Jisc and not explicitly permitting TDM

IMechE Journals Collection

IOP Option 2

IOS Press

ISPG SMP

JSTOR

Nature Publishing Group

Oxford University Press

Project Muse

PsycARTICLES

"Text and data mining is considered on a case by case basis. Any requests for text and data mining can be sent to APA and they will forward to the Publisher." Please contact us for approaches to the APA.

Royal Society of Chemistry

Sage Premier

ScienceDirect

SPIE Digital Library

not via Jisc and not explicitly permitting TDM

Springer *

Taylor & Francis

Wiley *

Digital copies

In the digital humanities, TDM can be facilitated by the supply of a copy on a flash drive of the files of digital archives. Usually this copy is only available when the University Library has purchased the archive and therefore owns it fully and in perpetuity. This condition holds for a number of digital archives but not for all; some major archives are subscribed, rather than licensed in perpetuity. Here we list which archives that can be supplied to you as a copy on a flash drive. For those archives where we cannot supply a copy, restricted by our license with the publisher, there is functionality on some sites for limited "TDM"-lite - for example in the Visualize Results tool on the Gale Primary Sources Platform.

The Economist Times Digital Archive

ECCO - Eighteenth Century Collections Online

EEBO - TCP - Early English Books Online - Text Creation Partnership

TDA - Times Digital Archive

A perpetual access license agreement is held by the University Library for the following digital archives; requests for copy of these archives would entail an order for a copy for TDM use from the publisher:-

Codices Vossiani Latini (Brill)

Illustrated London News (Gale Cengage)

Making of modern law legal treatises (Gale Cengage)

Nineteenth century US newspapers (Gale Cengage)

Times of India (ProQuest)

Please complete the form at this link with basic information on your needs and we will get back to you to progress the request. We will supply the disc copy in a reasonable time-frame (about 2 weeks depending on whether we need to make a new copy). We will ask you to complete a simple form on handover of the disc that will be an agreement between you and the University Library that you can proceed to make a copy from the copy we supply and that you will return the copy disc to the University Library within 2 weeks from the date of handover. We want to work with you so of course if flexibility is required here we want to accommodate that. If you have any questions relating to this service please use the link above to contact us. Thank you.