Skip to Main Content

Study Skills

Text & Data Mining: Law on TDM

Our proposal

CILIP, the UK library sector's leading representative organisation, explains that: "The nature of UK copyright exceptions is that they are defences to accusations of infringement rather than rights, and this puts the onus of responsibility on the person doing or facilitating the copying to ensure it is legal. Similarly, the hard-won changes to the law that mean that contracts cannot override exceptions are fundamental to equal access to information in today’s digitally connected world".  The aim of this guide is to make a start towards exposing the breadth of content (mostly library-subscribed) that may be of potential exploitation by Cambridge researchers wanting to use the techniques of text and data mining in their research. We encourage you to get in touch first for us to collaboratively support your enquiries if there is reason to think that the TDM you want to do could be facilitated by an initial step of clearance and confirmation it is within the legal exception under UK law.

Law on TDM


Since 2014 a statutory exception to UK copyright law has allowed for TDM to be performed providing the following two criteria are satisfied:

1. The TDM practitioner has, either as an individual (e.g. with a personal subscription) or as a member of the University (which holds subscription access rights), lawful access to the resource. 

2. The analysis is undertaken for the purposes of non-commercial research.

Note that 2. requires the research must not be for commercial purposes; it may be funded by commercial partners but the purpose must not be commercial.

A further criteria stipulates that any copy made under this exception must include an acknowledgement (e.g. in the form of a citation) unless making this acknowledgement would be impractical.


The law enabling TDM overrides any other agreement or contract. Providing both the above criteria (and the third where possible) are satisfied, the legal right to perform TDM remains in force even in a case where it is made difficult or impossible for the TDM to be done.

In cases where technical protection measures have been put in place by services or publishers to safeguard their platforms' security, and these measures impede or stop the carrying out of lawful TDM, the TDM practitioner should not try to circumvent these measures. Instead the TDM practitioner is encouraged to negotiate initially with the publisher to authorise access. The University Library is able to contact the publisher on the behalf of the TDM practitioner to make this communication and will endeavour to resolve such cases quickly. Please see the Support for your TDM page on this guide for this service.

Services and publishers have sought to provide tools to help with TDM and which may answer to the needs of researchers. Though in themselves a benefit, these tools (such as APIs) may involve a click-through licence to be agreed and accepted and clauses in these licences may restrict what the TDM practitioner would wish to do with the text or data. Providers can also introduce ‘added value’ measures, such as APIs that facilitate TDM activities and even enhance TDM functionality. Please contact for assistance if this is your situation.

Even when a Cambridge researcher is based outside the UK and has access to the content to be mined thanks to the University Library's licence, then any copies made for TDM under the exception can only be done in the UK by Cambridge colleagues based in the UK. Jisc's guide recommends: "The TDM exception in UK law permits the act of copying copyright material. In order for the individual researcher to be covered by the TDM exception, the act of copying would need to take place in the UK."

The CDPA 1988 (Copyright and Data Protection Act) determines how far "outputs" from TDM can be shared and clearly distinguishes between a "fact or a collection of facts" (e.g. a count of how often specific words appear in a text) and material that is produced resulting from an intellectual process. Again, Jisc's guide explains that "there is no copyright in a fact or a collection of facts unless some intellectual rigour has applied in the interpretation or presentation of those facts" and thus such data can be shared, regardless. In the case of copies made for TDM that do contain material that would be protected, the "quotation exception" applies. This exception allows sharing with individuals not involved in the original TDM activity provided only if:-

The work has been made available to the public, is published;

The quotation use doesn't exceed what can be considered fair dealing

The quotation amount is no more than is required for the specific purpose

The quotation is accompanied by a sufficient acknowledgement, where possible

Further, it is important to note with respect to fair dealing that the application of the TDM exception only needs to consider fair dealing where outputs from TDM are shared (or published) that contain third party works (i.e. not covered directly by the licence for the work that permits TDM). In all such cases where there is any doubt this situation may apply it is advisable to get advice which we can seek to provide (please contact us via this link).

Again UK researchers may benefit from the quotation exception, but researchers not based in the UK must apply to the laws governing the dissemination of the outputs abroad.

Finally, the quotation exception, like the TDM exception, overrides any licence terms that restrict the sharing of such outputs beyond the four restrictions listed above.



Further sources of information

The text and data mining copyright exception: benefits and implications for UK higher education: Helping you understand the legal implications of the new text and data mining copyright exception.

This guide by Jisc lays out the law as currently pertaining to TDM, including advice on its application to sharing research outputs, research projects with partners outside the UK, the quotation exception on re-use of data, and the content of resources under a CC-BY Creative Commons licence.

Legal Guidelines for TDM Practitioners

These guidelines by the Future TDM project highlight the areas you need to consider where you may need further legal advice.  They provide an excellent context for the practitioner and summarize the main points from the more detailed in-depth guide produced by the project:  FutureTDM Practioner Guidelines.


The Law on TDM in Europe: an introduction

The aim of this UKSG webinar was to introduce the audience, in particular non-lawyers, to the legal framework of text and data mining, focusing on the main aspects of the law at the European level. (NB: though this webinar has passed, it is possible to get the recording by registering via the link "register for the recording..." at the top of the page.)

© Cambridge University Libraries | Accessibility | Privacy policy | Log into LibApps