Generative Artificial Intelligence (genAI) and Copyright

Guidance on copyright when working with genAI systems.

On

Text and data mining and AI training

Training a large language model (LLM) or similar generative AI system usually involves exposing the system to a large body of input texts or other works. While it is currently unclear if such training could represent copyright infringement in the UK, it seems likely that if the genAI system is non-commercial then the training may rely on the s29A copyright exception for text and data mining, provided there is lawful access to the training data and no copies are shared further.

Using openly licensed works for genAI training, such as those available under Creative Commons or similar licences, would be permissible provided that the specific use made complies with the licence terms.


Inputs and prompting

When an end user enters a text prompt into a genAI system, if the choice of expression they use is their own intellectual creation then it may qualify as a copyright literary work in its own right. In such a case the end user would own the rights to the creative expression in their prompt. Note that this does not necessarily mean they own any copyright in the output generated by the genAI system in response to that prompt. 


Outputs and licences

The terms and conditions of any third-party or commercially available genAI system should be checked carefully to clarify ownership of any rights that might exist in generated outputs.

Such outputs may qualify for copyright protection in the UK if all necessary conditions are met. Section 9(3) of the Copyright Designs and Patents Act allows the possibility of copyright protection for “computer-generated” literary, dramatic, musical or artistic works where there is no human author.

Whether the owner of such rights would be the end user who submitted the prompt, or the system operator/provider, could be determined by any contractual conditions in the end user licence agreement. If the system owner claims copyright in any protectable outputs, it is important to check under what terms they licence the output to the end user.

It is of course possible no copyright will exist in many genAI outputs under UK law.


Attribution

If creating your own LLM or similar genAI system and using openly licensed works as part of the training inputs, you should remember any applicable attribution requirements if any copies are shared.

Most open licences require any onward copies to attribute the original author, work and licence terms. Failure to do so may be a breach of the licence. Works which are out of copyright and in the public domain do not have to be attributed, and can be freely used.


Further information

For any further help and support with copyright questions please contact us.

Ask a question

Email: library@sheffield.ac.uk

Phone: +44 114 222 7200