AI and OER: How a legally compliant application can succeed

An article by Yulia Loose

28. October 2024

Image by Sarah Brockmann, released under CC 0 (1.0)

Start

Do artificial intelligence (AI) and open educational resources (OER) go together? Does their relationship have a future? In this blog post, we show why AI and OER are a match and how AI tools can be used for OER creation in a legally compliant way.

This is the second post in our blog series “AI in the university”. After the first post was about AI detectors in the first post, in this post we looked at AI and OER from a legal perspective. This will soon be followed by an article on prompt tips for high-quality OER. We will then conclude the series with mandatory AI skills in higher education.

Recently, in the attention economy, the topic of “AI” has almost supplanted the topic of “OER” in the university context. Everyone is talking about AI: AI projects are being funded, AI workshops are fully booked. OER has moved down the list of priorities. These developments are understandable. And yet it should not be forgotten that not only AI, but also OER has great potential for teaching. So why not combine the two topics to make the most of their potential?

The interaction between AI and OER can bring considerable added value for teaching: AI can prepare open educational materials in a didactically clever and appealing way, structure them and much more. In turn, OER can help to ensure that AI programs are trained with high-quality materials and thus deliver more reliable results.

And what does the law say about this relationship? To what extent is a legally compliant use of AI programs in OER even possible? We will try to answer this question here. We will only address OER-relevant aspects (excluding data protection aspects). Please also note that legal developments in the field of AI are very dynamic. The explanations in this blog post refer to the current legal situation (October 2024).

Is AI output protected by copyright?

AI output (e.g. images, texts, infographics, etc.) is in the public domain under German copyright law, i.e. free from copyright. This is because, according to the clear wording of Section 2 (2) UrhG, only humans can be the creators of an intellectual output. Consequently, the AI cannot have any copyrights to the output.

Operators of AI programs as legal entities are not considered authors either. The authorship of the AI output does not lie with the programmers of the AI program either, because AI is generally not controllable. It functions autonomously. The same applies to users of the program. It is theoretically conceivable to manipulate AI output with very clever prompts (commands). However, it will generally not be possible to control the AI entirely with these. As a result, although the smart prompt is protected by copyright as the user’s work, the output remains in the public domain.

Something else may apply if the AI is only used as a tool in the creative process (i.e. only as a support), but the actual performance can be objectively attributed to a human being. This can be compared to the use of an image editing program. In this case, the copyright lies with the creator of the work.

Copyright protection can also arise if the AI product is extensively creatively reworked by a human. The result of the adaptation must be sufficiently individual, i.e. take on the personal characteristics of the creator of the work. This is not the case if only colors or sizes are adjusted or additions are made. Whether the required individuality and creativity (level of creativity) are present always depends on the specific individual case. You can find out more about this in the article AI and OER: How well do they go together? by Georg Fischer.

Can AI-generated output be used in OER? What applies to prompts?

As the pure AI output is in the public domain, it can generally be used in OER materials. However, this requires that no third-party rights (e.g. copyright, personal rights, trademark rights, etc.) are infringed by the output.

The publication of infringing content under an open license can be warned. This is because copyright law does not protect the good faith belief that the content used does not infringe the property rights of third parties. Rather, users have a duty to check the content used for infringements, which is not easy with AI-generated content. Users cannot know which sources are used to train the AI and whether the rights of third parties have been infringed. Theoretically, it is even possible for the AI to reproduce third-party works with which it has been trained in its output one-to-one or in part, but this is more of an exception.

Most AI programs do not copy other people’s works, but create patterns based on training data, similar to the human brain. The probability of being warned is therefore not particularly high. Nevertheless, it is important not to accept the generated result without reflection, but to check for legal infringements, e.g. with plagiarism software
or a reverse search in the trademark register. If possible, the AI-generated result should be revised several times with additional prompts or other programs.

The European AI Regulation(AI Act), which came into force on August 1, 2024, requires operators of generative AI programs to develop a copyright compliance strategy and publish detailed summaries of training data (recitals 106 and 107). However, it will take some time for AI operators to fulfill this obligation, as the regulation is being implemented gradually.

Caution is also required when formulating prompts. No prompts should be published that contain copyrighted works or parts thereof, third-party brands, personal photos or other personal data. This can also result in a warning.

Can the AI be trained with copyrighted works? Can I prevent the AI from being trained with my works?

In order for the AI to be able to generate content itself, it must first be trained with open sources (i.e. also with OER). This is probably legally permissible in Germany. It is predominantly argued that the copyright barrier for text and data mining in Section 44 b UrhG justifies the training.

Training AI with third-party content is not so easy to prevent. Many readers will remember the case involving Meta in the summer of 2024. The NRW consumer advice center had issued a warning to Meta because the company wanted to use user data on Instagram and Facebook to train AI without the users’ consent. Ultimately, the users were allowed to object to the training. However, the way to object was somewhat complex.

For content on websites and blogs, it is possible to declare a reservation of use against the training, the so-called opt-out from data mining. This must be done by storing a file in the root directory of the domain, i.e. in machine-readable form, so that it can be read by web crawlers. Web crawlers are programs that search the Internet and analyze websites. How exactly this works is explained, for example, in the tutorial How to block OpenAI’s ChatGPT, Google’s Gemini and other bots that want to use your texts for their AI by Kai Spriestersbach. However, an opt-out can also result in website content not being displayed as search results.

Do I have to label AI-generated content as such?

Since AI-generated content is in the public domain, there is currently no legal obligation to cite the source. Nevertheless, materials created with AI should always be labeled as such. This makes it easier for subsequent users to understand the creation process and the origin of the material. It is also made clear that the material is in the public domain, i.e. not subject to any copyright restrictions. The labeling could, for example, be as follows: “AI-generated. Public domain. Created with the program XY. Prompt: XY. Edits: XY”.

The OER material in which AI-generated content is used can be openly licensed. Licenses should be chosen that are the least restrictive for subsequent use, i.e. CC Zero release, CC BY and CC BY SA. The public domain AI content must be excluded from the license: “This OER material is licensed under license XY. The marked AI-generated images are not covered by the license. These are in the public domain.”

Something different applies to the so-called remixing (merging) of materials into a new work. If AI-generated content and the OER material are merged in such a way that a new work is created (e.g. an image collage, a video), the new overall work may be openly licensed. Till Kreutzer explains this as an example in the video Open Educational Resources, Copyright and AI.

Conclusion

AI programs can be used to create OER in compliance with the law if the above rules are observed. If you would like to find out more about the topic or would like to deepen what you have read, visit our workshop AI & OER in use: Enhancing OER in a diverse and legally compliant way with AI. For registration and further questions, please contact info@twillo.de.

Create

Prepare

Share

The OER Basics

German Copyright Law

Open Licenses: Creative Commons

AI in open university teaching

OER-Policy

Understanding of quality

Blog

Events

Templates

FAQ

Videos