This is the final part of our four-part series on the EUIPO study on GenAI and copyright. Read parts 1, 2, and 3.
The EUIPO study provides detailed insights into the evolving relationship between GenAI and copyright law, highlighting both the complex challenges and emerging solutions in this rapidly developing field. As discussed in the previous parts of this series, the study addresses crucial issues at both the training (input) and deployment (output) stages of GenAI systems.
GenAI Input: Key considerations
In its study, the EUIPO seems to advance a potentially flexible and broad interpretation of the requirements of Article 4 of the 2019 Copyright in the Digital Single Market Directive (CDSM Directive) for achieving valid opt-outs from text and data mining (TDM).
The EUIPO seems to regard the “expressly” requirement as not requiring references to specific works, explicit targeting of TDM use cases, or references to enabling legal provisions. The EUIPO’s approach would appear to afford greater flexibility to rightsholders, particularly benefiting smaller entities with limited resources (Takeaway 1). In addition, the EUIPO’s stance on the “by the rightsholder” requirement could extend to licensees and representatives acting on behalf of rightsholders, reflecting the practical realities of content management in the digital ecosystem. This interpretation acknowledges the complex structure of content distribution chains, where the entity with technical ability to implement an opt-out may be several steps removed from the original creator (Takeaway 2). The EUIPO’s report seems to regard natural language reservations as potentially satisfying the “appropriate means” requirement as AI capabilities advance and blur the line between “human-readable” and “machine-readable” content (Takeaway 3).
While various legal and technical opt-out measures exist, each has certain potential limitations. This suggests a pressing need for standardised opt-out solutions, potentially facilitated by IP offices to provide greater certainty and trust in the ecosystem (Takeaways 4 and 5).
GenAI Output: Key Challenges
On the output side, several critical issues emerge. The EUIPO suggests that the legal status of retrieval augmented generation (RAG) remains unclear, creating uncertainty about whether it qualifies as text and data mining (Takeaway 1). Technical solutions for transparency, while evolving, remain incomplete, with no universal standard for labelling or detecting AI-generated content yet meeting all requirements of the AI Act (Takeaway 2).
Model retention and regurgitation of protected materials presents a growing concern (Takeaway 3). Various technical mitigation approaches have been implemented, including data deduplication, content filtering systems, and post-training techniques such as model editing and “unlearning” (Takeaway 4).
The Path Forward
Moving forward, emerging technical solutions offer promising approaches to address copyright concerns. “Model unlearning” techniques enable the removal of protected content without complete retraining, while targeted editing methods allow for precise modifications to learned information. Data deduplication during training may reduce memorisation risks, addressing a recurring root cause of potential reproduction of copyrighted works.
Comprehensive risk management strategies are becoming essential, including corporate governance frameworks addressing the entire AI lifecycle thorough documentation, output filters, and regular evaluation. IP offices could also provide valuable guidance on technical standards and potentially serve as neutral arbiters in establishing best practices.
As the GenAI landscape continues to evolve, stakeholders must balance innovation with copyright law through a combination of legal frameworks, technical solutions, and industry collaboration. The EUIPO study provides a foundation for this ongoing conversation.
