The main challenges for generative AI privacy compliance: the EDPB new report on ChatGPT Taskforce

27 Giugno 2024

On April 13, 2023, the European Data Protection Board (“EDPB”) established a taskforce to foster cooperation and exchange information on possible enforcement actions on the processing of personal data in the context of ChatGPT (“ChatGPT Taskforce”). As is commonly known, ChatGPT is the generative artificial intelligence (“G-AI”) service provided by the company OpenAI OpCo, LLC.

The work of ChatGPT Taskforce has led to the creation of a Report on the work of the ChatGPT taskforce, which was recently adopted by the EDPB (“Report”). The Report provides the preliminary views of the ChatGPT Taskforce without prejudging the analysis that the Supervisory Authorities will have to conduct in their respective investigations.

The Report highlights several important privacy related issues that could potentially impact all developers and deployers of G-AI solutions. Indeed, the algorithms of such Large Language Models (“LLMs”) are trained using the so-called web scraping technique, which enables the automated collection and extraction of various types of information (including, personal data and even special categories of personal data) from different publicly available sources on the Internet.

Additionally, the taskforce members have developed a common questionnaire as a possible basis for their exchanges with Open AI, which is published as an annex to the Report. This set of questions aims to promote a coordinated approach to the investigation and can also serve as a useful guide for other providers in developing G-AI systems compliant with the data protection regulation.

But which are the principles on personal data applicable to ChatGPT?

  1. Lawfulness

Generally speaking, the EDPB recalls that each processing of personal data must meet one of the legal basis set forth in Article 6(1) of the Regulation (EU) 2016/679 (“GDPR”) and, where applicable, the additional requirements for processing special categories of personal data pursuant to Article 9(2) of the GDPR.

The Report details the use of legitimate interest for the collection and processing of personal data to train ChatGPT and sets out the limits within which, according to the EDPB, this could be considered acceptable. Indeed, the EDPB emphasises that while legitimate interest may potentially be used as legal basis, it should be based on a proper legitimate interest assessment (LIA), and adequate safeguards to reduce undue impact on data subjects should be implemented, potentially changing the balancing test in favour of the controller. Such safeguards could include, for example:

  1. technical measures, defining precise collection criteria;
  2. ensuring some categories of personal data are not collected or some sources (e.g. public social media profiles) are excluded from data collection;
  3. erasure and anonymisation of personal data collected via web scraping before the training stage.
  • Fairness

The principle of fairness requires that personal data should not be processed in a manner that is unjustifiably detrimental, unlawfully discriminatory, unexpected or misleading to the data subject.

A crucial aspect of this principle is that there should be no risk transfer. This means that ensuring compliance with the GDPR is a responsibility of OpenAI and not of the data subjects, even when individuals input personal data.

  • Information obligations, transparency and data accuracy

The Report outlines that if personal data are collected via web scraping from publicly accessible sources or while directly interacting with ChatGPT, the controller should provide proper information to the data subjects.

Given the vast amount of data collected via web scraping, it is often not practicable or possible to inform each data subject individually. Thus, the controller shall provide all the information set forth in art. 14(1) and (2) of the GDPR, such as general information and contact details of the controller, categories of personal data being processed, data retention period, rights of the data subjects, etc.

Conversely, when personal data are collected directly from the data subject (i.e. pursuant to art. 13 of the GDPR), the controller shall inform them that the content (i.e. the prompt, the uploaded files and ChatGPT responses) is used to train and improve the LLM.

Furthermore, due to the probabilistic nature of ChatGPT, the EDPB highlights that the controller should not only provide proper information about the probabilistic output creation mechanism and its limited reliability, but also disclose that the generated text, although syntactically correct, may be biased or made up.

  • Rights of the data subjects

The Report stresses the importance of data subjects being able to effectively exercise their rights. While OpenAI, as controller, provides information on how to exercise these rights in its privacy policy[1], the EDPB – pursuant to art. 12(2) and Recital 59 of the GDPR – asserts that the controller shall continue to improve the modalities for facilitating the exercises of such rights. This is particularly relevant as OpenAI suggests shifting from rectification to erasure when rectification is not feasible due to the technical complexity of ChatGPT.

In line with the principles of privacy by design and by default, the controller shall adopt appropriate measures, both when determining the means of processing and when processing itself, to effectively implement data protection principles and integrate the necessary safeguards into processing to meet GDPR requirements and protect the rights of data subjects.

[1] Europe privacy policy of OpenAI, paragraph 6.

2024 - Morri Rossetti

I contenuti pubblicati nel presente sito sono protetti da diritto di autore, in base alle disposizioni nazionali e delle convenzioni internazionali, e sono di titolarità esclusiva di Morri Rossetti e Associati.
È vietato utilizzare qualsiasi tipo di tecnica di web scraping, estrazione di dati o qualsiasi altro mezzo automatizzato per raccogliere informazioni da questo sito senza il nostro esplicito consenso scritto.
Ogni comunicazione e diffusione al pubblico e ogni riproduzione parziale o integrale, se non effettuata a scopo meramente personale, dei contenuti presenti nel sito richiede la preventiva autorizzazione di Morri Rossetti e Associati.