Recently we see more and more posts popping up on LinkedIn and elsewhere on how to optimize sales pipelines and other business processes using ChatGPT or some of its siblings. While the proposition is very tempting, there are huge problems for privacy and the protection of personal data, in particular as required under the GDPR.
One of the primary reasons why generative AI is currently not well-suited for handling personal data is that the models are trained on vast amounts of text data, including both public and private sources and the data ‘fed’ to it in the prompts. This means that any personal or sensitive information that you feed into the model will be remembered and potentially reused in future outputs, putting both the data subject and the data controller (the person using the language model) at risk. Firstly, the data is transmitted to the provider of the AI, where it may be distributed and reused and there is no control over this data, leave alone that a much required Data Processing Agreement is not in place and no sufficient guarantees required for a transfer of data to the US are warranted. Secondly, the data could resurface in an unpredictable manner, when other users prompt the AI for something similar.
Using generative AI for example to analyse CVs or personal profiles, constitutes a clear violation of GDPR, the European Union’s General Data Protection Regulation. This regulation mandates that businesses and organizations protect personal data by limiting its processing and ensuring that it is only used for specific, legitimate purposes. By feeding personal data into a generative AI model, you are essentially using it for a purpose that is outside the scope of the original legal base, which is a clear violation of GDPR.
Some data protection authorities (such as the Italian Garante Privacy) have already proceeded to ban ChatGPT in their countries for privacy concerns, with others, including the German Datenschutzkonferenz, the French CNIL, the Privacy Commissioner of Canada and the European Data Protection Board having launched investigations into it and are expected to take similar measures on short term.
Clearly, if you wilfully feed personal data to a system like ChatGPT, there is not much doubt about you violating the GDPR in multiple ways.
Even though ChatGPT may become available in Italy again if certain requirements are met by April 30th, it’s unlikely to be a good idea to use it for personal data.
A somewhat adjacent consideration is that generative AI models are often trained using biased data, which can lead to biased outputs. If personal data is fed into the model, the biases inherent in the training data can be amplified, potentially leading to discriminatory or harmful outputs. This is particularly problematic in the context of personal data, where even unintentional discrimination or harm can have serious consequences for the data subject.
In short, using generative AI for anything involving personal data is a risky proposition that should be avoided until solid measures to protect this data are in place. There are plenty of business opportunities in this space to prioritize privacy and data protection, such as differential privacy or federated learning. These approaches are designed to protect personal data by limiting its exposure and minimizing the risk of re-identification, while still allowing for meaningful analysis and insights. TechGDPR has developed a support program for ethical and compliant use of Artificial Intelligence, which may help remediate issues.