Start your day with intelligence. Get The OODA Daily Pulse.
The Italian Data Protection Authority last month banned ChapGPT after the OpenAI large language model (LLM)-based XaaS platform experienced a major data breach.
The official statement from the agency, known in English as the Italian SA, pointed out the following specific reasons for the ban:
“OpenAI’s ChatGPT…suffered its first major personal data breach.
The breach came during a March 20 outage and exposed payment-related and other personal information of 1.2% of the ChatGPT Plus subscribers who were active during a specific nine-hour window, according to a blog post by OpenAI Friday, March 24.
‘In the hours before we took ChatGPT offline on Monday, it was possible for some users to see another active user’s first and last name, email address, payment address, the last four digits (only) of a credit card number, and credit card expiration date. Full credit card numbers were not exposed at any time,’OpenAI officials wrote…” (1)
Writ large, ChatGPT violates the Italina SA’s Personal Data Protection Code and the European Union’s General Data Protection Regulation (GDPR): Failures to notify users of data collection, as well as to justify the hoovering of information, would run afoul of the European Union’s General Data Protection Regulation, raising the possibility that other countries in the bloc may follow suit in cracking down on the program. (2)
Wired magazine also reports that the Italian ban “may just be the beginning of ChatGPT’s regulatory woes:
The action is the first taken against ChatGPT by a Western regulator and highlights privacy tensions around the creation of giant generative AI models, which are often trained on vast swathes of internet data. Just as artists and media companies have complained that generative AI developers have used their work without permission, the data regulator is now saying the same for people’s personal information.
Similar decisions could follow all across Europe. In the days since Italy announced its probe, data regulators in France, Germany, and Ireland have contacted the [Italian SA] to ask for more information on its findings. ‘If the business model has just been to scrape the internet for whatever you could find, then there might be a really significant issue here,’ says Tobias Judin, the head of international at Norway’s data protection authority, which is monitoring developments. Judin adds that if a model is built on data that may be unlawfully collected, it raises questions about whether anyone can use the tools legally.
Europe’s GDPR rules, which cover the way organizations collect, store, and use people’s personal data, protect the data of more than 400 million people across the continent. This personal data can be anything from a person’s name to their IP address—if it can be used to identify someone, it can count as their personal information. Unlike the patchwork of state-level privacy rules in the United States, GDPR’s protections apply if people’s information is freely available online. In short: Just because someone’s information is public doesn’t mean you can vaccuum it up and do anything you want with it.
The [Italian SA] believes ChatGPT has four problems under GDPR:
‘The Italians have called their bluff,’ says Lilian Edwards, a professor of law, innovation, and society at Newcastle University in the UK. ‘It did seem pretty evident in the EU that this was a breach of data protection law.’” (3)
“OpenAI isn’t alone. Many of the issues raised by the Italian regulator are likely to cut to the core of all development of machine learning and generative AI systems, experts say. The EU is developing AI regulations, but so far there has been comparatively little action taken against the development of machine learning systems when it comes to privacy.
‘There is this rot at the very foundations of the building blocks of this technology—and I think that’s going to be very hard to cure,’ says Elizabeth Renieris, senior research associate at Oxford’s Institute for Ethics in AI and author on data practices. She points out that many data sets used for training machine learning systems have existed for years, and it is likely there were few privacy considerations when they were being put together.
‘There’s this layering and this complex supply chain of how that data ultimately makes its way into something like GPT-4,’ Renieris says. ‘There’s never really been any type of data protection by design or default.'” (4)
https://oodaloop.com/archive/2019/02/27/securing-ai-four-areas-to-focus-on-right-now/
https://oodaloop.com/archive/2023/04/08/in-an-open-letter-tristan-harris-et-al-call-for-a-pause-on-the-training-of-ai-systems-more-powerful-than-gpt-4/
https://oodaloop.com/ooda-original/2023/04/07/bill-gates-weighs-in-on-the-opportunities-and-responsibilities-of-a-chatgpt-based-ai-future/
https://oodaloop.com/ooda-original/disruptive-technology/2022/12/22/ooda-loop-2022-the-past-present-and-future-of-chatgpt-gpt-3-openai-nlms-and-nlp/