Start your day with intelligence. Get The OODA Daily Pulse.
Our hypothesis on AI Hallucinations is based on a quote from the OODAcon 2024 panel, “The Next Generative AI Surprise: “Artificial Intelligence hallucinations may sometimes provide an output that is a very creative interpretation of something or an edge case that proves useful.” With that framing in mind, the following is the first installment in our survey of differing perspectives on the threats and opportunities created by AI hallucinations.
Following are working definitions of AI hallucinations generated by LLMs based on the prompt: What is an LLM hallucination?
An LLM hallucination refers to instances where a language model generates information or responses that, while coherent and plausible-sounding, do not align with factual reality. This phenomenon is a byproduct of the model’s design to predict and generate text based on statistical patterns recognized in its vast training data. Essentially, these models excel at crafting sentences that mimic human language use, drawing on the immense corpus of text they’ve been trained on to produce responses that seem contextually appropriate. However, this capability does not equate to understanding the world or the veracity of the information being generated.
The term “hallucination” in this context is apt because it captures the essence of the problem: the model is creating something out of nothing, much like a mirage. It’s a fascinating yet challenging aspect of working with generative AI, as it requires users to critically assess the output of these models, understanding that while the language may sound convincing, it may be entirely fabricated or misleading.
This characteristic of LLMs underscores a critical limitation in their current state of development:
Understanding LLM hallucinations is crucial for anyone working with or relying on AI-generated content. It highlights the importance of maintaining a critical perspective on these models’ output. It reminds us that while AI can augment human capabilities in many ways, it also introduces new challenges that require careful navigation.
An “LLM hallucination” refers to instances where a large language model (LLM), like ChatGPT, generates plausible-sounding information that is incorrect or completely fabricated. This occurs due to how these models are trained and operated. Here are some key points about LLM hallucinations:
Training on Patterns: LLMs are trained on vast amounts of text data and learn to predict the next word in a sequence based on patterns observed in the training data. During operation, they cannot access external databases or the Internet or “understand” content as humans do.
Lack of Factuality: Since LLMs generate responses based on statistical likelihoods rather than factual verification, they can sometimes create responses that seem logical and coherent but are factually incorrect. This is particularly common when the model is asked about niche topics or specific details that may not be well-represented in the training data.
Context and Complexity: The accuracy of an LLM’s output can also depend on the complexity of the query and the context in which information is presented. Misinterpretations or slight errors in understanding the context can lead to responses that deviate from the truth.
Misinformation and Data Quality: The training data for LLMs can include inaccuracies, biases, and misinformation, which can be reflected in the model’s outputs. The model might also generate confident assertions about topics it has been trained with low-quality or incorrect data.
Due to these issues, while LLMs like ChatGPT are powerful tools for generating human-like text, their outputs must be critically evaluated, especially for tasks requiring high accuracy or reliance on up-to-date information. Users need to verify facts from reliable sources when accuracy is crucial.
According to the MIT Technology Review Emtech 2023, “In generative AI systems, “hallucinations” occur when large language models fabricate information, presenting fiction as fact. Research is underway at MIT to mitigate and eliminate the manufactured misinformation by generative AI systems in hopes of developing more trustworthy AI systems.”
In the interim – before AI hallucinations are solved – if ever? – how are AI hallucinations possibly a creative interpretation of something or an edge case that proves helpful? The following is a broad spectrum of research findings on LLM hallucinations, with an eye towards a broad spectrum of “creative interpretations and edge cases” with benevolent and malevolent intent, as we are agnostic and open at this point in the generative AI onslaught.
ChatGPT can offer coding solutions, but its tendency for hallucination presents attackers with an opportunity. Here’s what we learned.
In early 2023, VULCAN provided a creative edge case – for good or ill – for hacker access into developer ecosystems by way of LLM hallucinations:
We’ve seen ChatGPT generate URLs, references, and even code libraries and functions that do not exist. These LLM (large language model) hallucinations have been reported before and may be the result of old training data. If ChatGPT fabricates code libraries (packages), attackers could use these hallucinations to spread malicious packages without using familiar techniques like typosquatting or masquerading. Those techniques are suspicious and already detectable. But if an attacker can create a package to replace the “fake” packages recommended by ChatGPT, they might be able to get a victim to download and use it.
The impact of this issue becomes clear when considering that. In contrast, previously, developers had been searching for coding solutions online (for example, on Stack Overflow); many have now turned to ChatGPT for answers, creating a major opportunity for attackers.
We have identified a new malicious package spreading technique we call “AI package hallucination.” The technique relies on the fact that ChatGPT, and likely other generative AI platforms, sometimes answers questions with hallucinated sources, links, blogs, and statistics.
It will even generate questionable fixes to CVEs and – in this specific case – offer links to coding libraries that don’t exist. Using this technique, an attacker starts by formulating a question asking ChatGPT for a package that will solve a coding problem. ChatGPT then responds with multiple packages, some of which may not exist. Things get dangerous when ChatGPT recommends packages not published in a legitimate repository (e.g., npmjs, Pypi, etc.).
When the attacker finds a recommendation for an unpublished package, they can publish their malicious package in its place. The next time a user asks a similar question, they may receive a recommendation from ChatGPT to use the now-existing malicious package. We recreated this scenario in the proof of concept below using ChatGPT 3.5.
Steven Zurier covered the VULCAN research findings and tracked down various industry respondents for some perspective:
Bar Lanyado, Security Researcher, Voyager18 Research Team at Vulcan Cyber: “It’s very concerning how repetitive its answers are and how easily it responds with hallucinations,” said Lanyado. “We should expect to continue to see risks like this associated with generative AI and that similar attack techniques could be used in the wild. It’s just the beginning; generative AI tech is still pretty new. From a research perspective, we’ll likely see many new security findings in the coming months and years. That said, virtually all generative AI providers are working hard to decrease hallucinations and ensure that their products do not create cyber risks, and that’s reassuring.”
Melissa Bischoping, Director of Endpoint Security Research at Tanium said companies should never download and execute code they don’t understand and haven’t tested—such as open-source GitHub repos or ChatGPT recommendations. Bischoping said teams should evaluate for security any code they intend to run and have private copies of it. “Do not import directly from public repositories such as those used in the example attack,” said Bischoping. “In this case, attackers are using ChatGPT as a delivery mechanism. However, compromising the supply chain through shared/imported third-party libraries is not novel. “Use of this strategy will continue, and the best defense is to employ secure coding practices and thoroughly test and review code intended for use in production environments,” she continued. “Don’t blindly trust every library or package you find on the internet or in a chat with an AI.”
Bud Broomhead, Chief Executive officer at Viakoo, added that this case serves as an example of yet another chapter in the arms race that exists between threat actors and defenders…“Ideally, security researchers and software publishers can also leverage generative AI to make software distribution more secure,” said Broomhead The industry is in the early stages of generative AI being used for cyber offense and defense, Broomhead continued, who credited Vulcan and other organization with detecting new threats in time to prevent similar exploits. “Remember, only a few months ago, I could ask ChatGPT to create a new piece of malware, and it would,” he said. It takes very specific and directed guidance to create it inadvertently—and hopefully, even that approach will be prevented by the AI engines soon.”
For the full research analysis from the VULCAN team, go to this link.
The VULCAN research suggests, “It can be difficult to tell if a package is malicious if the threat actor effectively obfuscates their work or uses additional techniques such as making a functional trojan package. Given how these actors pull off supply chain attacks by deploying malicious libraries to known repositories, developers need to vet the libraries they use to ensure they are legitimate. This is even more important with suggestions from tools like ChatGPT, which may recommend packages that don’t exist or didn’t before a threat actor created them. There are multiple ways to do it, including checking the creation date, number of downloads, comments (or a lack of comments and stars), and looking at any of the library’s attached notes. If anything looks suspicious, think twice before you install it.”
Trusting ChatGPT for specific software code packages or code repository recommendations requires careful consideration due to several inherent limitations and risks:
Accuracy and Hallucination: As discussed, ChatGPT – like other large language models – can generate plausible-sounding but incorrect or fabricated information—a phenomenon often called “hallucination.” In the context of software packages or repositories, it might suggest non-existent libraries, misstate functionalities or provide incorrect configuration instructions.
Out-of-Date Information: The training data for ChatGPT only goes up until a certain point in time and doesn’t include real-time updates. This can be particularly problematic in the fast-evolving world of software development, where new versions, patches, and vulnerabilities are continuously emerging.
Security Implications: Using software recommendations without verification can be risky, as they can include deprecated or vulnerable packages. An attacker could potentially exploit commonly recommended but outdated or insecure packages. Furthermore, if attackers know the datasets used for training models like ChatGPT, they could influence or poison these datasets to promote specific malicious packages.
Verification and Due Diligence: Relying solely on AI-generated recommendations is risky for critical tasks, particularly in professional or production environments. Due diligence involves consulting official documentation, trusted community sources, and verified user reviews. This is especially important for security-sensitive applications, where the cost of an error can be high.
Therefore, while ChatGPT can offer general guidance and be a helpful tool for brainstorming or learning, its recommendations for software packages or code repositories should not be the sole source of truth. Always cross-reference with up-to-date, authoritative sources and perform thorough security checks before integrating new software into your projects. This approach helps mitigate risks associated with potential inaccuracies or outdated information.
For more OODA Loop News Briefs and Original Analysis, see OODA Loop | Generative AI OODA Loop | LLM
The Next Generative AI Surprise: At the OODAcon 2022 conference, we predicted that ChatGPT would take the business world by storm and included an interview with OpenAI Board Member and former Congressman Will Hurd. Today, thousands of businesses are being disrupted or displaced by generative AI. This topic was further examined at length at OODAcon 2023, taking a closer look at this innovation and its impact on business, society, and international politics. The following are insights from an OODAcon 2023 discussion between Pulkit Jaiswal, Co-Founder of NWO.ai, and Bob Flores, former CTO of the CIA.
What Can Your Organization Learn from the Use Cases of Large Language Models in Medicine and Healthcare?: It has become conventional wisdom that biotech and healthcare are the pacecars in implementing AI use cases with innovative business models and value-creation mechanisms. Other industry sectors should keep a close eye on the critical milestones and pitfalls of the biotech/healthcare space – with an eye toward what platform, product, service innovations, and architectures may have a potable value proposition within your industry. The Stanford Institute for Human-Centered AI (HAI) is doing great work fielding research in medicine and healthcare environments with quantifiable results that offer a window into AI as a general applied technology during this vast but shallow early implementation phase across all industry sectors of “AI for the enterprise.” Details here.The Origins Story and the Future Now of Generative AI: This book explores generative artificial intelligence’s fast-moving impacts and exponential capabilities over just one year.
Generative AI – Socio-Technological Risks, Potential Impacts, Market Dynamics, and Cybersecurity Implications: The risks, potential positive and negative impacts, market dynamics, and security implications of generative AI have emerged – slowly, then rapidly, as the unprecedented hype cycle around artificial intelligence settled into a more pragmatic stoicism – with project deployments – throughout 2023.
In the Era of Code, Generative AI Represents National Security Risks and Opportunities for “Innovation Power”: We are entering the Era of Code. Code that writes code and code that breaks code. Code that talks to us and code that talks for us. Code that predicts and code that decides. Code that rewrites us. Organizations and individuals prioritizing understanding how the Code Era impacts them will develop increasing advantages in the future. At OODAcon 2023, we will be taking a closer look at Generative AI innovation and its impact on business, society, and international politics. IQT and the Special Competitive Studies Project (SCSP) recently weighed in on this Generative AI “spark” of innovation that will “enhance all elements of our innovation power” – and the potential cybersecurity conflagrations that that same spark may also light. Details here.
Corporate Board Accountability for Cyber Risks: With a combination of market forces, regulatory changes, and strategic shifts, corporate boards and directors are now accountable for cyber risks in their firms. See: Corporate Directors and Risk
Geopolitical-Cyber Risk Nexus: The interconnectivity brought by the Internet has caused regional issues that affect global cyberspace. Now, every significant event has cyber implications, making it imperative for leaders to recognize and act upon the symbiosis between geopolitical and cyber risks. See The Cyber Threat
Ransomware’s Rapid Evolution: Ransomware technology and its associated criminal business models have seen significant advancements. This has culminated in a heightened threat level, resembling a pandemic’s reach and impact. Yet, there are strategies available for threat mitigation. See: Ransomware, and update.
Challenges in Cyber “Net Assessment”: While leaders have long tried to gauge both cyber risk and security, actionable metrics remain elusive. Current metrics mainly determine if a system can be compromised without guaranteeing its invulnerability. It’s imperative not just to develop action plans against risks but to contextualize the state of cybersecurity concerning cyber threats. Despite its importance, achieving a reliable net assessment is increasingly challenging due to the pervasive nature of modern technology. See: Cyber Threat
Decision Intelligence for Optimal Choices: Numerous disruptions complicate situational awareness and can inhibit effective decision-making. Every enterprise should evaluate its data collection methods, assessment, and decision-making processes for more insights: Decision Intelligence.
Proactive Mitigation of Cyber Threats: The relentless nature of cyber adversaries, whether they are criminals or nation-states, necessitates proactive measures. It’s crucial to remember that cybersecurity isn’t solely the IT department’s or the CISO’s responsibility – it’s a collective effort involving the entire leadership. Relying solely on governmental actions isn’t advised given its inconsistent approach towards aiding industries in risk reduction. See: Cyber Defenses
The Necessity of Continuous Vigilance in Cybersecurity: The consistent warnings from the FBI and CISA concerning cybersecurity signal potential large-scale threats. Cybersecurity demands 24/7 attention, even on holidays. Ensuring team endurance and preventing burnout by allocating rest periods are imperative. See: Continuous Vigilance
Embracing Corporate Intelligence and Scenario Planning in an Uncertain Age: Apart from traditional competitive challenges, businesses also confront unpredictable external threats. This environment amplifies the significance of Scenario Planning. It enables leaders to envision varied futures, thereby identifying potential risks and opportunities. Regardless of their size, all organizations should allocate time to refine their understanding of the current risk landscape and adapt their strategies. See: Scenario Planning