Start your day with intelligence. Get The OODA Daily Pulse.
The Center for Security and Emerging Technology) (CSET) in a July 2021 policy brief, “AI Accidents: An Emerging Threat – What Could Happen and What to Do,” makes a noteworthy contribution to current efforts by governmental entities, industry, AI think tanks and academia to “name and frame” the critical issues surrounding AI risk probability and impact. For the current enterprise, as we pointed out as early as 2019 in Securing AI – Four Areas to Focus on Right Now, the fact still remains that “having a robust AI security strategy is a precursor that positions the enterprise to address these critical AI issues.” In addition, enterprises which have adopted and deployed AI systems also need to commit to the systematic logging and analysis of AI-related accidents and incidents.
Look no further than this CSET policy brief as a blueprint for such AI accident notification, reporting and analysis efforts. To foster open collaboration within your organization, the opportunity also exists to build your reliability and safety engineering activities in collaboration with the open innovation and crowdsourcing database (with a very compelling open-source taxonomy architecture) on which CSET authors Zachary Arnold and Helen Toner have based their AI accidents framework and policy recommendations: The Artificial Intelligence Incident Database (http://incidentdatabase.ai) – a project housed at the Partnership on AI.
“An accident…involves damage to a defined system that disrupts the ongoing future output of that system. Not all such disruptions should be classified as accidents; the damage must be reasonably substantial.” Incidents are at the part or unit level; Accidents are at the subsystem or system level.
– Normal Accidents: Living with High-Risk Technologies by Charles Perrow
Based on their research, CSET’s Arnold and Toner provide three classifications of AI accidents: robustness failures, specification failures, and assurance failures:
“Failures of robustness – the system receives abnormal or unexpected inputs that cause it to malfunction.”
Reliability engineering usually accounts for this assessment of high risk industrial or automated systems, but new areas of research within systems engineering need to emerge for the AI accident caused by robustness failures. As the CSET authors point out, distributional shift (a change in the type of data the system is given with a unique vulnerability in the training data for reinforcement learning in machine learning) is a high-risk probability for AI systems. It is quantifying the impact of such high probability accidents related to data inputs which is difficult to determine with AI system deployments.
Robustness failures are the types of AI accidents reported most often by mainstream media, such as implicit racial bias in facial recognition systems or medical misdiagnosis by AI-based diagnostics in healthcare systems. A classic false positive in the 1980’s was the five American ICBM’s heading right towards the Soviet Union in 1983. Human intervention solved the problem on the Soviet side – and the incident was only made public in 1998. The future of the mobility ecosystem is also a prescient example, with several examples of autonomous vehicles crashing based on issues with the vehicle’s AI software.
Source: Artificial Intelligence Incident Database, https://incidentdatabase.ai/cite/74
Data inputs can also be adversarial in nature or systems tricked to achieve cyberwar offensive capabilities via distribution shift or other manipulations of deep learning models. Cybersecurity strategies to protect AI systems need to take into account these types of AI accidents, as even the smallest issue with the data entered into sensitive AI systems will alter the results generated by the machine learning algorithms. The risk is that catastrophic failures based on robustness failures of high risk societal systems based on AI, or their component subsystems, will have an equally catastrophic human, economic, or geopolitical impact.
“Failures of specification – the system is trying to achieve something subtly different from what the designer or operator intended, leading to unexpected behaviors or side effects.”
Worrisome AI accidents and incidents fall under this category, which have already had severe impacts in the real world. Disinformation, political unrest, and harmful content via recommendation and personalization algorithms on social media platforms such as Facebook, YouTube and Twitter, according to the policy brief, “left to their own devices, algorithms built into popular social media platforms have unexpectedly boosted disturbing and harmful content, contributing to violence and other serious harm in the real world.” Broad machine learning specifications, such as “maximizing engagement” on social medial platforms are the usual culprit in these AI accidents. “Reward hacking” (i.e., Joshua in Wargames continuing to march towards global thermonuclear war) is highlighted as a specification failure as well, with objectives met per machine learning specifications while missing the point entirely of the intended goal of the system.
Source: Artificial Intelligence Incident Database, https://incidentdatabase.ai/cite/1.
Identification and testing are discussed at various points in the CSET policy brief. It is hard to imagine the scale and scope of the testbed capacity industry and government would have to create for this challenge, although the National Institute of Standards and Technology (NIST) has at least begun outreach for an effort which may become testbed capacity at the federal level.
The CSET illustrates the potential of specification failures by way of a brilliant future scenario, telling the story of a “microelectronic meltdown”, wherein a computer semiconductor factor optimization software interprets the broad optimization instruction “improve energy efficiency.” Calamities and otherwise negative outcomes which would be instantly clear from the ‘common sense’ perspective of a human operator are value neutral to the AI system.
The most logical energy efficiency, as a result, is to overheat the six lithography machines on the production line (new, $100m machines just installed in the factory). As far as the value neutral decision-making of the software is concerned, “if the lithography machines were destroyed by overheating, they would not produce any chips to package, and the packaging line would never start up—eliminating any chance of an unplanned outage” – thus improving energy efficiency. So, the optimization software found a place in an AI component system not available to human operators (see failures of assurance below) and overheated the lithography machines. Really clever value neutral thinking by the software to achieve its objective based on a broad specification – with a really disastrous operational result for the organization.
“Failures of assurance – the system cannot be adequately monitored or controlled during operation.”
Traditional validation by way of testing or analysis does not apply to AI due to the potential for billions of calculations behind every AI function. Validation can be achieved through only a small manual sampling of decisions, but 100% complete assurance is more complicated if not impossible.
An area of AI research known as the ‘black box’ problem is classified as an assurance failure – with the potential for causing severe AI accidents or, at the very least, any ability to troubleshoot an AI incident or intervene to stop it from becoming an accident. The black box problem refers to the inability to interpret or explain the reasoning, logics and motivation of a machine learning dataset by human operators of the system. AI interpretability and explainability are of serious concern to AI researchers, prompting some to call for AI based solely on an “inherently interpretable” AI framework for systems development. The argument is that reverse engineering failures of assurance caused by black box models is more difficult than developing 100% interpretable models in the first place on which the entire artificial intelligence ecosystem is based: “The way forward is to design models that are inherently interpretable.”
Quite simply, there is no opportunity for human in the loop designs conducive to the speed, complexity and computational power of AI systems on the horizon. In future systems, the reinforcement learning will be resistant to the notion of a ‘right’ and ‘wrong’ human intervention based on human logic, human confirmation of a problem, or just plain common sense. Monitoring and intervention mechanisms remain a fledgling area of AI research and the CSET points out that UI and UX were attributable to operational issues on the Navy destroyer the USS John S. McCain and the Boeing 737 Max. Some would argue that machine learning is, by its very nature, a ‘human out of the loop’ design methodology.
For more, see: AI Accidents: An Emerging Threat – What Could Happen and What to Do
OODA Loop – Securing AI – Four Areas to Focus on Right Now
OODA Loop – Pentagon Experiments with Self-Driving Shuttles at San Diego Military Base
OODA Loop – NIST Prioritizes External Input in Development of AI Risk Management Framework
Special Series on Artificial Intelligence
AI, machine learning, and data science will be used to create some of the most compelling technological advancements of the next decade. The OODA team will continue to expand our reporting on AI issues. Please check out the AI reports listed below.
A Decision-Maker’s Guide to Artificial Intelligence – This plain english overview will give you the insights you need to drive corporate decisions
When Artificial Intelligence Goes Wrong – By studying issues we can help mitigate them
Artificial Intelligence for Business Advantage – The reason we use AI in business is to accomplish goals. Here are best practices for doing just that
The Future of AI Policy is Largely Unwritten – Congressman Will Hurd provides insight on the emerging technologies of AI and Machine Learning.
Artificial Intelligence Sensemaking: Bringing together our special reports, daily AI news and references to AI from the most reliable sources we know.