AI governance challenges part 3: Proliferation

Bengüsu Özcan & Eva Behrens | October 2024

This article is part three of a series on key regulatory challenges of advanced AI systems. The first and second articles focused on unexpected capabilities and deployment safety.

Over the past year, interest in governing highly-capable general purpose AI has surged both nationally and internationally. One challenge decision-makers face with the governance and regulation of advanced AI is the risk of proliferation. We define proliferation in this context as the rapid spread of advanced AI models with cutting-edge capabilities, either via free access to these models or a leakage or theft of these models, leading to both intentional and unintentional misuse.

Risks from unauthorized access to software are not unique to AI, as in threats like 3D-printed firearms. However, the scale and the impact of potential dangers of advanced AI can be far greater than other technologies. Future advanced AI models could enable development of bioweapons or automated cyberattacks which pose a large-scale risk. Unlike more conventional technologies, AI capabilities advance very rapidly or emerge only after the models are being deployed to the public, potentially catching regulators and developers off guard. In this blog post, we focus on proliferation through open source models and model theft, as their governance challenges and potential measures to address these are more alike.

What does 'open source' mean in the context of AI?

Open source generally refers to software with a public codebase, allowing anyone to inspect, modify, and distribute the code. In the context of AI, the term “open source” is often used more broadly to describe models that only have part of their components public, such as their weights. For example, Llama3 LLM, introduced by Meta as an open-source model, provides access to its weights but not its entire codebase or training data. The weights of a model are the result of a costly and lengthy training process and are crucial for generating outputs. Therefore, open-weight models are particularly easier to tweak than models shared with other components except the weights. However, since a dedicated actor with sufficient resources can still utilize any available components, our arguments in this blog post apply to all levels of open source.

Open source advanced AI models have their architecture, code, weights and data or a combination of these freely and publicly accessible. While beneficial for scientific research and product innovation, open sourcing also allows easier embedding of models’ capabilities into downstream applications and modifications of models, potentially for malicious purposes, without the long and costly model training process.

Even when models are open sourced with certain safety guardrails or restrictions, it is possible to bypass or violate them, as researchers have shown, particularly in relation to Meta’s LLaMA NLP model. Sophisticated misinformation, manipulation and scamming tools are already widely available. Even though no major incident has occurred yet, experts expect future models to be capable of generating bioweapons or designing sophisticated cyberattacks, posing significant misuse risks if widely accessible.

As advanced AI models become more strategically important, they are also more open to targeted intellectual property thefts and leaks. The labs developing advanced AI, such as OpenAI or Anthropic, recognize this risk and publicly commit to rigorous security measures; however, it is unclear whether these efforts are genuine or would be sufficient.

The challenge

Once publicly accessible, advanced AI capabilities can proliferate, i.e. spread rapidly, within time frames ranging from days to a few weeks or months as seen in the examples in Table 1. This is very likely to catch governance bodies off guard, making it difficult for them to address or even detect these issues in a timely manner.

It might be even more challenging if the risk is at the international level. For instance, if a series of sophisticated AI-assisted cyber attacks take place that target public infrastructure across multiple countries, these countries may not have sufficient time to collaborate and potentially pool resources to combat the attack, given that international coordination can be slow.

Proliferation Case	Time to proliferate
StyleGAN, NVIDIA’s realistic image generation model, was open sourced in 2019. Images generated through this model went viral through sites such as thispersondoesnotexist.com. Fake social media accounts using such pictures were discovered later that year.	Days
Meta AI allowed researchers to apply for the model weights of LLaMa, their LLM launched in February 2023. Within a week, various users had posted these weights on multiple websites, violating the terms under which the weights were distributed.	1 week
In March 2023, Stanford researchers created a low-cost AI model called Alpaca by fine-tuning Meta’s LLaMA model with text completion data from OpenAI, spending under $600. Although they took the model offline due to safety concerns, the instructions for recreating it are available on GitHub.	3 months

Table 1: Past incidents have shown that proliferation can happen quite fast, requiring proactive governance and preventative measures to address such incidents in a timely manner. This table is simplified and rephrased based on the Table 2 of Frontier AI Regulation: Managing Emerging Risks to Public Safety.

Another governance challenge with proliferation is that it is irreversible i.e. once a model is out there, it is likely out there forever. Given that the software can be easily downloaded, copied and stored offline on multiple devices, and can be shared widely and rapidly, tracing the spread of a model and its components might be practically impossible once it is shared, stolen or leaked.

The Solution: Anticipatory Governance

Given the rapid nature of AI proliferation, reactive governance measures are likely to be ineffective. Anticipatory measures, on the other hand, could address the proliferation challenges by preventing the scenarios that raise the risks in the first place.

A key governance intervention to address proliferation through open source could be restricting it or introducing a staged open sourcing regime for such models. In order not to stifle overall AI innovation and research, such a measure should explicitly target high-risk models. Currently, high-risk, general-purpose models require significantly more computing resources than other models, which makes compute a common benchmark referred to in existing governance frameworks for advanced AI such as the EU AI Act or the U.S. Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. However, similar capabilities might be achieved with less compute over time, necessitating frequent updates to regulations by experts to reflect AI developments.

Another intervention to address proliferation through model theft and leaks could be enforcing appropriate security standards for advanced AI developers based on a risk classification, and auditing their compliance, such as international safety standards for nuclear power or national cybersecurity standards. Existing security standards could provide a foundational framework; however, these standards should be tailored for advanced AI developers to secure strategic components of the AI models, such as securing model weights.

An open question for implementing anticipatory governance for security of advanced AI models is whether these measures should be determined at the national or international level. Currently, the framing of security of advanced AI models is heavily influenced by the national security concerns and AI competition between the US and China. Recent decisions and statements like OpenAI appointing a former US Army General to its board and Anthropic addressing national security as a priority in their red-teaming efforts suggest that leading AI companies might be increasingly considering national security issues. Even if model theft occurs as a result of a targeted national attack, if the model is leaked widely after the incident, the impacts of the risk might still occur at the global scale. Therefore, even though national interventions could increase the overall security of advanced AI, adhering to an internationally recognized security standard for advanced AI would be ideal.

Conclusion

Proliferation of AI capabilities can catch AI governance off guard, necessitating proactive measures that address the root causes of the issue. Anticipatory governance measures to address these risks might include limiting the open sourcing of models that are identified as high-risk and imposing stringent security standards on AI developers. Given that the consequences of AI proliferation can be global once models are widely shared, these measures should ideally be built through international consensus and cooperation, and apply internationally.