There's a lot of confusion and unknowns regarding AI application risks, and a lot of vendors are trying to offer solutions to problems that aren't clearly defined. In this blog we explain why a smart approach is to start by focusing on basic, foundational cyber hygiene, adopt well-established best practices and enforce common-sense usage policies.
Introduction
With all of the excitement around large language models (LLMs) and artificial intelligence (AI) in general, there has been an inevitable rush to build the next great “powered by LLM” applications. As is frequently the case, that rush has been followed by a lot of questions about the related risks and about how to secure these applications. Some aspects of securing LLM-powered applications are novel and present serious complexities that require consideration. Meanwhile, other aspects are not that different from the risks that traditional vulnerability management programs have been dealing with all along.
OWASP provides a Top 10 list which identifies the following risks for LLM applications:
- Prompt Injection
- Insecure Output Handling
- Training Data Poisoning
- Model DoS
- Supply Chain Vulnerabilities
- Sensitive Information Disclosure
- Insecure Plugin Design
- Excessive Agency
- Overreliance
- Model Theft
Novel challenges
Risks like prompt injection, sensitive-information disclosure, and training-data poisoning are not in and of themselves new. However, the practicality of exploitation and the challenges of securing against those risks are still an area of extensive research. We’ve seen reports of prompt-injection attacks on language models where an attacker convinces the model to ignore its instructions and disclose information that it shouldn’t have disclosed. These types of attacks are technically achievable but because of the nature of LLMs, the practicality of these attacks is uncertain. Unlike a SQL injection attack where you can develop some degree of control of the data you get back, an LLM prompt-injection attack is always going to be subject to the behavior of the underlying model.
Securing an LLM application against prompt injection similarly presents unique challenges. With traditional command injection / SQL injection attacks, we have a high degree of control of the input and can significantly constrain what a user inputs, as well as filter any unwanted input in pre-processing. With LLMs, the ability to have more open-ended questions and responses, and potentially follow-up questions, means that any pre-filtering or processing risks reducing the functionality of the feature. This inevitably becomes a delicate balancing act between security and functionality. We will likely see further developments in both improved attacks against prompts as well as better tooling to protect those same prompts.
Similar challenges for both the attacker and defender also exist in output handling and training-data poisoning. These risks should not be ignored and teams should be as defensive as possible, given the degree of uncertainty and the complex challenges. However, it can be easy to get bogged down in trying to address these risks while overlooking equally dangerous vulnerabilities that are also less complex to exploit and more familiar to defenders.
What to do - start with the basics
Two core risks associated with AI and LLM applications are supply-chain vulnerabilities and the inherent privacy risks. Both of these should be top of mind for applications developed in-house and for third-party applications, such as those from software vendors and from open-source projects. . Fortunately these risks are much better understood and can be incorporated into existing vulnerability management programs without significant complexity.
Looking at some of the major libraries that support AI and LLM applications, such as Transformers, NLTK, Tensorflow, and Langchain, we can see that many of them have at least some high and critical severity vulnerabilities that have been disclosed. When developing applications, it is critical that teams have visibility on the libraries being used as well as on the vulnerabilities in those libraries. We’ve recently seen vulnerability databases such as https://avidml.org/database/ as well as bug bounty programs such as https://huntr.com/ that are focused on AI-specific vulnerabilities. These can be valuable resources for teams that want to focus specifically on ensuring they mitigate any vulnerabilities in those libraries.
Privacy risks can be mitigated to some degree by establishing strong policies on the language models that engineering teams are permitted to use when building in-house applications and monitoring to ensure that only those language models are used. This is particularly important for any language models that are hosted by a third party. Additionally, it’s vital to have an inventory of all applications that employees are using which have LLM-powered features, as these can become uncontrollable channels for data leakage. A search for the term “AI” in the Firefox browser-extension catalog returns over 2,000 results , while a search for the term “LLM” returns over 300 results. Many employees may choose to install these types of extensions without fully understanding the security and privacy implications, nor how any data they input may end up being passed to a third party or added to an LLM training dataset. Organizations should establish approved applications and browser extensions, as well as establish policies prohibiting the use of any extensions that have not been evaluated and approved for usage.
Conclusion
LLMs and AI in general have created a lot of excitement across organizations globally but have also left many security teams with unanswered questions about how they can manage the risks. We will inevitably see a rush of new products and features aimed at securing models and prompts, and at blocking any malicious activity. Given the uncertainty around both the reliability and practicality of attacks on LLMs and the ability of tools to block those attacks without negating the value of LLMs, it may be best to watch that space before rushing in. In the meantime, security teams can take concrete, well-understood and well-defined steps to reduce their risk and secure the attack surface. They can also establish strong corporate policies on the usage of LLMs and LLM-powered applications, and conduct monitor to ensure those policies are followed.