Large language models: 6 pitfalls to avoid

From security and privacy concerns to misinformation and bias, large language models bring risk along with rewards.
Register or Login to like
Computer depicting lines of code, graphs, and other digital symbols; An image of a cloud appears to be coming out of the computer screen

There has recently been incredible progress in artificial intelligence (AI), due mainly to the advances in developing large language models (LLMs). These are the beating heart of text and code generation tools such as ChatGPTBard, and GitHub’s Copilot.

These models are on course for adoption across all sectors. But serious concerns remain about how they are created and used—and how they can be abused. Some countries have decided to take a radical approach and temporarily ban specific LLMs until proper regulations are in place.

Let’s look at some real-world adverse implications of LLM-based tools and some strategies to mitigate them.

1. Malicious content

LLMs can improve productivity in many ways. Their ability to interpret our requests and solve fairly complex problems means we can offload mundane, time-consuming tasks to our favorite chatbot and simply sanity-check the results.

But of course, with great power comes great responsibility. While LLMs can create helpful material and speed up software development, they can also enable rapid access to harmful information, accelerate the workflow of the bad guys, and even generate malicious content such as phishing emails and malware. The term “script kiddie” takes on a whole new meaning when the barrier to entry is as low as writing a well-constructed chatbot prompt.

While there are ways to restrict access to objectively dangerous content, they’re not always feasible or effective. In terms of hosted services like chatbots, content filtering can help to at least slow down an inexperienced user. Implementing strong content filters should be imperative, but they’re not bulletproof.

[ Also read Generative AI: 3 do's and don'ts for IT leaders. ]

2. Prompt injection

Specially crafted prompts can coerce an LLM into ignoring content filters and producing illicit output. This issue pervades all LLMs but is about to be amplified as these models connect with the outside world; for example, as plugins for ChatGPT. These could enable chatbots to ‘eval’ user-generated code, which can lead to arbitrary code execution. From a security perspective, equipping a chatbot with such functionality is very problematic.

To help mitigate this, it’s important that you understand the capabilities of your LLM-based solution and how it interacts with external endpoints. Determine if it has been connected to an API, is running a social media account, or is interacting with your customers without supervision, and evaluate your thread model accordingly.

While prompt injection may have appeared inconsequential in the past, these attacks can have very real consequences now as they begin to execute generated code, integrate into external APIs, and even read your browser tabs.

3. Data privacy/Copyright violation

Training large language models requires tremendous amounts of data, with some models numbering over half a trillion parameters. At this scale, understanding provenance, authorship, and copyright status is a gargantuan—if not impossible—task. An unvetted training set can result in a model that is leaking private data, misattributing citations, or plagiarizing copyrighted content.

Data privacy laws around the usage of LLMs are also very murky. As we’ve learned with social media, if something is free, chances are that users are the product. It’s worth remembering that if we ask a chatbot to find the bug in our code or to write a sensitive document, we’re sending that data to a third party, who may end up using it for model training, advertising, or competitive advantage. Data leakage through AI prompts can be especially damaging in business settings.

As LLM-based services become integrated with workplace productivity tools such as Slack and Teams, it’s crucial to carefully read providers’ privacy policies, understand how the AI prompts can be used, and regulate the use of LLMs at the workplace accordingly. Regarding copyright protections, we need to regulate data acquisition and use through opt-ins or special licensing without hindering the open and largely free internet as we have it today.

4. Misinformation

While they can convincingly feign intelligence, LLMs don’t really “understand” what they produce. Instead, their currency is the probabilistic relationships between words. They can’t distinguish between fact and fiction—some output might appear very believable but turn out to be a confidently worded untruth. An example of this is ChatGPT falsifying citations and even entire papers, as one Twitter user recently discovered first-hand.

LLM tools can prove incredibly useful in a vast array of tasks, but human beings must be involved in verifying the accuracy, benefits, and overall sanity of their responses.

The output of LLM tools should always be taken with a pinch of salt. These tools can prove incredibly useful in a vast array of tasks, but human beings must be involved in verifying the accuracy, benefits, and overall sanity of their responses. Otherwise, we are in for some disappointment.

5. Harmful advice

When chatting online, it’s becoming increasingly difficult to tell if you’re speaking to a human or a machine, and some entities might be tempted to take advantage of this. Earlier this year, for example, a mental health tech company admitted that some of its users who sought online counseling were unknowingly interacting with a GPT3-based bot instead of a human volunteer. This raised ethical concerns about using LLMs in mental healthcare and any other setting that relies on interpreting human emotions.

Currently, there’s little to no regulatory oversight in place to ensure that companies cannot utilize AI in this manner with or without the end user’s explicit consent. Moreover, adversaries can use convincing AI bots in espionage, scams, and other illegal activities.

AI doesn’t have emotions, but its responses can hurt people’s feelings or even lead to more tragic consequences. It’s irresponsible to assume that an AI solution can adequately interpret and respond to the emotional needs of a person responsibly and safely.

The use of LLMs in healthcare and other sensitive applications should be closely regulated to prevent any risk of harm to users. Providers of LLM-based services should always inform users of the scope of AI’s contribution to the service, and interacting with a bot should always be a choice rather than the default.

6. Bias

AI solutions are only as good as the data they’re trained on. This data often reflects our human biases toward political parties, ethnic groups, genders, or other demographics. Bias brings about negative consequences for the affected groups, where the model makes an unfair decision and can be both subtle and potentially difficult to address. Models trained on unvetted data from the Internet will always mirror human biases; models that constantly learn from user interaction are also prone to intentional manipulation.

To mitigate the risk of discrimination, LLM service providers must carefully evaluate their training datasets for any imbalances that may result in negative consequences. Machine learning models should also be periodically checked to ensure the predictions remain fair and accurate.

Large language models are completely redefining how we interact with software, bringing countless improvements to our workflows. However, with the current lack of meaningful regulations around AI and the scarcity of security aimed at machine learning models, the widespread and hasty implementation of LLMs will likely have significant downfalls. It’s therefore imperative to swiftly regulate and secure this precious technology.

[ Check out our primer on 10 key artificial intelligence terms for IT and business leaders: Cheat sheet: AI glossary. ]

Eoin Wickens
Eoin Wickens is a Senior Researcher at HiddenLayer, where he researches security for artificial intelligence and machine learning. He has previously worked in threat research, threat intelligence & malware reverse engineering and has been published over a dozen times, including co-authoring a book on defense against Cobalt Strike.
Marta Janus
Marta Janus is a Principal Researcher at HiddenLayer, where she focuses on investigating adversarial machine learning attacks and the overall security of AI-based solutions. Before joining HiddenLayer, Marta spent over a decade working as a researcher for leading anti-virus vendors.

Social Media Share Icons