Guard rails for LLMs - Anima Blog
Frontier GenAI

Guard rails for LLMs4 min read

Reading Time: 3 minutes The conclusion is that you cannot ignore hallucinations. They are an inherent part of LLMs and require dedicated code to overcome. In our case, we provide the user with a way to provide even more context to the LLM, in which case we explicitly ask it to be more creative in its responses. This is an opt-in solution for users and often generates better placeholder code for components based on existing usage patterns.

Frontier - Guard rails for LLM

Guard rails for LLMs4 min read

Reading Time: 3 minutes

LLMs have made a profound leap over the last few years, and with each iteration companies like OpenAI, Meta, Anthropic and Mistral have been leapfrogging one another in general usability and, more recently, with the ability of these AI models to produce useful code.

 
However, as they are trained on a wide variety of code techniques, libraries and frameworks, trying to get them to produce a unique piece of code that would run as expected is still quite hard. Our first attempt at this was with our Anima Figma plugin, which has multiple AI features. In some cases, we intended to expand our ability to address new language variations and new styling mechanisms without having to create inefficient heuristic conversions that would be simply unscalable. Additionally, we wanted users to personalize the code we produce and have the capability of adding state, logic and more capabilities to the code that we produce from Figma designs. This proved much more difficult than originally anticipated. LLMs hallucinate, a lot. Fine-tuning helps, but only to some degree – it reinforces languages, frameworks and techniques that the LLM is already familiar with, but that doesn’t mean that the LLM won’t suddenly turn “lazy” (putting comments with /* todo */ instructions rather than implementing or even repeating the code that we wanted to mutate or augment). It’s also difficult to avoid just plain hallucinations where the LLM invents its own instructions and alters the developer’s original intent. 
 
But as the industry progresses, LLM laziness goes up and down and we can use techniques like multishot and emotional blackmail to ensure that the LLM sticks to the original plan. But in our case, we are measured by how well the code we produce is usable and visually represents the original design. We had to create a build tool that evaluated the differences and fed any build errors and even visual errors back to the LLM. If the LLM hallucinates a file or instructions, the build process catches it and the error is fed back to the LLM to correct, just like a normal “feedback loop” that a human developer would implement. By setting this as a target, we could also measure how well we optimized our prompt engineering, RAG operations and which model is ideally suited for each task.
 
This problem arose again when we approached our newest offering: Frontier, the VSCode Extension which utilizes your design system and code components when it converts Figma designs to code. In this case, a single code segment could have multiple code implementations that could take in additional code sections as child components or props, yielding the need for much tighter guardrails for the LLM. Not only do we need to use all the previous tools, we also need to validate the results it produced are valid code. This needed to happen very quickly, which meant that a “self-healing” approach wouldn’t work. Instead, we are able to identify props and values using the existing codebase, combined with parsing the Typescript of the generated code to ensure that it makes sense and is valid code against the code component that we have chosen to embed in a particular area in the code base. Interestingly, despite the LLMs generating very small function call and getting a fair amount of context and multi-shot examples, they do hallucinate more often than expected. Fine-tuning might help with that, but we assumed that this is an inherent piece of the technology and requires tight guardrails.
 
That means that for each reply from the LLM we first validate that it’s a valid response, and if it is invalid we will explain to the LLM what’s wrong with it and ask it to correct. In our experience a single retry shot often does the trick and if it fails, it will likely fail in subsequent rounds. Once an initial validation is passed we actually go through the reply and validate that it makes sense, we have a few simple validation heuristics that improve the success rate dramatically. 
 
The conclusion is that you cannot ignore hallucinations. They are an inherent part of LLMs and require dedicated code to overcome. In our case, we provide the user with a way to provide even more context to the LLM, in which case we explicitly ask it to be more creative in its responses. This is an opt-in solution for users and often generates better placeholder code for components based on existing usage patterns. Interestingly, when we apply this to component libraries that the LLM was trained upon (MUI, for example, is quite popular) the hallucinations increase as the LLM has prior bias towards those component implementations and the guard rails are particularly useful there.
 

|

VP Engineering

A seasoned industry veteran, with background in ML, Machine Vision and every kind of software development, managing large and small teams and with a severe addiction to mountain biking and home theaters.

Leave a comment

Your email address will not be published. Required fields are marked *