Today I heard from MIT researcher Andrei Barbu about working with LLMs and how to prevent certain kinds of problems related to data leaks.
“We’re interested in studying language specifically,” he said of his work, pointing to some of the research groups’ goals in this area.
Reflecting on the duality of human and computer cognition, he pointed out some differences. People, he said, teach each other, for example. Another thing people can do is keep secrets. This can be difficult for our digital counterparts.
“There is a problem with LLMs,” he noted, explaining their potential for leaks. “They can’t keep a secret.”
Describing how to identify the problem, Barbu spoke of direct injection attacks as a prime example. We heard something like this from Adam Chipala a while back where he mentioned verifiable software principles as a possible solution.
Barbu mentioned something I found interesting, though: a sort of catch-22 for systems that aren’t very good at data sealing.
“Models are only as sensitive as the most sensitive piece of data within that model,” he explained. “People can question the model. … (on the other hand, models are) only as weak and vulnerable to attack as the least sensitive part you put in.”
People can poison your model, he suggested, in some cases quite easily.
The solution: Barbu talked about a custom, fine-tuned model and how it might work.
Specifically, he was talking about something called low-level customization, or LORA, a fine-tuning method first introduced at Microsoft in 2021.
When I looked into it, the experts mention two things LORA does in a proprietary way: it tracks weight changes instead of directly updating the weights, and makes the large matrix of weight changes into smaller arrays to isolate the parameters.
Describing some related methods, Barbu talked about extracting what is needed from a component library and explained that there are many ways to approach this. Using a Venn diagram, note the difference between adaptive and selective types of methods and others.
You can think, he suggested, of strategies like converting English to SQL to solve problems and speed up solutions. At the end of the day, though, the challenge remains.
“Security is binary,” he noted. “You make it or you fail.”
Now, this part of the presentation caught my ear: Barbu talked about some kinds of potential AI tools that could end up taking a lot of the work out of information security. He described a situation where tools are at the top of a network, looking for sensitive information and actually providing information.
This, he suggested, could solve some rampant HIPAA problems with leaks of protected health data.
“We just don’t have a good solution for this yet,” he said. But with the right tools, we can!
Another good idea I heard from his talk was to label “informed embarrassment” and “uninformed embarrassment” to see where the problem lies.
Then, he said, we can also improve the secret-keeping capabilities of our LLMs.
“We can create secure LLMs,” he added, “models that are completely immune to any kind of attack because we simply don’t bind the parameters that the user should have access to, so there’s literally nothing they can do…”
Look, there was a lot more to this demo – for example, Barbu talked a bit about the scenario of keeping data on alien landings in a database – just as an example! So this part was also remarkable. You’ll have to drill down into the video to really see it in detail. But I will continue to bring you the highlights from this conference, almost in real time!