Investor, Board Member and CEO AISERA, Inc.an AI Service Management Company (AISM).
On Microsoft’s most recent earnings call, CEO Satya Nadella offered one optimistic assessment of the company’s progress with Generative AI. The strategic partnership with OpenAI has proven to be a highly successful venture, potentially giving Microsoft a significant advantage over rivals such as Google and Amazon.
But Nadella’s message went beyond just large language models (LLMs). He also emphasized the importance of small language models (SLM) in Microsoft’s development strategy. He pointed to adoption by companies such as AT&T, EY and Thomson Reuters.
SLMs are generally five to 10 times shorter than LLMs. However, reduced size is not necessarily a disadvantage. SLMs still have significant potential and, in some cases, can perform on par with their larger LLM counterparts.
The SLM category is relatively nascent and subject to rapid innovation. Consequently, most businesses are currently experimenting with these models in pilot phases. But this technology has significant potential.
The problems with LLMs
While LLMs are a new technology, they have already become a major force in the business sector. They excel at processing, summarizing and analyzing large amounts of data and provide valuable insights for decision making. Then there are the advanced features for creating compelling content and translating foreign languages.
However, LLMs have significant disadvantages for business. One is the accuracy and quality of the model outputs. This includes not only the bias within the models, but also addressing the issue of “illusions”. These are cases where the model produces reasonable, but practically incorrect or meaningless information.
Next, LLMs can be very general. The reason is that the training data comes mostly from the public internet. A lack of customization can lead to a gap in how effectively these models understand and respond to industry-specific terminologies, processes, and data nuances.
Then there are the security and privacy concerns. When a business uses an LLM, it will transmit data through an API and this carries the risk of exposing sensitive information.
But with SLMs, it is possible to help alleviate these problems.
SLM
Often, LLMs offer capabilities that are foreign to business needs. After all, an energy company doesn’t need detailed information on the Middle Ages, classic novels or anthropology.
An SLM offers a more focused training set. The model is tailored to the unique data sets for specific businesses. These can range from product descriptions and customer feedback to internal communications such as Slack messages. The narrower focus of an SLM, as opposed to the vast knowledge base of an LLM, greatly reduces the chances of inaccuracies and illusions.
With a smaller model, it is more economical to build, develop and manage. This is important given the heavy costs of infrastructure such as GPUs (graphics processing units). In fact, an SLM can run on inexpensive commodity hardware—say, a CPU—or it can be hosted on a cloud platform.
The benefits of SLM go beyond cost-effectiveness. They are more customizable, allowing for easier adjustments based on user feedback. These models also have lower latency. This is a critical feature for applications where responsiveness is key, such as in chatbot interactions. This combination of adaptability and speed enhances overall performance and user experience.
When it comes to security, a major advantage of many SLMs is that they are open source. This enables deployment in a private data center, offering enhanced control and security measures tailored to an organization’s specific needs.
Granted, this is not to say that SLMs are not without their drawbacks. Again, the technology is quite new and there are still issues and areas that require refinement and improvement.
One of the challenges is evaluating the appropriate SLM. There are plenty available — which you can find on sites like Hugging Face — and new ones seem to hit the market every day. While there are metrics to make comparisons, they are not infallible and can be misleading.
There are also several ways to customize an SLM, which require specialized data science expertise. For example, refinement involves adjusting a model’s weights and biases. Then there is recovery augmented production (RAG). This is an advanced technique that improves the functionality of SLM by integrating external documents, usually from vector databases. This method optimizes the performance of LLMs, making them more relevant, accurate and useful in various contexts.
conclusion
While LLMs are powerful, they often generate answers that are too generalized and can be inaccurate. These models are also prone to security and privacy risks.
But SLMs were able to effectively deal with these problems. Some of the main benefits include:
• Grounding in a company’s proprietary data, which greatly improves accuracy.
• Lower cost as SLM can be performed on commodity material.
• Easier to customize the model.
• Lower latency, which can improve chatbot performance.
• Development in private data centers, which enhances security and privacy.
SLMs are still evolving rapidly. However, so far, they have proven to be an effective way for businesses to leverage genetic AI.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Am I eligible?