AWS re:Invent is just around the corner, and Amazon has already started announcing some of the new services and features.
In the tradition of releasing predictions and wishlists before re:Invent, here is my list of the top 5 AI announcements to expect from this massive conference:
1. AWS users will get an official AI assistant for cloud operations
Microsoft has Copilot and Google has Duet AI, but what about AWS? Amazon CodeWhisperer might be the answer, but it’s only aimed at developers targeting code integration within IDEs. How about an AWS Ops Whisperer?
I expect AWS to unveil a comprehensive AI assistant strategy at re:Invent this year, allowing users to interact with the cloud platform through an AI conversation and chatbot interface. It can also reveal the platform and a set of tools needed to develop custom AI assistants that can be integrated with external data and software.
Consider a chatbot in the AWS Console that accepts prompts such as “launching an EC2 instance in Singapore optimized to run my NGINX web server based on the same configuration used with the Ireland region released yesterday”. This is a highly contextual and efficient method of implementing DevOps, CloudOps, and even FinOps on AWS.
Imagine performing a post-mortem and basic analysis of an incident by simply asking the right questions to the AI assistant, which can gather logs and metrics from CloudWatch and CloudTrail. Users can interact with the AWS AI assistant to discover which region consumed the most expensive service in the past month. While these questions may seem primitive and simple, this concept truly revolutionizes operations and the opportunities are simply endless. AWS can even open a marketplace for assistants that users can publish and even monetize.
The same concept can easily be extended to AWS CLI and Cloudformation to make automation intelligent. The AWS CLI automatically recommends additional parameters and configurations based on best practices for cost security and performance optimization. You can interact with the AWS CLI through simple prompts, and an AI model running in the cloud turns them into a complex AWS CLI, Cloudformation template, or CDK script, saving DevOps engineers hours of effort.
Ultimately, AWS can create dedicated AI assistants for each of the job functions, such as infrastructure provisioning, storage functions, database functions, security functions, and financial functions. These assistants will have a deep understanding of customer environments and historical data to suggest and recommend the best way to use AWS services.
AWS AI Assistant could be the ultimate solution to deal with the most complex and ever-evolving cloud services platform of our time.
2. A new class of database management services based on vector databases
The success of LLM-based applications depends on vector databases. They provide LLMs with long-term memory recall of conversation history and also provide contextual cues to avoid hallucinations.
AWS has already added vector support for PostgreSQL running on the Amazon RDS and Amazon Aurora platforms. It also introduced a vector engine for Amazon OpenSearch Serverless to index, search, and retrieve embeds.
While it makes sense to add vector capabilities to existing databases, customers require a dedicated and cost-effective vector database that serves as a single source of truth for data storage and retrieval. A centralized vector database is preferable to a collocated vector database attached to each source in a scenario where structured and unstructured customer data is distributed across object storage, NoSQL, relational databases, and the data warehouse.
Amazon also has the opportunity to introduce additional features, including efficient similarity search algorithms, built-in text embedding models, and predictive performance to add value to the vector database.
Finally, I expect AWS to add vector support to Amazon Neptune, the graph database that can bring knowledge graphs to search, making the context rich and relevant.
3. Serverless RAG pipelines connecting various AWS data services to LLM
Amazon Bedrock has a feature called a knowledge base that connects data sources to vector databases to help you build agents. However, the service, which is still in beta, is very basic and leaves a lot up to the developer. It does not provide, for example, enough options to select different embedding models, vector databases and LLMs.
One of the main disadvantages of the knowledge base service is its inability to keep the vector database in sync with the data source. When a PDF is deleted from an S3 bucket, it is not clear if the associated vectors are also deleted. Additionally, the user experience of building a knowledge base leaves a lot to be desired. Configuring the required IAM role, data sources and LLM targets is clumsy and time consuming.
Amazon is likely to develop a serverless RAG pipeline that combines the best of Amazon Bedrock, AWS Glue, and AWS Step Functions. Clients should be able to start with a blank canvas and then add one or more data sources to monitor state and update the vector database. The same user interface should be used to select embedding models, vector databases, semantic search algorithms, prompt templates, and finally the LLM target. The generation-augmented recovery pipeline is implemented behind the scenes using AWS IAM, AWS Lambda, Amazon Bedrock, and other services. The service’s knowledge base and agent capabilities should be combined into a single, unified serverless infrastructure and developer experience.
By connecting agents to AWS Lambda and API Gateway, this serverless platform can be extended to expose agents as REST endpoints. Customers can gain insight into how agents interact with LLMs and the identity associated with each call by extending observability to CloudWatch. This could be modeled after LangChain’s LangServer and LangSmith, which have similar capabilities.
This service can become the foundation for building a no-code or low-code enterprise-grade tool for building AI agents and assistants in the future.
4. A new and more improved large tongue model from Titan
Although Amazon has Titan, its own LLM, available as a Bedrock foundation model, its performance is not comparable to proven models such as GPT-4.
Instead of Titan, the majority of AWS build AI documentation, tutorials, and reference architectures use Claude 2 from Anthropic. Even the latest fun no-code tool, PartyRockbased on Claude (Anthropic), Jurassic (AI21) and Command (Cohere) LLM instead of Titan, indicating the level of confidence in using the domestic model in production.
Amazon is rumored to be working on a better LLM codenamed ‘Olympus’ which has 2 trillion parameters. The group responsible for shipping Amazon’s foundation models is led by Rohit Prasad, a former Alexa executive who now reports directly to CEO Andy Jassy. In his capacity as Amazon’s Chief Scientist for Artificial General Intelligence (AGI), Prasad brought scientists from the Amazon Science team and those working on Alexa AI to focus on model training, providing specialized resources to connect AI initiatives the company’s.
The new LLM may include a text embedding model that provides efficient text vectoring techniques while preserving context and semantic meaning.
The rumored LLM could become the de facto standard for all LLM-powered applications developed by Amazon and its ecosystem, such as agents and chatbots.
5. Multimodal AI on Amazon Bedrock
Finally, Amazon is expected to enhance Bedrock’s multimodal capabilities through the integration of LLM and diffusion models. It can either develop its own models or bring in existing multimodal models such as Large Language and Vision Assistant or LLaVA. This model would be similar to OpenAI’s GPT-4V, which accepts an image or text prompt to answer a user query.
Amazon can develop tools and wrappers to help with multimodal AI, which can become an important feature of Amazon Bedrock.
I plan to publish a detailed breakdown of AWS re:Invent 2023 news and announcements. Stay tuned.