Representing a modern AI factory factory
Nvidia
At Infraai Global Summit’25, Nvidia announced a new member of the upcoming AI product family of the upcoming Data Rubin Center. The Rubin CPX will complete the standard Rubin AI graphics plant (GPU) in providing high value content production at a more efficient price. Most importantly, it fits into the Data Center Nvidia infrastructure is designed for a multi -GPU data center.
Tirias Research has consulted Nvidia and other AI companies listed in this article.
Tirias Research has long predicted the need for a variety of AI conclusions from companies such as AMD, Intel, Nvidia and anyone else develop AI semiconductor solutions. Like any other data center for the data center, there are no two AI models. As consumers and businesses adopt the AI and AI models continue to evolve, there will be the opportunity to optimize the material around an AI model or groups of models. However, the GPUs will remain one of the best solutions for both AI conclusions and AI conclusions for two main reasons, based on Nvidia with the Rubin CPX announcement.
The price of GPU AI
The first reason is the nature of the semiconductor industry. The technology industry shakes like pendulum. When new technology is introduced, there is a period of rapid innovation or in the case of AI, daily innovation. When the rate of innovation slows down, standards emerge. At this point, it is reasonable to consider optimizing a functional work in a special chip known as a built -in application -specific circuit (ASIC). In many cases, this feature can ultimately be incorporated into a reception processor such as a central processing unit (CPU) or GPU. However, the development of a custom chip or functional block can take three or more years. With new models and ways of processing these fast -changing models, GPU is a more practical solution than an ASIC for most IA applications.
The pending technology that converts with every new technology
Tirias research
The second reason is the ability of GPUs to be separated to handle multiple AI models at the same time. There is a myth that a transition from AI education to AI conclusions comes in the near future. With the development of models such as Openai Chatgpt models, Google’s Gemini, Microsoft Copilot, Deepseek R and V Series models, ANTHROPIC Claude, AI embarrassment and countless others, the overwhelming majority of AI processing AI across industry are already processing. If there was such a line, it would have been several years ago. With the programmable efficiency of GPU AI and the construction of GPU data centers, the overwhelming majority of AI working loads, especially AI and Agentic AI, run to GPU because it is the most effective option.
Buildout ai GPU of Nvidia
In GTC 2025, Nvidia introduced several basic technologies for building AI-Centric. These included NVL144 Rack Design, KV cache, Dynamo, Data Center Blueprints and improvements to NVLink, Specterum-X and Quantum-X networking technologies. KV cache allows the storage of key computers and value tags to be used in the next generation AI and between GPU. Dynamo is an open source conclusions of programming and launching the AI workload at the Data Center, essentially a data center orchestrator of the Data Center. NVL144 RACK and NVIDIA networking technologies are the infrastructure of the data center. And the designs of the Data Center executed in Omniverse provide a digital twin for the design, manufacture and operation of a AI data center or the AI plant, as stated in them. Now, Nvidia has introduced the Rubin CPX, an AI accelerator that is optimized to make specific features extremely good. With the Rubin CPX, Nvidia is taking another step in designing an AI plant that can be optimized for specific AI features.
Using detailed processing for conclusions workload
Nvidia
NVIDIA refers to the Rubin CPX as a framework for conclusions designed for very complex duties, such as millions of software development lines, video production hours and deep research. The Rubin CPX works in conjunction with CPU Vera and Rubin AI GPU. CPU Vera and Rubin AI GPU consume large volumes of data, which require high calculation. The Rubin CPX then receives a context input to begin to create the output or content. This phase of generations depends more on the memory and network of networking zone. As a result, the Rubin CPX, while built on the same Rubin AI GPU architecture, is designed differently by the GPU Rubin AI, with 128GB of GDDR7 Memory plus material cones and decoder to support video production. The Rubin CPX is capable of 30 Petaflops of performance using NVFP4 data format, a 3X increase in attention acceleration compared to the GB300 NVL72 and processing a one million token environment. Memory and architecture changes lead to a reduction in about 20 petaflops of overall performance, but to increase the efficiency of contextual contributions.
Nvidia plans to offer the Rubin CPX integrated into a single shelf with the CPU Vera and the Rubin AI GPU called Vera Rubin NVL144 CPX and as a separate accelerator shelf in the Rack Rubin NVL144. The Rack Rubin NVL144 CPX will be adjusted with 36 CPU Vera, 144 Rubin AI GPU and 144 Rubin CPXs with 100 TB of High Speed and 1.7 PB/S of the memory bandwidth. The result is eight exaflops of NVFP4 performance, an increase of 7.5x compared to the NVL72 GB300 rack. According to Nvidia, a $ 100 million Capex investment could lead to up to $ 5 billion, 30x to 50x (ROI) investment. The Dual Rack solution will provide the same performance as an additional 50 TB of memory.
Vera Ruben NVL144 CPX and Vera Rubin CPX + VERA RUBIN NLV144
Nvidia
Expect more
The Rubin CPX is a AI GPU conclusions platform that focuses on high quality generation applications. We will probably see other versions of the Nvidia AI GPU architectures that focus on different AI processing sections, such as the smaller AI models, in the future. We could even see various versions of CPX solutions optimize for even more specific applications. AI is not a single single workload and the optimization of the accelerator is only one step in the process. Most importantly, Nvidia continues to focus on the entire data center as a single system to ensure that all possible performance congestion are tackled, resulting in the highest possible efficiency and investment yield (ROI).
A common question is whether the industry needs an annual rate for new GPU AI. At this point, the answer is that new GPU AI needs each year only to keep up with innovation in AI. In addition, it requires optimized GPUs for the various types of AI workload.
