Watch All the Highlights from Nvidia's Keynote at Computex

Jun 2, 2024

Tech

Speaker 1: Just last week, Google announced that they've put QDF in the cloud and accelerate pandas. PANDAS is the most popular data science library in the world. Many of you in here probably already use pandas. It's used by 10 million data scientists in the world downloaded 170 million times each month. It is the Excel that is the spreadsheet of data scientists. Well, with just one click, you can now use PANDAS [00:00:30] in CoLab, which is Google's Cloud data centers platform accelerated by QDF. The speed up is really incredible. Let's take a look. Speaker 1: That was a great demo, right? It didn't take long. This is Earth two. The idea that we [00:01:00] would create a digital twin of the earth, that we would go and simulate the earth so that we could predict the future of our planet to better avert disasters or better understand the impact of climate change so that we can adapt better so that we could change our habits. Now, this digital twin of Earth is probably one of the most ambitious projects that the world's ever undertaken, and we're taking [00:01:30] large steps every single year, and I'll show you results every single year. But this year we made some great breakthroughs. Let's take a look. Speaker 2: On Monday, the storm will VE a north again and approach Taiwan. There are big uncertainties regarding its path. Different paths will have different levels of impact on Taiwan. Speaker 1: Someday in the near future, [00:02:00] we will have continuous weather prediction at every square kilometer on the planet. You will always know what the climate's going to be. You will always know, and this will run continuously because we've trained the AI and the AI requires so little energy. In the late 1890s, Nicola Tesla invented an AC generator. We invented an AI [00:02:30] generator. The AC generator generated electrons, NVIDIA's AI generator generates tokens. Both of these things have large market opportunities. It's completely fungible in almost every industry, and that's why it's a new industrial revolution. And now we have a new factory, a new computer, and what we will run on top of this is a new type of software [00:03:00] and we call it nims, Nvidia Inference Microservices. Now, what happens is the NIM runs inside this factory, and this NIM is a pre-trained model. It's an ai. Speaker 1: Well, this AI is of course quite complex in itself, but the computing stack that runs AI are insanely complex. When you go and use chat, GPT underneath their stack is [00:03:30] a whole bunch of software. Underneath that prompt is a ton of software, and it's incredibly complex because the models are large billions to trillions of parameters. It doesn't run on just one computer. It runs on multiple computers. It has to distribute the workload across multiple GPUs, tensor parallelism, pipeline parallelism, data parallel, all kinds of parallelism, expert parallelism, all kinds of parallelism, distributing the workload across multiple GPUs, processing [00:04:00] it as fast as possible. Because if you are in a factory, if you run a factory, your throughput directly correlates to your revenues. Your throughput directly correlates to quality of service, and your throughput directly correlates to number of people who can use your service. We are now in a world where data center throughput utilization is vitally important. Speaker 1: It was important in the past, but not vitally important. It was important in the past, but people don't measure it [00:04:30] today. Every parameter is measured, start time, uptime, utilization, throughput, idle time, you name it, because it's a factory. When something is a factory, its operations directly correlate to the financial performance of the company. And so we realize that this is incredibly complex for most companies to do. So what we did was we created this AI in a box and the containers [00:05:00] an incredible amount of software. Inside this container is cuda co DNN, tensor RT Triton for inference services. It is cloud native so that you could auto scale in a Kubernetes environment. It has management services and hooks so that you can monitor your ais. It has common APIs, standard APIs so that you could literally chat with this box. Speaker 1: [00:05:30] We now have the ability to create large language models and pre-trained models of all kinds. And we have all of these various versions, whether it's language-based or vision-based or imaging based, or we have versions that are available for healthcare, digital biology. We have versions that are digital humans that I'll talk to you about. And the way you use this, just come to ai.nvidia.com. And today we just posted up in hugging face, [00:06:00] the LAMA three NIM fully optimized. It's available there for you to try, and you can even take it with you. It's available to you for free. And finally, AI models that reproduce lifelike appearances, enabling real-time path, traced subsurface scattering to simulate the way light penetrates the skin, scatters and exits at various points, giving skin its soft and translucent appearance. Nvidia [00:06:30] ACE is a suite of digital human technologies packaged as easy to deploy, fully optimized microservices or nims. Speaker 1: Developers can integrate ACE NIMS into their existing frameworks, engines, and digital human experiences. Nitron, SLM and LLM, nims, to understand our intent and orchestrate other models. Riva speech, NIMS for interactive speech and translation, audio to face and gesture NIMS for [00:07:00] facial and body animation. And omniverse RTX with DLSS for neural rendering of skin and hair. And so we installed every single R-T-X-G-P-U with tensor core processing. And now we have a hundred million GForce RTX AI PCs in the world. And we're shipping 200 and this computex. We're featuring four new amazing laptops. All of them are able to run ai. Your future laptop, [00:07:30] your future PC will become an ai. It'll be constantly helping you, assisting you in the background. Ladies and gentlemen, this is Blackwell. Blackwell is in production, incredible amounts of technology. Speaker 1: This is our production board. This is the most complex, [00:08:00] highest performance computer the world's ever made. This is the gray CPU. And these are, you could see each one of these Blackwell dies, two of them connected together. You see that it is the largest dye, the largest chip the world makes. And then we connect two of them together with a 10 terabyte per second link. So this is A DGX Blackwell This, this is air cooled, [00:08:30] has eight of these GPUs inside. Look at the size of the heat sinks on these GPUs, about 15 kilowatts, 15,000 watts and completely air cooled. This version supports X 86, and it goes into the infrastructure that we've been shipping hoppers into. However, if you would like to liquid cooling, we have a new system. And this new system [00:09:00] is based on this board, and we call it MGX for modular. And this modular system, you won't be able to see this. Can they see this? Can you see this? You can. Are you okay? Speaker 1: And so this is the MGX [00:09:30] system, and here's the two Blackwell board. So this one node has four Blackwell chips. These four Blackwell chips, this is liquid cooled. Nine of them. Nine of them. Well, 72 of these, 72 of these GPUs, 72 of these GPUs are then connected together with a new MV link. This is ENV link switch [00:10:00] fifth generation. And the ENV link switch is a technology miracle. This is the most advanced switch the world's ever made. The data rate is insane. And these switches connect every single one of these black wells to each other so that we have one giant 72 GPU black wall. Well, the benefit of this is that in one domain, one GPU domain, [00:10:30] this now looks like one GPU. This one GPU has 72 versus the last generation of eight. So we increased it by nine times. The amount of bandwidth we've increased by 18 times. The AI flops, we've increased by 45 times, and yet the amount of power is only 10 times. This is a hundred kilowatts, and that is 10 kilowatts. This is one GPU, ladies and gentlemen, D-G-X-G-P-U. [00:11:00] The back of this GPU is the ENV link spine. The ENV link spine is 5,000, wires two miles, Speaker 1: And it's right here. This is an envy link spine, Speaker 1: And [00:11:30] it connects 72 GPUs to each other. This is an electrical mechanical miracle. The transceivers makes it possible for us to drive the entire length in copper. And as a result, this switch, the MV switch, vying switch driving the vying spine in copper makes it possible for us to save 20 kilowatts in one rack. 20 kilowatts could now be used for [00:12:00] processing. Just an incredible achievement. So we have code names in our company and we try to keep 'em very secret. Oftentimes, most of the employees don't even know. But our next generation platform is called Ruben, the Ruben platform, the Ruben platform. I'm not going to spend much time on it. I know what's going to happen. You're going to take pictures of it and you're going to go look at the fine prints and feel free to do that. So we have the Ruben platform, and one year later we'd have the Ruben [00:12:30] Ultra Platform. Speaker 1: All of these chips that I'm showing you here are all in full development, a hundred percent of them. And the rhythm is one year at the limits of technology, all a hundred percent architecturally compatible. So this is basically what Nvidia is building. A robotic factory is designed with three computers, train the AI on Nvidia ai. You have the robot running on the PLC systems for orchestrating the factories. And [00:13:00] then you, of course simulate everything inside omniverse. Well, the robotic arm and the robotic AMRs are also the same way. Three computer systems. The difference is the two omniverse will come together. So they'll share one virtual space when they share one virtual space, that robotic arm will become inside the robotic factory. And again, three computers. And we provide the computer, [00:13:30] the acceleration layers and pre-train AI models. Well, I think we have some robots that we'd like to welcome. There we go about my size, Speaker 1: And we have some friends to join us. So the future of robot robotics is [00:14:00] here, the next wave of ai. And of course, Taiwan builds computers with keyboards. You build computers for your pocket. You build computers for data centers in the cloud. In the future, you're going to build computers that walk and computers that roll around. And so these are all just computers. And as it turns out, the technology [00:14:30] is very similar to the technology of building all of the other computers that you already built today. So this is going to be a really extraordinary journey for us.