Serving DNNs in Real Time at Datacenter Scale with Project Brainwave

Intel

To meet the computational demands required of deep learning, cloud operators are turning toward specialized hardware for improved efficiency and performance. Project Brainwave, Microsoft’s principle infrastructure for AI serving in real time, accelerates deep neural network (DNN) inferencing in major services such as Bing’s intelligent search features and Azure. Exploiting distributed model parallelism and pinning over low-latency hardware microservices, Project Brainwave serves state-of-the-art, pre-trained DNN models with high efficiencies at low batch sizes.

Published:
Lang:
Type: White Paper
Length: