Workshop “Accelerating and Scaling Inference with NVIDIA GPUs” (подія в архіві)

Took place
01 December 2022 (Thursday)

Registration for this workshop is available only for the attendees of Transform 2022 conference.

By completing the application form, your registration is subject to approval. The priority will be given to applicants from Eastern Europe. You will receive an email soon after regarding the status of your application and course registration. Please make sure to check your spam and junk folders if you do not receive anything.

The workshop will be conducted in English.

Workshop description:

Learn how to use GPUs to deploy machine learning models to production scale with the Triton Inference Server. At scale machine learning models can interact with up to millions of users in a day. As usage grows, the cost of both money and engineering time can prevent models from reaching their full potential. It’s these types of challenges that inspired creation of Machine Learning Operations (MLOps). Practice Machine Learning Operations by: Deploying neural networks from a variety of frameworks onto a live Triton Server Measuring GPU usage and other metrics with Prometheus Sending asynchronous requests to maximize throughput Upon completion, learners will be able to deploy their own machine learning models on a GPU server.

Prerequisites: familiarity with at least one Machine Learning framework such as PyTorch/TensorFlow, ONNX, TensorRT. Familiarity with Docker recommended but not required.

