Past Event: CSEM Student Forum

Distributed training: how to effectively scale your deep learning models in a distributed computing environment

Jon Wittmer, CSEM PhD Student, Oden Institute, UT Austin

12 – 1PM
Friday Oct 21, 2022

POB 6.304

Abstract

Deeper, wider, and more powerful. These words summarize many of the leading machine learning models in use today, models that impact our everyday lives. From your virtual assistant to search engines to social media ads recommendations, large machine learning models are everywhere. Similarly, we live in the world of BIG DATA where managing and using the data is a far harder challenge than gathering the data.

While it is easy to conceptualize large models (just add more layers) and big data (buy more hard drives), the details of how to actually train large models and effectively utilize the vast amounts of data that we have at our disposal are not often discussed. State-of-the-art models require state-of-the-art training techniques. This talk will cover some of the different approaches to scaling both models and data in a distributed computing environment, including traditional distribution strategies that I use to train my research models and the state-of-the-art used on industrial problems, hopefully giving an appreciation for the sheer magnitude of compute going into the products that we use every day.

Biography

Jon is a PhD student in Dr. Tan Bui's group studying the interface of machine learning and scientific computing. He is particularly interested in applications of machine learning to high performance computing and optimization. He recently interned at Meta Reality Labs working on camera calibration.

Distributed training: how to effectively scale your deep learning models in a distributed computing environment

Event information

Date

12 – 1PM
Friday Oct 21, 2022

Location POB 6.304

Hosted by

Admin shanemcq@utexas.edu