Upcoming Event: Babuška Forum
Kevin Miller, Postdoctoral Fellow, Oden Institute, UT Austin
10 – 11AM
Friday Sep 29, 2023
Active learning is a paradigm in machine learning that seeks to judiciously select inputs to be labeled for the purpose of increasing the accuracy of downstream machine learning models. Whereas supervised learning methods rely on (potentially) large amounts of labeled data to produce accurate classifiers, it is often the case in real-world applications that identifying labels (e.g., classifications) for inputs is expensive. Indeed, in various applications wherein significant levels of domain expertise are necessary to hand-label inputs, it can be prohibitively costly to request that a human identify the labels for an entire pool of input data in order to use a supervised method for training a machine learning classifier. Active learning alleviates this burden by iteratively selecting useful and informative subsets of the input data pool (called "query points") to label in a feedback loop with a human-in-the-loop domain expert.
Two keys to successful active learning are to ensure that the selected query points (1) explore the extent of the clustering structure of the dataset and (2) exploit classification decision boundaries. In particular, it is crucial that a proposed active learning method selects query points that properly explore the dataset prior to selecting query points that exploit. Graph-based active learning is an important area of current research that leverages graph structures that model the pairwise similarities between inputs to capture the clustering structure inherent in the dataset for ensuring exploration prior to exploitation.
This talk will highlight some insights gained and methods developed in my recent work around this exploration versus exploitation tradeoff in graph-based active learning. Time permitting, this talk will also introduce quantitative mathematical guarantees for achieving exploration and exploitation with these methods.
Kevin Miller is currently a Peter J. O'Donnell Postdoctoral Fellow at the Oden Institute for Computational Engineering and Sciences at the University of Texas, Austin. He earned his Ph.D. in Mathematics from the University of California, Los Angeles under the supervision of Dr. Andrea Bertozzi. During his doctoral work, he was supported by the Department of Defense's National Defense Science and Engineering Graduate (NDSEG) fellowship. His research focuses on the mathematics of data science and statistical learning theory, with particular emphasis on subset selection, active learning, and graph-based semi-supervised learning methods.