Past Event: Center for Scientific Machine Learning
Yuling Yao, Assistant Professor in the Department of Statistics and Data Sciences at the University of Texas at Austin
1 – 2PM
Wednesday Feb 12, 2025
POB 4.304 and Zoom
Simulation is a powerful way to specify models in modern scientific computing, while the likelihood-free setting imposes challenges for inference and calibration. To start, I present a cosmology example of galaxy clustering analysis using simulation-based inference and normalizing flows. I present three recent advances in simulation-based inference: (1) “discriminative calibration” develops a general classifier approach to check Bayesian computation including simulation-based inference and Markov chain Monte Carlo. The classifier performance is a consistent estimate of a family of divergence measures, including the classical classifier two-sample test as a special case. (2) To incorporate posterior approximations from different inference algorithms or flow architectures and improve the final inference quality, I present “simulation-based stacking”, a general framework to combine probabilistic inferences. (3) Yet even when the inference is perfect, the simulation model is often an approximation to the nature. I present “simulation-based posterior predictive check”, a framework to check if the simulation model does a good job of capturing relevant aspects of the data, such as means, standard deviations, and quantiles. This new predictive check p-value is ensured to be frequentist-calibrated under the null, making it particularly suitable for rigorously testing scientific hypothesis.
Yuling Yao joined UT Austin in 2024 as an Assistant Professor at the Department of Statistics and Data Sciences. Before coming to UT he spent three years as a Flatiron Research Fellow at the Flatiron Institute, Center for Computational Mathematics.
Dr. Yuling develops scalable Bayesian methods for applied data problems, with a stron emphasis on probabilistic modeling and uncertainty quantification. His recent applications span diverse areas, including lead fallout in Paris, arsenic contamination in Bangladeshi groundwater, the evolution of the Universe post-Big Bang, and bottom quark tagging in particle collider experiments.
Better applied statistics need better methodology. To that end, he designs statistical and machine learning methods, with a focus on model evaluation, aggregation, causal inference and inference under misspecification.