Past Event: Center for Autonomy Seminar
Octavia Camps, Robust Systems Lab, Northeastern University
11 – 12PM
Thursday Mar 27, 2025
POB 6.304
A long-standing goal of Machine Learning is to enable machines to structure and interpret the world as humans do. This challenge is particularly complex in time-series data, such as video sequences, where seemingly different observations can arise from the same underlying dynamics. In this talk, I will explore how leveraging these inherent dynamics can lead to frugal and interpretable architectures for sequence analysis, classification, prediction, and manipulation.
I will illustrate these ideas with two key examples. First, I will introduce CVAR, a dynamics-based architecture for cross-view action recognition. By exploiting temporal coherence in sequential data, CVAR extracts dynamics-based features and invariant representations through an information-autoencoding unsupervised learning paradigm. This flexible framework accommodates various input modalities, including RGB and 3D skeleton data. Experimental results on four benchmark datasets demonstrate that CVAR not only outperforms state-of-the-art methods across all modalities but also significantly bridges the performance gap between RGB and 3D skeleton-based approaches.
Next, I will present JPDVT, a framework designed to solve "Set to Sequence" problems— where unordered, incomplete sets must be assembled into meaningful sequences. By employing conditional diffusion denoising probabilistic models, JPDVT learns the probability distribution of all possible permutations in training data, enabling sequence reconstruction through distribution sampling. Our approach achieves state-of-the-art performance in both quantitative and qualitative evaluations, demonstrating its ability to handle missing data and solve larger, more complex image and video puzzles than previous methods.
These examples highlight the power of ordering-aware architectures in structured sequence learning and their broad applicability across domains.
Octavia Camps received a B.S. degree in computer science and a B.S. degree in electrical engineering from the Universidad de la Republica (Uruguay), and a M.S. and a Ph.D. degree in electrical engineering from the University of Washington. Since 2006, she has been a Professor in the Electrical and Computer Engineering Department at Northeastern University. From 1991 to 2006 she was a faculty of Electrical Engineering and of Computer Science and Engineering at The Pennsylvania State University. Prof. Camps was a visiting researcher at the Computer Science Department at Boston University during Spring 2013 and in 2000, she was a visiting faculty at the California Institute of Technology and at the University of Southern California. She is an associate editor IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) and was a General Chair for IEEE/CVF Computer Vision and Pattern Recognition (CVPR) 2024. Her main research interests include dynamics-based computer vision, machine learning, and image processing. In particular, her work seeks data-driven dynamic representations for high dimensional temporal sequences, which are compact, physically meaningful, and capture causality relationships. Combining recent deep learning developments with concepts from dynamic systems identification, she has developed models and algorithms for a range of video analytics applications, including human re-identification, visual tracking, action recognition, video generation, and medical imaging.