Kevin Carlberg is an AI Research Science Manager at Meta Reality Labs Research and an Affiliate Associate Professor of Applied Mathematics and Mechanical Engineering at the University of Washington . He leads a multidisciplinary research team focused on enabling the future of interaction in augmented and virtual reality by developing AI breakthroughs across the domains of embodied AI, multimodal machine learning, conversational AI, and AI-accelerated computational physics. His individual research combines concepts from machine learning, computational physics, and high-performance computing to drastically reduce the cost of simulating nonlinear dynamical systems at extreme scale.
Previously, Kevin was a Distinguished Member of Technical Staff at Sandia National Laboratories in Livermore, California, where he led a research group of PhD students, postdocs, and technical staff in applying these techniques to a range of national-security applications in mechanical and aerospace engineering.
His recent plenary talk at the ICERM Workshop on Scientific Machine Learning summarizes his work.
PhD in Aeronautics and Astronautics, 2011
Stanford University
MS in Aeronautics and Astronautics, 2006
Stanford University
BS in Mechanical Engineering, 2005
Washington University in St. Louis
In many applications, projection-based reduced-order models (ROMs) have demonstrated the ability to provide rapid approximate solutions to high-fidelity full-order models (FOMs). However, there is no a priori assurance that these approximate solutions are accurate; their accuracy depends on the ability of the low-dimensional trial basis to represent the FOM solution. As a result, ROMs can generate inaccurate approximate solutions, e.g., when the FOM solution at the online prediction point is not well represented by training data used to construct the trial basis. To address this fundamental deficiency of standard model-reduction approaches, this work proposes a novel online-adaptive mechanism for efficiently enriching the trial basis in a manner that ensures convergence of the ROM to the FOM, yet does not incur any FOM solves. The mechanism is based on the previously proposed adaptive $h$-refinement method for ROMs [12], but improves upon this work in two crucial ways. First, the proposed method enables basis refinement with respect to any orthogonal basis (not just the Kronecker basis), thereby generalizing the refinement mechanism and enabling it to be tailored to the physics characterizing the problem at hand. Second, the proposed method provides a fast online algorithm for periodically compressing the enriched basis via an efficient proper orthogonal decomposition (POD) method, which does not incur any operations that scale with the FOM dimension. These two features allow the proposed method to serve as (1) a failsafe mechanism for ROMs, as the method enables the ROM to satisfy any prescribed error tolerance online (even in the case of inadequate training), and (2) an efficient online basis-adaptation mechanism, as the combination of basis enrichment and compression enables the basis to adapt online while controlling its dimension.
This work proposes a machine-learning framework for modeling the error incurred by approximate solutions to parameterized dynamical systems. In particular, we extend the machine-learning error models (MLEM) framework proposed in [Freno, Carlberg, 2019] to dynamical systems. The proposed Time-Series Machine-Learning Error Modeling (T-MLEM) method constructs a regression model that maps features—which comprise error indicators that are derived from standard a posteriori error-quantification techniques—to a random variable for the approximate-solution error at each time instance. The proposed framework considers a wide range of candidate features, regression methods, and additive noise models. We consider primarily recursive regression techniques developed for time-series modeling, including both classical time-series models (e.g., autoregressive models) and recurrent neural networks (RNNs), but also analyze standard non-recursive regression techniques (e.g., feed-forward neural networks) for comparative purposes. Numerical experiments conducted on multiple benchmark problems illustrate that the long short-term memory (LSTM) neural network, which is a type of RNN, outperforms other methods and yields substantial improvements in error predictions over traditional approaches.
Nearly all model-reduction techniques project the governing equations onto a linear subspace of the original state space. Such subspaces are typically computed using methods such as balanced truncation, rational interpolation, the reduced-basis method, and (balanced) POD. Unfortunately, restricting the state to evolve in a linear subspace imposes a fundamental limitation to the accuracy of the resulting reduced-order model (ROM). In particular, linear-subspace ROMs can be expected to produce low-dimensional models with high accuracy only if the problem admits a fast decaying Kolmogorov $n$-width (e.g., diffusion-dominated problems). Unfortunately, many problems of interest exhibit a slowly decaying Kolmogorov $n$-width (e.g., advection-dominated problems). To address this, we propose a novel framework for projecting dynamical systems onto nonlinear manifolds using minimum-residual formulations at the time-continuous and time-discrete levels; the former leads to extit{manifold Galerkin} projection, while the latter leads to extit{manifold least-squares Petrov–Galerkin} (LSPG) projection. We perform analyses that provide insight into the relationship between these proposed approaches and classical linear-subspace reduced-order models. We also propose a computationally practical approach for computing the nonlinear manifold, which is based on convolutional autoencoders from deep learning. Finally, we demonstrate the ability of the method to significantly outperform even the optimal linear-subspace ROM on benchmark advection-dominated problems, thereby demonstrating the method's ability to overcome the intrinsic $n$-width limitations of linear subspaces.
This work introduces the network uncertainty quantification (NetUQ) method for performing uncertainty propagation in systems composed of interconnected components. The method assumes the existence of a collection of components, each of which is characterized by exogenous-input random variables (e.g., material properties), endogenous-input random variables (e.g., boundary conditions defined by another component), output random variables (e.g., quantities of interest), and a local uncertainty-propagation operator (e.g., provided by stochastic collocation) that computes output random variables from input random variables. The method assembles the full-system network by connecting components, which is achieved simply by associating endogenous-input random variables for each component with output random variables from other components; no other inter-component compatibility conditions are required. The network uncertainty-propagation problem is: Compute output random variables for all components given all exogenous-input random variables. To solve this problem, the method applies classical relaxation methods (i.e., Jacobi and Gauss–Seidel iteration with Anderson acceleration), which require only blackbox evaluations of component uncertainty-propagation operators. Compared with other available methods, this approach is applicable to any network topology (e.g., no restriction to feed-forward or two-component networks), promotes component independence by enabling components to employ tailored uncertaintypropagation operators, supports general functional representations of random variables, and requires no offline preprocessing stage. Also, because the method propagates functional representations of random variables throughout the network (and not, e.g., probability density functions), the joint distribution of any set of random variables throughout the network can be estimated a posteriori in a straightforward manner. We perform supporting convergence and error analysis and execute numerical experiments that demonstrate the weak- and strong-scaling performance of the method.