Optimal transport and Mean field games Seminar, Summer 2022.

Organizer: Wuchen Li.

Regular seminar talk time and location: Wendesday 5:00 p.m.-6:00 p.m., LA time @ Zoom.

Presenter	Jie Shen	Purdue University
Date: June 1, 5:00-6:00 p.m. LA time
Title: Efficient positivity/bound preserving schemes for complex nonlinear systems
Abstract: Solutions for a large class of partial differential equations (PDEs) arising from sciences and engineering applications are required to be positive to be positive or within a specified bound. It is of critical importance that their numerical approximations preserve the positivity/bound at the discrete level, as violation of the positivity/bound preserving may render the discrete problems ill posed. I will review the existing approaches for constructing positivity/bound preserving schemes, and then present several efficient and accurate approaches: (i) through reformulation as Wasserstein gradient flows; (ii) through a suitable functional transform; and (iii) through a Lagrange multiplier. These approaches have different advantages, are all relatively easy to implement and can be combined with most spatial discretization.

Presenter	Yifei Wang	Stanford University
Date: June 15, 5:00-6:00 p.m. LA time
Title: Optimal Neural Network Approximation of Wasserstein Gradient Direction via Convex Optimization
Abstract: The computation of Wasserstein gradient direction is essential for posterior sampling problems and scientific computing. The approximation of the Wasserstein gradient with finite samples requires solving a variational problem. We study the variational problem in the family of two-layer networks with squared-ReLU activations, towards which we derive a semi-definite programming (SDP) relaxation. This SDP can be viewed as an approximation of the Wasserstein gradient in a broader function family including two-layer networks. By solving the convex SDP, we obtain the optimal approximation of the Wasserstein gradient direction in this class of functions. Numerical experiments including PDE-constrained Bayesian inference and parameter estimation in COVID-19 modeling demonstrate the effectiveness of the proposed method. This is based on joint works with Peng Chen (UT Austin/Georgia tech), Mert Pilanci (Stanford), and Wuchen Li (U of SC).

Presenter	Jiajia Yu	Rensselaer Polytechnic Institute
Date: June 22, 5:00-6:00 p.m. LA time
Title: Computational Mean-field Games on Manifolds
Abstract: Conventional Mean-field games/control study the behavior of a large number of rationalagents moving in the Euclidean spaces. In this work, we explore the mean-field games onRiemannian manifolds. We formulate the mean-field game Nash Equilibrium on manifolds.We also establish the equivalence between the PDE system and the optimality conditions ofthe associated variational form on manifolds. Based on the triangular mesh representationof two-dimensional manifolds, we design a proximal gradient method for variational mean-field games. Our comprehensive numerical experiments on various manifolds illustrate thee↵ectiveness and flexibility of the proposed model and numerical methods.

Presenter	Tingwei Meng	Brown university/UCLA
Date: June 29, 5:00-6:00 p.m. LA time
Title: Overcoming the curse of dimensionality for solving high-dimensional Hamilton-Jacobi partial differential equations or optimal control problems using neural networks
Abstract: Hamilton-Jacobi PDEs and optimal control problems are widely used in many practical problems in control engineering, physics, financial mathematics, and machine learning. For instance, controlling an autonomous system is important in everyday modern life, and it requires a scalable, robust, efficient, and data-driven algorithm for solving optimal control problems. Traditional grid-based numerical methods cannot solve these high-dimensional problems, because they are not scalable and may suffer from the curse of dimensionality. To overcome the curse of dimensionality, we developed several neural network methods for solving high-dimensional Hamilton-Jacobi PDEs and optimal control problems. This talk will contain two parts. In the first part, I will talk about SympOCNet method for solving multi-agent path planning problems, which solves a 512-dimensional path planning problem with training time of less than 1.5 hours. In the second part, I will show several neural network architectures with solid theoretical guarantees for solving certain classes of high-dimensional Hamilton-Jacobi PDEs. By leveraging dedicated efficient hardware designed for neural networks, these methods have the potential for real-time applications in the future.

Presenter	Weiqi Chu	UCLA
Date: July 6, 5:00-6:00 p.m. LA time
Title:An opinion dynamics model on hypergraphs and its mean-field limit
Abstract: The perspectives and opinions of people change and spread through social interactions on a daily basis. The study of opinion dynamics gives a quantitative approach to examine how opinions evolve as dynamical processes on networks. In this talk, I will focus on one type of opinion dynamics model — bounded-confidence model. I will derive a density-based opinion model on hypergraphs, and extend the model to its mean-field limit, where the mean-field density follows a kinetic equation of Kac type. We prove that the solution of the governing equation is a probability density and converges to a sum of Dirac delta measures as time goes to infinity. Each single delta function represents an opinion cluster at steady states. We also examine the steady-state opinion clusters show nice bifurcation patterns when we increase the variance of initial distributions. This talk is based on our recent paper https://arxiv.org/abs/2203.12189

Presenter	Haizhao Yang	University of Maryland, College Park
Date: July 27, 5:00-6:00 p.m., LA time
Title: Finite Expression Method for Solving High-Dimensional PDEs
Abstract: Designing efficient and accurate numerical solvers for high-dimensional partial differential equations (PDE) remains a challenging and important topic in computational science and engineering, mainly due to the ``curse of dimensionality" in designing numerical schemes that scales in dimension. This talk introduces a new methodology that seeks an approximate PDE solution in the space of functions with finitely many analytic expressions and, hence, this methodology is named as the finite expression method (FEX). It is proved in approximation theory that FEX can avoid the curse of dimensionality. As a proof of concept, a deep reinforcement learning method is proposed to implement FEX for various high-dimensional PDEs in different dimensions, achieving high and even machine accuracy with a memory complexity polynomial in dimension and an amenable time complexity. An approximate solution with finite analytic expressions also provides interpretable insights of the ground truth PDE solution, which can further help to advance the understanding of physical systems and design postprocessing techniques for a refined solution.

Presenter	Nhat Ho	UT Austin
Date: Aug 3, 5:00-6:00 p.m. LA time
Title: Statistical and computational perspectives on latent variable models via optimal transport
Abstract: The growth in scope and complexity of modern data sets presents the field of statistics and data science with numerous inferential and computational challenges, among them how to deal with various forms of heterogeneity. Latent variable models provide a principled approach to modeling heterogeneous collections of data. However, due to the over-parameterization, it has been observed that parameter estimation and latent structures of these models have non-standard statistical and computational behaviors. In this talk, we provide new insights into these behaviors under mixture models, a building block of latent variable models. From the statistical viewpoint, we propose a general framework for studying the convergence rates of parameter estimation in mixture models based on Wasserstein distance. Our study makes explicit the links between model singularities, parameter estimation convergence rates, and the algebraic geometry of the parameter space for mixtures of continuous distributions. Based on the convergence rates of parameter estimation under the Wasserstein distance, we propose a novel Merge-Truncate-Merge procedure to consistently estimate the true number of components in mixture models. From the computational side, we propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data. Our method involves a joint optimization formulation over several spaces of discrete probability measures, which are endowed with Wasserstein distance metrics. We propose a number of variants of this problem, which admit fast optimization algorithms, by exploiting the connection to the problem of finding Wasserstein barycenters. Consistency properties are established for the estimates of both local and global clusters. Finally, experiment results with both synthetic and real data are presented to demonstrate the flexibility and scalability of the proposed approach.

Presenter	Siting Liu	UCLA
Date: August 10, 5:00-6:00 p.m.
Title: A numerical algorithm for inverse problem from partial boundary measurement arising from mean field game problem
Abstract: In this work, we consider a novel inverse problem in mean-field games (MFG). We aim to recover the MFG model parameters that govern the underlying interactions among the population based on a limited set of noisy partial observations of the population dynamics under the limited aperture. Due to its severe ill-posedness, obtaining a good quality reconstruction is very difficult. Nonetheless, it is vital to recover the model parameters stably and efficiently in order to uncover the underlying causes for population dynamics for practical needs. Our work focuses on the simultaneous recovery of running cost and interaction energy in the MFG equations from a finite number of boundary measurements of population profile and boundary movement. To achieve this goal, we formalize the inverse problem as a constrained optimization problem of a least squares residual functional under suitable norms. We then develop a fast and robust operator splitting algorithm to solve the optimization using techniques including harmonic extensions, three-operator splitting scheme, and primal-dual hybrid gradient method. Numerical experiments illustrate the effectiveness and robustness of the algorithm. A future direction will be to develop a faster algorithm for inverse problems in higher dimensions with the help of machine learning techniques and neural network architecture.

Presenter	Stanley Osher	UCLA
Date: August 18
Title: Mean field games Annual report 4
Abstract: We briefly reivew the development of MURI in year 2021-2022.

Presenter	Yang Liu	UCLA
Date: August 24, 2022.
Title: Learning Particle Dynamics from Observations of Ensembles with Physics-informed Deep Generative Models
Abstract: We propose a new method for inferring the governing stochastic ordinary differential equations (SODEs) by observing particle ensembles at discrete and sparse time instants, i.e., multiple “snapshots.” Particle coordinates at a single time instant, possibly noisy or truncated, are recorded in each snapshot but are unpaired across the snapshots. By training a physics-informed generative model that generates “fake” sample paths, we aim to fit the observed particle ensemble distributions with a curve in the probability measure space, which is induced from the inferred particle dynamics. We employ different metrics to quantify the differences between distributions, e.g., the sliced Wasserstein distances and the adversarial losses in generative adversarial networks. We illustrate the method by learning the drift and diffusion terms of particle ensembles governed by SODEs with Brownian motions and Lévy processes up to 100 dimensions. We also discuss how to treat cases with noisy or truncated observations. Apart from systems consisting of independent particles, we also tackle nonlocal interacting particle systems with unknown interaction potential parameters by constructing a physics-informed loss function.

Presenter	Xin Jiang	UCLA
Date: Nov 2, 2022.
Title: TBA
Abstract: TBA

Optimal transport and Mean field games Seminar, Spring 2022.

Organizer: Wuchen Li.

Regular seminar talk time and location: Wendesday 5:00 p.m.-6:00 p.m., LA time @ Zoom.

Presenter	Thomas Hou	California Institute of Technology
Date: Jan 5, 5:00-6:00 p.m., LA time, 2022.
Title: Multiscale Invertible Generative Networks for High-Dimensional Bayesian Inference
Abstract: We propose a Multiscale Invertible Generative Network (MsIGN) and associated training algorithm that leverages multiscale structure to solve high-dimensional Bayesian inference. To address the curse of dimensionality, MsIGN exploits the low-dimensional nature of the posterior, and generates samples from coarse to fine scale (low to high dimension) by iteratively upsampling and refining samples. MsIGN is trained in a multi-stage manner to minimize the Jeffreys divergence, which avoids mode dropping in high-dimensional cases. On two high-dimensional Bayesian inverse problems, we show superior performance of MsIGN over previous approaches in posterior approximation and multiple mode capture. On the natural image synthesis task, MsIGN achieves superior performance in bits-per-dimension over baseline models and yields great interpret-ability of its neurons in intermediate layers. This is a joint work with Shumao Zhang and Pengchuan Zhang.

Presenter	Chao Ma	Stanford university
Date: Jan 19, 5:00-6:00 p.m., LA time, 2022.
Title: Provably convergent quasistatic dynamics for mean-field two-player zero-sum games
Abstract: Abstract: We study the minimax problem arising from finding the mixed Nash equilibrium for mean-field two-player zero-sum games. Solving this problem requires optimizing over two probability distributions. We consider a quasistatic Wasserstein gradient flow dynamics in which one probability distribution follows the Wasserstein gradient flow, while the other one is always at the equilibrium. The convergence of the quasistatic dynamics to the mixed Nash equilibrium is shown under mild conditions. Inspired by the continuous dynamics of probability distributions, we derive a quasistatic Langevin gradient descent method with inner-outer iterations, and test the method on different problems, including training mixture of Generative Adversarial Networks (GANs).

Presenter	Abhishek Halder	University of California, Santa Cruz
Date: Jan 26, 5:00-6:00 p.m., LA time, 2022.
Title: Generalized Gradient Flows for Stochastic Prediction, Estimation, Learning and Control
Abstract: This talk will outline a recent and fast-moving development in systems-control research, where new geometric interpretations for the stochastic prediction, filtering, learning and control problems are emerging. At the heart of this development, lies the Wasserstein metric and the theory of optimal mass transport, which induces a Riemannian-like structure on the manifold of joint probability density functions supported over the state space. It turns out that the equations of prediction and filtering can be viewed as the gradient flows of certain Lyapunov functionals with respect to suitable notion of distance on such infinite dimensional manifolds. These ideas lead to infinite dimensional proximal recursions. The well-known exact filters, such as the Kalman-Bucy and the Wonham filters, have been explicitly recovered in this setting. Interestingly, the same framework can be used to design gradient descent algorithms numerically implementing the proximal recursions over probability weighted scattered point clouds, avoiding function approximation, and hence have extremely fast runtime. These techniques also enable a computational approach for mean-field learning of neural networks from data. The same line of ideas appear naturally in the finite horizon optimal density control (a.k.a. Schrodinger bridge) problems, and there too, the Wasserstein proximal algorithms help solve certain Schrodinger bridge problems with nonlinear prior dynamics. The latter can be seen as the continuum limit of decentralized stochastic optimal control, and is of contemporary engineering interest for shaping a distribution over time via feedback with applications in automated driving, power systems, and process control.

Presenter	Yiwei Wang	Illinois Institute of Technology
Date: Feb 2, 5:00-6:00 p.m., LA time, 2022.
Title: Variational discretization to generalized gradient flows and beyond: a discrete energetic variational approach.
Abstract: In this talk, we present a systematic framework for deriving variational numerical schemes for generalized gradient flows. The proposed numerical framework is based on the energy-dissipation law, which describes all the physics and the assumptions in each system and can combine various types of spatial discretizations including Eulerian, Lagrangian, and particle approaches. The resulting semi-discrete equation inherits the variational structures from the continuous energy-dissipation law. As examples, we apply such an approach to construct variational Lagrangian schemes to porous medium type generalized diffusions, Allen-Cahn type phase-field models, and particle methods for Bayesian inference. Numerical examples demonstrate the advantages of our numerical approach. The talk is based on several joint works with Prof. Chun Liu and Prof. Lulu Kang.

Presenter	Lulu Kang	Illinois Institute of Technology
Date: Feb 16, 11:00-11:50 a.m., LA time, 2022.
Title: Energetic Variational Inference
Abstract: In this talk, we introduce a new variational inference (VI) framework, called energetic variational inference (EVI), which minimizes the VI object function based on a prescribed energy-dissipation law. Under the EVI, we can derive many existing Particle-based Variational Inference (ParVI) methods, including the popular Stein variational gradient descent (SVGD). More importantly, many new ParVI schemes can be created under this framework. To demonstrate how to develop a new EVI method, we propose a new particle-based EVI scheme, which performs the particle-based approximation of the density first and then uses the approximated density in the variational procedure, or "Approximation-then-Variation'' for short. Thanks to this order of approximation and variation, the new scheme can maintain the variational structure at the particle level. Different divergence measures can be combined with EVI to produce different ParVI algorithms. Specially, we demonstrate the EVI methods using KL-divergence and Maximum Mean Discrepancy measure. The proposed methods are compared with existing ones using different examples. We conclude the talk with discussions of future research topics.

Presenter	Lexing Ying	Stanford university
Date: Feb 16th, 5:00-6:00 p.m., LA time, 2022.
Title: On optimization formulations and algorithms of Markov decision problems.
Abstract: Markov decision problems and reinforcement learning have been active research areas in the past decade. Compared with the rapid algorithmic developments, the linear/convex programming formulations of the Markov decision problem are less well-known. In the first part of talk, we will discuss the convex optimization formulations of Markov decision problems in the primal, dual, and primal-dual forms. In the second part of the talk, we will present two new algorithms that are inspired by these optimization formulations and exhibit exponential or even super-exponential convergence.

Presenter	Vincent Hsiao	University of Maryland
Date: March 9th, 11:00-11:50 a.m., LA time, 2022.
Title: A Mean Field Game Model of Spatial Evolutionary Games.
Abstract: Evolutionary Game Theory (EGT) studies evolving populations of agents that interact through normal form games whose outcome determines each individual's evolutionary fitness. In many applications, EGT models are extended to include spatial effects in which the agents are located in a structured population such as a graph. We propose a Mean Field Game (MFG) generalization, denoted Pair-MFG, of the spatial evolutionary game model such that the behavior of a given spatial evolutionary game (or more specifically the behavior of its pair approximation) is a special case trajectory of the corresponding MFG. In the first part of the talk, we will go over current methods for modeling evolutionary games and their spatial extensions. In the second part of the talk, we will present our proposed Pair-MFG model and approach for solving the Pair-MFG model using fixed point iteration with time-dependent proximal terms.

Presenter	Yunan Yang	ETH Zürich
Date: March 23, 11:00-11:50 a.m., LA time, 2022.
Title: Efficient Natural Gradient Descent Methods for Large-Scale Optimization Problems.
Abstract: We propose an efficient numerical method for computing natural gradient descent directions with respect to a generic metric in the state space. Our technique relies on representing the natural gradient direction as a solution to a standard least-squares problem. Hence, instead of calculating, storing, or inverting the information matrix directly, we apply efficient methods from numerical linear algebra to solve this least-squares problem. We treat both scenarios where the derivative of the state variable with respect to the parameter is either explicitly known or implicitly given through constraints. We apply the QR decomposition to solve the least-squares problem in the former case and utilize the adjoint-state method to compute the natural gradient descent direction in the latter case.

Presenter	Samy Wu Fung	Colorado School of Mines
Date: April 06, 5:00-6:00 p.m., LA time, 2022.
Title: Global Solutions to Nonconvex Problemsby Evolution of Hamilton-Jacobi PDEs.
Abstract: Computing tasks may often be posed as optimization problems. The objectivefunctions for real-world scenarios are often nonconvex and/or nondifferentiatiable.State-of-the-art methods for solving these problems typically only guarantee con-vergence to local minima. This work presents Hamilton-Jacobi-based MoreauAdaptive Descent (HJ-MAD), a zero-order algorithm with guaranteed convergenceto global minima, assuming continuity of the objective function. The core idea is tocompute gradients of the Moreau envelope of the objective (which is “piece-wiseconvex”) with adaptive smoothing parameters. Gradients of the Moreau envelopeare approximated by employing the Hopf-Lax formula for the viscous Hamilton-Jacobi equation. Provided numerical examples illustrate global convergence.

Presenter	Li Wang	University of Minnesota
Date: April 27th, 11:00-11:50 a.m., LA time, 2022.
Title: Variational asymptotic preserving schemes for multiscale kinetic equations
Abstract: Kinetic theory has emerged as a critical tool in studying many-particle systems with random motion. It bridges the gap between microscopic particle dynamics and macroscopic continuum mechanics, and therefore is multiscale in nature. Asymptotic preserving (AP) scheme links different scales with an asymptotic relation in a seamless way, and has become one of the mainstream multiscale computational methods in kinetic theory and related fields. Using the idea from optimal transport, we developed a new type of asymptotic preserving scheme that takes the advantage of the versatile optimization toolbox. In particular, we obtain uniform convergence rate with respect to the scaling parameters and mesh resolution. If time permits, I will mention our recent development on AP scheme using neural network approximation.

Presenter	Yuan Gao	Purdue university
Date: April 27, 5:00-6:00 p.m., LA time, 2022.
Title: Macroscopic behaviors for non-equilibrium chemical reaction from a Hamiltonian viewpoint
Abstract: Most biochemical reactions in living cells are open systems interacting with environment through chemostats to exchange both energy and materials, which, at a mesoscopic scale, can be modeled by a random time-changed Poisson processes. To characterize macroscopic behaviors in the large volume limit, the law of large numbers in the path space determines a mean-field limit nonlinear reaction rate equation describing the dynamics of the concentration of species, while the WKB expansion for the chemical master equation yields a Hamilton-Jacobi equation (HJE) and the Lagrangian gives the good rate function in the large deviation principle. By regarding chemical master equation as an upwind scheme, whose structure is preserved in HJE, we give another proof for the mean-field limit reaction rate equation. We decompose the mean-field reaction rate equation into a conservative part and a dissipative part in terms of the stationary solution to HJE. This stationary solution is used to determine the energy landscape and thermodynamics for general chemical reactions, which particularly maintains a positive entropy production rate at a non-equilibrium steady state. The associated energy dissipation law at both the mesoscopic and macroscopic levels is proved together with a passage from the mesoscopic to macroscopic one. A non-convex energy landscape emerges from the convex mesoscopic relative entropy functional in the large volume limit, which picks up the non-equilibrium features. The existence of this stationary solution is ensured by the optimal control representation at undetermined time horizon for the weak KAM solution to the stationary Hamilton-Jacobi equation. Furthermore, we use a reversible Hamiltonian to study a class of nonequilibrium enzyme reactions, which reduces the conservative-dissipative decomposition to an Onsager-type strong gradient flow, and a modified time reversed least action path serves as the transition paths between multiple non-equilibrium steady states with associated path affinities.

Presenter	Levon Nurbekyan	UCLA
Date: May 4th, 11:00-11:50 a.m., LA time, 2022.
Title: Random Features for High-Dimensional Nonlocal Mean-Field Games
Abstract: We propose an efficient solution approach for high-dimensional nonlocal mean-field game (MFG) systems based on the Monte Carlo approximation of interaction kernels via random features. We avoid costly space-discretizations of interaction terms in the state-space by passing to the feature-space. This approach allows for a seamless mean-field extension of virtually any single-agent trajectory optimization algorithm. Here, we extend the direct transcription approach in optimal control to the mean-field setting. We demonstrate the efficiency of our method by solving MFG problems in high-dimensional spaces which were previously out of reach for conventional non-deep-learning techniques.

Presenter	Jiacheng Zhang	University of California, Berkeley
Date: May 4th, 5:00-6:00 p.m., LA time, 2022.
Title: Stationary solutions and local equations for interacting diffusions on regular trees
Abstract: We study the invariant measures of infinite systems of stochastic differential equations (SDEs) indexed by the vertices of a regular tree. These invariant measures correspond to Gibbs measures associated with certain continuous specifications, and we focus specifically on those measures which are homogeneous Markov random fields. We characterize the joint law at any two adjacent vertices in terms of a new two-dimensional SDE system, called the "local equation", which exhibits an unusual dependence on a conditional law. Exploiting an alternative characterization in terms of an eigenfunction-type fixed point problem, we derive existence and uniqueness results for invariant measures of the local equation and infinite SDE system. This machinery is put to use in two examples. First, we give a detailed analysis of the surprisingly subtle case of linear coefficients, which yields a new way to derive the famous Kesten-McKay law for the spectral measure of the regular tree. Second, we construct solutions of tree-indexed SDE systems with nearest-neighbor repulsion effects, similar to Dyson's Brownian motion.

Presenter	Hansol Park	Simon Fraser University
Date: May 11th, 5:00-6:00 p.m., LA time, 2022.
Title: Mean Field Kuramoto models on graphs
Abstract: One of a classical synchronization model is the Kuramoto model. We propose both first and second order Kuramoto dynamical models on graphs using discrete optimal transport dynamics. We analyze the synchronization behaviors for some examples of Kuramoto models on graphs. We also provide a generalized Hopf-Cole transformation for discrete optimal transport systems. Focus on the two points graph, we derive analytical formulas of the Kuramoto dynamics with various potential induced from entropy functionals. Several numerical examples for the Kuramoto model on general graphs are presented.

Presenter	Minh Ha Quang	RIKEN Center for Advanced Intelligence
Date: May 18, 2022.
Title: Entropic regularized Wasserstein distances between Gaussian measures and Gaussian processes.
Abstract: Optimal transport (OT) has been attracting much research attention in various fields, in particular machine learning and statistics. It is well-known that the exact OT distances are generally computationally demanding and suffer from the curse of dimensionality. One approach to alleviate these problems is via regularization. In this talk, we present recent results on the entropic regularization of OT in the setting of Gaussian measures and their generalization to the infinite-dimensional setting of Gaussian processes. In these settings, the entropic regularized Wasserstein distances admit closed form expressions, which satisfy many favorable theoretical properties, especially in comparison with the exact distance. In particular, we show that the infinite-dimensional regularized distances can be consistently estimated from the finite-dimensional versions, with dimension-independent sample complexities. The mathematical formulation will be illustrated with numerical experiments on Gaussian processes.

Presenter	Tryphon Georgiou	University of California, Irvine
Date: Spring, 2022.
Title: TBA
Abstract: TBA

Presenter	Franca Hoffman	California Institute of Technology
Date: July or later, 2022.
Title: TBA
Abstract: TBA

Presenter	Flavien Leger	INRIA Paris
Date: TBA.
Title: TBA
Abstract: TBA

Optimal transport and Mean field games Seminar, Fall 2021.

Organizer: Wuchen Li.

Regular seminar talk time and location: Wendesday 5:00 p.m.-6:00 p.m., LA time @ Zoom.

Presenter	Lauro Langosco di Langosco	ETH Zurich
Date: Sep, 8th, 11:00 a.m.-11:50 a.m.
Title: Neural Variational Gradient Descent
Abstract: Particle-based approximate Bayesian inference approaches such as Stein Variational Gradient Descent (SVGD) combine the flexibility and convergence guarantees of sampling methods with the computational benefits of variational inference. In practice, SVGD relies on the choice of an appropriate kernel function, which impacts its ability to model the target distribution -- a challenging problem with only heuristic solutions. We propose Neural Variational Gradient Descent (NVGD), which is based on parameterizing the witness function of the Stein discrepancy by a deep neural network whose parameters are learned in parallel to the inference, mitigating the necessity to make any kernel choices whatsoever.

Presenter	Wumaier Maimaitiyiming	UCLA
Date: Sep 15, 5:00 p.m.-6:00 p.m.
Title: A dynamic mass-transport method for Poisson-Nernst-Planck systems
Abstract: In this talk, I will present some recent work on developing structure-preserving (i.e., positivity preserving, mass conservative, and energy dissipating) methods for numerically simulating Poisson-Nernst-Planck (PNP) systems. Motivated by Benamou-Brenier's dynamic formulation of the quadratic Wasserstein metric, we introduce a Wasserstein-type distance suitable for our problem setting, we then construct a variational scheme that falls into the Jordan--Kinderlehrer--Otto framework. The variational scheme is a first-order (in time) approximation of the original PNP system. To reduce the computational cost, we further approximate the constraints and the objective function in the underlying Wasserstein-type distance, such approximations won't destroy the first-order accuracy. With a standard spatial discretization, we obtain a finite-dimensional strictly convex minimization problem with linear constraints. The admissible set in the variational problem is a subset of the probability space and the Wasserstein-type distance is nonnegative, therefore our scheme is a positivity preserving, mass conservative, and energy dissipating scheme.

Presenter	Arnulf Jentzen	University of Münster
Date: Sep 22, 5:00-6:00 p.m., LA time.
Title: On neural network approximations for partial differential equations and convergence results for gradient descent optimization methods
Abstract: In the first part of this talk we show that artificial neural networks (ANNs) with rectified linear unit (ReLU) activation have the fundamental capacity to overcome the curse of dimensionality in the numerical approximation of semilinear heat partial differential equations with Lipschitz continuous nonlinearities. In the second part of this talk we present recent convergence analysis results for gradient descent (GD) optimization methods in the training of ANNs with ReLU activation. Despite the great success of GD type optimization methods in numerical simulations for the training of ANNs with ReLU activation, it remains -- even in the simplest situation of the plain vanilla GD optimization method with random initializations -- an open problem to prove (or disprove) the conjecture that the risk of the GD optimization method converges in the training of ANNs with ReLU activation to zero as the width of the ANNs, the number of independent random initializations, and the number of GD steps increase to infinity. In the second part of this talk we, in particular, present the affirmative answer of this conjecture in the special situation where the probability distribution of the input data is absolutely continuous with respect to the continuous uniform distribution on a compact interval and where the target function under consideriation is piecewise linear.

Presenter	Jose A. Carrillo	University of Oxford
Date: Oct 5th, 10:00 a.m.-10:55 a.m., LA time.
Title: Consensus-Based Interacting Particle Systems and Mean-field PDEs for Optimization and Sampling
Abstract: We will start by doing a quick review on consensus models for swarming. Stability of patterns in these models will be briefly discussed. Then we provide an analytical framework for investigating the efficiency of a consensus-based model for tackling global optimization problems. We justify the optimization algorithm in the mean-field sense showing the convergence to the global minimizer for a large class of functions. An efficient algorithm for large dimensional problems is introduced. Theoretical results on consensus estimates will be illustrated by numerical simulations. We then develop these ideas to propose a novel method for sampling and also optimization tasks based on a stochastic interacting particle system. We explain how this method can be used for the following two goals: (i) generating approximate samples from a given target distribution, and (ii) optimizing a given objective function. This approach is derivative-free and affine invariant, and is therefore well-suited for solving complex inverse problems, allowing (i) to sample from the Bayesian posterior and (ii) to find the maximum a posteriori estimator. We investigate the properties of this family of methods in terms of various parameter choices, both analytically and by means of numerical simulations. This talk is a summary of works in collaboration with Y.-P. Choi, O. Tse, C. Totzeck, F. Hoffmann, A. Stuart and U. Vaes.

Presenter	Dmitry Vorotnikov	Universidade de Coimbra
Date: Oct 6, 11:00 a.m.-11:50 a.m., LA time.
Title: On some submanifolds of the Otto-Wasserstein space and related curvature-driven flows
Abstract: I will describe hidden dynamical similarities between incompressible fluids and inextensible strings/networks, and the link of the latter ones to optimal transport. After that, I will speak about well-posedness and long-time asymtoptics of some gradient flows on the related submanifolds of the Otto-Wasserstein space.

Presenter	Hong Duong	University of Birmingham
Date: Oct 13, 13:00 p.m.-13:50 p.m., LA time.
Title: GENERIC and Hypocoercivity
Abstract: In this talk, we consider two frameworks to study non-reversible Markov processes, namely the Hypocoercivity Theory and GENERIC (General Equations for Non-Equilibrium Reversible-Irreversible Coupling). The HT provides a robust functional analytic framework for the study of exponential fast convergence to equilibrium for non-reversible processes. On the other hand, the GENERIC framework, which has been used widely in physics and engineering, provides a systematic method to derive thermodynamically consistent (reversible-irreversible) evolution equations. We will compare the two theories, discuss their connections to large deviation principle and applications to Piecewise Deterministic Markov Processes.This talk is based on my collaborative works [1,2]. References: [1] M. H. Duong, M. A.Peletier, J. Zimmer. GENERIC formalism of a Vlasov-Fokker-Planck equationand connection to Large Deviation Principle. Nonlinearity 26, 2951-2971, 2013. [2] M. H. Duong and M. Ottobre. GENERIC and Hypocoercivity, In preparation.

Presenter	Alpár Richárd Mészáros	Durham University
Date: Oct 20, 2:00 p.m.-3:00 p.m.
Title: Mean Field Games systems under displacement monotonicity
Abstract: In this talk we show how to obtain global in time well-posedness of mean field games systems under the so-called displacement monotonicity condition on the data. We consider a general class of nonüseparable Hamiltonians and final data that possess this monotonicity condition. In particular, our uniqueness result seem to be the first one in the literature beyond the well-known one in the so-called Lasry-Lions monotonicity regime. This monotonicity yields in addition the regularity of solutions independently of the intensity of the individual noise that can be taken to be degenerate. In the talk we aim to use only elementary arguments, and in particular we do not pass through the master equation. The talk will be based on several works in collaboration with W. Gangbo, C. Mou and J. Zhang.

Presenter	Aaron Palmer	UCLA
Date: Oct 27, 5:00 p.m.-6:00 p.m.
Title: Information in Mean Field Games and Stochastic Control
Abstract: In economics and many disciplines of engineering, the available `information' (i.e., consumer/market data, sensor measurements) plays an essential role in the design of models and control mechanisms. The rapidly developing field of mean field games and control theory builds a framework to analyze models consisting of many interacting agents that are relevant to these fields. 'Information structures' in these models can dictate how the agents cooperate as well as how they determine their strategies from available data. In the 'mean field limit' as the number of agents goes to infinity, the limiting mean field game is known to not distinguish between information structures when the interactions are sufficiently 'long range'. We investigate further this phenomenon to better understand the role that information does play in these problems. The 'Dyson game' is a stochastic differential game of many players, where the distinction of information structures persists in the mean field limit. This is a result of 'short range' or singular interactions, and the limiting mean field game incorporates the microscopic fluctuations in the mean field cost analogous to a thermodynamic limit towards a free energy density. A future goal of this line of research is to better understand such limits and how information plays a role. We will then discuss high-dimensional Markov decision processes with 'partial observation' of the state. Under assumptions of 'long range' interactions, the 'partial observation' vanishes in the mean field limit, which effectively becomes fully observable (as a nonlinear, deterministic, optimal control problem). We investigate the role that information plays in the fluctuations about the mean field limit, which are determined by the solution of a linear-quadratic-Gaussian stochastic control problem. In this way, an improved control policy may be derived that incorporates the 'partial observations' using a form of extended Kalman filter. We finish with a discussion of how understanding the role of information in the fluctuations can help us determine, at least approximately, the effect of information on the mean field limit with 'short range' interactions.

Presenter	Hongkang Yang	Princeton university
Date: Nov 3, 5:00-6:00 p.m., LA time.
Title: Generalization Error of GAN from the Discriminator's Perspective
Abstract: The generative adversarial network (GAN) is a well-known model for learning high-dimensional distributions, but the mechanism for its generalization ability is not understood. In particular, GAN is vulnerable to the memorization phenomenon, the eventual convergence to the empirical distribution. We consider a simplified GAN model with the generator replaced by a density, and analyze how the discriminator contributes to generalization. We show that with early stopping, the generalization error measured by Wasserstein metric escapes from the curse of dimensionality, despite that in the long term, memorization is inevitable. Additionally, we present a hardness of learning result for the Wasserstein GAN. This talk will be based on works in collaboration with W. E.

Presenter	Daniel Matthes	Technische Universität München
Date: Nov 10, 11:00-11:50 a.m., LA time.
Title: Two Diffusion Equations in Lagrangian Discretization
Abstract: The porous medium and thin film equations are, respectively, second and fourth order degenerate parabolic equations with finite speed of propagation. We present fully Lagrangian discretizations for both, in one space dimension. Despite being quite straightforward in their definitions, these discretization have remarkable qualitative properties. Specifically, we prove inheritance of asymptotic self-similarity (at optimal rates), and of the waiting time phenomenon from the PDEs. The proofs are based on the equations gradient flow structure, and on the dissipation of weighted entropy functionals, respectively.We close by showcasing several failed attempts to generalize the discretization concept to higher space dimensions. This talk is based on joint results with Horst Osberger, and with Julian Fischer.

Presenter	Jack Xin	University of California, Irvine
Date: Dec 1, 2021, 5:00 p.m.-6:00 p.m. LA time.
Title: DeepParticle: deep-learning invariant measure by minimizing Wasserstein distance on data generated from an interacting particle method
Abstract: High dimensional partial differential equations (PDE) are challenging to compute by traditional mesh based methods especially when their solutions have large gradients or concentrations at unknown locations. Mesh free methods are more appealing, however they remain slow and expensive when a long time and resolved computation is necessary. We present DeepParticle, an integrated deep learning (DL), optimal transport (OT), and interacting particle (IP) approach through a case study of Fisher-Kolmogorov-Petrovsky-Piskunov front speeds in incompressible flows. PDE analysis reduces the problem to a computation of principal eigenvalue of an advection-diffusion operator. Stochastic representation via Feynman-Kac formula makes possible a genetic interacting particle algorithm that evolves particle distribution to a large time invariant measure from which the front speed is extracted. The invariant measure is parameterized by a physical parameter (the Peclet number). We learn this family of invariant measures by training a physically parameterized deep neural network on affordable data from IP computation at moderate Peclet numbers, then predict at a larger Peclet number when IP computation is expensive. The network is trained by minimizing a discrete Wasserstein distance from OT theory. The DL prediction serves as a warm start to accelerate IP computation especially for a 3-dimensional time dependent Kolmogorov flow with chaotic streamlines. Our methodology extends to a more general context of deep-learning stochastic particle dynamics. This is joint work with Zhongjian Wang (University of Chicago) and Zhiwen Zhang (University of Hong Kong).

Presenter	Shu Liu	Georgia Institute of Technology
Date: Dec 8, 2021, 5:00-6:00 p.m. LA time.
Title: Neural Parametric Fokker-Planck equations
Abstract: We develop and analyze a numerical method proposed for solving high-dimensional Fokker-Planck equations by leveraging the generative models from deep learning. Our starting point is a formulation of the Fokker-Planck equation as a system of ordinary differential equations (ODEs) on finite-dimensional parameter space with the parameters inherited from generative models such as normalizing flows. We call such ODEs "neural parametric Fokker-Planck equations". The fact that the Fokker-Planck equation can be viewed as the 2-Wasserstein gradient flow of the relative entropy (also known as KL divergence) allows us to derive the ODE as the 2-Wasserstein gradient flow of the relative entropy constrained on the manifold of probability densities generated by neural networks. For numerical computation, we design a bi-level minimization scheme for the time discretization of the proposed ODE. Such an algorithm is sampling-based, which can readily handle computations in higher-dimensional space. Moreover, we establish bounds for the asymptotic convergence analysis as well as the error analysis for both the continuous and discrete schemes of the neural parametric Fokker-Planck equation. Several numerical examples are provided to illustrate the performance of the proposed algorithms and analysis. ArXiv: https://arxiv.org/abs/2002.11309

Presenter	Wuchen Li	University of South Carolina
Date: Dec 15, 2021, 5:00-6:00 p.m. LA time.
Title: Controlling regularized conservation laws via entropy-entropy flux pairs
Abstract: In this talk, we study a class of variational problems for regularized conservation laws with Lax's entropy-entropy flux pairs. We first introduce a modified optimal transport space based on conservation laws with diffusion. Using this space, we demonstrate that conservation laws with diffusion are "flux-gradient flows". We next construct variational problems for these flows, for which we derive dual PDE systems for regularized conservation laws. Several examples, including traffic flow and Burgers' equation, are presented. Incorporating both primal-dual algorithms and monotone schemes, we successfully compute the control of conservation laws. This is based on a joint work with Siting Liu and Stanley Osher.

Presenter	Francesca Boso	Stanford University
Date: TBA
Title: TBA
Abstract: TBA

Presenter	Frank Nielsen	Sony CSL, Japan, Ecole Polytechnique
Date: TBA
Title: TBA
Abstract: TBA

Presenter	Marie-Therese Wolfram	University of Warwick
Date: TBA, 2022
Title: On structure preserving schemes for Wasserstein gradient flows
Abstract: TBA

Optimal transport and Mean field games Seminar, Summer 2021.

Organizer: Wuchen Li.

Regular seminar talk time and location: Wendesday 5:00 p.m.-6:00 p.m., LA time @ Zoom.

Presenter	Leonard Wong	University of Toronto
Date: June 2, 5:00-6:00 p.m. LA time.
Title: Logarithmic divergences and statistical applications
Abstract: Divergences, such as the KL-divergence, play important roles in statistics, information theory, and data science. In the first part of the talk, we give a geometric interpretation of divergence via optimal transport maps. The Bregman divergence, for example, corresponds to the Brenier map of the Wasserstein transport. Next, we consider a family of logarithmic divergences which is a nonlinear extension of the Bregman divergence and arises from the Dirichlet transport. A distinguished example is the logarithmic divergence of the Dirichlet perturbation model which is a multiplicative analogue of the additive normal model. Using this divergence, we formulate statistical applications including clustering and nonlinear principal component analysis. Based on ongoing joint work with Zhixu Tao, Jiaowen Yang and Jun Zhang.

Presenter	Levon Nurbekyan	UCLA
Date: June 9, 5:00-6:00 p.m. LA time.
Title: Optimal Transport for Parameter Identification of Chaotic Dynamics via Invariant Measures
Abstract: Parameter identification determines the essential system parameters required to build real-world dynamical systems by fusing crucial physical relationships and experimental data. However, the data-driven approach faces many difficulties, such as discontinuous or inconsistent time trajectories and noisy measurements. The ill-posedness of the inverse problem comes from the chaotic divergence of the forward dynamics. Motivated by the challenges, we shift from the Lagrangian particle perspective to the state space flow field's Eulerian description. Instead of using pure time trajectories as the inference data, we treat statistics accumulated from the Direct Numerical Simulation (DNS) as the observable. The continuous analog of the latter is the physical invariant probability measure which is a distributional solution of the stationary continuity equation. Thus, we reformulate the original parameter identification problem as a data-fitting, PDE-constrained optimization problem. A finite-volume upwind scheme and the so-called teleportation regularization are used to discretize and regularize the forward problem. We present theoretical regularity analysis for evaluating gradients of optimal transport costs and introduce two different formulations for efficient gradient calculation. Numerical results using the quadratic Wasserstein metric from optimal transport demonstrate the robustness of the novel approach for chaotic system parameter identification.

Presenter	Wuchen Li	U of SC
Date: June 16, 5:00-6:00 p.m. LA time.
Title: Entropy dissipation via information Gamma calculus
Abstract: In this talk, we present some convergence behaviors for some non-gradient degenerate stochastic differential equations towards their invariant distributions. Our method extends the connection between Gamma calculus and Hessian operators in the Wasserstein space. In detail, we apply Lyapunov methods in the space of probabilities, where the Lyapunov functional is chosen as the relative Fisher information. We derive the Fisher information induced Gamma calculus and Lyapunov constant to handle non-gradient drift vector fields and degenerate diffusion matrix. Several examples are provided for non-reversible Langevin dynamics, sub-Riemannian diffusion processes, and variable-dependent underdamped Langevin dynamics. This is based on joint works with Qi Feng.

Presenter	Daniel McKenzie	UCLA
Date: June 23rd, 5:00-6:00 p.m.
Title: Learning to predict Nash equilibria from data using fixed point networks
Abstract: We study the problem of predicting the outcome of a contextual game, given only the context, and assuming that the player's cost functions are unknown. We use the recently introduced Fixed Point Network (FPN) framework to phrase this as a learning problem using historical data consisting of pairs of context and game outcomes. Using several "tricks" (e.g. Davis-Yin operator splitting, constraint decoupling) we improve the efficiency of this scheme to the extent that it can be readily applied to large games with complicated constraint sets. Finally, we demonstrate the efficacy of our proposed scheme (dubbed Nash-FPNs) on a collection of real-world traffic routing problems.

Presenter	Yann Breiner	CNRS, DMA-ENS, 45 rue d'Ulm, Paris, France
Date: June 30th, 11:00-11:50 a.m.
Title: Initial value problems viewed as generalized MFG/OT problems with matrix-valued density fields
Abstract: The initial value problem for many important PDEs (Burgers, Euler, Hamilton-Jacobi, Navier-Stokes equations, systems of conservation laws with convex entropy, etc…) can be often reduced to a convex minimization problem that looks like a generalized optimal transport problem involving matrix-valued density fields. As a matter of fact, the time boundary conditions enjoy a backward-forward just as in mean-field game theory.

Presenter	Stanley Osher	UCLA
Date: July 8th, 5:00 p.m.-6:00 p.m.
Title: Mean field games with applications Year 3

Presenter	Stanley Osher, Han Zhu, Dana Nau	UCLA, Unversity of Houston, University of Maryland
Date: July 14th, August 6th
Title: Mean field games with applications Year 3

Optimal transport and Mean field games Seminar, Spring 2021.

Organizer: Wuchen Li.

Regular seminar talk time and location: Wendesday 5:00 p.m.-6:00 p.m., LA time @ Zoom.

Presenter	Wuchen Li	Unversity of South Carolina
Date: Jan 13, 5:00 p.m. to 6:00 p.m., LA time.
Title: Transport Information Bregman Divergence
Abstract: I will tell a tale of transport information geometry. We study Bregman divergences in probability density space embedded with the Wasserstein-2 metric. Several properties and dualities of transport Bregman divergences are provided. In particular, we derive the transport Kullback-Leibler (KL) divergence by a Bregman divergence of negative Boltzmann-Shannon entropy in Wasserstein-2 space. We also derive analytical formulas of transport KL divergence for one-dimensional probability densities and Gaussian families.

Presenter	Yifei Wang	Stanford University
Date: Jan 20, 5:00 p.m. to 6:00 p.m., LA time.
Title: Information Newton’s flows
Abstract: We introduce a framework for Newton's flows in probability space with information metrics, named information Newton's flows. Extending the relationship between overdamped Langevin dynamics and Wasserstein gradient flows of Kullback-Leibler (KL) divergence, we derive Newton's Langevin dynamics from Wasserstein Newton's flows. We design sampling efficient variational methods in affine models and reproducing kernel Hilbert space (RKHS) to approximate Wasserstein Newton's directions. Convergence results of the proposed information Newton's method with approximated directions are established. Several numerical examples from Bayesian sampling problems are shown to demonstrate the effectiveness of the proposed method.

Presenter	Yuhan Kang	University of Houston
Date: Jan 20, 6:00 p.m. to 6:30 p.m., LA time.
Title: Task Selection and Route Planning for Mobile CrowdSensing Using Multi Population Mean-Field Games
Abstract: With the increasing deployment of mobile vehicles, such as mobile robots and unmanned aerial vehicles (UAVs), it is foreseen that they will play an important role in mobile crowd sensing (MCS). Specifically, mobile vehicles equipped with sensors and communication devices are able to collect massive data due to their fast and flexible mobility in MCS systems. On the other hand, energy efficiency is a vital metric to mobile vehicle systems, especially for the MCS where a large number of vehicles are needed to collect enough sensing data. In this paper, we consider a UAV-assisted MCS system where UAVs are owned by different operators or individuals who compete against others for limited sensing resources. We investigate the joint task selection and route planning problem for such an MCS system. However, since the structural complexity and computational complexity of the original problem is very high due to the large number of UAVs, we propose a multi-population Mean-Field Game (MFG) problem by simplifying the interaction between UAVs as a distribution over their strategy space, known as the mean-field term. To solve the multi-population MFG problem efficiently, we propose a G-prox primal-dual hybrid gradient method (PDHG) algorithm whose computational complexity is independent of the number of UAVs. Numerical results show that the proposed multi-population MFG scheme and algorithm are of great effectiveness and efficiency.

Presenter	Hao Gao	University of Houston
Date: Jan 20, 6:30 p.m. to 7:00 p.m., LA time.
Title: Mean Field Game Approach in Social Network and UAVs
Abstract: In this presentation, I will talk about the applications of mean field games in the Social Network (SN) and UAVs. For SNs, we study the belief and opinion evolution which can aid in understanding how people influence others’ decisions through social relationships as well as provide a solid foundation for many valuable social applications. We formulate the opinion evolution in SN as a high-dimensional mean field game and then apply the GAN-liked neural network, named APAC-net to tractably solve it. For UAVs, we consider the scenario of emergent communication after a disaster in a metropolitan area. Teams of low altitude rotary-wing UAVs are going to provide quality communication services with line-of-sight links. We formulate the velocity control problem of a large number of UAVs as a mean field game and then apply the G-prox primal dual hybrid gradient (PDHG) method to solve it.

Presenter	Jiaxi Zhao	Stony Brook University
Date: Jan 27, 5:00 p.m. to 6:00 p.m., LA time.
Title: Scaling limits of Wasserstein Gaussian mixture models
Abstract: We define 'canonical' Wasserstein metrics on probability simplices over one-dimensional bounded homogeneous lattices via a scaling limit of Wasserstein metric on Gaussian mixture models (GMM). Next, we construct several generalizations of this metric in different models, including inhomogeneous lattices, second-order metrics, and metrics with extra degrees of freedom on means (location) parameters. These models are of special interest in various settings such as numerical schemes for PDE. To illustrate this point, we further study the gradient flows on those models for three typical functionals, namely potential, internal and interaction energy functionals. We prove the results such as conservation laws and long-time existence of these gradient flows and study their relations to counterparts in continuous case.

Presenter	Goffredo Chirco	INFN - Istituto Nazionale di Fisica Nucleare
Date: Feb 3rd, 11:00 a.m. to 11:50 a.m.
Title: Lagrangian and Hamiltonian Dynamics for Probabilities on the Statistical Bundle.
Abstract: In a non-parametric formalism, we consider the full set of positive probability functions on a finite sample space, and we provide a specific expression for the tangent and cotangent spaces over the statistical manifold, in terms of a Hilbert bundle structure that we call the Statistical Bundle. In this setting, we compute velocities and accelerations of a one-dimensional statistical model using the canonical dual pair of parallel transports and define a coherent formalism for Lagrangian and Hamiltonian mechanics on the bundle. Finally, we show how our formalism provides a consistent framework for accelerated natural gradient dynamics on the probability simplex.

Presenter	Nicolas Garcia Trillos	UW Madison
Date: Feb 10th, 5:00 p.m. to 6:00 p.m.
Title: On Wasserstein gradient flows and the search of neural network architectures
Abstract: Neural networks have revolutionized machine learning and artificial intelligence in unprecedented ways, establishing new benchmarks in performance in applications such as image recognition and language processing. Such success has motivated researchers and practitioners in multiple fields to develop further applications. This environment has driven several novel research directions. In particular, one crucial question that has received increased recent attention is related to the design of good neural architectures using data-driven approaches and minimal human intervention. In this talk I will discuss a framework in which ideas from optimal transport can be used to motivate algorithms for the exploration of the architecture space. In the first part of the talk I will abstract the problem of neural architecture search slightly and discuss how optimal transport can motivate first order and second order gradient descent schemes for the optimization of a semi-discrete objective function. I will then return to the original neural architecture search problem, and using the ideas discussed during the first part of the talk, I will motivate two algorithms for neural architecture search called NASGD and NASAGD. I will wrap up by discussing the performance of our algorithms when searching an architecture for a classification problem with the CIFAR-10 data set, and providing some perspective on future research directions. This talk is based on joint work with Felix Morales and Javier Morales.

Presenter	Simon Backer	University of Cambridge
Date: Feb 17th, 5:00 p.m. to 6:00 p.m.
Title: Quantum statistical learning via Quantum Wasserstein natural gradient
Abstract: We introduce a new approach towards the statistical learning problem argminρ(θ) W2Q(ρ⋆,ρ(θ)) to approximate a target quantum state ρ⋆ by a set of parametrized quantum states ρ(θ) in a quantum L2-Wasserstein metric. We solve this estimation problem by considering Wasserstein natural gradient flows for density operators on finite-dimensional C∗ algebras. For continuous parametric models of density operators, we pull back the quantum Wasserstein metric such that the parameter space becomes a Riemannian manifold with quantum Wasserstein information matrix. Using a quantum analogue of the Benamou-Brenier formula, we derive a natural gradient flow on the parameter space. We also discuss certain continuous-variable quantum states by studying the transport of the associated Wigner probability distributions. This is joint work with Wuchen Li.

Presenter	Pierre Monmarche	Sorbonne University.
Date: Feb 24th, 11:00 a.m. to 11:50 a.m.
Title: Wasserstein contraction for non-elliptic diffusions on R^d
Abstract: A classical result from Sturm and Von Renesse is the equivalence between a lower bounded Ricci curvature and some properties of the heat semi-group, like contraction of the Wasserstein metrics or gradient bounds. We will see a similar result for diffusion processes on R^d with constant diffusion matrix, without any assumption of reversibility nor of hypoellipticity. The case of the (generalized) Langevin diffusion, and the link with Villani's modified entropy method for hypocoerciviy, will be discussed.

Presenter	Jiajia Yu	RPI
Date: March 3rd, 17:00 p.m. to 18:00 p.m., LA time.
Title: An efficient and flexible algorithm for dynamic mean field planning and convergence analysis
Abstract: Dynamic mean field planning is a special case of variational mean field game and a generalization of optimal transport. In this talk, I will present an efficient and flexible algorithm to solve dynamic mean-field planning problems based on an accelerated proximal gradient method. Then I will briefly discuss how we generalize the algorithm to mean-field game problems and accelerate it by multilevel and multigrid strategies. Some numerical results will follow to illustrate the efficiency and flexibility of our algorithm. At the end of this talk, I will show that the proposed discrete solution converges to the underlying continuous solution as the grid size increases by induction on iterations of our algorithm and confirm the convergence analysis with a numerical experiment..

Presenter	Zebang Shen	U Penn
Date: March 10th, 5:00 p.m. to 6:00 p.m., LA time.
Title: Sinkhorn Natural Gradient for Generative Models
Abstract: We consider the problem of minimizing a functional over a parametric family of probability measures, where the parameterization is characterized via a push-forward structure. An important application of this problem is in training generative adversarial networks. In this regard, we propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence. We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically with respect to the desired accuracy. This is in sharp contrast to existing natural gradient methods that can only be carried out approximately. Moreover, in practical applications when only Monte-Carlo type integration is available, we design an empirical estimator for SIM and provide the stability analysis. In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.

Presenter	Giovanni Conforti	École Polytechnique
Date: March 17th, 11:00 a.m. to 11:59 a.m., LA time
Title: Construction and long time behavior of mean field Schrödinger bridges
Abstract: The first goal of this talk is to introduce the mean field Schrödinger problem, which can be cast as the problem of finding the most likely evolution of a system of interacting particles conditionally to the observation of their initial and final configuration. Then, I shall discuss the ergodic behavior of solutions, which are called mean field Schrödinger bridges, and illustrate connections with the turnpike phenomenon in optimal control. Two model examples will be treated in more detail: the case of the simple exclusion process and weakly interacting diffusion processes.

Presenter	Derek Onken	Emory University
Date: March 24th, 17:00 p.m.-18:00 p.m., LA time
Title: A Neural Network Approach for High-Dimensional Optimal Control
Abstract: Optimal control (OC) problems aim to find an optimal policy that control given dynamics over a period of time. For systems with high-dimensional state (for example, systems with many centrally controlled agents), OC problems can be difficult to solve globally. We propose a neural network approach for solving such problems. When trained offline in a semi-global manner, the model is robust to shocks or disturbances that may occur in real-time deployment (e.g., wind interference). Our unsupervised approach is grid-free and scales efficiently to dimensions where grids become impractical or infeasible. We demonstrate the effectiveness of our approach on several multi-agent collision-avoidance problems in up to 150 dimensions.

Presenter	Renyuan Xu	University of Oxford
Date: March 31st, 11:00 a.m.-11:59 a.m., LA time
Title: Interbank lending with benchmark rates: a singular control game
Abstract: The market for interbank lending offers an interesting example of strategic interaction among financial institutions in which players react to an average of the actions from other players. One of the widely commented features of the interbank market is the fixing mechanism for interbank benchmark interest rates, the most well-known example of which is the London Interbank Offer Rate (LIBOR) which plays a central role in financial markets. However, this mechanism has been criticized and extensively documented for its vulnerability to manipulations. One of the lessons from the manipulation of LIBOR and other benchmarks is that insufficient attention had been paid to incentives, strategic interactions, mechanism design, and the role of the regulator in such markets. In this talk, we study the interbank lending market via a class of N-player stochastic games with singular controls. We describe Pareto optima for this game and show how they may be achieved through the intervention of a regulator, whose policy is a solution to a high-dimensional control problem. Pareto optima are characterized in terms of the solution to a new class of Skorokhod problems with piecewise-continuous free boundary. Pareto optimal policies are shown to correspond to the enforcement of endogenous bounds on interbank lending rates. Analytical comparison between Pareto optima and Nash equilibria for the case of two players allows to quantify the impact of regulatory intervention on the stability of the interbank rate. If time allows, we will also discuss the challenges in numerical solutions of such games and the mean-field limit. This is based on joint work with Rama Cont (Oxford) and Xin Guo (UC Berkeley).

Presenter	Weikan Xu	UCL
Date: April 7th, 11:00-11:50 a.m. LA time.
Title: Learning Deep Kernels for Non-Parametric Two-Sample Tests
Abstract: We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power. These tests adapt to variations in distribution smoothness and shape over space, and are especially suited to high dimensions and complex data. By contrast, the simpler kernels used in prior kernel testing work are spatially homogeneous and adaptive only in length-scale. We explain how this scheme includes popular classifier-based two-sample tests as a special case but improves on them in general. We provide the first proof of consistency for the proposed adaptation method, which applies both to kernels on deep features and to simpler radial basis kernels or multiple kernel learning. In experiments, we establish the superior performance of our deep kernels in hypothesis testing on benchmark and real-world data.

Presenter	Andrew Duncun	Imperial College London
Date: April 14th, 2:00-3:00 p.m. LA time.
Title: On the geometry of Stein variational gradient descent and related ensemble sampling methods
Abstract: Bayesian inference problems require sampling or approximating high-dimensional probability distributions. The focus of this talk is on the recently introduced Stein variational gradient descent methodology, a class of algorithms that rely on iterated steepest descent steps with respect to a reproducing kernel Hilbert space norm. This construction leads to interacting particle systems, the mean-field limit of which is a gradient flow on the space of probability distributions equipped with a certain geometrical structure. This viewpoint is leveraged to shed some light on the convergence properties of the algorithm, in particular addressing the problem of choosing a suitable kernel function. In the last part of the talk, I will discuss some new work where similar analysis is applied to a wider class of ensemble particle-based methods to analyse their performance and robustness under noisy likelihoods. Related work can be found here https://arxiv.org/abs/1912.00894 and https://arxiv.org/abs/2104.03384.

Presenter	Robert Martin	Air Force Research Laboratory
Date: April 21th, 2:00-3:00 p.m. LA time.
Title: How to Train your Digital Twin
Abstract: The role of science is the systematic distillation of data into knowledge. Within the current ‘Big Data’ environment, science has struggled to keep pace with the explosion in available data from a variety of sources. While recent developments in scientific machine learning have aimed at addressing this challenge, the existing machine learning theory needs further improvements to provide the necessary confidence required for scientific and engineering applications. Recently, the notion of the construction of a ‘Digital Twin’ that mirrors a physical system by incorporating, as closely as possible, all available information and models related to the system has been proposed as a conceptual framework for addressing this challenge. While the benefits of such a modeling paradigm are clear, the practical implementation for specific problems of interest often remain more nebulous. In this talk, a novel framework for unifying dynamical systems, control, and information theoretic concepts in the context of learning that has been of recent interest within the Air Force Research Laboratory’s In-Space propulsion branch will be presented. The proposed methodology constitutes a practical approach for the construction and training of a Digital Twin for In-Space propulsion applications. The outlook for this framework for a broad class of physical systems that exhibit emergent coherent dynamics along with suggested future directions for expanding the applicability of the approach towards ever more complex problems will also be explored.

Presenter	Jiangfeng Zhang	University of Southern California
Date: April 21st, 5:00-6:00 p.m. LA time.
Title: Set Values of Mean Field Games
Abstract: When a mean field game satisfies certain monotonicity conditions, the mean field equilibrium is unique and the corresponding value function satisfies the so called master equation. In general, however, there can be multiple equilibriums, and in the literature one typically studies the asymptotic behaviors of individual equilibriums of the corresponding $N$-player game. We instead study the set of values over all (mean field) equilibriums, which we call the set value of the game. We shall establish two crucial properties of the set value: (i) the dynamic programming principle; (ii) the convergence of the set values from the $N$-player game to the mean field game. We emphasize that the set value is very sensitive to the choice of the admissible controls. For the dynamic programming principle, one needs to use closed loop controls (not open loop controls) and it involves some very subtle path dependence issue. For the convergence, one has to restrict to the same type of equilibriums for the $N$-player game and for the mean field game. The talk is based on an ongoing joint work with Melih Iseri.

Presenter	Chenchen Mou	City University of Hong Kong
Date: April 28th, 5:00-6:00 p.m. LA time.
Title: Mean field games master equations with non-separable Hamiltonian
Abstract: In this talk, we give a structural condition on non-separable Hamiltonians, which we term displacement monotonicity condition, to study second order mean field games master equations. A rate of dissipation of a bilinear form is brought to bear a global (in time) well-posedness theory, based on a--priori uniform Lipschitz estimates in the measure variable. Displacement monotonicity being sometimes in dichotomy with the widely used Lasry-Lions monotonicity condition, the novelties of this work persist even when restricted to separable Hamiltonians. This is based on the joint work with W. Gangbo, A. Meszaros, J. Zhang.

Presenter	Qin Li	UW-Madison
Date: May 5th, 5:00-6:00 p.m. LA time.
Title: Ensemble Kalman Inversion, Ensemble Kalman Sampler, and mean-field limit
Abstract: How to sample from a target distribution is one of the core problems in Bayesian inference. It is used widely in machine learning, data assimilation and inverse problems. During the past decade, ensemble type algorithms have been very popular, among which, Ensemble Kalman Inversion (EKI,1) and Ensemble Kalman Sampler (EKS,2) may have garnered the most attention. While numerical experiments suggest consistency and fast convergence, rigorous mathematical justification is in lack. To prove the validity of the two methods, we utilize the mean-field limit argument. This translates the algorithms into a set of coupled stochastic differential equations, whose mean-field limit (as the number of particles N goes to infinity) is a Fokker-Planck equation that reconstructs the target distribution exponentially fast. We prove the convergence rate is optimal in the Wasserstein sense, meaning the ensemble distribution converges to the target distribution in N^{-1/2}. It is a joint work of two papers (3,4) with Zhiyan Ding. 1. Ensemble Kalman methods for inverse problems, M. A. Iglesias, K. Law, A. M. Stuart, Inverse Problems 29(4), 2013. 2. Interacting Langevin Diffusions: Gradient Structure and Ensemble Kalman Sampler, A. Garbuno-Inigo, F. Hoffmann, W. Li and A. M. Stuart, SIAM J. Apply. Dyn. Syst. 19(1), 2019. 3. Ensemble Kalman inversion: mean-field limit and convergence analysis, Stat. Comp. 31(9), 2021. 4. Ensemble Kalman Sampler: mean-field limit and convergence analysis, Z. Ding, Q. Li, SIAM J. Math. Anal. 53(2), 2021.

Presenter	Qiang Liu	UT Austin, 5:00-6:00 p.m. LA time.
Date: May 12th, 5:00-6:00 p.m. LA time.
Title: Recent Applications of Stein's Method in Machine Learning
Abstract: Stein's method is a powerful technique for deriving fundamental theoretical results on approximating and bounding distances between probability measures, such as central limit theorem. Recently, it was found that the key ideas in Stein's method, despite being originally designed as a pure theoretical technique, can be repurposed to provide a basis for developing practical and scalable computational methods for learning and using large scale, intractable probabilistic models. We will give an overview for these developments of Stein's method in machine learning, focusing on two important tools: 1) kernel Stein discrepancy (KSD), which provides a computational tool for approximating and evaluating (via goodness-of-fit test) distributions with intractable normalization constants, and 2) Stein variational gradient descent (SVGD), which is a deterministic sampling algorithm for finding concise particle-based approximation to intractable distributions that combines the advantages of Monte Carlo, variational inference and numerical quadrature methods. Bio: Qiang Liu is an assistant professor of computer science at UT Austin. He works on the intersection of statistical inference and machine learning & artificial intelligence.

Presenter	Nick Alger	UT Austin
Date: May 19th, 5:00-6:00 p.m. LA time.
Title: High Order Tensor Train Taylor Series for Uncertainty Quantification in Distributed Parameter Inverse Problems Governed by PDEs
Abstract: Low rank approximation of the Hessian (second derivative) is a popular tool for dimension reduction in statistical inverse problems governed by partial differential equations (PDEs). Compression of higher order derivatives was previously considered intractable, but it is not intractable anymore. We present a new “tensor-free' tensor train compression method, which allows us to efficiently construct low-rank approximations of third, fourth, fifth, and higher order derivatives of the negative log posterior in distributed parameter inverse problems governed by PDEs. We use the resulting truncated Taylor series formed from these derivative tensors, in combination with the Riemannian manifold Hamiltonian Monte Carlo method, to efficiently sample from the posterior.

Presenter	Wonjun Lee and Siting Liu	UCLA
Date: May 26th, 5:00-6:00 p.m., LA time.
Title: Mean field control for Vaccine distribution
Abstract: With the invention of the COVID-19 vaccine, shipping and distributing are crucial in controlling the pandemic. In this paper, we build a mean-field variational problem in a spatial domain, which controls the propagation of pandemic by the optimal transportation strategy of vaccine distribution. Here we integrate the vaccine distribution into the mean-field SIR model designed in our previous paper (https://arxiv.org/abs/2006.01249). Numerical examples demonstrate that the proposed model provides practical strategies in vaccine distribution on a spatial domain.

Optimal transport and Mean field games Seminar, Fall 2020.

Organizer: Wuchen Li.

Regular seminar talk time and location: Wendesday 5:00pm-6:00pm, LA time @ Zoom.

Presenter	Samy Wu Fang	UCLA
Date: Oct 14, 5:00 to 6:00 p.m. LA time.
Title: Adversarial Projections for Inverse Problems
Abstract: We present a new mechanism, called adversarial projection, that projects a given signal onto the intrinsically low dimensional manifold of true data. This operation can be used for solving inverse problems, which consists of recovering a signal from a collection of noisy measurements. Rather than attempt to encode prior knowledge via an analytic regularizer, we leverage available data to project signals directly onto the (possibly nonlinear) manifold of true data (i.e., regularize via an indicator function of the manifold). Our approach avoids the difficult task of forming a direct representation of the manifold. Instead, we directly learn the projection operator by solving a sequence of unsupervised learning problems, and we prove our method converges in probability to the desired projection. This operator can then be directly incorporated into optimization algorithms in the same manner as Plug and Play methods, but now with added theoretical guarantees. Numerical examples are provided.

Presenter	Wonjun Lee	UCLA
Date: Oct 21, 5:00 to 6:00 p.m., LA time.
Title: The back-forth method for Wasserstein gradient flows.
Abstract: We present a method to efficiently compute Wasserstein gradient flows. Our approach is based on a generalization of the back- and-forth method (BFM) introduced in [JL19] to solve optimal transport problems. We evolve the gradient flow by solving the dual problem to the JKO scheme. In general, the dual problem is much better behaved than the primal problem. This allows us to efficiently run large scale gradient flows simulations for a large class of internal energies including singular and non-convex energies.

Presenter	Karthik Elamvazhuthi	UCLA
Date: Oct 28, 5:00--6:00 p.m., LA time.
Title: Probability density stabilization using non-local regularization of PDEs
Abstract: In this talk, I will present some recent work on developing particle methods for numerically simulating a class of PDEs that provide mean-field feedback laws for probability density stabilization. These have important applications in problems such as sampling and multi-agent control. We will consider a class of nonlinear PDE models with the velocity and reaction parameters as the control inputs. Since the control inputs for these PDEs depend on the local density, they are not suitable for implementation on a finite number of particles. We construct a particle method by regularizing the local dependence to construct a non-local control law. While the nonlocal approximations make numerical implementation easier, their local limits have good analytical properties from the point of view of understanding long-term behavior.

Presenter	Gabriel Khan	Iowa state university
Date: Nov 4, 5:00--6:00 p.m., LA time.
Title: Complex Geometry and Optimal Transport
Abstract: In this talk, we consider the Monge problem of optimal transport, which seeks to find the most cost-efficient movement of resources. In particular, we study the regularity (i.e. continuity/smoothness) of this transport. In recent work (joint with J. Zhang), we show that there is a connection between this question and the ``anti-bisectional curvature" of certain Kahler metrics. In this talk, we'll discuss several applications of these results. First, we will answer a question in mathematical finance about the regularity of pseudo-arbitrages, which are investment strategies which beat the market almost surely in the long run. Second, by studying the behavior of anti-bisectional curvature along Kahler-Ricci flow, we will be able to show new results in complex geometry and optimal transport.

Presenter	Tianyi Lin	UC Berkely
Date: Nov 11, 5:00 p.m. to 6:00 p.m., LA time.
Title: Projection Robust Wasserstein Distance and Riemannian Optimization.
Abstract: Projection robust Wasserstein (PRW) distance, or Wasserstein projection pursuit (WPP), is a robust variant of the Wasserstein distance. Recent work suggests that this quantity is more robust than the standard Wasserstein distance, in particular when comparing probability measures in high-dimensions. However, it is ruled out for practical application because the optimization model is essentially non-convex and non-smooth which makes the computation intractable. Our contribution in this paper is to revisit the original motivation behind WPP/PRW, but take the hard route of showing that, despite its non-convexity and lack of non-smoothness, and even despite some hardness results proved by~\citet{Niles-2019-Estimation} in a minimax sense, the original formulation for PRW/WPP \textit{can} be efficiently computed in practice using Riemannian optimization, yielding in relevant cases better behavior than its convex relaxation. More specifically, we provide three simple algorithms with solid theoretical guarantee on their complexity bound (one in the appendix), and demonstrate their effectiveness and efficiency by conducting extensive experiments on synthetic and real data. This paper provides a first step into a computational theory of the PRW distance and provides the links between optimal transport and Riemannian optimization.

Presenter	Klas Modin	Chalmers University of Technology and the University of Gothenburg
Date: Nov 18, 11:00 a.m. to 11:59 a.m., LA time.
Title: Optimal transport, information geometry, and factorisations of matrices
Abstract: In this talk I’m going to outline the geometric approach to optimal transport and analogous constructions in information geometry. Furthermore, I will show how these frameworks give rise to classical matrix factorisations, such as the polar, QR, Cholesky, and spectral factorisations.

Presenter	Nicolo' De Ponti	SISSA, Trieste, Italy
Date: Nov 23, 11:00 a.m. to 11:59 a.m., LA time.
Title: Entropy-Transport distances between measures and metric measure spaces
Abstract: In the first part of the talk we introduce the class of optimal Entropy-Transport problems, a recent gener- alization of optimal transport where also creation and destruction of mass is taken into account. We focus in particular on the metric properties of these problems, showing how this theory can produce some meaningful distances between nonnegative and finite measures. Inspired by previous work of Gromov and Sturm, we then use these metrics to construct a new class of distances between unbalanced metric measure spaces. This talk is based on a joint collaboration with Andrea Mondino.

Presenter	Guo Xin	UC Berkely
Date: Dec 2, 3:00 p.m. to 4:00 p.m., LA time.
Title: A mean-field game approach for multi-agent reinforcement learning: convergence and complexity analysis
Abstract: Multi-agent reinforcement learning (MARL), despite its popularity and empirical success, suffers from the curse of dimensionality. We propose to approximate cooperative MARL by a mean-field control framework and show that the approximation error is of $O(\frac{1}{\sqrt{N}})$. By establishing appropriate forms of the dynamic programming principle for both the value function and the Q function, we propose a model-free kernel-based Q-learning algorithm, with a linear convergence rate and the sample complexity independent of the number of agents N. Empirical studies for the network traffic congestion problem demonstrate that MFC-K-Q outperforms existing MARL algorithms when N is large, for example when N is bigger than 50.

Presenter	Ted Moskovitz.	UCL
Date: Dec 9, 11:00 a.m. to 11:59 a.m., LA time.
Title: Efficient Wasserstein Natural Gradients for Reinforcement Learning
Abstract: A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL). The procedure uses a computationally efficient Wasserstein natural gradient descent that takes advantage of the geometry induced by a Wasserstein penalty to speed optimization. This method follows the recent theme in RL of including divergence penalties in the objective to establish trust regions. Experiments on challenging tasks demonstrate improvements in both computational cost and performance over advanced baselines.

Presenter	Alex Lin	UCLA
Date: Dec 16, 5:00 p.m. to 6:00 p.m., LA time.
Title: Solving Mean-Field Games by APAC-Net
Abstract: We present APAC-Net, an alternating population and agent control neural network for solving stochastic mean field games (MFGs). Our algorithm is geared toward high-dimensional instances of MFGs that are beyond reach with existing solution methods. We achieve this in two steps. First, we take advantage of the underlying variational primal-dual structure that MFGs exhibit and phrase it as a convex-concave saddle point problem. Second, we parameterize the value and density functions by two neural networks, respectively. By phrasing the problem in this manner, solving the MFG can be interpreted as a special case of training a generative adversarial network (GAN). We show the potential of our method on up to 100-dimensional MFG problems.