Quantifying Uncertainties in Weight-Parameterized Residual Neural Networks

Abstract

Neural network (NN) have been employed as surrogates to replicate input-output maps in complex physical models, accelerating sample-intensive studies such as model calibration and sensitivity analysis, albeit in most instances trained NNs are treated deterministically. In the context of probabilistic estimation, Bayesian methods provide an ideal path to infer NN weights while incorporating various sources of uncertainty in a consistent fashion. However, exact Bayesian posterior distributions are extremely difficult to compute or sample from. Variational approaches, as well as ensembling methods, provide viable alternatives accepting various degrees of approximation and empiricism.
This work focuses on special NN architectures, residual NNs (ResNets). Inspired by the continuous, neural ODE analogy, we develop an approach for ResNet weight matrix parameterization as a function of depth. The choice of parameterization affects the capacity of the network, leading to regularization and improved generalization. More importantly, weight-parameterized ResNets become more amenable to Bayesian treatment due to the reduction of the number of parameters and overall regularization of the loss, or log-posterior, surface. We will highlight the improvements in training and generalization gained by using weight-parameterized ResNet architectures in the context of various Bayesian NN learning methods.

Date
Mar 1, 2024
Location
Trieste, Italy