Surrogates

A surrogate or metamodel is an approximation of the actual model function or simulator over the parameter space. Running a surrogate is computationally much less expensive than the actual simulator, a characteristic that makes the use of surrogates inevitable in different UQ problems. However, the predictions by the surrogate should be accurate enough compared to the actual predictions by the simulator. Using an additive error model, the following relation can be considered between the model function (simulator) and its observations:

\[r=f(\chi,\mathbf{q})+\mathbf{\varepsilon}\,,\]

where \(\mathbf{\varepsilon}\) expresses the bias and random noise. Our task is to construct a surrogate \(\tilde{f}(\chi,\mathbf{q})\) for the actual model function (simulator).

Treating the simulator as a blackbox, a set of training data \(\mathcal{D}=\{(\mathbf{q}^{(i)},r^{(i)})\}_{i=1}^n\) is obtained. There are different techniques to construct a surrogate, for instance see [Smith13], [Ghanem17], and [Gramacy20]. Some of the techniques relevant to CFD applications have been implemented in UQit. Here, we provide a short overview to the theory behind these techniques, explain their implementation in UQit, and provide examples to show how to use them.

Non-intrusive Polynomial Chaos Expansion

As a type of stochastic collocation (SC) methods, see e.g. Chapter 20 in [Ghanem17], non-intrusive PCE [Xiu05], [Xiu07] can be used to construct a surrogate.

\[\tilde{f}(\chi,\mathbf{q}) = \sum_{k=0}^K \hat{f}_k(\chi) \Psi_{k}(\xi) \,.\]

There is a one-to-one correspondence between any sample of \(\mathbf{q}\in \mathbb{Q}\) and \(\xi\in\Gamma\), where \(\mathbb{Q}=\bigotimes_{i=1}^p \mathbb{Q}_i\) and \(\Gamma=\bigotimes_{i=1}^p \Gamma_i\). Note that \(\mathbb{Q}_i\) is the admissible space of the i-th parameter which can be mappd onto \(\Gamma_i\) based on the gPCE rules, see [Xiu02], [Eldred09]. For the details of the non-intrusive PCE method refer to Standard Polynomial Chaos Expansion.

Lagrange Interpolation

Theory

As another form of SC-based surrogates, Lagrange interpolation can be considered:

\[\tilde{f}(\chi,\mathbf{q}) = \sum_{k=1}^n \hat{f}_k(\chi,\mathbf{q}) L_k(\mathbf{q}) \,,\]

where \(\hat{f}_k(\chi,\mathbf{q})=f(\chi,\mathbf{q}^{(k)})=r^{(k)}\) are the training model outputs. If the \(n_i\) samples taken from the \(i\)-th parameter space are represented by \(Q_{n_i}\), then the use of tensor-product leads to the nodal set \(Q_n\) of size \(n=\prod_{i=1}^p n_i\), where,

\[Q_n= Q_{n_1} \bigotimes Q_{n_2}\bigotimes \ldots \bigotimes Q_{n_p} \,.\]

Correspondingly, the Lagrange bases \(L_k(\mathbf{q})\) are constructed using the tensor-product of the bases in each of the parameter spaces:

\[L_k(\mathbf{q})=L_{k_1}(q_1) \bigotimes L_{k_2}(q_2) \bigotimes \ldots \bigotimes L_{k_p}(q_p) \,,\]

where,

\[\begin{split}L_{k_i}(q_i) = \prod_{\substack{{k_i=1}\\{k_i\neq j}}}^{n_i} \frac{q_i - q_i^{(k_i)}}{q_i^{(k_i)}-q_i^{(j)}} \,,\quad i=1,2,\ldots,p \,.\end{split}\]

Note that the Lagrange basis satisfies \(L_{k_i}(q_i^{(j)})=\delta_{k_{i}j}\), where \(\delta\) represents the Kronecker delta.

Example

  • For \(p=1\) (one-dimensional parameter \(q\)):

fInterp=lagInt(fNodes=fNodes,qNodes=[qNodes],qTest=[qTest]).val
  • For \(p>1\) (multi-dimensional parameter \(\mathbf{q}\)):

fInterp=lagInt(fNodes=fNodes,qNodes=qNodes,qTest=qTestList,liDict={'testRule':'tensorProd'}).val

Implementation

Note:
  • To avoid Runge’s phenomenon, the training nodal set should be non-uniformly distributed in each dimension of the parameter space.

class lagInt.lagInt(qNodes, fNodes, qTest, liDict=[])

Lagrange interpolation over a p-D parameter space, where p=1,2,… The interpolation order is \((n_1-1)*(n_2-1)*...*(n_p-1)\), where, \(n_k, k=1,2,...,p\) refers to the number of training nodes in the i-th dimension of the parameter space.

\[F(\mathbf{Q})=\sum_{k_1=1}^{n_1} ... \sum_{k_p=1}^{n_p} [fNodes(k_1,k_2,...,k_p) L_{k_1}(Q_1) L_{k_2}(Q_2) ... L_{k_p}(Q_p)]\]

where, \(L_{k_i}(Q_i)\) is the single-variate Lagrange basis in the i-th dimension.

Args:
qNodes: A list of length p

qNodes=[qNodes_1,qNodes_2,…,qNodes_p], where qNodes_k is a 1D numpy array of n_k nodes in the k-th parameter dimension.

fNodes: A p-D numpy array of shape (n_1,n_2,…,n_p) (or a 1D array of size n=n_1*n_2*…*n_p if p>1)

Response values at the training nodes.

qTest: A list of length p

Containing test samples from the p-D parameter space, i.e. qTest=[Q_0,Q_1,…,Q_{p-1}] where Q_i is a 1D numpy array of size m_i, i=0,1,…,p-1. Note that Q_i must be in [min(qNodes_i),max(qNodes_i)] for i=1,2,…,p.

liDict: dict (optional)
To set different options for Lagrange interpolation over the p-D space when p>1, with the following key:
  • ‘testRule’: The rule for treating the multi-dimensional test points with the values:
    • ‘tensorProd’: A tensor-product grid is constructed from Q_i`s in list `qTest. Hence, the total number of sample points is m=m0*m1*m_{p-1}.

    • ‘set’: All m_i for i=0,1,…p-1 should be equal. As a result, a set of m=m_i test points is considered.

Returns:
val: f(qTest), Interpolated values of f(q) at qTest
  • If p==1: 1D numpy array of size m1

  • If p>1 and ‘testRule’==’tensorProd’ => val: pD numpy array of shape (m1,m2,…,mp)

  • If p>1 and ‘testRule’==’set’ => ‘val’: 1D numpy array of size m1*m2*…*mp

basis1d(qNodes_, k, Q_)

Constructs single-variate Lagrange bases \(L_k(q)\) using n nodes qNodes_ for k=0,1,…,n-1. The bases are evaluated at m samples Q_ of a single-variate parameter

Args:
qNodes_: 1D numpy array of size n

The single-variate training nodal set

Q_: 1D numpy array of size m

Test samples

k: 1D numpy array of int values

Order of the polynomial bases

Returns:
prod: n-by-m numpy array

Values of \(L_k(Q)\) for k=0,1,…,n-1 evaluated at Q_

interp()

Lagrange interpolation, for p=1,2, …

interp_1d()

Lagrange interpolation of order (n-1) constructed over a 1D parameter space. Bases are constructed from n nodes qNodes and are evaluated at test points Q in [min(qNodes_),max(qNodes_)].

\[F(Q)=\sum_{k=0}^{n-1} fNodes_{k} L_k(Q)\]

where, Lagrange Bases L_k(q) are constructed from the nodal set.

interp_pd()

Lagrange interpolation of order \((n_1-1)*(n_2-1)*...*(n_p-1)\) constructed over a p-D parameter space. Here, \(n_k, k=1,2,...,p\) refers to the number of training nodes in the i-th dimension of the parameter space.

\[F(\mathbf{Q})=\sum_{k_1=1}^{n_1} ... \sum_{k_p=1}^{n_p} [fNodes(k_1,k_2,...,k_p) L_{k_1}(Q_1) L_{k_2}(Q_2) ... L_{k_p}(Q_p)]\]

where, \(L_{k_i}(Q_p)\) is the single-variate Lagrange basis in the i-th dimension.

lagInt.lagInt_Quads2Line(fNodes, qNodes, lineDef)

Constructing a Lagrange interpolation from tensor-product nodal set in a 2D parameter space and then evaluating it at the test points located on a straight line lying in the same parameter plane.

Args:
qNodes: A list of length 2

qNodes=[Q1,Q2] list of training nodes Qi is a 1D numpy array of length ni for i=1,2.

fNodes: 1D numpy array of length n1*n2

Values of the response at qNodes

lineDef: dict
Defines the line over which the test samples are to be taken. The keys are:
  • ‘lineStart’:[q1Start,q2Start]; line’s starting point

  • ‘lineEnd’:[q1End,q2End]; line’s end point

  • ‘noPtsLine’: int; number of test points on the line, m1

Returns:
val: f(qTest), Interpolated values of f(q) at qTest

1D numpy array of size m1

Notebook

Try this LagInt notebook to see how to use UQit for Lagrange interpolation over a parameter space.

Gaussian Process Regression

Theory

Consider the simulator \(f(\mathbf{q})\) where \(\mathbf{q}\in \mathbb{Q}\). The observations are assumed to be generated from the following model,

\[y = f(\mathbf{q}) + \varepsilon \,.\]

Since the exact simulator \(f(\mathbf{q})\) is not known, we can put a prior on it, which is in the form of a Gaussian process, see [Rasmussen05], [Gramacy20]. Based on the training data \(\mathcal{D}\), the posterior of the \(f(q)\), denoted by \(\tilde{f}(\mathbf{q})\), is inferred. Without loss of generality we assume \(\varepsilon\sim\mathcal{N}(0,\sigma^2)\). Contrary to the common use of Gaussian process regression (GPR) where \(\sigma\) is assumed to be fixed for all observations (homoscedastic noise), we are interested in cases where \(\sigma\) is observation-dependent (heteroscedastic noise). In the latter, we need to have a Gaussian process to infer the noise parameters, see [Goldberg98]. Eventually, the posteriors of \(\tilde{f}(\mathbf{q})\) and response \(y\) can be sampled over the parameter space, see [Rezaeiravesh20] and the references therein for the details.

Example

Given the training data including the observational noise, A GPR is constructed in UQit as,

gpr_=gpr(xTrain,yTrain[:,None],noiseSdev,xTest,gprOpts)
post_f=gpr_.post_f
post_obs=gpr_.post_y

Implementation

In UQit, the GPR is implemented using the existing Python library GPyTorch [Gardner18]. The user can similarly use any other available library for GPR as long as the code structure is kept consistent with UQit.

Notes:
  1. GPyTorch: Size of the training data cannot exceed 128. Otherwise, make batches of max size 128.

2. GPyTorch: If the standard deviation of the observation noise is exactly zero for all observations, then there may be issues with Cholesky decomposition. Therefore, instead of zero, use a very small value for the noise standard deviations.

3. In a p-D parameter space, it is required to define a length-scale per dimension. Based on the experience, if the range of the parameters are too different from each other or are too large, the optimization of the length-scales can be problematic. To rectify this, the original parameter space can be mapped, for instance, into the hypercube [-1,1]^p. Then, the GPR can be constructed on this mapped space.

4. Shifting and scaling (standardization) of the training data can result in a better fit of GPR. This can be activated via making ‘standardizeYTrain’:True in gprOpts.

5. Different options for kernel function are available by GPyTorch. One can change the default kernel in self.covar_module(…) (see below) manually by modifying the source code of UQit.

class gpr_torch.SingletaskGPModel(train_x, train_y, likelihood)

GPR for single-task output using GPyTorch, 1D input: y=f(x) in R, x in R

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class gpr_torch.SingletaskGPModel_mIn(train_x, train_y, likelihood)

GPR for single-task output using GPyTorch, p-D input: y=f(x) in R, x in R^p, p>1

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class gpr_torch.gpr(xTrain, yTrain, noiseV, xTest, gprOpts)

Constructs and evaluates a GPR over the space of uncertain parameters/inputs. The GPR model is

\[y=f(x)+\epsilon\]

where \(\epsilon\sim N(M,V)\)

  • The parameter (input) space is p-dimensional, where p=1,2,…

  • The response y is 1-D (single-task model).

  • The observations are assumed to be un-correlated, but the noise uncertainty can be observation-dependent (heteroscedastic). Therefore, only the diagonal elements of V are (currently) assumed to be non-zero. If the noise uncertainty is fixed for all observations, then epsilon_i~iid (homoscedastic).

Args:
xTrain: 2D numpy array of size nxp

Training samples (size n) from the p-D input (parameter) space

yTrain: 2D numpy array of size nxm (m: dimensionality of y)

Training model output (response)

noiseV: A 1D numpy vecor of size n

Standard deviation of the the Gaussian noise (diagonal elements of V)

xTest: 2D numpy array of size nTestxp

Test samples from the input (parameter) space

gprOpts: dict
Options for constructing GPR with the following keys:
  • ‘nIter’: int

    Number of iterations in the optimization of hyperparameters

  • ‘lr’: float

    Learning rate in the optimization of hyperparameters

  • ‘standardizeYTrain’: bool (optional, default:False)

    If true, the training data are standardized by shifting by mean and scaling by sdev:

    \(yStnd = (yTrain-mean(yTrain))/sdev(yTrain)\)

  • ‘convPlot’: bool

    If true, optimized values of the hyper-parameters is plotted vs. iteration.

Attributes:

post_f: Posterior of f(x) at xTest.

post_obs: Predictive posterior (likelihood) at xTest.

gprTorch_1d()

GPR for 1D uncertain parameter. Observations \({(x_i,y_i)}_{i=1}^n\) are assumed to be independent but their noise variance can be either the same (iid=homoscedastic) or non-identical (heteroscedastic).

gprTorch_1d_singleTask()

GPR for 1D uncertain parameter and single-variate response y.

gprTorch_pd()

GPR for p-D (p>1) uncertain parameter. Observations (X_i,Y_i) are assumed to be independent but their noise variance can be either the same (iid=homoscedastic) or different (heteroscedastic).

gprTorch_pd_singleTask()

GPR for p>1 uncertain parameter and single-variate response y

optim_conv_plot()

Plot convergence of loss and length-scale during the optimization

train_pred()

Constructor of the GPR (training and predicting at test samples)

class gpr_torch.gprPlot(pltOpts={})

Plotters for GPR

Args:
pltOpts: dict (optional)
Options for planar plots (p=1 or 2) with the following keys:
  • ‘title’: string, plot title

  • ‘legFS’: float, legend fontsize

  • ‘labFS’: [float,float], x,y-axes label fontsize

  • ‘ticksFS’: [float,float], x,y-ticks fontsize

  • ‘save’: bool
    If ‘save’==True, then
    • ‘figName’: string, figure name

    • ‘figDir’: string, directory to save the figure

    • ‘figSize’: [float,float], figure size

Methods:
torch1d():

Plots the GPR constructed for a 1D input.

torch2d_2dcont():

Planar contour plot of a GPR constructed over a 2D input space.

torch2d_3dSurf():

3D plot of the GPR surface (mean+CI) constructed for a 2D input.

torch1d(post_f, post_obs, xTrain, yTrain, xTest, fExTest, shift=0.0, scale=1.0)

Plots the GPR constructed by GPyToch for a 1D input.

Args:
post_f: GpyTorch object

Posterior density of the model function f(q)

post_obs: GpyTorch object

Posterior density of the response y

xTrain: 1D numpy array of size nTrain

GPR training samples taken from the input space

fExTest: 1D numpy array of size nTrain

Response values at xTrain

xTest: 1D numpy array of size nTest

Test samples taken from the input space

fExTest: 1D numpy array of size nTest

Exact response values at xTest

shift: Float scalar (optional)

Value by which the original training data ‘yTrain’ were shifted for standardization

scale: Float scalar (optional)

Value by which the original training data ‘yTrain’ were scaled for standardization

torch2d_2dcont(xTrain, qTest, fTestGrid)

Planar contour plots of a GPR constructed over a 2D input space.

Args:
xTrain: 2D numpy array of shape (nTrain,2)

GPR training samples taken from the input space

yTrain: 1D numpy array of size nTrain

Response values at xTrain

qTest: List of length 2

=[qTest_1,qTest2], where qTest_i: 1D array of size nTest_i of the test samples taken from the space of i-th input

fTestGrid: 2D numpy array of shape (nTest_1,nTest_2)

Response values at a tensor-product grid constructed from qTest

torch2d_3dSurf(xTrain, yTrain, qTest, post_, shift=0.0, scale=1.0)

3D plot of the GPR surface (mean+CI) constructed for a 2D input (parameter).

Args:
xTrain: 2D numpy array of shape (nTrain,2)

GPR training samples taken from the input space

yTrain: 1D numpy array of size nTrain

Response values at xTrain

qTest: List of length 2

=[qTest_1,qTest2], where qTest_i: 1D array of size nTest_i of the test samples taken from the space of i-th input

post_: GpyTorch object

Posterior density of model function f(q) or observations y

shift: Float scalar (optional)

Value by which the original training data ‘yTrain’ were shifted for standardization

scale: Float scalar (optional)

Value by which the original training data ‘yTrain’ were scaled for standardization

class gpr_torch.gprPost(gpPost, nTest, shift=0.0, scale=1.0)

Post-processing a constructed GPR

Args:
gpPost: GpyTorch object

GPR posterior density created by GPyTorch

nTest: A list of size p

Containing number of test points in each parameter dimension: [nTest1,nTest2,…nTestp]

shift: Float scalar (optional)

Value by which the original training data ‘yTrain’ were shifted for standardization

scale: Float scalar (optional)

Value by which the original training data ‘yTrain’ were scaled for standardization

Methods:

torchPost(): Computing mean, standard-deviation, and CI of the GPR-posterior

Attributes:
mean: p-D numpy array of size (nTest1,nTest2,…,nTestp)

Mean of the GP-posterior

sdev: pD numpy array of size (nTest1,nTest2,…,nTestp)

Standard-deviation of the GP-posterior

ciL: pD numpy array of size (nTest1,nTest2,…,nTestp)

Lower 95% CI of the GP-posterior

ciU: pD numpy array of size (nTest1,nTest2,…,nTestp)

Upper 95% CI of the GP-posterior

torchPost()

Computes mean, standard-deviation, and CI of the GPR-posterior created by GPyTorch

Notebook

Try GPR notebook to see how to use UQit for Gaussian process regression over a parameter space.