Surrogates¶
A surrogate or metamodel is an approximation of the actual model function or simulator over the parameter space. Running a surrogate is computationally much less expensive than the actual simulator, a characteristic that makes the use of surrogates inevitable in different UQ problems. However, the predictions by the surrogate should be accurate enough compared to the actual predictions by the simulator. Using an additive error model, the following relation can be considered between the model function (simulator) and its observations:
where \(\mathbf{\varepsilon}\) expresses the bias and random noise. Our task is to construct a surrogate \(\tilde{f}(\chi,\mathbf{q})\) for the actual model function (simulator).
Treating the simulator as a blackbox, a set of training data \(\mathcal{D}=\{(\mathbf{q}^{(i)},r^{(i)})\}_{i=1}^n\) is obtained.
There are different techniques to construct a surrogate, for instance see [Smith13], [Ghanem17], and [Gramacy20].
Some of the techniques relevant to CFD applications have been implemented in UQit
.
Here, we provide a short overview to the theory behind these techniques, explain their implementation in UQit
, and provide examples to show how to use them.
Non-intrusive Polynomial Chaos Expansion¶
As a type of stochastic collocation (SC) methods, see e.g. Chapter 20 in [Ghanem17], non-intrusive PCE [Xiu05], [Xiu07] can be used to construct a surrogate.
There is a one-to-one correspondence between any sample of \(\mathbf{q}\in \mathbb{Q}\) and \(\xi\in\Gamma\), where \(\mathbb{Q}=\bigotimes_{i=1}^p \mathbb{Q}_i\) and \(\Gamma=\bigotimes_{i=1}^p \Gamma_i\). Note that \(\mathbb{Q}_i\) is the admissible space of the i-th parameter which can be mappd onto \(\Gamma_i\) based on the gPCE rules, see [Xiu02], [Eldred09]. For the details of the non-intrusive PCE method refer to Standard Polynomial Chaos Expansion.
Lagrange Interpolation¶
Theory¶
As another form of SC-based surrogates, Lagrange interpolation can be considered:
where \(\hat{f}_k(\chi,\mathbf{q})=f(\chi,\mathbf{q}^{(k)})=r^{(k)}\) are the training model outputs. If the \(n_i\) samples taken from the \(i\)-th parameter space are represented by \(Q_{n_i}\), then the use of tensor-product leads to the nodal set \(Q_n\) of size \(n=\prod_{i=1}^p n_i\), where,
Correspondingly, the Lagrange bases \(L_k(\mathbf{q})\) are constructed using the tensor-product of the bases in each of the parameter spaces:
where,
Note that the Lagrange basis satisfies \(L_{k_i}(q_i^{(j)})=\delta_{k_{i}j}\), where \(\delta\) represents the Kronecker delta.
Example¶
For \(p=1\) (one-dimensional parameter \(q\)):
fInterp=lagInt(fNodes=fNodes,qNodes=[qNodes],qTest=[qTest]).val
For \(p>1\) (multi-dimensional parameter \(\mathbf{q}\)):
fInterp=lagInt(fNodes=fNodes,qNodes=qNodes,qTest=qTestList,liDict={'testRule':'tensorProd'}).val
Implementation¶
- Note:
To avoid Runge’s phenomenon, the training nodal set should be non-uniformly distributed in each dimension of the parameter space.
-
class
lagInt.
lagInt
(qNodes, fNodes, qTest, liDict=[])¶ Lagrange interpolation over a p-D parameter space, where p=1,2,… The interpolation order is \((n_1-1)*(n_2-1)*...*(n_p-1)\), where, \(n_k, k=1,2,...,p\) refers to the number of training nodes in the i-th dimension of the parameter space.
\[F(\mathbf{Q})=\sum_{k_1=1}^{n_1} ... \sum_{k_p=1}^{n_p} [fNodes(k_1,k_2,...,k_p) L_{k_1}(Q_1) L_{k_2}(Q_2) ... L_{k_p}(Q_p)]\]where, \(L_{k_i}(Q_i)\) is the single-variate Lagrange basis in the i-th dimension.
- Args:
- qNodes: A list of length p
qNodes=[qNodes_1,qNodes_2,…,qNodes_p], where qNodes_k is a 1D numpy array of n_k nodes in the k-th parameter dimension.
- fNodes: A p-D numpy array of shape (n_1,n_2,…,n_p) (or a 1D array of size n=n_1*n_2*…*n_p if p>1)
Response values at the training nodes.
- qTest: A list of length p
Containing test samples from the p-D parameter space, i.e. qTest=[Q_0,Q_1,…,Q_{p-1}] where Q_i is a 1D numpy array of size m_i, i=0,1,…,p-1. Note that Q_i must be in [min(qNodes_i),max(qNodes_i)] for i=1,2,…,p.
- liDict: dict (optional)
- To set different options for Lagrange interpolation over the p-D space when p>1, with the following key:
- ‘testRule’: The rule for treating the multi-dimensional test points with the values:
‘tensorProd’: A tensor-product grid is constructed from Q_i`s in list `qTest. Hence, the total number of sample points is m=m0*m1*m_{p-1}.
‘set’: All m_i for i=0,1,…p-1 should be equal. As a result, a set of m=m_i test points is considered.
- Returns:
- val: f(qTest), Interpolated values of f(q) at qTest
If p==1: 1D numpy array of size m1
If p>1 and ‘testRule’==’tensorProd’ => val: pD numpy array of shape (m1,m2,…,mp)
If p>1 and ‘testRule’==’set’ => ‘val’: 1D numpy array of size m1*m2*…*mp
-
basis1d
(qNodes_, k, Q_)¶ Constructs single-variate Lagrange bases \(L_k(q)\) using n nodes qNodes_ for k=0,1,…,n-1. The bases are evaluated at m samples Q_ of a single-variate parameter
- Args:
- qNodes_: 1D numpy array of size n
The single-variate training nodal set
- Q_: 1D numpy array of size m
Test samples
- k: 1D numpy array of int values
Order of the polynomial bases
- Returns:
- prod: n-by-m numpy array
Values of \(L_k(Q)\) for k=0,1,…,n-1 evaluated at Q_
-
interp
()¶ Lagrange interpolation, for p=1,2, …
-
interp_1d
()¶ Lagrange interpolation of order (n-1) constructed over a 1D parameter space. Bases are constructed from n nodes qNodes and are evaluated at test points Q in [min(qNodes_),max(qNodes_)].
\[F(Q)=\sum_{k=0}^{n-1} fNodes_{k} L_k(Q)\]where, Lagrange Bases L_k(q) are constructed from the nodal set.
-
interp_pd
()¶ Lagrange interpolation of order \((n_1-1)*(n_2-1)*...*(n_p-1)\) constructed over a p-D parameter space. Here, \(n_k, k=1,2,...,p\) refers to the number of training nodes in the i-th dimension of the parameter space.
\[F(\mathbf{Q})=\sum_{k_1=1}^{n_1} ... \sum_{k_p=1}^{n_p} [fNodes(k_1,k_2,...,k_p) L_{k_1}(Q_1) L_{k_2}(Q_2) ... L_{k_p}(Q_p)]\]where, \(L_{k_i}(Q_p)\) is the single-variate Lagrange basis in the i-th dimension.
-
lagInt.
lagInt_Quads2Line
(fNodes, qNodes, lineDef)¶ Constructing a Lagrange interpolation from tensor-product nodal set in a 2D parameter space and then evaluating it at the test points located on a straight line lying in the same parameter plane.
- Args:
- qNodes: A list of length 2
qNodes=[Q1,Q2] list of training nodes Qi is a 1D numpy array of length ni for i=1,2.
- fNodes: 1D numpy array of length n1*n2
Values of the response at qNodes
- lineDef: dict
- Defines the line over which the test samples are to be taken. The keys are:
‘lineStart’:[q1Start,q2Start]; line’s starting point
‘lineEnd’:[q1End,q2End]; line’s end point
‘noPtsLine’: int; number of test points on the line, m1
- Returns:
- val: f(qTest), Interpolated values of f(q) at qTest
1D numpy array of size m1
Notebook¶
Try this LagInt notebook to see how to use UQit
for Lagrange interpolation over a parameter space.
Gaussian Process Regression¶
Theory¶
Consider the simulator \(f(\mathbf{q})\) where \(\mathbf{q}\in \mathbb{Q}\). The observations are assumed to be generated from the following model,
Since the exact simulator \(f(\mathbf{q})\) is not known, we can put a prior on it, which is in the form of a Gaussian process, see [Rasmussen05], [Gramacy20]. Based on the training data \(\mathcal{D}\), the posterior of the \(f(q)\), denoted by \(\tilde{f}(\mathbf{q})\), is inferred. Without loss of generality we assume \(\varepsilon\sim\mathcal{N}(0,\sigma^2)\). Contrary to the common use of Gaussian process regression (GPR) where \(\sigma\) is assumed to be fixed for all observations (homoscedastic noise), we are interested in cases where \(\sigma\) is observation-dependent (heteroscedastic noise). In the latter, we need to have a Gaussian process to infer the noise parameters, see [Goldberg98]. Eventually, the posteriors of \(\tilde{f}(\mathbf{q})\) and response \(y\) can be sampled over the parameter space, see [Rezaeiravesh20] and the references therein for the details.
Example¶
Given the training data including the observational noise, A GPR is constructed in UQit
as,
gpr_=gpr(xTrain,yTrain[:,None],noiseSdev,xTest,gprOpts)
post_f=gpr_.post_f
post_obs=gpr_.post_y
Implementation¶
In UQit
, the GPR is implemented using the existing Python library GPyTorch
[Gardner18].
The user can similarly use any other available library for GPR as long as the code structure is kept consistent with UQit
.
- Notes:
GPyTorch: Size of the training data cannot exceed 128. Otherwise, make batches of max size 128.
2. GPyTorch: If the standard deviation of the observation noise is exactly zero for all observations, then there may be issues with Cholesky decomposition. Therefore, instead of zero, use a very small value for the noise standard deviations.
3. In a p-D parameter space, it is required to define a length-scale per dimension. Based on the experience, if the range of the parameters are too different from each other or are too large, the optimization of the length-scales can be problematic. To rectify this, the original parameter space can be mapped, for instance, into the hypercube [-1,1]^p. Then, the GPR can be constructed on this mapped space.
4. Shifting and scaling (standardization) of the training data can result in a better fit of GPR. This can be activated via making ‘standardizeYTrain’:True in gprOpts.
5. Different options for kernel function are available by
GPyTorch
. One can change the default kernel in self.covar_module(…) (see below) manually by modifying the source code ofUQit
.
-
class
gpr_torch.
SingletaskGPModel
(train_x, train_y, likelihood)¶ GPR for single-task output using GPyTorch, 1D input: y=f(x) in R, x in R
-
forward
(x)¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
gpr_torch.
SingletaskGPModel_mIn
(train_x, train_y, likelihood)¶ GPR for single-task output using GPyTorch, p-D input: y=f(x) in R, x in R^p, p>1
-
forward
(x)¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
gpr_torch.
gpr
(xTrain, yTrain, noiseV, xTest, gprOpts)¶ Constructs and evaluates a GPR over the space of uncertain parameters/inputs. The GPR model is
\[y=f(x)+\epsilon\]where \(\epsilon\sim N(M,V)\)
The parameter (input) space is p-dimensional, where p=1,2,…
The response y is 1-D (single-task model).
The observations are assumed to be un-correlated, but the noise uncertainty can be observation-dependent (heteroscedastic). Therefore, only the diagonal elements of V are (currently) assumed to be non-zero. If the noise uncertainty is fixed for all observations, then epsilon_i~iid (homoscedastic).
- Args:
- xTrain: 2D numpy array of size nxp
Training samples (size n) from the p-D input (parameter) space
- yTrain: 2D numpy array of size nxm (m: dimensionality of y)
Training model output (response)
- noiseV: A 1D numpy vecor of size n
Standard deviation of the the Gaussian noise (diagonal elements of V)
- xTest: 2D numpy array of size nTestxp
Test samples from the input (parameter) space
- gprOpts: dict
- Options for constructing GPR with the following keys:
- ‘nIter’: int
Number of iterations in the optimization of hyperparameters
- ‘lr’: float
Learning rate in the optimization of hyperparameters
- ‘standardizeYTrain’: bool (optional, default:False)
If true, the training data are standardized by shifting by mean and scaling by sdev:
\(yStnd = (yTrain-mean(yTrain))/sdev(yTrain)\)
- ‘convPlot’: bool
If true, optimized values of the hyper-parameters is plotted vs. iteration.
- Attributes:
post_f: Posterior of f(x) at xTest.
post_obs: Predictive posterior (likelihood) at xTest.
-
gprTorch_1d
()¶ GPR for 1D uncertain parameter. Observations \({(x_i,y_i)}_{i=1}^n\) are assumed to be independent but their noise variance can be either the same (iid=homoscedastic) or non-identical (heteroscedastic).
-
gprTorch_1d_singleTask
()¶ GPR for 1D uncertain parameter and single-variate response y.
-
gprTorch_pd
()¶ GPR for p-D (p>1) uncertain parameter. Observations (X_i,Y_i) are assumed to be independent but their noise variance can be either the same (iid=homoscedastic) or different (heteroscedastic).
-
gprTorch_pd_singleTask
()¶ GPR for p>1 uncertain parameter and single-variate response y
-
optim_conv_plot
()¶ Plot convergence of loss and length-scale during the optimization
-
train_pred
()¶ Constructor of the GPR (training and predicting at test samples)
-
class
gpr_torch.
gprPlot
(pltOpts={})¶ Plotters for GPR
- Args:
- pltOpts: dict (optional)
- Options for planar plots (p=1 or 2) with the following keys:
‘title’: string, plot title
‘legFS’: float, legend fontsize
‘labFS’: [float,float], x,y-axes label fontsize
‘ticksFS’: [float,float], x,y-ticks fontsize
- ‘save’: bool
- If ‘save’==True, then
‘figName’: string, figure name
‘figDir’: string, directory to save the figure
‘figSize’: [float,float], figure size
- Methods:
- torch1d():
Plots the GPR constructed for a 1D input.
- torch2d_2dcont():
Planar contour plot of a GPR constructed over a 2D input space.
- torch2d_3dSurf():
3D plot of the GPR surface (mean+CI) constructed for a 2D input.
-
torch1d
(post_f, post_obs, xTrain, yTrain, xTest, fExTest, shift=0.0, scale=1.0)¶ Plots the GPR constructed by GPyToch for a 1D input.
- Args:
- post_f: GpyTorch object
Posterior density of the model function f(q)
- post_obs: GpyTorch object
Posterior density of the response y
- xTrain: 1D numpy array of size nTrain
GPR training samples taken from the input space
- fExTest: 1D numpy array of size nTrain
Response values at xTrain
- xTest: 1D numpy array of size nTest
Test samples taken from the input space
- fExTest: 1D numpy array of size nTest
Exact response values at xTest
- shift: Float scalar (optional)
Value by which the original training data ‘yTrain’ were shifted for standardization
- scale: Float scalar (optional)
Value by which the original training data ‘yTrain’ were scaled for standardization
-
torch2d_2dcont
(xTrain, qTest, fTestGrid)¶ Planar contour plots of a GPR constructed over a 2D input space.
- Args:
- xTrain: 2D numpy array of shape (nTrain,2)
GPR training samples taken from the input space
- yTrain: 1D numpy array of size nTrain
Response values at xTrain
- qTest: List of length 2
=[qTest_1,qTest2], where qTest_i: 1D array of size nTest_i of the test samples taken from the space of i-th input
- fTestGrid: 2D numpy array of shape (nTest_1,nTest_2)
Response values at a tensor-product grid constructed from qTest
-
torch2d_3dSurf
(xTrain, yTrain, qTest, post_, shift=0.0, scale=1.0)¶ 3D plot of the GPR surface (mean+CI) constructed for a 2D input (parameter).
- Args:
- xTrain: 2D numpy array of shape (nTrain,2)
GPR training samples taken from the input space
- yTrain: 1D numpy array of size nTrain
Response values at xTrain
- qTest: List of length 2
=[qTest_1,qTest2], where qTest_i: 1D array of size nTest_i of the test samples taken from the space of i-th input
- post_: GpyTorch object
Posterior density of model function f(q) or observations y
- shift: Float scalar (optional)
Value by which the original training data ‘yTrain’ were shifted for standardization
- scale: Float scalar (optional)
Value by which the original training data ‘yTrain’ were scaled for standardization
-
class
gpr_torch.
gprPost
(gpPost, nTest, shift=0.0, scale=1.0)¶ Post-processing a constructed GPR
- Args:
- gpPost: GpyTorch object
GPR posterior density created by GPyTorch
- nTest: A list of size p
Containing number of test points in each parameter dimension: [nTest1,nTest2,…nTestp]
- shift: Float scalar (optional)
Value by which the original training data ‘yTrain’ were shifted for standardization
- scale: Float scalar (optional)
Value by which the original training data ‘yTrain’ were scaled for standardization
- Methods:
torchPost(): Computing mean, standard-deviation, and CI of the GPR-posterior
- Attributes:
- mean: p-D numpy array of size (nTest1,nTest2,…,nTestp)
Mean of the GP-posterior
- sdev: pD numpy array of size (nTest1,nTest2,…,nTestp)
Standard-deviation of the GP-posterior
- ciL: pD numpy array of size (nTest1,nTest2,…,nTestp)
Lower 95% CI of the GP-posterior
- ciU: pD numpy array of size (nTest1,nTest2,…,nTestp)
Upper 95% CI of the GP-posterior
-
torchPost
()¶ Computes mean, standard-deviation, and CI of the GPR-posterior created by GPyTorch
Notebook¶
Try GPR notebook to see how to use UQit
for Gaussian process regression over a parameter space.