Ex. 1: Surrogate Modeling¶
Let’s construct a 2D function
In the following steps, we’ll use nodeworks to sample a relevant
Sampling¶

Following the steps below, we’ll create a simple factorial design for sampling the model:
Launch Nodeworks and right click on the empty sheet to bring up the menu. Scroll down to
Add Node
, then down to theSurrogate Modeling and Analysis
collection of nodes and (left) click onDesign of Experiments
to populate the sheet with the Design of Experiments node.On the
Variables
tab, click thebutton twice to initiate the
and variables. By selecting the default entries in thee table, set the
variable
names tox
andy
for clarity. (The new entries default totype
Double Precision
withoutarg(s)
orunits
andlink
None
, so no change is needed for these properties.)We’ll look at a phase space of
with a simple interval spacing of 0.2. In order to cover the edges of the domain of interest, set the from
andto
of both variables to-2.2
and2.2
, respectively. (You can set thelevels
of each to23
, but we’ll use the same interval for bothand , so variable specific levels won’t be necessary. Click over to the
Design
tab, set theMethod
tofactorial
and theLevels
to23
and click on theBuild
button to generate the sample. The uniformly spaced 2D factorial design can be viewed on thePlot
tab, as shown below.

Model Evaluation¶
Now we need to evaluate the model at the sampling points. In this case, we are using the
simple function
Right click or type to access the node menu and add a
Code
node.Set the arguments to
xy
(since we will pass into the code node an array of thevalues at the 529 design points). Evaluate the model function
with python. Non-python users can simply copy and paste in the code below into the function
entry.
import numpy as np
returnOut = 2.5*np.exp(-1*(xy[:, 0]**2 + xy[:, 1]**2))*(xy[:, 0] + xy[:, 1]**2 + np.random.normal(0, 0.1, 529))
Then
Set the
Output Selection
in the DOE node toDOE Array
(ourarray). Add a connection from the
Selected Output
terminal of the Design of Experiments to thexy
terminal of theCode
node.

At this point, pressing will send the data from the DOE node to the code node for
evaluation. We could connect the code node output to a print node to verify that the
function was being evaluated correctly. However in this case (with a previously tested code),
we direct the code node output to an RSM node in the section below.
Surrogate Modeling¶
Now, we will take the function
Right click or type to access the node menu and add a Response Surface node, found in the
Surrogate Modeling and Analysis
node collectionConnect both the
DOE Matrix
terminal on the Design of Experiments node and thereturnOut
terminal of thecode
node to thematrix/response
terminal of the Response Surface node.Click on the
Model
tab of the Response Surface node and check (under thefit
column of the table) theradial basis function
model. In the options set theepsilon
value to 0.3. For now, theFunction
andSmooth
fields can be left atmultiquadric
and0
, respectively.Now run the sheet by pressing the
button. This will send the data from the DOE node to the code node, calculating the response of the “model” at the sample points, and from there onto the RSM node, constructing a Gaussian process surrogate model. The response data can be viewed in the
Data
tab in raw table format and in thePlot
tab as a 3D surface with response data points from the “actual model” superimposed.

- Although we now have a surrogate model, we may want to know:
How good is this surrogate?
What is the error in the surrogate relative to the full model?
Could we build a better surrogate with different model settings a different model option?
We’ll explore these questions in the following section.
(Surrogate) Model Error¶
The surrogate constructed in the previous section used a multiquadric radial basis function
without smoothing, i.e., we left the Smooth
parameter at the default 0
value. In this
case, the RBF is a pure interpolator. In other words, each sampled point matched exactly
by the surrogate. If we click over to the Error
tab, you may see a little scatter, but this
is only from the (Nodeworks default) 10% holdout for cross validation. If we return to the
Model
tab and set the cross validation points
to 0
and refit the model, the parity
plot now shows a perfect 1:1 correlation and the error plot is near single precision numerical
zero.

However, it is unlikely that this, essentially error-free, surrogate is the best candidate.
Although analysis will not be covered in this example, typically the surrogate will be evaluated
at locations in the domain other than the exact sample locations. So, how much error does
the surrogate model incur in those instances? One way to answer this is through cross validation.
A random sub-set of the data are withheld and the surrogate is fit to the remaining data,
the error of which is evaluated with the hold out data. To assess the current surrogate model,
holdout 25% of the dataset for cross validation and re-fit the model. Repeat this procedure
by incrementing the Smooth
value and you should notice that the epsilon
, or a different radial basis Function
or a different surrogate modeling choice altogether may produce an even better fit than
the model presented here. Users can test any of the different model options and compare
against the surrogate constructed here using the cross validation metrics.
