Design of Experiments¶

The Design of Experiments is used to choose the sampling points to be used as input parameters for full model evaluation. It is important in this stage to consider the potential range of interest and ensure that the sampling space completely covers this range so that the resulting surrogate model is analyzed within its range of support.

Variables¶

Variables are added in the first Variables tab by clicking on the add symbol above the table. Variables can be removed from the table by highlighting them in the table and clicking on the symbol. Information about the variable properties are entered and edited in the entry spaces below the table. Some of the properties are type specific, discussed below. All entered variables are listed and summarized in the table.

Variable properties:

variable: used to name and identify the variables.

arg(s): defines the variable argument, i.e., the ineger index if the variable is an array or vector.

units: specify variable units for plotting.

type: Double Precision real variables.

link used to specify the variable as a function of another variable. NOTE that the variable is no longer an independent variable, which can be important when sampling randomly. However, it can also be useful when dependent variables need to be set consistently with one independent random variable. The plot below shows a two variable array which must sum to one like the volume fraction of a two-phase mixture, for example.

from: lower bound of parameter space of this variable dimension.

to: upper bound of parameter space of this variable dimension.

levels: defines the number of evenly spaced sampling points between from and to inclusively. NOTE that this is only used in the specific instance when the factorial design is specified on the Design tab and the Use variable specific levels option is checked.

type: Integer integer valued variables have the same properties as Double Precision variables. NOTE that integers are also treated the same as reals during the calculation of the sampling points and then rounded. Care should be taken to avoid unwanted repeated samples.

type: String text based variables can be added and removed with the + and - symbols, activated by checking the box and named by double clicking on the entry space to the right of the checkbox. NOTE that the functionality of String variables is not currently fully implemented. Specifically, downstream nodes can not handle string variables. Currently, it is recommended that users convert strings into integer or real variables.

type: Logical variables only take integer values of 1 or 0 for True or False.

../_images/doe_variables_linked.png — A design where two variables are linked.¶

Design¶

The Design tab is where the samples are actually constructed. Generally speaking, there are four ways to construct a sampling scheme: ordered, sequentially (sub-random), pseudo-randomly and simply importing a design constructed previously or from a different code. Available Method’s include

Ordered designs:

Factorial

Covary

Central Composite

Sequential designs

Sobol

Hammersly

Halton

Random designs

Monte Carlo

Latin Hypercube

Previously generated designs

The Import button is used to load a design saved locally in csv, comma separated variable format

Additional sampling properties:

Samples specifies the total number of samples drawn

Repeat specifies how many samples are repeated a specified number of times–this can be useful when generating designs for actual experiments and for non-deterministic simulations

Randomize sample order–this applies to the table as well as the order in which the samples are transferred to downstream nodes

Build generates the samples, also used to re-generate the design if properties are adjusted

Export exports the samples to a comma separated variable file

Some other method-specific properties:

Randomize toggles between setting the seed of the pseudorandom number generator from the given seed or from the clock NOTE that randomize should be used with caution and the accepted design should be saved with Export as the design will be re-generated and changed when the sheet is run

Levels a constant number number of sampling intervals to be used with all variables with the Factorial design

Alternatively, users may check Use variable specific levels and the intervals for each variable are taken from the levels value entered previously in the Variables tab

Face in the central composite design specifies circumscribed, inscribed or faced

Alpha in the central composite design specifies orthogonal or rotatable

Optimize in the latin hypercube design allows selection of an optimization technique to improve the space filling of the design. See Latin Hypercube for more.

Iterations in the latin hypercube design allows for the specification of iterations used by the Optimize technique.

../_images/doe_methods.png — Example of designs with 30 samples.¶

Plot¶

The samples are plotted in a 2D scatter plot. The variables can be set by selecting the Y Axis and X Axis variables from the dropdown list. The plot can be customized and saved using the buttons below the plot. By default, all variables are shown with included points in blue and excluded points in red. Excluded and/or included points can be alternatively turned off by right clicking on the plot and unchecking.

Quality¶

The Quality tab analyzes the spatial quality of the design. The Minimum Distance, Maximum Distance and their Ratio (Max/Min) are reported and calculated using scipy.spatial.distance.pdist. Several Distance Metric options are available from the scipy library:

Bray-Curtis

Canberra

Chebyshev

City Block (Manhattan)

Correlation

Cosine

Dice

Euclidean

Hamming

Jaccard

Kulsinski

Mahalanobis

matching - Synonym for Hamming

Minkowski

Rogers-Tanimoto

Russell-Rao

Standardized Euclidean

Sokal-Michener

Sokal-Sneath

Squared Euclidean

Yule

The L2-Discrepancy from Eq.5 of [Fang2001b] is also calculated for the design.

W D^{2} (D) = - (4 / 3)^{K} + 1 / N^{2} Σ_{i, j = 1}^{N} P i_{k = 1}^{K} [3 / 2 - | x_{k}^{1} - x_{k}^{2} | * (1 - | x_{k}^{1} - x_{k}^{2} |)]

References¶

[Fang2001b]

K.T. Fang and C.X. Ma, “Wrap-Around L2-Discrepancy of Random Sampling, Latin Hypercube, and Uniform Designs,” Journal of Complexity, vol. 17, pp. 608-624, 2001.