12.1.17. Parallelization Control¶
Parallel performance depends on several things, and one has to evaluate
different options before choosing the right strategy for any problem.
For example if the J-direction is the strongest coupled direction, the
preconditioning for the linear solver will be poor if there is decomposition in
that direction. However, since decomposing in all the directions reduces the
processor grid surface area, the communication cost will be less for the same
computational grid. The preconditioners are chosen with the keyword LEQ_PC
.
In addition to LINE relaxation, one can choose the “DIAG” or “NONE”
preconditioner that reduces inter-processor communications but this choice will increase
the number of linear equation solver iterations. The DIAG and NONE choices for
preconditioners may be appropriate for all equations except the continuity (or
pressure and volume fraction correction) equations. The parallel performance is
greatly dependent on the choices stated here, and some trial and error may be
required to determine the right combination of decomposition direction with the
choice of preconditioners to get the best performance in production runs.
NODESI * NODESJ * NODESK
must be the same as the number of processors
specified using mpirun
(or equivalent command). Otherwise the code will
return with an error.
12.1.17.1. NODESI¶
Data Type: INTEGER
Number of grid blocks in x-direction.
12.1.17.2. NODESJ¶
Data Type: INTEGER
Number of grid blocks in y-direction.
12.1.17.3. NODESK¶
Data Type: INTEGER
Number of grid blocks in z-direction.
12.1.17.4. DLB_NODESI(LAYOUT)¶
Data Type: INTEGER
\(1 \le Layout \le 100\)
List of grid blocks in x-direction used in Dynamic Load balance (DLB). The DLB will test each partition layout defined by DLB_NODESI, DLB_NODESJ, DLB_NODESK, and choose the one providing the best load balance. For each layout, the product DLB_NODESIxDLB_NODESJxDLB_NODESK must match the number of cores used in the DMP run.
Example: To test two 80-cores layouts with 4x5x4 and 2x20x2 partitions, define DLB_NODESI(1)=4, DLB_NODESJ(1)=5, DLB_NODESK(1)=4, and DLB_NODESI(2)=2, DLB_NODESJ(2)=20, DLB_NODESK(2)=2.
12.1.17.5. DLB_NODESJ(LAYOUT)¶
Data Type: INTEGER
\(1 \le Layout \le 100\)
List of grid blocks in y-direction used in Dynamic Load balance (DLB). The DLB will test each partition layout defined by DLB_NODESI, DLB_NODESJ, DLB_NODESK, and choose the one providing the best load balance. For each layout, the product DLB_NODESIxDLB_NODESJxDLB_NODESK must match the number of cores used in the DMP run.
Example: To test two 80-cores layouts with 4x5x4 and 2x20x2 partitions, define DLB_NODESI(1)=4, DLB_NODESJ(1)=5, DLB_NODESK(1)=4, and DLB_NODESI(2)=2, DLB_NODESJ(2)=20, DLB_NODESK(2)=2.
12.1.17.6. DLB_NODESK(LAYOUT)¶
Data Type: INTEGER
\(1 \le Layout \le 100\)
List of grid blocks in z-direction used in Dynamic Load balance (DLB). The DLB will test each partition layout defined by DLB_NODESI, DLB_NODESJ, DLB_NODESK, and choose the one providing the best load balance. For each layout, the product DLB_NODESIxDLB_NODESJxDLB_NODESK must match the number of cores used in the DMP run.
Example: To test two 80-cores layouts with 4x5x4 and 2x20x2 partitions, define DLB_NODESI(1)=4, DLB_NODESJ(1)=5, DLB_NODESK(1)=4, and DLB_NODESI(2)=2, DLB_NODESJ(2)=20, DLB_NODESK(2)=2.
12.1.17.7. SOLVER_STATISTICS¶
Data Type: LOGICAL
Print out additional statistics for parallel runs
12.1.17.8. DEBUG_RESID¶
Data Type: LOGICAL
Group residuals to reduce global collectives.
12.1.17.9. ENABLE_DMP_LOG¶
Data Type: LOGICAL
All ranks write error messages.
12.1.17.10. DBGPRN_LAYOUT¶
Data Type: LOGICAL
Print the index layout for debugging.