11.1.17. Parallelization Control

Parallel performance depends on several things, and one has to evaluate different options before choosing the right strategy for any problem. For example if the J-direction is the strongest coupled direction, the preconditioning for the linear solver will be poor if there is decomposition in that direction. However, since decomposing in all the directions reduces the processor grid surface area, the communication cost will be less for the same computational grid. The preconditioners are chosen with the keyword LEQ_PC. In addition to LINE relaxation, one can choose the “DIAG” or “NONE” preconditioner that reduces inter-processor communications but this choice will increase the number of linear equation solver iterations. The DIAG and NONE choices for preconditioners may be appropriate for all equations except the continuity (or pressure and volume fraction correction) equations. The parallel performance is greatly dependent on the choices stated here, and some trial and error may be required to determine the right combination of decomposition direction with the choice of preconditioners to get the best performance in production runs.

NODESI * NODESJ * NODESK must be the same as the number of processors specified using mpirun (or equivalent command). Otherwise the code will return with an error.

11.1.17.1. NODESI

Data Type: INTEGER

Number of grid blocks in x-direction.

11.1.17.2. NODESJ

Data Type: INTEGER

Number of grid blocks in y-direction.

11.1.17.3. NODESK

Data Type: INTEGER

Number of grid blocks in z-direction.

11.1.17.4. DLB_NODESI(LAYOUT)

Data Type: INTEGER

  • \(1 \le Layout \le 100\)

List of grid blocks in x-direction used in Dynamic Load balance (DLB). The DLB will test each partition layout defined by DLB_NODESI, DLB_NODESJ, DLB_NODESK, and choose the one providing the best load balance. For each layout, the product DLB_NODESIxDLB_NODESJxDLB_NODESK must match the number of cores used in the DMP run.

Example: To test two 80-cores layouts with 4x5x4 and 2x20x2 partitions, define DLB_NODESI(1)=4, DLB_NODESJ(1)=5, DLB_NODESK(1)=4, and DLB_NODESI(2)=2, DLB_NODESJ(2)=20, DLB_NODESK(2)=2.

11.1.17.5. DLB_NODESJ(LAYOUT)

Data Type: INTEGER

  • \(1 \le Layout \le 100\)

List of grid blocks in y-direction used in Dynamic Load balance (DLB). The DLB will test each partition layout defined by DLB_NODESI, DLB_NODESJ, DLB_NODESK, and choose the one providing the best load balance. For each layout, the product DLB_NODESIxDLB_NODESJxDLB_NODESK must match the number of cores used in the DMP run.

Example: To test two 80-cores layouts with 4x5x4 and 2x20x2 partitions, define DLB_NODESI(1)=4, DLB_NODESJ(1)=5, DLB_NODESK(1)=4, and DLB_NODESI(2)=2, DLB_NODESJ(2)=20, DLB_NODESK(2)=2.

11.1.17.6. DLB_NODESK(LAYOUT)

Data Type: INTEGER

  • \(1 \le Layout \le 100\)

List of grid blocks in z-direction used in Dynamic Load balance (DLB). The DLB will test each partition layout defined by DLB_NODESI, DLB_NODESJ, DLB_NODESK, and choose the one providing the best load balance. For each layout, the product DLB_NODESIxDLB_NODESJxDLB_NODESK must match the number of cores used in the DMP run.

Example: To test two 80-cores layouts with 4x5x4 and 2x20x2 partitions, define DLB_NODESI(1)=4, DLB_NODESJ(1)=5, DLB_NODESK(1)=4, and DLB_NODESI(2)=2, DLB_NODESJ(2)=20, DLB_NODESK(2)=2.

11.1.17.7. SOLVER_STATISTICS

Data Type: LOGICAL

Print out additional statistics for parallel runs

11.1.17.8. DEBUG_RESID

Data Type: LOGICAL

Group residuals to reduce global collectives.

11.1.17.9. ENABLE_DMP_LOG

Data Type: LOGICAL

All ranks write error messages.

11.1.17.10. DBGPRN_LAYOUT

Data Type: LOGICAL

Print the index layout for debugging.