Hi, developer. I installed a Linux virtual machine on the cluster, but I found that dmp parallel computing is becoming slower and slower as the number of cores increases. MFIX:22.4.3, case is cyclone_3d_flush_bc.
Parallel computing is hard. More CPUs mean more overhead communicating between cores, and if you don’t have high-speed interconnect between the nodes, this can be a significant burden. Also you are running on a VM which adds additional overhead. There can also be configuration issues with the VM which keep it from reaching peak performance. It would be good if you could try to get help from a local expert, as we really do not know enough about your computing environment. Look at network usage, try to run perf top
on the worker nodes, etc, see if you can find the bottleneck.
OK, thanks a lot. I understand.
Hi,
The advice by open-mpi is not to “oversubscribe”, i.e., not to specify more than the available number of physical cores (and not threads) in the machine. If oversubscribed, the program may still run but in “degraded” mode and the performance may suffer. Please see item 13.4.21 in the below link.
13.4. Running MPI applications — Open MPI 5.0.x documentation (open-mpi.org)
Please note: I am not an expert and “Oversubscribe” and “Degraded” are all buzz words to me!
Thank you.
OK, thank you very much!
Note that, if running jobs from the GUI will not allow you to oversubscribe, but if you modify the mpirun
command you can add the --oversubscribe
flag. This is, as Prabu pointed out, not recommended!