MFIX parallel version slower when using multiple nodes on a HPC cluster

Fabio.Dioguardi · May 15, 2020, 9:16am

Dear community,

I am actually not 100% sure this is a problem related to MFIX itself.
I have noticed that parallel simulations on our HPC cluster are slower when using multiple nodes on the cluster. For example, I have run the same MFIX simulation (with the latest version) on the cluster twice: once using 16 cores (which is the number of cores available in each node, hence using one node) and the other using 96 cores, hence 6 nodes.
The first one is always significantly faster than the second one (169 vs. 379 minutes).

At first I thought it might be a problem related to our cluster, but then I run a similar test with another model using mpirun like for MFIX. And for this model the simulation with 96 cores was significantly faster than that with 16 cores. Therefore I am now wondering it may be related to the parallel version of MFIX?

I have used the same compilers for both models (gfortran and openmpi).

Thanks
Fabio

gaoxi · May 19, 2020, 2:59pm

For you fist case, you might overkill the simulation. It is not always true to get faster speed when you use more cores. For example, when you simulate a small scale problem with many cores, the communicate overhead becomes be limiting factor, thus will even reduce the speed.