When attempting to restart the simulation using ‘restart_1’, I get the following error:
Initial DES Particle array size: 6146
Message 1010: Read in data from .RES file for TIME = 5.1000
Time step number (NSTEP) = 52403
Compressible: IJK_P_g remaining undefined.
Resizing DES MPI buffers: 1.5 MB (+202.9%)
[ins006:4083590] *** An error occurred in MPI_Wait
[ins006:4083590] *** reported by process [4088528897,25]
[ins006:4083590] *** on communicator MPI_COMM_WORLD
[ins006:4083590] *** MPI_ERR_TRUNCATE: message truncated
[ins006:4083590] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[ins006:4083590] *** and potentially your MPI job)
Attempting to rerun multiple times sometimes may give a different error:
A request was made to bind that would require binding
processes to more cpus than are available in your allocation:
Application: ./mfixsolver
#processes: 80
Mapping policy: BYCORE
Binding policy: CORE
You can override this protection by adding the "overload-allowed"
option to your binding directive.
The simulation uses DEM with air as the fluid. UDFs usr0_des.f and usr3_des.f are used.
Other info:
This simulation runs until time-out, but there are issues in restarting. I run other identical simulations with the small difference of using variable particle densities. Interestingly, about half of these other simulations restart with no issue, but the other half returns either of the two errors above. Files used are attached (.zip).
I tried:
- Rebuilding the solver - did not help
- Excluding partially occupied nodes - did not help
- Using a different MPI module - did not help
- A similar post suggested the issue is with DLB_DT, but that DNE in my mfix file
- Used ‘restart_2’ - did not help
Edit: updated mfixdmp.slurm file in .zip
MFiX_forum 2.zip (45.5 MB)