MfiX runs fine in serial mode, but stalls after a while in dmp/smp mode without error

I’m seeking assistance regarding an issue I’m encountering with running the MFIX solver on HPC. While the solver works fine in serial mode, it stalls after a while when executed in DMP/SMP mode. (no error in log or .out)

Steps I have taken to troubleshoot the problem:
I have recompiled the MFIX solver using different compilers with no progress.

usr_mod.f (880 Bytes)
usr0.f (2.7 KB)
usr1_des.f (3.1 KB)
12_3_Replica.mfx (11.7 KB)
des_thermo_cond_mod.f (28.5 KB)
calc_collision_wall_mod.f (53.9 KB)
des_allocate_mod.f (33.3 KB)
allocate_dummy_cut_cell_arrays.f (3.1 KB)
geometry.stl (14.8 MB)

Thanks for the report, however there’s not really enough information here to identify the problem. It would also be helpful to see any log files (.LOG, .OUT, etc) and any files generated by the batch system - slurm.* or whatever your batch system is, as well as information about the HPC cluster you are using. Please run the command mfixversioninfo and attach the output. Also, it looks like you are using MFiX 21.4 which is 3 years old at this point, can you try with the current version (24.1.1)?

HEATER.txt (1.7 MB)
Here’s the log file. It usually stalls till the time allocation for the computing node ends.

I can’t compile your code with 21.4. Did you mean you are using MFiX 24.1? Your calc_collision_wall_mod.f has glued sphere code in it so it won’t work with any version prior to 24.1. You also need to attach the geometry_0001.stl and geometry_0004.stl if you want assistance. Better yet, use the GUI and save all the projects file (submit bug report).

When you write UDFs or modify the code, I recommend you build the solver in debug mode. This should catch some issues. When troubleshooting, comment out all your code and make sure it runs. Then slowly turn on some of your new code until it fails.

Hello Jeff, I’ll try commenting out the code till I find the issue. In the meantime, I have the project files here. Thank you
heater_sim.zip (43.1 MB)

The issue is you have added a call to CALC_avgTs from within a particle loop, and CALC_avgTs is looping over all cells so you are spending so much time in it that it looks like the code is hanging. You need to call CALC_avgTs outside a particle loop.

2 Likes

Thank you Jeff; calling CALC_avgTs outside the loop seems to have solved the problem, but I’m now running into divergence issue “MAX_INLET_VEL_FAC. DT < DT_MIN. Recovery not possible!”. From posts with similar issues in the forum, it seems it could be a mesh problem so I’m trying to see if I can fix that. I’ll give feedback on the progress soon.

OK. Please try to increase the small cell tolerance and normal distance tolerance to see if it helps. You can also increase the maximum inlet velocity factor (Numerics>Advanced pane).

Hello Jeff, I increased the small cell tolerance and reduced dt_min, and I could simulate past the diverging timestep. Now I’ll hold one of the two parameters constant while varying the other to pinpoint the parameter that needed to be optimized. Thank you!

I also noticed that when i called CALC_avgTs outside the loop such that

               IF(FLUID_AT(IJK_FACET).AND.AREA>ZERO)THEN
                  DES_QW_Cond(IJK_FACET,phase_LL) = &
                     DES_QW_Cond(IJK_FACET, phase_LL) + QSWALL/AREA
	       ! I removed CALL CALC_avgTs
                  DES_HW_Cond(IJK_FACET,phase_LL) = DES_QW_Cond(IJK_FACET, phase_LL)/(TWALL - avgDES_T_s(IJK_FACET))
                  WRITE (*,*) DES_QW_Cond(IJK_FACET,phase_LL), DES_HW_Cond(IJK_FACET,phase_LL)

               ENDIF

            ENDIF ! WALL BDRY
         ENDDO                  ! CELL_COUNT (facets)
      ENDDO  ! LL

      ! Now calling CALC_avgTs after the particle loop
      CALL CALC_avgTs

      RETURN

   END SUBROUTINE CALC_DEM_THERMO_WITH_WALL_STL

END MODULE CALC_COLLISION_WALL

, it doesn’t affect updates to avgDES_T_s, which is required within the subroutine where it was formally called. Why is that?

What do you mean by that?

what I mean is that moving the CALL CALC_avgTs outside the loop will not update the values of avgDES_T_s within the subroutine CALC_DEM_THERMO_WITH_WALL_STL, but I just realized that it won’t matter because the simulation doesn’t consider radiation heat transfer