Hello,
The code is successfully performed in serial, but when it runs with DMP, the procedure will stop and report error . The stop time varies with different number of cores. For example, the procedure stop at t=0.91 s with NodesX=1, NodesY=16, NodesK=1.
The error is listed as follows
Thanks for the bug report. I just replicated this crash with the 22.3 version after about an hour running with 1x16x1 DMP. There is a message at startup
The preconditioner for the linear solver MIGHT NOT be
very efficient with DMP partitions in the y-axis
but this should not lead to a crash. I will let you know what we find. Can you work around this problem by using a different domain decomposition?
When there is a crash, it is always better to include more details, otherwise it is very time consuming to debug, and you may not get any response. Useful information include (this should already be included in the post draft): Description
Details on how to reproduce the issue. If MFiX crashed or failed, how long does it take?
Attempts to fix the issue
Include what you have tried, what helped and what didn’t help.
Did you?
Select the relevant category (Installation / How to / Bug report / Share)?
Attach project files
Attach zip file generated from main menu> submit bug report.
I see you have modified the code. Does it crash if you use the original code? It often take more effort to write parallel code.
This being said, the error message usually means a particle travelled to fast and cannot be matched across processors. Please try to increase the spring stiffness to see if this helps (this would prevent excessive overlap and thus huge contact forces)
Thanks for your reply. I have tried different domain decomposition, such as, 1x8x1, 1x4x1, 1x2x1, and 2x4x1.When we use different domain decomposition, the crash takes place with different time. Some cases may continue for 1 days, but eventually the crash occurs.
Thanks for your reply. Indeed, I have modified the code in two files, calc_force_dem.f and usr3_des.f. We want to realize the particle deletion and insertion. The crash would not occur if we use the original code. I have tried to increase the spring stiffness (1000 to 5000), but it even crash more quickly. I also have tried different domain decomposition. All cases will crash, although the crash time differs. The shortest crash time is about 1 hour, and the longest time is about 1 day. It seems that when the new particle is inserted at some point, the code is easy to crash.
If you are inserting new particles, you need to make sure there is no overlap with any of the existing particles. If there is a large overlap, it will create very large forces and the particle will shoot out of the domain crossing processors. I think this is the source of your issue here.
I have added corresponding code to decrease or freeze the velocity of the new inserting particles, which would not create a large force or high velocity. The different speed limit are also tested. These codes are successfully performed in serial, but it crashes with DMP. The crash time seems to be irregular, one hour or 10 hours being possible.
The modified code is used to realize the particle replacement. That is, one original particle is deleted and then the position is inserted with two new particles. The new particles are within the domain of original particle diameter.