Solver stops immediately without showing any error

mariam.yehia33 · June 5, 2021, 4:25pm

Dears,

I have attached my .mfx file. The build is successful but the run stops. I need some help why the run stops by itself after several seconds. It doesn’t show any error.

It only says “Mfix process has stopped” in comsol pane. Can you try to run the attached file and tell me what’s wrong? thanks in advance. I’m using version 21.1.4.

wang_new_2021-06-05T181632.626867.zip (61.2 KB)

mariam.yehia33 · June 6, 2021, 11:51am

Update: When I switched the the thermal conductivity for the solid phases (sand and biomass) from “Musser” to “No conductive heat flux” in the solids pane, the model ran. So what is wrong with adding the thermal conductivities?

cgw · June 7, 2021, 4:31pm

Hi Mariam

When I run the wang_new case I get this error in the console:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7f5b347936b0 in ???
#1  0x7f5b34792865 in ???
#2  0x7f5b8120b78f in ???
#3  0x7f5b34e703f1 in __calc_collision_wall_MOD_calc_dem_thermo_with_wall_stl
	at /home/cgw/Work/NETL/mfix/model/des/calc_collision_wall_mod.f:994
#4  0x7f5b34e25f24 in __des_time_march_MOD_des_time_step
	at /home/cgw/Work/NETL/mfix/model/des/des_time_march.f:186

These solver backtraces are still not triggering error popups, so they can be easy to miss, but they are printed in red to make them stand out. You may have to scroll up in the console to find the message. There are also a lot of lines of “junk”. The next release will improve the reporting of solver backtraces.

Now the particular error is at this line, and it’s an invalid memory reference, so it looks like an array indexing problem:

                  AREA=AREA_CUT(IJK_FACET)

Adding some print statements:

                  write(*,*) "I, J, K=", I_FACET, J_FACET, K_FACET
                  IJK_FACET=funijk(I_facet,J_facet,K_facet)
                  write(*,*) "IJK_FACET=", IJK_FACET
                  AREA=AREA_CUT(IJK_FACET)

When the error occurs, I am seeing:

I, J, K=          47           2           1
IJK_FACET=        6994

The I index is beyond the size of the grid (IMAX=46). I’m not sure why this is happening - we will look into this and get back to you.

– Charles

cgw · June 7, 2021, 6:56pm

@mariam.yehia33 - This is a bug in MFiX. Since this is a rectangular geometry (no STL) the keyword cartesian_grid is set to False, which skips the initialization of certain arrays, including AREA_CUT and BLOCKED_CELL_AT. Any attempt to access these arrays will cause a segfault (as you have seen).

This will be fixed in the 21.2 release due later this month. If you can’t wait, or would like to help us with testing, you can apply this patch:

patch.txt (1.4 KB)

Hope this helps,
– Charles

mariam.yehia33 · June 10, 2021, 8:23am

Hello Charles,

I have applied the patch you sent and this time the model ran, however diverged after run time=~ 0.5 sec.

This time it gave the error:

ERROR time_step.f:193
DT < DT_MIN. Recovery not possible!

I have attached the file updated with the patch applied in the calc_collision_wall_mod.f subroutine.
I’m trying to figure out what is wrong with the model this time. I hope you can help too.

wang_new_2021-06-10T102542.700772.zip (99.8 KB)

Thanks a lot,
Mariam

cgw · June 14, 2021, 3:59pm

@mariam.yehia33 - I ran this case for about 15 minutes with no errors. It is proceeding pretty slowly though, at this rate it will take 2-3 days to complete. Here’s a plot of dt:

mariam.yehia33 · June 15, 2021, 8:01am

Hello Charles,

There is a misunderstanding. What I mean is that the run diverged after 0.5 sec of simulation time. It took 3~4 hours to make 0.5 sec of simulation time until it diverged.

Can you tell me how to know the reason beyond this error "DT < DT_MIN. Recovery not possible!"? and how to deal with it so I can continue with the simulation.

Thanks a lot,
Mariam

cgw · June 15, 2021, 4:31pm

@mariam.yehia33 - Have you seen this document?
https://mfix.netl.doe.gov/doc/mfix/21.1.4/html/reference/faq.html#what-do-i-do-if-a-run-does-not-converge

Another suggestion that came up recently on the user forum is to use the ideal gas law for the fluid phase density, as remarked here Simulation is not running showing some meshing error! - #6 by jeff.dietiker

– Charles

jeff.dietiker · June 16, 2021, 4:36pm

The issue is the biomass species do not have valid specific heat coefficients. You can plot Cp vs Temperature in the species popup, near the bottom left corner (Plot CP(T) button) to verify that. Char is fine, Ash and Wood coefficients are zero and thus have a zero Cp, H2O_L is only defined till T=600K, then set to zero.

When the biomass particles heat up above 600K, their Cp becomes zero and this creates a division by zero. This somehow doesn’t crash the simulation but instead generates a NaN and it spirals down from there.

We can try to put safeguards in the GUI so the Cp coefficients are not all zeros, and maybe try to catch a zero Cp in the solver. In the meantime I recommend plotting Cp vs Temperature as a sanity check.

mariam.yehia33 · June 28, 2021, 9:53pm

Hello all,

Update: after adding the wood and ash appropriate cp coefficients from available literature and also adding a approximate cp values for the liquid H2O beyond 600K, the model has continued until simulation time= 9 sec so far without divergence. Thank you all a lot for your help. I really appreciate it.

Many thanks,
Mariam

mariam.yehia33 · December 30, 2021, 8:27am

Hello @jeff.dietiker and @cgw ,

It has been a while for this topic, however I still need to know what proper assumption is to be made in order to prevent divergence as the biomass particles is heated to 600 K due to the lack of H2O_L cp data. In other words, how can I extend the water cp values to a temperature of 1000 K?

Thanks,
Mariam

cgw · December 30, 2021, 7:01pm

There’s another species definition in the Burcat database for H2O that covers a wider temperature range. This is gas phase rather than liquid, but I don’t imagine you are dealing with liquid H2O at temperatures above 600K. It’s up to you to determine if this is correct for your applications. You can also input your own coefficients for the thermal polynomials, based on your own research.

– Charles

mariam.yehia33 · December 31, 2021, 3:19pm

Thanks, Charles.

The liquid water shouldn’t reach 600K in liquid phase, however, the biomass particle includes H2O_L within its species, and when particle temperature approaches 600K, the case diverges. I have a drying chemical reaction implemented as well. What can I do else?

Here is the file:
wang_new_2021-12-31T121850.521748.zip (15.9 MB)

It diverges at 8.13 sec when Tp=566 K, so I think the problem is with the water cp.

cgw · January 6, 2022, 5:16pm

I ran this case on 8 cores and got a divergence at t=7.963695s with

DT < DT_MIN.  Recovery not possible!

Are you sure this is due to the temperature being too high? I’m not sure that’s the cause. This was not an FPE like you originally reported, it’s a different error.

I’m seeing a lot of warnings related to inlet velocity - 35 of these:

Velocity exceeds limit:   502.99
in cell: I =   21   J =    7   K =    1
Epg =  0.94458     Ug =   244.68     Vg =   506.42     Wg =   0.0000
To change the limit, adjust the scale factor MAX_INLET_VEL_FAC.

with a big flood of them right before the divergence. There are also a lot of instances of this warning (which is currently showing up as a “Message” rather than “Warning”):

Message from check_data_30.f:372
Time =   7.9628
Warning: The sum of mass fractions is not equal to one.

I suspect that these may be part of the problem.

– Charles

cgw · January 6, 2022, 5:58pm

“Just for fun”, here are some animations I made from your case. The sand particles are colored by y-velocity, and the gas phase is colored by temperature. (Currently you can show the sand or the biomass but not both, a future version of MFiX will allow this).

I’m not sure if it’s meaningful but just before the divergence, at t=7.93, toward the left side of the domain, a large number of particles suddenly drop (large negative y-velocity) - this looks unusual to me. The video flickers at this point. You can see it better in the second “detail” video.

I’ll let you know if I find anything more about this case.

– Charles

jeff.dietiker · January 11, 2022, 3:24pm

One thing I noticed is you have some very small particles just before it fails (around 60 microns, see below). You also do not have any limiters in the reaction rates. My suggestion is to limit the reactions based on the amount of species (by mass, not mass fraction) and/or particle size.