SQP method using particles_input. data imports the particle position, but cannot be successfully read

With the correct diameter value in the file (0.012787997374897409, as computed by the superquadric designer) the simulation crashes almost immediately with:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
Backtrace for this error:
#0 sq_calc_force_mod_MOD_calc_force_superdem
        at des/sq_calc_force_superdem.f:377
#1 des_time_march_MOD_des_time_step
        at des/des_time_march.f:192
#2 run_dem
        at mfix.f:211
#3 run_mfix
        at mfix.f:146
#4 main_MOD_run_mfix0
        at main.f:79

I’ve been experimenting with a version of xpow which temporarily disables overflow checking, so that instead of a value like 1e240 ^ 2 leading to an overflow, it returns inf.

With that modifcation, the simulation starts off by printing a huge number (191742) of overlap between two superquadric particles is too large! messages, and after that proceeds extremely slowly:

plot2

But it’s running. I’ll let it go overnight and see what happens.

I’ve made a little progress on this case and found some interesting things.

First, thanks again to @gaoxi for pointing out the radius/diameter mixup.

Comment: I think it would be best for this input to be optional. It can be computed from the semiaxes, and for non-particle_input.dat cases (automatic particle generation), if the diameter is not specified it is computed from the semiaxes. So why make users specify it here?

Once I corrected the value in the file, I noticed something interesting. In the VTK window:
close-pack

when we look closely they do seem to be overlapping a bit.

This is quite strange because yesterday when I looked at the particles they were not touching at all:

I think the only thing that has changed is the diamater specified in particle_input.dat. But that should not affect how the particles are displayed in the VTK window, because, again, the superquadrics are completely described by the semiaxes and m,n exponents. Perhaps there is a bug in the visualization code, I will follow up on this.

So I decided to take seriously the thousands of
overlap between two superquadric particles is too large!
messages on the screen.

Looking at your data file, the minimum spacing between the particles in the y-direction is 0.0131989 which is very close to the bounding sphere diameter, it looks like it is close enough to cause trouble:
0.0131989 - 0.01278799737489740 = 0.000410
too close for comfort.

Here’s a little Python script that adds a little breathing room:

#!/usr/bin/env python
import numpy as np

input = open('particle_input.dat')
output = open('particle_input.dat.new', 'w')
# copy header
for lno in range(38):
    output.write(input.readline())
# load numeric data
data = np.loadtxt(input)
# rescale
data *= [1, 1.1, 1]
# write output
for row in data:
    row.tofile(output, sep=' ')
    output.write('\n')

(Rename particle_input.dat.new to particle_input.dat after running this.)

After doing this the overlap is too large! messages went away, at least for a while. I am still getting a crash, but it runs a little further before crashing, and goes a lot faster (I am also running 8-thread SMP which helps a lot).

Here’s the particles with a little more space:

and here’s a little movie showing how far we’ve come:

More to come …

Disabling the FPE trap in xpow and allowing it to return inf:

Now we’ve got a new backtrace:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0 sq_rotation_mod_MOD_qmatrixrotatevector
  at des/sq_rotation.f:345
#1 sq_equivalent_radius_mod_MOD_sq_gradient
  at des/sq_equivalent_radius.f:80
#2 sq_contact_newton_dpmethod_mod_MOD_func_dp_a
  at des/sq_contact_detection_newton.f:949
#3 sq_contact_newton_dpmethod_mod_MOD_sq_contact_newton_dp_a
  at des/sq_contact_detection_newton.f:267
#4 sq_calc_force_mod_MOD_calc_force_superdem._omp_fn.0
  at des/sq_calc_force_superdem.f:341
Core was generated by `/home/cgw/mambaforge/envs/mfix-git/bin/python -m mfix_solver.pymfix -d /tmp/n2d'.
Program terminated with signal SIGFPE, Arithmetic exception.
#0  0x00007f4b2a8fcd3d in sq_rotation_mod::qmatrixrotatevector (qrot=..., x_frame_1=..., x_frame_2=..., ixg=2)
    at /home/cgw/mambaforge/envs/mfix-git/share/mfix/src/model/des/sq_rotation.f:345
345	                      & *X_frame_2(2)+Qrot(3,1)*X_frame_2(3))
[Current thread is 1 (Thread 0x7f4b26b046c0 (LWP 17346))]
(gdb) list
340	!         its transpose
341	!
342	! Rotate the vector "X_frame_2" from the local frame (2) to the
343	!         global frame (1)
344	         X_frame_1(1) = 2.D0*(Qrot(1,1)*X_frame_2(1)+Qrot(2,1)          &
345	                      & *X_frame_2(2)+Qrot(3,1)*X_frame_2(3))
346	         X_frame_1(2) = 2.D0*(Qrot(1,2)*X_frame_2(1)+Qrot(2,2)          &
347	                      & *X_frame_2(2)+Qrot(3,2)*X_frame_2(3))
348	         X_frame_1(3) = 2.D0*(Qrot(1,3)*X_frame_2(1)+Qrot(2,3)          &
349	                      & *X_frame_2(2)+Qrot(3,3)*X_frame_2(3))
(gdb) p qrot
$1 = ((0.49903154015346318, -0.0007951998096093725, 0.031094848276286433) (-0.028471787774220762, 0.18959288376215444, 0.46178338615382208) (-0.012525144031196185, -0.46265960066123457, 0.18918037605149429))
(gdb) p x_frame_2
$2 = (-inf, -inf, inf)

Reducing the number of particles from 4500 to 45 - this is probably a good idea for debugging because it speeds things up considerably. Here’s another movie, with the colors corrected (negative y-velocity, downward, is blue/purple, positive y-velocity is yellow). I used a shorter VTK write interval to capture more detail. It starts off slowly but gets pretty wild when the particles hit the bottom and start spinning around like crazy!

The error now is in a new place:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7fc530f8f5cf in ???
#1  0x7fc4e4688e64 in __sq_calc_force_mod_MOD_calc_force_superdem._omp_fn.0
        at model/des/sq_calc_force_superdem.f:377

Next things to try:

  1. disable ALL overflow checks (not just in xpow.c)
  2. try m=6 as Xi suggests.

With all FPE (overflow/zero-division) checks disabled (invalid operations will result in NaN or +/-Inf as appropriate, per IEEE754)

This is quite interesting - it took about 20 minutes to simulate 2 seconds, and there were no crashes, just some overlap is too big! messages.

Thes disc-shaped particles are spinning very fast. As entertaining as it is to watch, I’m not sure how physical it is. It might be correct. Particles tend to find their happy place and hover there, spinning madly, but eventually they drift off or are hit by another particle.

If you look closely, going frame-by-frame, you can find a few things like this:

two particles have completely intersected in a physically impossible manner. This is perhaps to be expected, given the high spin rate - we might have just missed the collision between them. dt is 0.01 sec, it’s possible that reducing max_dt would help with these overlaps (at the cost of slowing down the simulation). But since FPE checks are disabled, this was non-fatal - the simulation continued regardless.

This tells me that perhaps we need to reconsider the way we handle FPE’s are handled in MFiX. Disabling them altogether seems unwise - but for sections of code like the SQP code where we know large values are common (due to corners and other singularities) it might be easier to disable FPE checks than to rewrite the code to avoid doing things like (huge_number) ** 2 - checking all the inputs to all calculations would be a lot of work and could also slow down the code substantially.

Or perhaps we could add a control to the Numerics pane to enable/disable the various type of FPE check.

Next things to try:

  1. Change shape exponents
  2. Add more particles

Here’s a plot of simulation time vs. real time. Note that when particles are colliding, the slope is much lower. With fewer collisions to check, it speeds up.

Also I found that unfortunately there was a mistake in my previous runs - I had inadvertently put in the bounding diameter for m=2, n=6rather than m=2, n=8. Oops. (As I said, having to specify this is error-prone). The numbers are very close, 0.012851 vs. 0.012788 so perhaps this was not too big of an error. Running again with the correct value gives fairly similar results:


(note that the range of the colorbar for y-velocity has been expanded, so that small velocities are no longer over-emphasized)
and there are still a few overlaps:

Next: change exponent from 8 to 6

Ok, changing n=8 to n=6 and updating the bounding diameter did not make any difference, other than changing the shape of the particles slightly (more rounded corners, flat faces a little less flat). It still dies with an FPE if overflow checks are off and runs with checks disabled. And we still have particle overlaps. So I don’t think resticting the shape exponent to values <= 6 is the solution, at least not in this case.

Thank you very much for your patient reply!@cgw
I will study carefully to see if I can solve the problem of excessive particle overlap. Thank you again for your help!

Thank you for Mr. Gao’s reply. I will give it a try.

I’m not sure the occasional overlaps are something you can solve … I believe this is due to the high rotational speed.

For reference - to turn off FPE checks, currently you must edit the file $CONDA_PREFIX/share/mfix/src/model/gfortran_init.c
and comment out or remove the line

   _gfortran_set_fpe(13);

and recompile.

Note that you must NOT “Copy the file to source directory for editing” but must edit it in-place. This means that ALL mfix simulations will pick up the change. Be advised. Currently the override-source-files-in-project-dir feature only works for Fortran files, not C - I will fix this for the next release.

Taking this one step further, I tried simulating all 4500 particles using 100 cores on our cluster, with FPE checks disbled. I increased the particle spacing with the above respace.py script, and changed the grid slightly (6x150x3) so I could use a 2x50 DMP decomposition.

The simulation starts, but diverges with DT < DT_MIN

Setting DT_MIN to 0 just to see what happens … it decreases until it’s about on the order of 1e-11 and after that simulation time just slows to a crawl

Turning off the fluid solver (pure granular flow) led to a similar result, so the problem is not in the fluid phase.

So, I’m afraid the conclusion is - at this point, it’s not feasable to simulate 4500 non-spherical particles - the limit is not CPU speed but dt becoming vanishingly small.

I will follow up here if I find any solution to this.

Thanks for the report, SQP is a new feature and your feedback helps us improve it.

– Charles

Thank you for your help! @cgw
I will give it a try. I am trying to change the size of the bounding diameter to that of a non spherical particle with equal volume, as this ensures that the volume of the non spherical particle is equal to the volume of the equivalent sphere. However, it is obvious that the particles overlap with each other, and I am not sure if changing this parameter will affect the calculation results.

Go ahead and try it, but I think that will give a bounding sphere that is too small, and collisions will be missed. The bounding sphere is used as a first step in collision checks - if two particles are separated by more than the bounding diameter, then they cannot possibly touch, and the more expensive contact check can be skipped. If you deliberately make the bounding sphere too small, you will cause many collisions to be missed. It’s important for the bounding diameter to be accurate, which is why I think it should be calculated by MFiX and not specified by users.

Thank you for your advice. @cgw
If so, are boundary spheres or non-spherical particles actually calculated in mfix operation?

I noticed there are several issue of the settings:

des_epg_clip = 0.42 ! set it to 0.1, as this is non spherical particles, packing are much denser, this might cause the divergence with DT < DT_MIN

neighbor_search_n = 25 ! 2 reduce this might help converge
neighbor_search_rad_ratio = 1.2 ! add this to avoid missing collision neighbors.

use a coarser grid, i would recommend mesh size is about 2 bounding sphere diameter

1 Like

Thank you for Mr. Gao’s guidance. I will give it a try.

Thanks @gaoxi - that helps a lot!

With the changes you suggested, dt is now on the order of 1e-3 instead of 1e-11 and the simulation proceeds, albeit a bit slowly (estimated time remaining: 8 hours, to reach t=5.0

DT:

Simulation time:

Since I am using a coarser grid as you suggested, I can only use 40 DP cores. Perhaps a combination of SMP and DMP would help? (Is that supported? I suppose I’ll try it)

Here’s the updated project file:
N2D-LSFB.mfx (14.1 KB)

I’m going to let this run a bit longer and then upload a movie. Film at 11 :slight_smile:

The bounding sphere will be calculated if not specified. But due to limitations of the particle_input.dat format it is not possible to leave it unspecified :frowning:

If you are using automatic particle generation you can leave it blank. This is why the control in the Solids/Materials pane is labeled “Optional”