Please select the most relevant MFiX category: | Installation | How to | Bug report | Share | for this topic.
Hello,developers,I tried to modify the code in des_thermo_newvalues.f to achieve the granular source item input I wanted.I divided the particles into 10 regions based on their distance from the center of the circle. The region is shown in Figure 1.
Figure 1
I counted the total number of particles in each region at the initial moment.But the results under the dmp run are different from the results under the single-core run.
The result is:
dmp
k1
k2
k3
k4
k5
k6
k7
k8
k9
k10
93
265
490
697
967
1144
1289
1528
1725
1240
128
368
544
688
992
1200
1448
1584
1695
1194
112
320
560
752
976
1168
1341
1617
1749
1205
93
294
504
728
888
1160
1336
1563
1741
1172
426
1247
2098
2865
3823
4672
5414
6292
6910
4811
38558
single-run
432
1264
2112
2870
3811
4643
5382
6265
6895
4884
38558
Under dmp run, my core is set to 2,1,2.k1,k2,k3,k4,k5,k6,k7,k8,k9,k10 are the number of particles in different regions under the current core.I counted them in line 141 of des_thermo_newvalues.f.As you can see, the results are different, is this a bug in the dmp run? 2_umf_2022-11-14T041121.280165.zip (14.5 MB)
Also,running to 0.73s my calculation gave the following error.
Thank you very much if someone can answer my questions.
“It is possible you will get different particle position over time due to the difference in order of operation.”
Since the totals match, and the other numbers are fairly close, I think that this does not indicate a problem, just normal variation.
The core dump on the other hand indicates that something has gone wrong. Can you copy/paste the last part of the output from the console, including the whole stack trace (not just a screenshot?) Thanks.
@zxc I am able to reproduce the failure here, although it takes several hours of running. The error I’m seeing is a bit unusual:
Error: Solver crash!
munmap_chunk(): invalid pointer
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 des_thermo_newvalues_mod_MOD_des_thermo_newvalues
at 2_umf_2022-11-14T041121.280165/des_thermo_newvalues.f:654
#1 des_time_march_MOD_des_time_step
at des/des_time_march.f:216
#2 run_dem
at mfix.f:211
#3 run_mfix
at mfix.f:146
#4 main_MOD_run_mfix0
at main.f:79
Usually, when mfix crashes, it is with SIGFPE (floating pointe error, i.e. zero division or math overflow) or SIGSEGV (invalid pointer access, typically due particles leaving the domain). In contrast, SIGABRT is relatively rare.
The error is reported on line 654 of des_thermo_newvalues.f in the project directory, that is, your code, which is your job to debug But this is somewhat unusual:
652 RETURN
653
654 END SUBROUTINE DES_THERMO_NEWVALUES
655
656 END MODULE DES_THERMO_NEWVALUES_MOD
note that nothing is really happening on line 654 … hmm …
Going back to the original error message, recall that it said munmap_chunk(): invalid pointer and a little bit of searching for that term reveals that this is typically due to an error in freeing allocated memory (“munmap” is a clue, it means “unmap memory”) - it seems some allocated memory is getting deallocated twice, or deallocate is getting called with a bogus pointer value. This may be happening due to some automatic cleanup which occurs on exiting the subroutine (?) - it may be a bug in openMPI itself (??) - I’ll let you know if I get any further debugging this! And please let me know if you figure it out.
Hello,charles.
I reran the code, but it output new error.
corrupted size vs. prev_size
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x7fa258eea08f in ???
at /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
#1 0x7fa258eea00b in __GI_raise
at ../sysdeps/unix/sysv/linux/raise.c:51
#2 0x7fa258ec9858 in __GI_abort
at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:79
#3 0x7fa258f3426d in __libc_message
at ../sysdeps/posix/libc_fatal.c:155
#4 0x7fa258f3c2fb in malloc_printerr
at /build/glibc-SzIz7B/glibc-2.31/malloc/malloc.c:5347
#5 0x7fa258f3c96a in unlink_chunk
at /build/glibc-SzIz7B/glibc-2.31/malloc/malloc.c:1454
#6 0x7fa258f3de8a in _int_free
at /build/glibc-SzIz7B/glibc-2.31/malloc/malloc.c:4342
#7 0x7fa20c1ebce3 in __des_thermo_newvalues_mod_MOD_des_thermo_newvalues
at /home/u/1_5_umf/des_thermo_newvalues.f:359
#8 0x7fa20c7b2d42 in __des_time_march_MOD_des_time_step
at/home/u/anaconda3/envs/mfix-21.4/share/mfix/src/model/des/des_time_march.f:201
#9 0x7fa20c4e9bc0 in run_dem
at /home/u/anaconda3/envs/mfix-21.4/share/mfix/src/model/mfix.f:211
#10 0x7fa20c4e9aae in run_mfix_
at /home/u/anaconda3/envs/mfix-21.4/share/mfix/src/model/mfix.f:146
It seems to be an error when the program deallocate the array y_pos1.I uploaded the new des_thermo_newvalues.f.My next step will be to change the y_pos dynamic array to a static array and perform the calculation, but the result seems to be different. des_thermo_newvalues.f (38.5 KB)
Simplify the problem as much a possible: Decrease the number of particles, Decrease the number of rings
Visualize the data. Look at where the particles are located with different partitions and if the number of particles reported by the code match the visual inspection. This is doable if you can get down to a small number of particles (see above).
Sometimes it is better to start from scratch and add new pieces of code one at a time, rather than trying to debug too complex of a code. Add one or two new arrays (allocate/deallocate) at a time until it crashes. That way it will be easy to find out if you forgot to deallocate one.