Hi ,May I ask why the software suddenly quit in the process of calculation? Thanks for your reply!
tansuanhua_xiao_2022-11-20T211856.196979.zip (1.3 MB)
Hi, welcome to the MFiX forum.
This one is a little tricky to debug:
$ cd /tmp/tansuanhua_xiao_2022-11-20T211856.196979/
$ file core
core: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style,
from '/usr/bin/mpirun --use-hwthread-cpus -mca mpi_warn_on_fork 0 -np 6 /tmp/tansuanh', real uid: 103, effective uid: 103, real gid: 1000, effective gid: 1000, execfn: '/usr/bin/mpirun', platform: 'x86_64'
$ gdb /usr/bin/mpirun core
GNU gdb (Gentoo 12.1 vanilla) 12.1
Core was generated by `/usr/bin/mpirun --use-hwthread-cpus -mca mpi_warn_on_fork 0 -np 6 /tmp/tansuanh'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f26f5cb6c93 in PMIx_Finalize () from /usr/lib64/openmpi/mca_pmix_pmix3x.so
[Current thread is 1 (Thread 0x7f26f5edafc0 (LWP 2149))]
(gdb) where
#0 0x00007f26f5cb6c93 in PMIx_Finalize () from /usr/lib64/openmpi/mca_pmix_pmix3x.so
#1 0x00007f26f5c822ba in pmix3x_client_finalize () from /usr/lib64/openmpi/mca_pmix_pmix3x.so
#2 0x00007f26f5ec285d in ?? () from /usr/lib64/openmpi/mca_ess_hnp.so
#3 0x00007f26f6224cf6 in ?? () from /usr/lib64/libevent_core-2.1.so.7
#4 0x00007f26f62259d7 in event_base_loop () from /usr/lib64/libevent_core-2.1.so.7
#5 0x000055c8fa8223c1 in ?? ()
#6 0x00007f26f605b2ca in __libc_start_call_main () from /lib64/libc.so.6
#7 0x00007f26f605b385 in __libc_start_main () from /lib64/libc.so.6
#8 0x000055c8fa822131 in ?? ()
(gdb) quit
The crash is in mpirun
, not mfix
- so I’m not quite sure what to do next. The routine PMIx_Finalize
is part of OpenMPI.
You should try running in serial (no DMP) and without your usr_rates_des.f
to see if that changes things. Also experiment with different domain decompositions. Perhaps someone else has seen this type of OpenMPI crash before and can comment.
– Charles