FPE at calc_resid.f:982

I encountered the similar error recently when I tried to continue a task under “RESTART_1” mode. From the slurm .out file, the program run normally for a period of time and then failed with the following error:


How can I fix that? Thank you in advance.

Ju - this is a different error than was originally reported, so it’s OK to create a new topic, especially since the original problem is reported as “solved”.

Here’s the code that’s triggering the FPE.

978      IJK_RESID = 1
979      MAX_RESID = RESID_IJK( IJK_RESID )
980      DO IJK = ijkstart3, ijkend3
981      IF(.NOT.IS_ON_myPE_wobnd(I_OF(IJK),J_OF(IJK), K_OF(IJK))) CYCLE
982          IF (RESID_IJK( IJK ) > MAX_RESID) THEN
983              IJK_RESID = IJK
984              MAX_RESID = RESID_IJK( IJK_RESID )
985          ENDIF
986      ENDDO

The exception at 982 indicates that RESID_IJK(IJK) must be NaN, otherwise a simple comparison would not trigger an invalid floating-point error.

We need to find out how the NaN value got in there, but in the meanwhile you can try changing the code in calc_resid.f as follows:

      IJK_RESID = 1
      MAX_RESID = RESID_IJK( IJK_RESID )
      DO IJK = ijkstart3, ijkend3
      IF(.NOT.IS_ON_myPE_wobnd(I_OF(IJK),J_OF(IJK), K_OF(IJK))) CYCLE
          IF (ISNAN(RESID_IJK(IJK))) CYCLE
          IF (RESID_IJK( IJK ) > MAX_RESID) THEN
              IJK_RESID = IJK
              MAX_RESID = RESID_IJK( IJK_RESID )
          ENDIF
      ENDDO

This is simply checking for the NaN value and skipping the test if it occurs. That should get you past this crash, but you might run into other problems further down the line, due to the NaN. Try this out and let us know how it goes.

– Charles

Hi Charles, this time I simulated a new task with the same project as that incurred the problems. And the following similar “floating point” error happened as well at around 0.09s instead of 0.02s before.



How can I fix that? Thank you.