11.1.18. Batch Queue

MFiX can be used on systems where code execution is controlled through a batch queue submission system instead of interactive or background job type methods as shown in the previous section. Usually, the user specifies the wall clock time duration of the job, and the batch queuing system prioritizes incoming jobs based on their resource allocation requests. In order for MFiX to avoid abrupt and abnormal termination at the end of the batch job session, several keywords need to be entered in mfix.dat. Controlled and clean termination in environments with batch queue is important as the system may terminate the batch job while MFiX is writing out *.SP files, which may corrupt the files or cause loss of data.

For this purpose, MFiX checks whether the user-specified termination criteria is reached at the beginning of each time step. However, to avoid performance bottlenecks on small systems where the user is running jobs without a batch queue, this feature is disabled by default. In order to enable this feature the following block of keywords need to be entered into mfix.dat.

CHK_BATCHQ_END = .TRUE.  ! Enable the controlled termination feature
BATCH_WALLCLOCK = 3600.0 ! Specify the total wall clock duration
                         ! of your job in seconds
TERM_BUFFER = 300.0      ! Specify a buffer time to start
                         ! clean termination of MFiX

Setting CHK_BATCH_END = .TRUE. in mfix.dat will enable the checking of the termination criteria at the beginning of each time step. In the above example, the user has set the total wall clock time for the duration of the batch session to 1 hour (this is specified in seconds in mfix.dat) and a buffer of 300 seconds has been set so that MFiX has sufficient time to terminate cleanly by writing out all *.SP and *.RES files before the batch session terminates. The duration of the buffer is critical for simulations with large files. MFiX will check if elapsed time >= (BATCH_WALLCLOCK - TERM_BUFFER) to start clean termination.

Another way to gracefully terminate MFiX as soon as possible is to create an empty file named MFIX.STOP (filename all uppercase) in the working directory where MFiX runs. At the beginning of each time step if the MFIX.STOP file is detected, then MFiX will terminate gracefully by saving *.RES files. CHK_BATCHQ_END flag must be set to .TRUE. in order to activate this feature.

The following terminal command can be used to gracefully terminate MFiX:

> touch MFIX.STOP

Remember to erase the file once MFiX terminates, otherwise the next time MFiX is run
in the same directory it will terminate immediately.
> rm -f -r ./MFIX.STOP

11.1.18.1. CHK_BATCHQ_END

Data Type: LOGICAL

Enables controlled termination feature when running under batch queue system to force MFiX to cleanly terminate before the end of wall clock allocated in the batch session.

11.1.18.2. BATCH_WALLCLOCK

Data Type: DOUBLE PRECISION

Total wall-clock duration of the job, in seconds.

11.1.18.3. TERM_BUFFER

Data Type: DOUBLE PRECISION

Buffer time specified to allow MFiX to write out the files and cleanly terminate before queue wall clock time limit is reached such that (BATCH_WALLCLOCK-TERM_BUFFER) is less than then batch queue wall clock time limit, in seconds.