.. _sma-ex6:

Ex. 6: Generic model submission
-------------------------------

This example demonstrates the use of the :ref:`sma-gmc` and :ref:`hpc-queue`
nodes to generate and submit model runs of any text based input model on a
HPC.

.. note::

   This example was run on NETL's `Joule`_ HPC, which uses a `Slurm`_ based
   queueing system. However, this example should work on other systems but the
   queue commands may need to be changed to work with the queue manager being
   used.

Step 1: Setup base directory
++++++++++++++++++++++++++++

For this example, we will create a simple python script that reads two input
values (:math:`x` and :math:`y`) from a text file, evaluates the quadratic bowl
function (:math:`z = x^2 + y^2`), writes the resulting value (:math:`z`) to
another text file, and finally waits a random amount of time.

First, create a new directory, such as ``nodeworks_ex6``, to contain the files.
Next, create and open a file called ``run.py``, which will be our "model". Copy
and paste the following python code:

.. code-block:: python

   import numpy as np
   import time
   import random

   # read inputs
   matrix = np.atleast_2d(np.loadtxt('./sample.txt'))

   # evaluate function
   rsp = np.sum(matrix**2, axis=1)

   # wait random amount of time
   time.sleep(random.randint(10, 30))  # seconds

   # write result
   np.savetxt('response.txt', rsp)

Next, create a file named ``sample.txt`` in the same directory. This file will
be our texted based model input file that the :ref:`sma-gmc` node will replace the
variables ``${x}`` and ``${y}`` with the sample values generated by the
:ref:`sma-doe` node. Copy and paste the following text:

.. code-block::

   ${x} ${y}

Make sure you save the files and have a directory structure that looks like
this:

::

    nodeworks_ex6
    ├── run.py
    └── sample.txt

Step 2: Setup the nodes
+++++++++++++++++++++++

Open ``Nodeworks``, create a new sheet, and add a :ref:`sma-doe` node. On the
variables tab, create a new variable by pressing the |add| button. Change the
variable name from ``x1`` to just ``x``, matching the first variable in the
``sample.txt`` file. Next, change the range of the variable, replacing the
``0`` in the ``from`` field with ``-1``. Follow the same process to add another
variable by pressing the |add| button, Change the variable name from ``x2`` to
``y``, and replace the ``0`` in the ``from`` field with ``-1``.


.. figure:: ./images/ex6_doe.png
   :align: center

Next, make the samples by going to the ``Design`` tab, selecting
``latin hypercude`` as the ``Method`` and changing the number of ``Samples``
from the default ``10`` to ``20``. Finally, generate the samples by pressing
the ``Build`` button.

Now, add a :ref:`sma-gmc` node to the sheet and connect the ``DOE Matrix``
terminal from the :ref:`sma-doe` node to the ``DOE Matrix`` terminal on the
:ref:`sma-gmc` node. Select the ``Source directory`` by clicking the |open|
button and browsing to the directory created in Step 1 (``nodeworks_ex6``). The
``File extensions to copy`` and ``File extensions to replace`` lists will be
populated with all the file extensions in the ``Source directory``. In this
example, you should only see ``.py`` and ``.txt``. In the
``File extensions to copy`` list, check the ``.py`` extension. In the
``File extensions to replace`` list, check the ``.txt`` extension. The
``Export directory`` will automatically be set to the ``Source directory`` and
we will leave the default ``Directory prefix`` as ``sim_``.

.. figure:: ./images/ex6_gmc.png
   :align: center

The :ref:`sma-gmc`
node is now set up to create a new directory for each sample in the DOE matrix,
copying the ``run.py`` file into the new directories as well as copy the
``sample.txt`` file into the new directories while replacing the ``${x}`` and
``${y}`` variables with the correct sample values. Press the
``Create directories`` button to actually create the directories. The project
directory should now look like:

::

    nodeworks_ex6
    ├── run.py
    ├── sample.txt
    ├── sim_000000
    |   ├── run.py
    |   ├── sample_dict.json
    |   └── sample.txt
    ├── sim_000001
    |   ├── run.py
    |   ├── sample_dict.json
    |   └── sample.txt
    etc.

.. note::

   The ``sample_dict.json`` file contains the variables and values used to
   replace the variable names in the selected file extensions. It is a json
   file that will look something like:

   .. code-block:: json

      {"x": -0.4622, "y": 0.9629}

Now add a :ref:`hpc-queue` node to the sheet and connect the ``directories``
terminal of the :ref:`sma-gmc` node to the ``directories`` terminal of the
:ref:`hpc-queue` node. To auto populate the fields with Joule specific
commands, regular expressions, and queue script, select the ``Load template``
drop down and the ``Joule (slurm)`` template. Select the ``general`` partition
from the ``Queue`` list. In the ``Run CMD`` field is where we enter the actual
command to run out model, ``run.py``. For this specific case, we will use the
``srun`` command and set the working directory so we can run multiple
simulations on the same node. Enter the following ``Run CMD`` field:

.. code-block::

   srun --chdir=${cwd} python run.py

Next, change the ``Runs per job`` from the default of ``1`` to ``5`` and select
the ``concurrent`` check box. This will copy the ``Run CMD`` 5 times in the
same queue submission script. The ``concurrent`` check box tells the node to
append each run command with an ``&`` and append a ``wait`` after the run
commands, allowing the simulations to be run concurrently on the same node:

.. code-block: shell

   srun --chdir=sim_000000 python run.py &
   srun --chdir=sim_000001 python run.py &
   srun --chdir=sim_000002 python run.py &
   srun --chdir=sim_000003 python run.py &
   srun --chdir=sim_000004 python run.py &
   wait

.. figure:: ./images/ex6_queue_1.png
   :align: center

To actually submit the job scripts to the queue, go to the ``Jobs`` tab and
press the ``Submit`` button. The ``Jobs`` table will be populated with
information about the job for each individual simulation (run directory) even
though some of the simulations were submitted together in the same queue script.
Toggle the |refresh| button to enable the queue node to check the status of the
jobs.

.. figure:: ./images/ex6_queue_2.png
   :align: center

As jobs are finished, they will be added to the ``finished directories``
terminal of the :ref:`hpc-queue` node. To read the resulting ``response.txt``
written by the simulation, add a ``Code`` node. Enter ``dirs`` in the
``arguments`` field and hit ``Enter`` on the keyboard to generate a new
terminal. Connect the ``finished directories`` terminal on the :ref:`hpc-queue`
node to the ``dirs`` terminal on the ``Code`` node. Finally, copy and paste the
following Python code into the ``Code`` node:

.. code-block:: python

   import numpy as np
   import os

   rsp = []
   for d in dirs:
       rsp_f = os.path.join(d, 'response.txt')
       rsp.append(np.loadtxt(rsp_f))

   returnOut = rsp

.. figure:: ./images/ex6_code.png
   :align: center

We now have everything in place to connect the samples generated by the
:ref:`sma-doe` node and the response from the ``Code`` node into the
:ref:`sma-rsm` node. However, since all the simulations may not have run yet,
we need to filter out the samples that do not have a response. The
:ref:`hpc-queue` node has a ``finished mask`` that provides a list of booleans
that can be used by the ``Sample Filter`` node.

Add a ``Sample Filter`` node to the sheet and connect the :ref:`sma-doe` node's
``DOE Matrix`` terminal to the ``Sample Filter`` node's ``samples`` terminal.
Next, connect the :ref:`hpc-queue` node's ``finished mask`` terminal to the
``Sample Filter`` node's ``mask`` terminal.

Finally, add a :ref:`sma-rsm` node to the sheet. Connect the ``returnOut``
terminal from the ``Code`` node to the :ref:`sma-rsm` node's ``matrix/response``
terminal and connect the ``filtered samples`` terminal of the ``Sample Filter``
node to the ``matrix/response`` terminal of the :ref:`sma-rsm` node. Run the
sheet to populate the :ref:`sma-rsm` node with the samples and response.

.. figure:: ./images/ex6_complete.png
   :align: center

.. _Slurm: https://slurm.schedmd.com/overview.html
.. _Joule: https://hpc.netl.doe.gov/
.. |add| image:: ../../../nodeworks/images/add.svg
.. |open| image:: ../../../nodeworks/images/open.svg
.. |refresh| image:: ../../../nodeworks/images/refresh.svg