Small-scale validation tests for MFIX-Exa CFD-DEM

I created and released a report that is a collection of our previous small-scale validation cases for MFIX-Exa. (Thanks to Kuipers’ group, from which all of the data in this version come from.)

In addition to collecting and refreshing the inputs, it was built into a regression test harness which produces the PDF (if successful). I intend to run this for every release. This month’s 25.10 being the first version, which I just added to the release notes:

If you have any suggestions for improvement or additional cases that should be considered, please suggest them on here. Keep in mind I would like to keep these tests “small scale” in the sense that they can be run “overnight” and can be post-processed easily for automation.

thanks

Thanks for sharing Will! A couple of questions/suggestions:

  1. What is the criteria to decide if the regression test is successful? Do you compare numerical results between the upcoming release and the previous one, or do you have a global measure of the error with experimental data?
  2. Are all of these tests run in serial on a CPU? If so, would it make sense to compare with parallel runs and GPU runs? It may be good to include the time to solution for each case.
  3. Are you planning to distribute the input files in the tarball? I see a bunch of benchmarks in mfix-exa/benchmarks but they look different (and there are more than what is listed in the readme file).
  1. There is no criteria, it’s left to the user to decide if the accuracy is sufficient. Not the best, but I think it should be viewed in the wider picture of our framework where we have ci tests that @jmusser (and now @cgw) maintain which ensure consistency down to a given tolerance between serial/MPI/OMP (possibly depreciated)/GPU (at least nvidia, possibly AMD?). But these benchmarks have to get updated periodically when changes are sufficient to exceed that threshold and it is expected. This would (hopefully) show in a longer transient that “roughly” similar results are obtained. Possibly statistically the same results are obtained, but as you point out, I would have to save the processed data from one run to another to make that comparison. That could be done. Maybe it should be. This is just version 1. I think we’d have a hard time saying if, for instance, the volume fraction profile in the Mueller test changed slightly outside of the error bars but still very far away from the data if this should signal a “failed” test or we just got some slightly different result which is still within the realm of CFD-DEM predictive accuracy.
  2. These are all MPI CPU. Adding the ability to run on GPU would be simple enough but I hesitated to do so because our GPU queue is so small. I would like to hit run and come back to results in the morning, not stuck waiting for one of @onlyjus’s ollamas to end.
  3. No. These are all in my personal repo. I can give you access, but it’s not polished enough for public release and I don’t intend to make it so. Coming up (after a few items) on my TODO list is going to be tutorials of some sort. I was thinking about putting these in a sperate repo, actually a separate git instance entirely, on the DOE CODE gitlab, so that I could keep that “sanitized” and hopefully get supervisory approval on make it entirely public so that the inputs, ascent yaml files, open scad files, exported csg files, etc., would all be there with the text of the tutorials. I’m kind of hemming and hawing about doing that because I would be creating a lot of work for myself making sure everything continues to work for a code that’s constantly evolving. But I realize that should be done.

Thanks for the replies Will,

  1. I guess I just got tripped on “it was built into a regression test harness which produces the PDF (if successful)”. The “if successful” must apply to generating the PDF, not if the test harness passes. I think manual observation, once per release is sufficient.
  2. OK, I would be interested to see (at some point) how the GPU run compares with MPI CPU, both in terms of accuracy and speed.
  3. That’s fine, it is not urgent. Thanks for documenting these validation cases!

Correct on 1. The biggest thing I’m missing in this v1 is error checking. There is no error checking. If everything runs correctly, then it should post process fine and spit out this PDF. If something screws up, I have no way of knowing it other than seeing if there are any figures in the generated doc. And, actually, as I found out this week, I think most if not all cases will generate the figures anyway they will just be missing simulation data. This happened on the first run through for the goldschmidt case because 2 of the 10 cases failed to run at all due to an MPI failure on Joule. I almost didn’t even notice it because the data in that figure is a little busy anyway.