Out of memory by coupling udf

monkeychan · May 25, 2021, 2:15am

Hi all
I am trying to couple my own udf to mfix (mfix-pic). But it turns out exceed the memeory. Could you please help me with this problem.
When I run the code using command “gfortran -mcmodel=medium udf.f90”, it compiles successfully. But I have no idea how to do it after coupled to Mfix.

usr1.f90:(.text+0x890): relocation truncated to fit: R_X86_64_PC32 against .bss' usr1.f90:(.text+0x8a9): relocation truncated to fit: R_X86_64_PC32 against .bss’
usr1.f90:(.text+0xd51): relocation truncated to fit: R_X86_64_PC32 against .bss' usr1.f90:(.text+0xd65): relocation truncated to fit: R_X86_64_PC32 against .bss’
usr1.f90:(.text+0xead): relocation truncated to fit: R_X86_64_PC32 against .bss' usr1.f90:(.text+0xeff): relocation truncated to fit: R_X86_64_PC32 against .bss’
usr1.f90:(.text+0xf60): relocation truncated to fit: R_X86_64_PC32 against .bss' usr1.f90:(.text+0xfc1): relocation truncated to fit: R_X86_64_PC32 against .bss’
usr1.f90:(.text+0x1033): relocation truncated to fit: R_X86_64_PC32 against .bss' usr1.f90:(.text+0x1085): relocation truncated to fit: R_X86_64_PC32 against .bss’
usr1.f90:(.text+0x10e6): additional relocation overflows omitted from the output

cgw · May 25, 2021, 12:44pm

I believe the problem is caused by the -mcmodel=medium flag you are passing. See this posting:

and this for more detail:

monkeychan · May 25, 2021, 10:18pm

Thank you very much for your help. Yes, I know it always works when I compile it using the command “-mcmodel=medium (or large)” by gfortran. But the thing is when I couple it into mfix, and compile the solver by also adding this flag, it doesn’t work. So I am wondering if there is something not matching, or a different way to handle this because of Mfix.

cgw · May 26, 2021, 1:15pm

You need to use the same -mcmodel flag for your UDF code as well as for the rest of MFiX.

monkeychan · May 26, 2021, 10:08pm

Yes, I try to add the flag to build the solver (fig. 1), it turned out the solve was built successfully, however, it cannot run because of out of memory (see fig.2)

compile
solver

cgw · May 28, 2021, 2:39pm

You need more RAM, or some swap space if you can tolerate swapping.

– Charles

monkeychan · May 30, 2021, 12:33am

Hi Charles
Thank you very much for your answer. May I ask you one more time: how should I get more space, do I need a more powerful computer, or is there any command that can release or allocate some space?

cgw · June 1, 2021, 5:54pm

You probably need a more powerful computer, or to use a coarser grid to reduce the memory requirement.

You may be able to increase the amount of memory available by using virtual memory, which really just writes data to disk when memory is full. This is much slower than real RAM but sometimes you can get away with it.

On Windows: https://www.tomshardware.com/news/how-to-manage-virtual-memory-pagefile-windows-10,36929.html

On Linux: https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-18-04/

What kind of system are you using, how much RAM do you have, and what are the details of your model? (Grid size / # of cells)

– Charles

monkeychan · June 1, 2021, 10:44pm

Thank you so much for your answers. I am using Debian. The RAM is 32 G. I have many variables has the x- ,y-, z-, species, wavelength dimensions (322,422,5,10,421) . Thank you very much. I will try to figure it out by more testing.

monkeychan · June 25, 2021, 5:29am

Hi Charles,
Can I ask you another question? Do you know how I can get/know how much memory I need? How to get it, I can get more memories for my computer.

cgw · June 25, 2021, 7:25pm

@monkeychan - Please attach your MFiX files and I’ll try to give you some advice.

monkeychan · June 28, 2021, 3:28am

case-1.mfx (13.3 KB)
usr1.f90 (13.1 KB)

Thank you very much, for the udf, I only give out the structure without detailed codes, hope you understand. In reality, it may need more memory, if you could give me some guidance on how to calculate memory needed, that will be great. Thank you in advance.

cgw · June 28, 2021, 5:01pm

Hi Jingjing:

As an experiment I set up a 100 gigabyte swap file. I did this on my laptop which has fast (NVME) SSD. I set it up in /home because I had space there, you can adjust the path name as needed. These operations must be done as ‘root’ user.

## Create 100GB file
$ dd if=/dev/zero of=/home/SWAP bs=1G count=100  
100+0 records in
100+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 297.116 s, 361 MB/sx280# mkswap /home/SWAP

## Set permissions
$ chmod 0600 /home/SWAP

## Initialize file for use as swap space
$ mkswap /home/SWAP
Setting up swapspace version 1, size = 100 GiB (107374178304 bytes)
no label, UUID=ae8b662d-ad7e-4bfc-ac87-40427cbb6ce4

## Enable swapping on the new file
$ swapon /home/SWAP

## Show free space in gigabytes
$ free -g   
               total        used        free      shared  buff/cache   available
Mem:              15           4           0           0          10           9
Swap:            100           0         100

After doing this I’m able to start your job. While it’s running I check top:

# top

PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                        
2302 cgw       23   3   58.7g   1.6g  21464 S 102.3  10.1   4:16.92 python

This is very interesting. Normally when a job is using virtual memory (swap space) it runs very slowly, because disk is pretending to be RAM, but it’s much slower (with modern SSD this is not as bad as it used to be, but still orders of magnitude slower than actual RAM). However this job appears to be running normally, and the swap space is not actually being used. Note the columns VIRT and RES in the top output. VIRT is the virtual size of your process, that is, the total amount of memory allocated by the process, which is the sum of virtual and physical memory. RES is the “resident set size”, which is the amount of code and data that is currently “swapped in”, that is, in actual RAM instead of the swap file. And that number is fairly low - your process is demanind about 60GB to start, but only using 1.6 of that. Since this fits comfortably in system RAM no actual swapping is happening and the job proceeds efficiently.

But if this memory in fact is never being accessed, maybe we don’t have to create the huge SWAP file at all.

In Linux there’s an adjustable kernel parameter called overcommit_memory - according to the system documentation (man 5 proc)
This contains the kernel virtual memory accounting mode.
Values are:

          0: heuristic overcommit (this is the default)
          1: always overcommit, never check
          2: always check, never overcommit

(More at: memory - How does vm.overcommit_memory work? - Server Fault )

Setting this to 1 means that all memory allocations will succeed even if there is not enough RAM+swap space to honor the request. This option seems a bit odd but it exists for precisely the reason that many programs allocate a large amount of memory which is not ever used. Of course, as soon as a program tries to access memory beyond what is actually available, it will abort with an out-of-memory error.

So, as root, I did this:

    $ swapoff /home/SWAP
    $ rm /home/SWAP
    $ echo 1 > /proc/sys/vm/overcommit_memory

and I was also able to run your job, without requiring the giant SWAP file.

The downside of this is that it affects all jobs running on the system, not just mfix, so you might see some unusual behavior. On the other hand it will probably be just fine

In conclusion - you are allocating a large array but only using a small fraction of it. You can try to understand why this is happening and fix your code. Or you can use some system-level hacks to make the allocation succeed, either by enabling swap space or memory overcommit.

Hope this is helpful,

– Charles

monkeychan · June 28, 2021, 10:27pm

Hi Charles
I appreciate your explanation and help so much. I will test and see if it will work. Thank you again! I will update you later!