Hi Jingjing:
As an experiment I set up a 100 gigabyte swap file. I did this on my laptop which has fast (NVME) SSD. I set it up in /home because I had space there, you can adjust the path name as needed. These operations must be done as ‘root’ user.
## Create 100GB file
$ dd if=/dev/zero of=/home/SWAP bs=1G count=100
100+0 records in
100+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 297.116 s, 361 MB/sx280# mkswap /home/SWAP
## Set permissions
$ chmod 0600 /home/SWAP
## Initialize file for use as swap space
$ mkswap /home/SWAP
Setting up swapspace version 1, size = 100 GiB (107374178304 bytes)
no label, UUID=ae8b662d-ad7e-4bfc-ac87-40427cbb6ce4
## Enable swapping on the new file
$ swapon /home/SWAP
## Show free space in gigabytes
$ free -g
total used free shared buff/cache available
Mem: 15 4 0 0 10 9
Swap: 100 0 100
After doing this I’m able to start your job. While it’s running I check top
:
# top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2302 cgw 23 3 58.7g 1.6g 21464 S 102.3 10.1 4:16.92 python
This is very interesting. Normally when a job is using virtual memory (swap space) it runs very slowly, because disk is pretending to be RAM, but it’s much slower (with modern SSD this is not as bad as it used to be, but still orders of magnitude slower than actual RAM). However this job appears to be running normally, and the swap space is not actually being used. Note the columns VIRT
and RES
in the top
output. VIRT
is the virtual size of your process, that is, the total amount of memory allocated by the process, which is the sum of virtual and physical memory. RES
is the “resident set size”, which is the amount of code and data that is currently “swapped in”, that is, in actual RAM instead of the swap file. And that number is fairly low - your process is demanind about 60GB to start, but only using 1.6 of that. Since this fits comfortably in system RAM no actual swapping is happening and the job proceeds efficiently.
But if this memory in fact is never being accessed, maybe we don’t have to create the huge SWAP file at all.
In Linux there’s an adjustable kernel parameter called overcommit_memory
- according to the system documentation (man 5 proc
)
This contains the kernel virtual memory accounting mode.
Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit
(More at: memory - How does vm.overcommit_memory work? - Server Fault )
Setting this to 1
means that all memory allocations will succeed even if there is not enough RAM+swap space to honor the request. This option seems a bit odd but it exists for precisely the reason that many programs allocate a large amount of memory which is not ever used. Of course, as soon as a program tries to access memory beyond what is actually available, it will abort with an out-of-memory error.
So, as root, I did this:
$ swapoff /home/SWAP
$ rm /home/SWAP
$ echo 1 > /proc/sys/vm/overcommit_memory
and I was also able to run your job, without requiring the giant SWAP file.
The downside of this is that it affects all jobs running on the system, not just mfix, so you might see some unusual behavior. On the other hand it will probably be just fine
In conclusion - you are allocating a large array but only using a small fraction of it. You can try to understand why this is happening and fix your code. Or you can use some system-level hacks to make the allocation succeed, either by enabling swap space or memory overcommit.
Hope this is helpful,
– Charles