Particle generation in a small box

DrMin · November 2, 2023, 12:11pm

Hi, I want to generate randomly located particles with a volume fraction of 30% in a square box, with no overlap between particles. Do you have a good method or some literature?

cgw · November 8, 2023, 2:21pm

Here’s the most simple-minded way to do it, this algorithm just tries adding particles randomly and checks for overlaps. This is quadratic (or worse) in the number of particles and will get slow for large number of particles. It might be sufficient for you. More refined approaches involve spatially-organized data structures, where a coarse grid or a 3-d tree is used to simplify the overlap checks.

#!/usr/bin/env python

import numpy as np

box_size = 10
radius = 0.5
volume_fraction = 0.3

n_particles = int( (volume_fraction * box_size**3) /
                   (4/3 * np.pi * radius**3))
print(n_particles)

coords = np.zeros((n_particles,3), float)
d2 = 4 * radius**2 # square of diameter.  avoid square roots
n = 0
while n < n_particles:
    while True:
        xyz = radius + (box_size - 2*radius) * np.random.random(3)
        if n==0 or min(np.sum((coords[:n]-xyz)**2, axis=1)) > d2:
            coords[n] = xyz
            n += 1
            break

print(coords)

cgw · November 8, 2023, 6:40pm

Here’s a much better version, illustrating the concept of a grid structure.
np.empty((n_cells,n_cells,n_cells), object)
creates a 3-dimensional array, each element of which is a Python object (initially set to None). We use these to hold lists of the points in each cell.

The cell size is exactly the diameter of the particles. A particle in a given cell can only overlap with another particle if the second particle is within a 27-cell (3x3x3) region surrounding the given cell. So instead of comparing each candidate particle to ALL the particles we’ve placed before, we only have to check the ones in these adjacent cells. We also bail out of the check as soon as we find the first overlap, rather than computing all the distances and then using min.

This causes the algorithm runtime to be roughly linear in the number of particles placed. Some additional speed could be gained, perhaps by using Cython. But this is pretty good for Python.

#!/usr/bin/env python

import numpy as np
import sys

box_size = int(sys.argv[1])
radius = float(sys.argv[2])
volume_fraction = float(sys.argv[3])

diameter = 2*radius
d2 = 4*radius**2

n_particles = int( (volume_fraction * box_size**3) / (np.pi * 4 * radius**3 / 3))
print(n_particles)

n_cells = int(0.5 + box_size/diameter)

grid = np.empty((n_cells,n_cells,n_cells), object) #Create object array

def check_neighboring_cells(xyz):
    i,j,k = map(int, xyz/diameter)
    for ii in range(max(i-1,0), min(i+2,n_cells)):
        for jj in range(max(j-1,0), min(j+2,n_cells)):
            for kk in range(max(k-1,0), min(k+2,n_cells)):
                neighbors = grid[ii,jj,kk]
                if neighbors is not None:
                    for neighbor in neighbors:
                        if np.sum((neighbor-xyz)**2) < d2: # Too close
                            return False
    return True

n = 0
while n < n_particles:
    xyz = radius + (box_size - 2*radius) * np.random.random(3)
    if check_neighboring_cells(xyz):
        i,j,k = map(int, xyz/diameter)
        if grid[i,j,k] is None:
            grid[i,j,k] = []
        grid[i,j,k].append(xyz)
        n += 1

#print(grid)

cgw · November 8, 2023, 6:47pm

Here’s a plot of the runtime for the improved algorithm, using box sizes of 10, 15, 20, 25, 30, 35, and 40.

plot2

Note that when we use number of particles for the x-axis the scaling is almost linear, which is good. The naive algorithm I posted initally is much worse (I’d post a plot to compare, but it’s too slow!)

cgw · November 8, 2023, 8:12pm

For comparison:
plot3

DrMin · November 9, 2023, 1:25am

Thank you for your reply. Using a grid to generate particles is a really good idea. I’m looking into it. I’ve tried a few things.
The diameter of the particles I want to generate is not uniform, so it seems that the grid should be divided by the size of the largest particle diameter, but this is almost 20 times the size of the smallest particle. Maybe I should use more than one grid? I stopped at this step.
In addition, I also tried the non-grid method. My method is to randomly generate all the particles, then find the overlap of any particle with all the other particles, sum these overlaps (the summation weight depends on the particle diameter), and then move the particles according to the summation result. At present, this method can achieve the goal, but the number of cycles is not very stable.
Thanks again.

cgw · November 16, 2023, 7:29pm

I wanted to go a little further with this sample code and see if I could speed up the particle generation. I tried a few things in pure Python such as pre-allocating the grid array to avoid run-time memory allocations for each cell, but this only slowed things down - I suspect because it increased the overall memory footprint and caused more cache misses (CPUs are much faster than RAM these days so time spent waiting for non-cached data is significant).

My test case is a 50x50x50 box, particle radius 0.5, packing density 0.3. The Python code I posted above took 4 minutes and 42 seconds to place 71619 particles. (About 250 particles/sec)

Next I tried cython which is a tool that turns Python into C which you then compile. You can also add type annotations which allow Cython to generate more efficient C code - if a certain variable, say a loop counter, will only ever be an integer, avoiding Python’s dynamic-typing behavior is a big savings.

After doing some iterations of adding type definitions, I got the Cython version to run the same size box in 30 seconds - a speedup of about 10X

I got a little frustrated with Cython after a while (I suspect one could get further speedups) and decided - if I want it to run at the speed of compiled C, I should just write it in C. And this gave me an opportunity to see if I still remembered how to write efficient C :-).

The code attached (compiled with -O3 -march=haswell) took only 0.393 sec to place 71619 particles - a speedup of over 700x compared with the Python version. It is also capable of handling larger domains (or smaller particles) due to more efficient use of RAM. It places 182000 particles / sec which is the best I’ve been able to do so far. Note that none of these test programs output the particle coordinates because that adds significanly to the run time and I am only trying to compare the placement algorithm implementations.

#include <stdlib.h>
#include <stdio.h>
#include <math.h>

inline
float cube(float x) {
    return x*x*x;
}

inline
float square(float x) {
    return x*x;
}

float cvt(char *arg) {
    char *end;
    end = arg;
    float d = strtod(arg, &end);
    if (end==arg) {
        fprintf(stderr, "Error: can't convert %s\n", arg);
        exit(2);
    }
    return d;
}

int main(int argc, char **argv)
{
    float box_size, radius, volume_fraction, diameter, d2, x, y, z;
    int n, n_cells, n_particles, i, j, k, ii, jj, kk;
    int imin, imax, jmin, jmax, kmin, kmax, n1, n2;
    int idx0, idx1, idx2, idx3;
    size_t size;

    if (argc != 4) {
        fprintf(stderr,"Usage: %s box_size radius volume_fraction\n", argv[0]);
        exit(1);
    }
    box_size = cvt(argv[1]);
    radius = cvt(argv[2]);
    volume_fraction = cvt(argv[3]);

    diameter = 2*radius;
    d2 = diameter*diameter;
    n_cells = (int)(0.5 + box_size/diameter);

    // Each grid cell contains up to 4 particle centers (x,y,z)
    // We use float to save memory & reduce cache misses
    float *grid = (float*)calloc(size=cube(n_cells) * 4 * 3,
                                 sizeof(float));
    if (!grid) {
        fprintf(stderr, "Cannot allocate %ld bytes\n", size);
        exit(3);
    }

    int *count = (int*)calloc(size=cube(n_cells), sizeof(int));
    if (!count) {
        fprintf(stderr, "Cannot allocate %ld bytes\n", size);
        exit(3);
    }

    n_particles = (volume_fraction * cube(box_size)) / (M_PI * 4 * cube(radius) / 3);
    printf("%d\n", n_particles);

    n1 = n_cells;
    n2 = square(n_cells);
#define to_index(i,j,k) (i*n2 + j*n1 + k)  // index into 3-d array

    for (n=0; n<n_particles;) {
      try_again:
        x = radius + (box_size - 2*radius) * drand48();
        y = radius + (box_size - 2*radius) * drand48();
        z = radius + (box_size - 2*radius) * drand48();
        i = x/diameter;
        j = y/diameter;
        k = z/diameter;
        idx0 = to_index(i,j,k);
        if (count[idx0] == 4) //cell is full
            goto try_again;

        imin = i? i-1: 0;
        imax = i>=n_cells-1?  n_cells-1: i+1;
        jmin = j? j-1: 0;
        jmax = j>=n_cells-1?  n_cells-1: j+1;
        kmin = k? k-1: 0;
        kmax = k>=n_cells-1?  n_cells-1: k+1;

        for (ii=imin; ii<=imax; ++ii)
            for (jj=jmin; jj<=jmax; ++jj)
                for (kk=kmin; kk<=kmax; ++kk)
                    for(idx1=to_index(ii,jj,kk),
                            idx2=idx1*4*3,
                            idx3=idx2 + 3*count[idx1];
                        idx2 < idx3;)
                        if (square(grid[idx2++] - x) +
                            square(grid[idx2++] - y) +
                            square(grid[idx2++] - z) < d2)
                            goto try_again;

        idx1 = idx0*4*3 + 3*count[idx0];
        grid[idx1++] = x;
        grid[idx1++] = y;
        grid[idx1++] = z;
//        printf("%d %f %f %f\n", n, x, y, z);
        count[idx0]++;
        ++n;
    }
}

(file link - remove .txt extension - cannot upload .c file here)
particles6.c.txt (2.9 KB)