After various preliminaries, the program reads information about the simulation space and then calls DOMAIN to figure out the spatial (domain) decomposition. To determine which processors control adjacent domains we identify each processor uniquely by considering each processor in the network as a cell in a link-cell structure. We then use the link-cell algorithm to determine the addresses of a processor's neighbors. Particles are then mapped onto processors on the basis of their x, y, and z coordinates. For 3D, we denote the number of processors allocated in the X, Y, and Z dimensions by Px, Py, and Pz, respectively, so
![]()
For a particle at position ri = (xi, yi, zi) in a simulation box with sides of length Lx, Ly, and Lz, with 0 ≤ xi < Lx, 0 ≤ yi < Lx, 0 ≤ zi < Lz the processor coordinates are given by

where INT is the function returning the integer part of the argument in brackets. The mapping from processor coordinates (Ix(i), yi, Iz(i)) to processor index is given by
![]()
Coordinates of the center of each processor's simulation box can be calculated from

Particles are allocated to processors on the basis of Ii at the start (subroutines INITPR (for particles) and INITSNEW (for ellipsoids)) and whenever particles are moved. While figuring out the domain decomposition, a processor's north (+y direction), south (−y direction), east (+x direction), west (−x direction), up (+z direction), and down (−z direction) neighboring processors are tabulated.
The simulation box size for each processor is given by

In the domain decomposition molecular dynamics cycle (subroutine CYCLE), we now have, on each processor,
Compute the new particle positions from
![]()
Compute the midpoint velocity (velocity at the midpoint of the time step) from
![]()
Calculate r3s to make sure particles remain in the QDPD box.
Move particles (MOVPAR) to their new home processor based on r3s.
Construct an extended volume consisting of owned cells plus ghost cells (EXTVOL) based on r3s.
EXTVOL calls a subroutine (LEBC) to apply Lees-Edwards shear boundary conditions.
Construct the link-cell list (LINKS) based on r3 coordinates.
Calculate new forces (FORCES), including a call to THIRDLAW, which transfers pair forces back to their home processor and adds them to forces there.
Compute new velocities from the new forces
![]()
The way the Newton's third law forces are handled in spatial (domain) decomposition is the following. A table is kept of edge particles that are sent in all directions. Then after forces are calculated, THIRDLAW loops over just the "other" particles looking for force contributions that have to be sent back to the processor that "owns" the particle and added to the forces there. THIRDLAW then communicates these Newton's third law force additions back to the "home" processors of the "other" particles and adds them to the forces there.
Some of these steps require additional explanation. In MOVPAR r3i coordinates are transformed, by subtracting the coordinates of the center of each processor's simulation box, so that they are in the range
![]()
where rprosl(k) is the size of that processor's domain in the k direction. Particles which don't meet this criterion have moved out of the processor and are sent to their new home processor. A subtle point is that this is relatively slow motion so we know that the move is to the nearest neighbor in the k dimension, the one in the negative or positive direction, depending on whether r3(k, i) − rorigin(k) < −rprosl(k) or r3(k, i) − rorigin(k) ≥ rprosl(k). This is true for k = 2 or 3, but because of the shear boundary condition at the topmost layer, particles may have moved more than one processor away in X in a single time step. We handle this by finding the maximum number of swaps in X on each processor, then do a global MAX of the values of each processor to determine how many swaps to do.
Next comes the formation of extended volumes using "ghost" cells in EXTVOL. To accommodate the "ghost" cells, the number of cells in each direction is increased by 2. So, for example, for a division of the central processor into 100 cells as in Fig. 5, the X and Y cell dimensions are 10. Mx and My, the cell X and Y cell dimensions for this processor are 12 (10 + 2) to accommodate the left and right ghost cells. We use the following to define a cell index for particle i (ICELLi)
![]()
where Ix(i), Iy(i), and Iz(i) are now given by

Sx, Sy, and Sz are scale factors whose purpose is to transform coordinates so that a processor's "own" particles in a domain will have values in the range

In Fig. 6 we show the central processor from Fig. 5 again, with its "own" and "ghost" cells renumbered according to the above. Using these scale factors, it is straightforward to identify which particles need to be passed in all 4 (or 6) directions. For example, particles whose Ix(i) value is 2 are left edge particles and need to be passed to the processor to the left; particles whose Ix(i) value is 11 (Mx − 1) are right edge particles and need to be passed to the processor on the right. It is important subsequent swaps, so for example particles whose Iy(i) value is 2 are passed down, and that includes particles in the "ghost" cells with Iy(i) = 1 and 12, and particles whose Iy(i) value is 1 are passed up. This is the way the particles in corner cells are made available to adjacent processors. As processor 4 communicates information about particles in its edge cells with Ix = 11 to processor 5, processor 5 in turn communicates information about particles in its left edge cells to processor 4, which become the right edge ghost cells on processor 4. So after swapping with processors to its left, right, north, and south, the complete "extended volume" exists on processor 4, and this can be followed by the link-cell list construction (Ix = {1,12}, Iy = {1,12}) and computation of forces (for particles owned by this processor, which are those in cells with Ix = {2,11}, Iy = {2,11}).

Fig. 6. Enlargement of the central region of Figure 5. Link-cell periodic boundaries become processor domain boundaries. Dashed lines correspond to ghost cells.
Now consider Fig. 5 again, and imagine calculating the forces using a single processor and the link-cell algorithm, and subdividing the simulation box into 30 cells in X and Y. The force calculation on particles in cells with Ix, Iy = {11,20} in Fig. 5 would be calculated exactly the same way as the particles owned by processor 4 in Fig. 6, for which Ix, Iy = {2,11}. This is the essence of the parallel link-cell method.
Similar conditions apply for the other processors, except for processors containing cells on the edge of the simulation box, such as processor 8 in Fig. 7. Cells interior to the processor, for which Ix, Iy are {2,10} are just like the cells on processor 4. At issue are the cells for which Ix = 11 and those for which Iy = 11, i.e, edge cells on the processor which are also edge cells for the whole simulation box (Fig. 5). But the right edge ghost cells (Iy = 12) for processor 8 are Ix = 1 cells for processor 6 and would be sent to processor 8 during the swap between these two processors (8 is the processor to the west (−x direction) of 6 and 6 is the processor to the east (+y direction) of 8). Similarly, processors 2 and 8 pair up to create the Iy = 12 ghost cells on 8. The net result of this is that the force calculation on particles in the domain of processor 8 will be calculated exactly the same way as the force on particles in the cells with Ix = {21,30}, Iy = {21,30} in a sequential simulation of the whole box with 30 cells in X and Y. Similar conditions pertain to other processors containing cells on the edge of the simulation box.

Fig. 7. A 2-D example of the parallel link-cell algorithm showing a processor containing cells on the edge of the simulation box.

Fig. 8. A nine processor 2-D domain decomposition and neighboring layers resulting from application of an applied strain consistent with the Lees-Edwards boundary condition.

Fig. 9. A more detailed nine processor 2-D domain decomposition including shear.
One point that was skipped in the above discussion is the treatment of the shear boundary conditions. In Fig. 8 we show the Fig. 5 simulation box again, and three boxes above the simulation box, moving to the right, as well as three boxes below the simulation box, moving to the left. 0′, 1′, and 2′ are images of 0, 1, and 2 which have moved to the right because of the shear. 6′, 7′, and 8′ have moved left. In Fig. 9 we redraw Fig. 8, showing the sheared upper boundary and the extended volume we have to build prior to computing forces. Cells that must be considered for edge cells (2,31) and (31,31) are shown with arrows. Note that because of Newton's third law, the extended volume we need includes left, right, and up layers, but not down (Iy = 1). Also care must be taken to include the shear shown in the figure. Subroutine EXTVOL handles this by forming the Y "ghost" layer before X (for 3D, the order is Z, Y, X). The Iy = 32 layer is formed by processors 0, 1, and 2 sending their Iy = 2 cells to processors 6, 7, and 8 respectively, and adding the simulator box distance in Y. In addition, movement to the right coming from the shear is computed from

where rmax(1) is the simulation box dimension in X and

Now subroutine LEBC is called to relocate particle properties to the processor that needs the information. This is done using the same technique as in MOVPAR, but care must be taken to keep track of the relocations so they can be reversed in the THIRDLAW transfer of forces back to their home processor. With these maneuvers, the Lees-edwards boundary condition is accomplished in our parallel program. Basically the program implements particles leaving at the bottom of the simulation box and entering at the top "ghost" layer (the mirror image) but with its X coordinate shifted to account for the strain.