In the finite element programs, each voxel, of type INTEGER*2, must know the positions of its 27 nearest neighbors in a cubic array since that is a mathematical requirement of the calculation. In the serial versions, this is accomplished by using an array, ib, which is dimensioned as ib(ns, 27). This array is of type INTEGER*4 and uses 54 (27 x 2) times the amount of memory as the original dataset. In fact, it serves no purpose as a calculating device, but is only used as a look-up (hash) table since it stores the 1-D positions of the 27 nearest neighbors for a given voxel.
In the parallel program, vox is dimensioned as a rank 3 array, vox(nx, ny, d1 − 1: d2 + 1). With this arrangement, it is trivial to find the indices of 27 nearest neighbors for a given voxel, vox(i, j, k). The three nearest neighbors (including the voxel itself) in the z-direction have indices of (i, j, k − 1), (i, j, k) and (i, j, k + 1). Therefore the set of 27 nearest neighbors for this element is generated by adding ± 1 or 0 to any or all of the indices of the (i, j, k) triplet. The lowest neighbor has indices of (i − 1, j − 1, k − 1) and the highest has (i + 1, j +1, k + 1). These values can be calculated on the fly or generated by using an adequately defined set of triply nested do-loops.
Special allowances have to be made when the current voxel is on the outside edges of the data cube (i.e. i = 1 or nx or j = 1 or ny). At these extremes, the value of i or j is interrogated and the values of i − 1, i + 1 are compared to 0 and nx. For example, if i − i (or j − 1) equals 0, a modification takes place and the (i − 1)th (or (j − 1)th) neighbor is replaced by the voxel with i = nx (or j = ny). A similar modification takes place when the voxel having i = nx (j = ny). In this instance, the voxel with i = 1 (or j = 1) is used. This procedure is justified due to the periodic nature of the data.Therefore, by switching over to a parallel implementation and this new indexing scheme, one has a dramatic improvement in memory savings since the additional storage of particle positions is no longer needed. This memory is now free to be put to better use. Some small arrays (dk, cmod, sigma, etc...) that appear throughout the calculations have dimensions that are determined by the number of phases nphase one has in the original dataset; this number is known a-priori, like nx, ny and nz. Arrays which need this value are defined as allocatable as well. This increases the flexibility of the program and contributes to a saving of memory by implementing dynamic allocation of additional arrays. In the serial version of ELAS3D, dk is dimensioned to dk(100; 8; 3; 8; 3); the first index represents the phase number. Therefore, the new code allocates dk as dk(nphase, 8, 3, 8, 3).