Optimal transverse grid sizes

FFT works best for grid sizes that are factorizable into small numbers. Any size will work, but the performance may vary dramatically.

FFTW documentation quotes the optimal size for their algorithm as \(2^a 3^b 5^c 7^d 11^e 13^f\), where \(e+f\) is either \(0\) or \(1\), and the other exponents are arbitrary.

While LCODE 3D does not use FFTW (it uses cufft instead, wrapped by cupy), the formula is still quite a good rule of thumb for calculating performance-friendly config_example.grid_steps values.

The employed FFT sizes for a grid sized \(N\) are \(2N-2\) for both DST (dst2d(), mix2d()) and DCT transforms (dct2d(), mix2d()) when we take padding and perimeter cell stripping into account.

This leaves us to find such \(N\) that \(N-1\) satisfies the small-factor conditions.

If you don’t mind arbitrary grid sizes, we suggest using

  1. \(N=2^K + 1\), they always perform the best, or
  2. one of the roundish \(201\), \(301\), \(401\), \(501\), \(601\), \(701\), \(801\), \(901\), \(1001\), \(1101\), \(1201\), \(1301\), \(1401\), \(1501\), \(1601\), \(1801\), \(2001\), \(2101\), \(2201\), \(2401\), \(2501\), \(2601\), \(2701\), \(2801\), \(3001\), \(3201\), \(3301\), \(3501\), \(3601\), \(3901\), \(4001\).

The code to check for the FFTW criteria above and some of the matching numbers are listed below.

def factorize(n, a=[]):
    if n <= 1:
        return a
    for i in range(2, n + 1):
        if n % i == 0:
            return factorize(n // i, a + [i])

def good_size(n):
    factors = factorize(n - 1)
    return (all([f in [2, 3, 4, 5, 7, 11, 13] for f in factors])
            and actors.count(11) + factors.count(13) < 2 and
            and n % 2)

', '.join([str(a) for a in range(20, 4100) if good_size(a)])

\(21\), \(23\), \(25\), \(27\), \(29\), \(31\), \(33\), \(37\), \(41\), \(43\), \(45\), \(49\), \(51\), \(53\), \(55\), \(57\), \(61\), \(65\), \(67\), \(71\), \(73\), \(79\), \(81\), \(85\), \(89\), \(91\), \(97\), \(99\), \(101\), \(105\), \(109\), \(111\), \(113\), \(121\), \(127\), \(129\), \(131\), \(133\), \(141\), \(145\), \(151\), \(155\), \(157\), \(161\), \(163\), \(169\), \(177\), \(181\), \(183\), \(193\), \(197\), \(199\), \(201\), \(209\), \(211\), \(217\), \(221\), \(225\), \(235\), \(241\), \(251\), \(253\), \(257\), \(261\), \(265\), \(271\), \(281\), \(289\), \(295\), \(301\), \(309\), \(313\), \(321\), \(325\), \(331\), \(337\), \(351\), \(353\), \(361\), \(365\), \(379\), \(385\), \(391\), \(393\), \(397\), \(401\), \(417\), \(421\), \(433\), \(441\), \(449\), \(451\), \(463\), \(469\), \(481\), \(487\), \(491\), \(501\), \(505\), \(513\), \(521\), \(529\), \(541\), \(547\), \(551\), \(561\), \(577\), \(589\), \(595\), \(601\), \(617\), \(625\), \(631\), \(641\), \(649\), \(651\), \(661\), \(673\), \(687\), \(701\), \(703\), \(705\), \(721\), \(729\), \(751\), \(757\), \(769\), \(771\), \(781\), \(785\), \(793\), \(801\), \(811\), \(833\), \(841\), \(865\), \(881\), \(883\), \(897\), \(901\), \(911\), \(925\), \(937\), \(961\), \(973\), \(981\), \(991\), \(1001\), \(1009\), \(1025\), \(1041\), \(1051\), \(1057\), \(1079\), \(1081\), \(1093\), \(1101\), \(1121\), \(1135\), \(1153\), \(1171\), \(1177\), \(1189\), \(1201\), \(1233\), \(1249\), \(1251\), \(1261\), \(1275\), \(1281\), \(1297\), \(1301\), \(1321\), \(1345\), \(1351\), \(1373\), \(1387\), \(1401\), \(1405\), \(1409\), \(1441\), \(1457\), \(1459\), \(1471\), \(1501\), \(1513\), \(1537\), \(1541\), \(1561\), \(1569\), \(1585\), \(1601\), \(1621\), \(1639\), \(1651\), \(1665\), \(1681\), \(1729\), \(1751\), \(1761\), \(1765\), \(1783\), \(1793\), \(1801\), \(1821\), \(1849\), \(1873\), \(1891\), \(1921\), \(1945\), \(1951\), \(1961\), \(1981\), \(2001\), \(2017\), \(2049\), \(2059\), \(2081\), \(2101\), \(2107\), \(2113\), \(2157\), \(2161\), \(2185\), \(2201\), \(2241\), \(2251\), \(2269\), \(2305\), \(2311\), \(2341\), \(2353\), \(2377\), \(2401\), \(2431\), \(2451\), \(2465\), \(2497\), \(2501\), \(2521\), \(2549\), \(2561\), \(2593\), \(2601\), \(2641\), \(2647\), \(2689\), \(2701\), \(2731\), \(2745\), \(2751\), \(2773\), \(2801\), \(2809\), \(2817\), \(2881\), \(2913\), \(2917\), \(2941\), \(2971\), \(3001\), \(3025\), \(3073\), \(3081\), \(3121\), \(3137\), \(3151\), \(3169\), \(3201\), \(3235\), \(3241\), \(3251\), \(3277\), \(3301\), \(3329\), \(3361\), \(3403\), \(3431\), \(3457\), \(3501\), \(3511\), \(3521\), \(3529\), \(3565\), \(3585\), \(3601\), \(3641\), \(3697\), \(3745\), \(3751\), \(3781\), \(3823\), \(3841\), \(3851\), \(3889\), \(3901\), \(3921\), \(3961\), \(4001\), \(4033\), \(4051\), \(4097\)