I wonder if common NDs, placed on a flash head, behave the same as when they are over a lens, in terms of reducing the output of a flash by a known log factor. I will check that at higher power levels with my meter.
They do, but. Wratten No. 96 ND filters are measurably (though not normally significantly) yellow, which can explain differences you found between different brands or kinds of film (the films may have differing spectral sensitivities - the yellowness is significant in this application; sensitometry).

Using the flash at power settings well-above your planned final setting... through any mechanical filter you create, you can find the difference. Remove the filter and measure... (Label the "filter" with the effective Density) Reduce the flash and measure again... Replace the filter and go to work testing.