Available DOF for a given perspective depends on the physical aperture diameter; e.g. 50mm f/2 is 25mm (I'm assuming a simple lens here with no pupil magnification). This diameter defines the cone of light coming from each point in the plane of focus and therefore how far out-of-focus things are outside the plane of focus.

You can take the same (FOV, DOF, duration) photo on different formats of course, assuming a lens can be opened up or stopped down far enough. For simplicity of this example, consider a square crop from whatever size film you're using; all of the following are equivalent images:
35mm = 24x24mm, 50mm f/2, ISO100, 1/1000s
6x6 = 55x55mm, 114.5mm f/4.6, ISO525, 1/1000s
4x5" = 100x100mm, 208mm f/8.3, ISO1730, 1/1000s

Note that they all have the same field of view (37.5 deg diagonal), same DOF in the final print and the same exposure time therefore the same motion visible. Also note that the ISO in each case is proportional to the film area in use, which is to be expected from conservation of energy (photon count): in each case, you have the same physical aperture collecting the same number of photons from the same field of view. If you make an image k times larger in each dimension, you spread the photons k^2 thinner on the film and require a k^2 higher sensitivity to get the same exposure. And that fits with the relative aperture being k times smaller, which is how photographers normally think of exposure.

If film technology was such that grain size (area of each granule) was proportional to ISO, and I think that this is very very very approximately true with some significant exceptions, then choice of format would be nearly moot. You could do anything with any format you want, within the limits of lens design... and there's the rub. If someone invented a stupidly-fine-grained (10x finer than Tech Pan!) film of ISO0.25 then we could (according to the above theory of equivalence) shoot with a 5mm f/0.2 lens onto 3mm film. The problem is that we can't really build a lens with a relative aperture bigger than f/0.75 and your final camera won't be that much smaller anyway - the film might be only 3x3mm per frame, but you still need that 25mm hole in the front to collect the light. And then it's a bitch to build an enlarging lens!

The same scaling issues occur when going to larger formats. ISO400 is bearable on 35mm, but there isn't really a viable option of ISO7000 available on 4x5" film though it's very easy to build a 200 f/8 lens. So despite the geometric ability of the formats to be equivalent optically, the technology (lens construction and emulsion too) limits us in each direction. Though you can often pick any two systems and be able to take some equivalent images as per the above definition, the differences in format mean that the two systems have different DOF/speed/resolution envelopes.

If you ignore the time-vs-ISO thing and stick to films not digital, then the distinction becomes more significant. While the best 35mm lenses are over 100lp/mm, films don't get much better than that and I don't see the emulsion research occurring to improve it. 35mm film isn't really going to get better even if the lenses improve because the lenses are not the limiting factor.

However, we already have easily-available MF lenses capable of 50-80lp/mm. And even moreso for LF: 40-50lp/mm is quite achievable with more than 4x5" coverage; see the GigaPxl project, which is using 9x18" film at about 30lp/mm, a custom lens and scanning to achieve massive resolution. You're not going to see that sort of resolution from 35mm in my lifetime, unless it's by stitching (I've done 500MP stitches from a 12MP DSLR) and stitching is basically just a longhand way of using the same sensor area repeatedly to effectively form a much larger sensor. A stitch is merely a larger-format photograph, taken piecemeal.

So if you're using film, your choice of format does, effectively, constrain you because of the limits on emulsion and lens construction technologies. You can use smaller formats, get fast shutter speeds, more DOF and less resolution, or you can use larger formats, get longer shutter speed, less DOF and more resolution.

If you go digital, the constraints do go away a bit. Because of the way electronic sensors work (CCD and CMOS), they have a specific sensitivity so using them at a higher EI just reduces your signal to noise ratio, which means your noise amplitude in the print increases. It turns out that the noise magnitude in a final print does follow the EI in the same manner as defined above for the optical equivalence of different format sizes. This means that the equivalence rule between format sizes does actually hold for digital if you use the same sensor technology in all sensor sizes.

IMHO, MF digital exists at the moment only to take advantage of existing MF lenses[1]. Because of the economics of digital sensor manufacture (yield reduces dramatically with larger chips), it makes more sense to build smaller chips and higher resolution lenses with less coverage. For this reason and the huge market share held by 24x16mm and 36x24mm digital sensors, I think most of the development will occur in the 35mm space - in other words, I suspect I partially agree with the OP in that there will be better lenses available and you'll be able to use them on film bodies.

However, I don't think films will improve to the point where you can make any use of that additional performance. Edge sharpness wide-open will get better on the latest lenses but even if you're using the finest 35mm films available, you're not ever going to match the resolution you get from MF or LF film or 35mm digital. In other words, 35mm film photography has basically reached its peak of quality - if you want better, you need to change format and/or capture technology.

If MF digital becomes mainstream and some development occurs on lenses that cover the larger MF formats, IMHO that's where the gains for film users will be. We have 6x6 and 6x7 lenses now that can resolve about as much as Pan-F but not necessarily wide-open. That means there are some small improvements to be made to the lenses that will give a quality improvement on existing films, at least with the lenses wide-open. The likelihood of a mainstream 6x7 digital sensor is kind of low though I think, at least for a while but I can hope I'm wrong!

[1] Ignore the Leica S2. Just because it exists does not mean it's a good place to be in terms of optimal or even sustainable price/performance - that's Leica for you.