Actually, I shoot a lot of stereo using a Manfrotto slide bar and a (Gasp) digital camera. Good stereo requires that both images are taken aimed straight ahead, with the film plane for both photos parallel to the scene. Your camera as drawn, looks fine, I would not tilt the lenses, but you can certainly tilt (aim) the camera in any direction/elevation.
The Manfrotto slide bar keeps the camera parallel to the scene and it can then be moved laterally. Typically the distance is the same distance as between your two eyes....I think that's about 65mm.
It takes a pretty steady hand to take stereo photos manually (using one camera) without a slide bar. With your camera, it would be a snap (pun intended). You would need to ensure that both imaging units are pointed exactly straight ahead, otherwise there will be modest change in scene perspective between the two photos.
Small irregularities are compensated by the eye/brain combo. Your eye/brain combo is definitely NOT used to having two slightly different images. I have a stereo pair where the wind blew (moved) a branch in one photo and not in the other. It literally makes my head hurt to view this pair.
I print 7 x 7 photos and mount them, with the proper spacing, on opposite pages of a scrapbook (or whatever) and view them with a Geoscope. See:
There is a formula for the correct spacing vs. distance to the object being photographed. We humans (unlike owls, etc.) have evolved to perceive stereo at pretty modest distances.
There is an absolute TON of stuff out there on stereo photography. Google stereoscopic.