How to manually pick correspondence pixels from ZED cam?

I find the ZED cam depth map / point cloud is very noisy. For a new project I want to point the camera at a group of light bulbs placed in a room and get their XYZ data. I hope that manually picking correspondence pixels in the left and right camera image might give me a more stable and accurate depth measurement for those manually picked light bulb locations.

I like using the ZED because its cameras are already aligned, so all possibly corresponding pixels are already on the same horizontal epipolar line.
Also, some of the calibration data is already available; like intrinsic parameters and extrinsic parameters.

I managed to connect a Stereolabs ZED cam to my macOS computer with OF0.11.0 and thanks to the ZED-open-capture API I can use openCV to rectify the cameras and get a depth map. But the depth data is noisy, probably because of the disparity matching that is being performed.

Do I need “Disparity matching” as performed by cv::Ptr<cv::StereoSGBM> left_matcher; ?
Is there a simpler way to calculate the depth of a point seen by the left and right camera?

Here are my tests so far:
https://github.com/stephanschulz/ZED-open-capture-in-OF

I took a d-tour and trying it with python for now.

I am using cv2.triangulatePoints().
Here is a short video from the above python app, in which I show how via the mouse I select the pixels pairs.
The xyz text print out shows how z most of the time is around -65; which must be wrong.
Apr-13-2022 10-51-492

My openframeworks.cc app is based on these examples which all do not need the Zed SDK

Here a video of my C++ app. In this app the rectified camera images are completely shown, not like in the python app which cuts off the corners.
Apr-13-2022 18-18-572

In this version of the python code I got better results.

I made the mistake think that since the left and right image are beside each other, separated in x by 1/2 width of the double video stream, that this offset needed to maintained when providing the left and right pixel pair coordinates.
For example I thought if the left point is (left_xValue,10) then the right point would need to be (right_xValue+1/2 width,10). But at least in the python code I get more logical results when I think of the two images more or less sitting on top of each other; so their x difference is relatively small.

The resulting depth values are still not super accurate. At a 2 meter distance the code returns more like 2.6m, and at 5 m more like 6.5 m. This is either the depth inaccuracy or maybe the stereo camera is not perfectly aligned any more and need to be recalibrated.

The strange thing is that if I do the same offset handling in C++ as I do in python I still get bad results.

Also
remap() in python produces images that fill the rectangle completely but in C++ is still has those bowed edges. ??