Align multiple kinect point clouds with ArUco markers


I’m working on an OF project with ofxAzureKinect and ofxAruco addons.
As I’m not really advanced in Computer Vision and Math, I’m struggling on a point and I need some help.

My goal is to align multiple point clouds provided by kinects to be able to cover a wider area without losing any informations while someone is walking through the space.

I’m currently struggling to align my point clouds and it seems that my calculation are obviously wrong. It would be nice if someone has already experimented that.

Here’s the result (saved in a json file) (I’m currently saving 4 ofvec4)


Thanks in advance!

So what I’m doing in order is :

  • Get the calibration of the kinects
    void Kinect::initializeCameraCalibration()
		m_calibration = &get_k4a_device().get_calibration(m_settings.depthMode, m_settings.colorResolution);
		auto* intrinsics = &m_calibration->color_camera_calibration.intrinsics.parameters;

		// Calculate camera matrix with intrinsic parameters.
		std::vector<double> cameraVector =
			intrinsics->param.fx, 0.f, intrinsics->,
			0.f, intrinsics->param.fy, intrinsics->,
			0.f, 0.f, 1.f
		m_cameraMat_color = cv::Mat(3, 3, CV_32F, &cameraVector[0]);

		// Calculate dist coefficient matrix with intrinsic parameters.
		std::vector<double> distCoeffsVector =
			intrinsics->param.k1, intrinsics->param.k2, intrinsics->param.p1,
			intrinsics->param.p2, intrinsics->param.k3, intrinsics->param.k4,
			intrinsics->param.k5, intrinsics->param.k6
		m_distCoeffs_color = cv::Mat(5, 1, CV_32F, &distCoeffsVector[0]);
  • Capture images from a video feed
	void Kinect::processCapture()
		const ofPixels pixels = this->colorPix;
		const cv::Mat image = ofxCv::toCv(pixels);
  • Detect ArUco markers in these images
	Eigen::Affine3d Kinect::GetDetectedMarkersTransform()
        // Setup aruco (marker detection)
		ofxAruco aruco;

		Eigen::Affine3d frameMarker;
		int count = 0;
		// Process each image.
		for (auto& image : m_calibrationImages)
			// Convert image to grayScaleImage.
			cv::Mat gray;
			cv::cvtColor(image, gray, CV_BGR2GRAY);

			ofxCvGrayscaleImage grayImage;
			grayImage.allocate(gray.cols, gray.rows);
			grayImage.setFromPixels(, gray.cols, gray.rows);

			// Detect the markers in each image.
			auto markers = aruco.getMarkers();

			Console::LogWarning("Processing image %d. %d marker(s) detected.", count, (int)markers.size());
			if ((int)markers.size() > 0)
				for (int i = 0; i < (int)markers.size(); i++)
					if (markers[i].isValid())
						// Get the screen position (projection) of the markers.
						double mvMat[16];
						ofMatrix4x4 recompMat;

						glm::mat4 tempMat(
							recompMat._mat[0].x, recompMat._mat[0].y, recompMat._mat[0].z, recompMat._mat[0].w,
							recompMat._mat[1].x, recompMat._mat[1].y, recompMat._mat[1].z, recompMat._mat[1].w,
							recompMat._mat[2].x, recompMat._mat[2].y, recompMat._mat[2].z, recompMat._mat[2].w,
							recompMat._mat[3].x, recompMat._mat[3].y, recompMat._mat[3].z, recompMat._mat[3].w);

						cv::Mat transform = cv::Mat::eye(4, 4, CV_32F);
						memcpy(, glm::value_ptr(tempMat), 16 * sizeof(float));
						transform = transform.t();

						cv::cv2eigen(transform, frameMarker.matrix());

		return frameMarker;
  • Trying to Calibrate/ Align the point clouds.
void KinectUI::AlignPointClouds(std::vector<Kinect*> _devices, std::vector<Eigen::Affine3d> _transforms)
	// Get the first kinect as reference.
	const Eigen::Affine3d mainTransform = _transforms[0];

	for (int i = 0; i < (int)_devices.size(); i++)
		if (i == 0)

		// Get transform matrix.
		const Eigen::Affine3d currTransform = _transforms[i];

		// Get the transform of the master in the sub kinect coordinate system;
		const Eigen::Affine3d master_sub_transform = mainTransform * currTransform.inverse();

		// Get the transform of the sub kinect in the master coordinate system.
		const Eigen::Affine3d sub_master_transform = currTransform * mainTransform.inverse();
		const Eigen::Matrix4d transform_matrix = sub_master_transform.matrix();

		// Copy the relative transform matrix from cv::Mat to a glm::mat4x4;
		memcpy(glm::value_ptr(_devices[i]->m_relativeTransform),, 16 * sizeof(float));

		// Do not forget to transpose the matrix as both are not row-col based!
		_devices[i]->m_relativeTransform = glm::transpose(_devices[i]->m_relativeTransform);
  • Update each points with the calculated transform.
					// Check if the point is inside the bounding rect.
					if (point.x >= -m_VBOBoxPrimitive.getWidth() / 2
						&& point.x <= m_VBOBoxPrimitive.getWidth() / 2
						&& point.y >= -m_VBOBoxPrimitive.getHeight() / 2
						&& point.y <= m_VBOBoxPrimitive.getHeight() / 2
						&& point.z >= -m_VBOBoxPrimitive.getDepth() / 2
						&& point.z <= m_VBOBoxPrimitive.getDepth() / 2)
						// The point is colliding with the box
						this->positionCache[count] = glm::vec3(

						// Multiply the point by the transform of the kinect in the marker coordinate space.
						glm::vec4 point(this->positionCache[count], 1.f);
						point = m_relativeTransform * point;
						this->positionCache[count] = point;

						this->uvCache[count] = glm::vec2(x, y);