Optimize ofxCv::toOf (Fastest way to render cv::Mats)

Hello all,

I have a thread running some heavy OpenCV code. The images are stored as RGB cv::Mat class members. These images are 1280x720 and I want to render a few hundred of them. Here is the approach I’m using to render the cv::Mat members in a simple example:

renderTest.zip (1.0 MB)

Excerpt:

ofSetColor(255,255,255,255);
ofEnableAlphaBlending();
for(int i = 0; i < 400; i++) {
  ofImage renderImage;
  ofxCv::toOf(cvImage, renderImage);
  renderImage.setImageType(OF_IMAGE_COLOR_ALPHA);
  renderImage.draw(0,0);
  //ofxCv::drawMat(*imageIter, 0, 0);
}
ofDisableAlphaBlending();
  1. I did notice that performance is slightly improved when I wrap the drawing code in ofEnableAlphaBlending() / ofDisableAlphaBlending(), rather than calling ofEnableAlphaBlending() in setup() What is the best practise here?

  2. I tried adding an ofImage class member with setUseTexture(false) that was constructed in a child thread as follows:

    ofxCv::toOf(alphaImage, this->renderImage);
    this->renderImage.setImageType(OF_IMAGE_COLOR_ALPHA);

And then drawing in the main thread with:

renderImage.setUseTexture(true); // let the ofImage upload texture to GPU.
renderImage.reloadTexture(); // update texture copy
renderImage.draw(0,0);

That caused a lot of headaches and I gave up on it. There seem to be issues around the way I’m using the class instances, which get moved, copied, merged, and manipulated in many ways in the child thread, and having a member with OpenGL hooks seems to be causing many issues.

How can I improve performance? I don’t mind letting the thread to more work, as its already slow (over 1s), maybe using ofPixels as the class member?

Thanks.

1 Like

i think the slowest part in your call is calling ofxCv::toOf() as that is making a copy. You can avoid it by wrapping an ofPixels using toCv() and pass that to opencv then upload the pixels into a texture. toCv does not copy the pixels into a Mat but instead wraps the memory in the Mat structure so it should be faster, mostly if you are doing it lots of times

Thanks Arturo:

I only need to go from opencv to ofx, not the other way around. So how do I “upload the pixels into a texture” and avoid the copy?

hey!

ofxCv::toOf() only copies things when it absolutely can’t avoid it (like cv::Point2f, cv::Rect).

it doesn’t have to make a copy here. in this case, it uses setFromExternalPixels() internally to avoid copying. the code you wrote should be just as fast as drawMat().

the slowest part is your setImageType() and draw(). depending on the image format, setImageType() will add unnecessary overhead. also, you can use alpha blending without making an alpha channel on the image (just use ofSetColor(255, 128) for example to set everything at 50% opacity).

the fastest thing you can do is to create a vector in your setup(), and then in draw() you can loop through and draw them.

1 Like

Thanks Kyle!

Could you elaborate on why setImageType() is adding overhead? If I call that before running toOf() would that help? Or somehow set up ofImages to expect a certain format to reduce overhead?

The alpha channel is the result of some OpenCV code, and is certainly needed (they mask meanShift segmented regions from live frames).

The images to be rendered are all constructed on the fly from OpenCV and processed in a thread. They are changed for every thread iteration (~2s). The above example is a simple proxy for my real code.

Is there a way I could do more work on the OpenCV side (I expect the thread to run slow, the more I can do in the thread the less load on rendering) to make rendering faster? i.e. change the pixel format somehow to make conversion faster and limit/mitigate the overhead of setImageType()? Adding ofImage members to my thread opened up a bit of a nightmare.

I am also using opencv_gpu; if I upload my images to the GPU as GpuMats is there a way to render those?

Thanks for your help; I’m looking to any solution to make this work that still involves rendering large RGBA Mats stored in RAM in ofx. And slowing down the thread by prepping for rendering in OpenCV is expected.

Thanks again, I hope you have more ideas…

setImageType adds overhead because it does an image type conversion for you. the name is slightly misleading, it should be “convertImageType” or something.

it might be possible to drop the alpha channel faster with opencv than with OF. or, it might be fastest to not convert the image type at all and just use a shader (or disable alpha blending) to draw the image without alpha.

i haven’t worked with opencv_gpu at all so i can’t advise there.

Thanks Kyle.

To reiterate, I need the alpha channels. Since my cv::Mats are RGBA, I should be able to convert them to ofImage without any conversion, should I not? Actually, I am internally storing Alpha and RGB as separate Mats because they are generated by different algorithms. So if its faster to upload them separately, and apply the alpha channels with a shader and render the result to an FBO, that sounds promising.

Any sample code?

i see. in that case, yes, it will be faster to upload the rgb and alpha channels separately and then render them together on the gpu than it will be to combine them on the cpu. from your code above, it looks like you don’t need sample code, but if you are lost there are lots of examples of using depth maps in the way you are trying to use the alpha channels.

I’ll look at the ofx shader examples. As I understand it, the plan is:

  1. use toOf to convert the RGB cv::Mat to a ofImage (or ofPixels?)
  2. use toOf to convert the Alpha cv::Mat to a ofImage (or ofPixels?)
  3. Upload ofImage/ofPixels to GPU
  4. Use shader to combine RGB and A and render to FBO.

Thanks!

I’ve managed an approach (shaderRender.tar.bz2 (7.6 KB)) that seems to work, and in my small test it looks to be about twice as fast as the method above:

maskIter = masks.begin();
for (imageIter = images.begin(); imageIter != images.end() and maskIter != masks.end(); imageIter++) {
    // Convert Mats to ofImages for uploading to GPU
    ofImage uploadImage, uploadMask; // or ofPixels? seems this will do the GPU upload, what we want?
    ofxCv::toOf(*imageIter, uploadImage);
    uploadImage.setImageType(OF_IMAGE_COLOR); // should not need this? TODO any over-head / conversion?
    ofxCv::toOf(*maskIter, uploadMask);
    uploadMask.setImageType(OF_IMAGE_GRAYSCALE);
    // Render to FBO
    perceptsFBO.begin(); // render to FBO
    shader.begin();
    shader.setUniformTexture("image", uploadImage.getTextureReference(), 0);
    shader.setUniformTexture("mask", uploadMask.getTextureReference(), 1);
    ofTranslate(640, 360); // 1280/2, 720/2, specific to drawing on the plane.
    plane.draw();
    shader.end();
    perceptsFBO.end();
    maskIter++;
}

The jist is a simple shader that just uses the value of one texture as an alpha value to mask the RGB from another texture:

uniform sampler2DRect image;
uniform sampler2DRect mask;

in vec2 texCoordVarying;

out vec4 outputColor;

void main()
{
vec4 texel0 = texture(image, texCoordVarying);
float alpha = texture(mask, texCoordVarying); // single channel, and thus single float value.
outputColor = vec4(texel0.rgb, alpha);
}

When applied in my main program, it’s not nearly as much of an improvement as in this simple test case. I did notice (looking at callgrind output) that setImageType() uses 42% CPU (inclusive) using the shader and only 2% using the old method. Although in real-time the shader version is much faster.

How can I avoid the setImageType() calls? The cv::Mats are RGB and greyscale; no conversion should be needed.

i think you can avoid the conversion by commenting them out.

Mat colorMat(512, 512, CV_8UC3);
Mat alphaMat(512, 512, CV_8UC1);
ofImage colorImg, alphaImg;
toOf(colorMat, colorImg);
toOf(alphaMat, alphaImg);
if(colorImg.getPixelsRef().getImageType() == OF_IMAGE_COLOR &&
   alphaImg.getPixelsRef().getImageType() == OF_IMAGE_GRAYSCALE) {
    ofLog() << "good";
}

this prints “good”, which means the image type is already set, and you do not need to set it yourself.

if you are getting something with the wrong image type from toOf, let me know – that’s a bug. but as far as i can tell, you can just comment out setImageType().

Thanks for taking the time Kyle.

There is certainly something going wonky here. If I comment out the setImageType() calls, then on the first frame the alpha channel is not processed:
<img src="/uploads/default/6931/69a361af79b57894.jpg" width=“400"”>

On the second frame (and subsequent frames) the pixel contents are garbled:

All frames should look like the following, not like the images above:

For each of the messed up frames, getPixelsRef().getImageType() returns the proper values. If I keep the setImageType() calls, then all frames are rendered properly, as in the final image above. I tried pulling the latest ofxCv from git, and the same thing happens.

i wish i could help here, but i can’t see a simple enough example of what’s going wrong to narrow down what the bug is.

it sounds like what you’re saying is “without calling setImageType() sometimes my images aren’t rendered correctly”. but if i go through the process myself i don’t have any problems:

Mat colorMat = Mat::eye(512, 512, CV_8UC3) * 255;
Mat alphaMat = Mat::eye(512, 512, CV_8UC1) * 255;
ofImage colorImg, alphaImg;
toOf(colorMat, colorImg);
toOf(alphaMat, alphaImg);
colorImg.update();
alphaImg.update();
colorImg.draw(0, 0);
alphaImg.draw(512, 0);

this draws a red and white diagonal line as expected.

if you could post similarly short code that shows something simple that isn’t working, i would be glad to take another look!

Kyle: you’re little sample code above solved my problems! I did not have ofImage::update() in my code! Replacing setImageType() with update() and all is well (and FAST!).

After all this work, some updates seem to have messed up my performance! See this question.