Optimization: how to reduce data copying with fbo and ofThreadChannel


#1

Hi! I have a class which receives an fbo containing a shader-modified webcam image

void videoGenerator::addFrame(ofFbo & fbo) {
    fbo.readToPixels(exportPixels);
    frameSaveChannel.send(exportPixels);
}

I send this image to a thread channel for further processing and saving.

The addFrame() function is only called once every 5 seconds, so maybe it’s pointless trying to optimize this, but I would like to know what’s the best approach :slight_smile:

As I see it, I’m receiving a reference to an fbo, which is light, then copy all the pixels from the gpu to a ofPixels (not so nice), then make a copy of those pixels and send them to the channel. All this on a mobile device.

Is there something you would do different?


#2

the slowest part there really is downloading the pixels from the graphics card, the copy of the pixels is also a relatively expensive operation but not so much as the second.

you can copy the texture to an ofBufferObject and then download that asynchronously which take the same time as readToPixels but works in the background and you can even do it on a different thread

to avoid the copy of the pixels you can have a pool of ofPixels and instead of sending thepixels send the index of the pixels in the pool or just a pointer and then return it back trhough another channel to the main thread once you are done with it.

this addon: https://github.com/arturoc/ofxTextureRecorder does pretty much all of this


#3

Thank you, very useful tips! I measured the time when running on Desktop (similar for debug and release):

readToPixels() ~10ms 
send()          <1ms

Then I followed this ofBufferObject example.

Replacing readToPixels() by fbo.getTexture().copyTo(bufferCopy); the time goes down to 0ms, which is great. But my frames are now black :slight_smile: No crash though. I wonder if it’s a GL image format issue.

This is how I use the buffer:

    // On a thread...
    texture[count].loadData(bufferCopy, GL_RGBA, GL_UNSIGNED_BYTE);
    texture[count].readToPixels(exportPixels);

    // Strip alpha, mirror
    exportPixels.setImageType(ofImageType::OF_IMAGE_COLOR);
    exportPixels.mirror(false, true);

    // Add frame to gif
    gif.addFrame(exportPixels);

    // Save image for mp4
    string exportFileName = "img" + index + ".jpg";
    ofSaveImage(exportPixels, exportFileName);

I think it goes wrong already in the first line, because when I draw texture[count] they’re all empty.

// addFrame
fbo.getTexture().copyTo(bufferCopy);
frameSaveChannel.send(true); // bool just for testing

//setup
for(size_t i = 0; i < totalFrames; i++) {
    ofTexture t;
    t.allocate(w, h, GL_RGBA);
    frames.push_back(t);
}
bufferCopy.allocate(w * h * 4, GL_STATIC_DRAW);
exportPixels.allocate(w, h, OF_PIXELS_RGBA);

Any ideas?

I’ll add an example to the ofBufferObject documentation, which is empty.


#4

mmh not sure what you are doing but you can’t call any of the texture methods on a thread and using ofBufferObject you shouldn’t be caling readToPixels, take a look at the addon i posted. there’s also a couple of examples on ofBufferObject and ofThreadChannel in the examples folder


#5

I made it work! :slight_smile: following the threadedPixelBufferExample (thanks!), although the performance seems to be similar. 10ms are spent in:

unsigned char * p = bufferCopy.map<unsigned char>(GL_READ_ONLY);

I imagine this depends on the graphics card.

How to figure out if a method should not be called in a thread? Study the OF source code?


#6

any gl call or call to an object in the OF gl folder can’t be called in a thread