Performance Issues with Pixel Operations

Hello openFrameworks-Community.

I have a sketch which is supposed to sort pixels by brightness within an image, frame by frame.

This is my code:

#include "ofApp.h"
ofImage image;

int w;
int h;

//--------------------------------------------------------------
void ofApp::setup(){
    
    w = ofGetWidth();
    h = ofGetHeight();
    
    unsigned char *pixels = new unsigned char[w * h * 3];
    for(int y = 0;y<h;y++){
        for(int x = 0;x<w;x++){
            
            int red = ofRandom(255);
            int green = 0;
            int blue = ofRandom(255);
            
            int index = 3 * (x + w * y);
            pixels[index] = red;
            pixels[index + 1] = green;
            pixels[index + 2] = blue;
        }
    }
    
    image.setFromPixels(pixels,w,h,OF_IMAGE_COLOR);
    delete[] pixels;
    
    
    
}

//--------------------------------------------------------------
void ofApp::update(){
    ofImage next;
    next.clone(image);
    
    for(int x = 1;x<w-1;x++){
        for(int y = 1;y<h-1;y++){
            ofColor c = next.getColor( x, y );
            ofColor n = next.getColor( x+1, y );
            int b1 = c.getBrightness();
           int b2 = n.getBrightness();
            
            if(b1<b2){
                ofColor temp = c;
                next.setColor(x,y,n);
                next.setColor(x+1,y,c);
                
            }
                   }
    }
    
    
    image.clone(next);
    cout << ofGetFrameRate() << endl;
    
}

It seems to me that it could be much much faster due to the fact that I have a similar program in Processing, which runs at about 18 FPS. In openFrameworks I have about 9 FPS.

Could you give me any hints for improving the performance of such an algorithm? I am kind of new to openFrameworks and switched from Processing because of the fact that it should run faster. It seems I did not find the best way to reach my goal.

Thank you very much and sorry for my poor english, I hope you got my point.

Greets!

it looks like you are creating a new image every frame with the clone operation. Maybe if you put it in the .h file (ie, move ofImage next to ofApp.h) it will be faster? I get worried whenever I see loading or allocating of memory on a per frame basis.

1 Like

I see the same frame rate as you. The biggest slowdowns here are the getColor() / setColor() pair, the image copies zach mentioned and getBrightness(). The main one is the getColor() / setColor() pairs.

Basically, the issue is that those methods recalculate the position in the image every time. Usually it’s fast enough, but if you do it for every pixel in the image it becomes slow. An analogy would be like picking up and dropping a hammer for every swing, instead of just holding on to it :smile:

The trick is to use an iterator or pointer into the image, and move it along pixel by pixel as you do your sorting. This code also uses the image directly, instead of making a copy. This code runs at about 35 FPS for me:

void ofApp::update(){
    
    int channelCount = image.getPixels().getNumChannels(); // 3 channels
    
    for(int y = 0; y < h; y++) {
        
        // get a pointer starting at the first pixel in this row
        unsigned char * cursor = image.getPixels().getData() + ((w * channelCount) * y);
        
        for(int x = 0; x < w - 1; x++) {
            
            // create "current" and "next" colors based on cursor position
            ofColor current(cursor[0], cursor[1], cursor[2]);
            ofColor next(cursor[3], cursor[4], cursor[5]);

            int b1 = current.getBrightness();
            int b2 = next.getBrightness();
            
            if(b1 < b2) {
                // swap pixels
                cursor[0] = next.r;
                cursor[1] = next.g;
                cursor[2] = next.b;
                cursor[3] = current.r;
                cursor[4] = current.g;
                cursor[5] = current.b;
            }
            
            // move cursor to the next pixel
            cursor += channelCount;
        }
    }
    
    image.update();
    cout << ofGetFrameRate() << endl;
}

After this, the next biggest slowdowns are creating the two colors every loop (current and next), and the getBrightness() call. You can get around the color slowdown by doing a little pointer trick, since an ofColor is basically an [r, g, b, a] array in memory:

void ofApp::update(){

    int channelCount = image.getPixels().getNumChannels(); // 3 channels
    
    for(int y = 0; y < h; y++) {
        unsigned char * cursor = image.getPixels().getData() + ((w * channelCount) * y);

        for(int x = 0; x < w - 1; x++) {
            
            // create "current" and "next" pointers based on cursor position
            ofColor * c = (ofColor *)cursor;
            ofColor * n = (ofColor *)(cursor + channelCount);

            int b1 = c->getBrightness();
            int b2 = n->getBrightness();

            if(b1 < b2) {
                // swap pixels. We need to make a temp copy this time, since we're
                // using the pixel memory directly
                ofColor temp = *c;
                cursor[0] = n->r;
                cursor[1] = n->g;
                cursor[2] = n->b;
                cursor[3] = temp.r;
                cursor[4] = temp.g;
                cursor[5] = temp.b;
            }
            
            // move cursor to the next pixel
            cursor += channelCount;
        }
    }
    
    image.update();
    cout << ofGetFrameRate() << endl;
}

This runs at about 45 FPS for me in debug, and a full 60 FPS in Release mode.

It’s not exactly “beginner”, but I hope that helps! :slight_smile:

2 Likes

you can also avoid a lot of multiplications and additions in those lines by changing them with:

    unsigned char * cursor = image.getPixels();
    for(int i=0;i<image.getPixelsRef().size();i++,cursor++) {
2 Likes

Nice work! This really boosts the performance.

@admsyn: I think you can further improve the performance by storing the brightness of the next pixel (b1 in case of swap and b2 in case of non-swap) in a bNext variable. Then you can use the bNext value for b1, except when x == 0. This cuts down on the brightness calculations, so gives some more fps. :smile:

@arturo: I’m having some trouble trying out your suggestion, just replacing the lines doesn’t seem to work as expected. With such a for loop, how can you distinguish between rows as needed for pixel sorting? And how does cursor++ relate to cursor += channelCount?

oh sure, didn’t realize you needed the row, also that @admsyn code already does the calculation for the cursor only once per row so it shouldn’t be much more gain but this should work:

unsigned char * cursor = image.getPixels();
for(int y = 0; y < h; y++,cursor+=channelCount) {
        for(int x = 0; x < w - 1; x++,cursor+=channelCount) {
           ...
        }
}

if you are using OF from github or from 0.9 you can do:

// setup
img.setImageType(OF_IMAGE_COLOR_ALPHA);

//update
for(auto line: img.getPixels().getLines()) {
    auto prevPixel = line.getPixels().begin();
    auto lineFromSecondPixel = ofPixels::Pixels(prevPixel+1, line.getPixels().end());
    for(auto pixel: lineFromSecondPixel) {
        ofColor * c = (ofColor *)&prevPixel[0];
        ofColor * n = (ofColor *)&pixel[0];

        int b1 = c->getBrightness();
        int b2 = n->getBrightness();

        if(b1 < b2) {
            swap(*n,*c);
        }
        prevPixel = pixel;
   }
}

that avoids the pointer math which is usually tricky and prone to errors and should be at least as fast

2 Likes

Ok, yeah that works. These examples have been very helpful for me to learn more about efficient pixel operations in OF and also as a catalyst to research memory and pointers (this morning I read the ofBook chapter on it). So thanks a lot @admsyn and @arturo! And thanks @MACIEJ for asking the question in the first place! :smile:

The new for loop style reminds me of enhanced for loops in java. Looks great! I just downloaded 0.8.4. I did also fork/clone the whole repo from github, but at the moment I am running into some errors when building it. However, sooner or later I will definitely try out this alternative option.

Hello there.
Thank you very much for replying. Really interesting posts!

@admsyn:
First of all: from 9 FPS to 45 FPS : wow! :scream:
…Although I cannot run your code. I get an error at the channelCount-Integer and the cursor-Pointer.

It says “> Member reference base type ‚unsigned char*‘ is not a structure or union”

@arturo:
your code looks pretty unfamiliar to me, but also very elegant :smile: I have to analyze it first, but thanks in advance!

I have an additional question:
What would be if i was going to compare my “current” pixel not with the very next one (in the sense of "n+1), but with for example the pixel over or under my current pixel(n+width or n-width).
I mean something like a pixel neighbourhood similar to cellular automata.

In my code I would just change

ofColor c = next.getColor( x, y )

to

ofColor c = next.getColor( x, y+1 );

It seems to me that @admsyn’s code is particullarly geared to the very next indicated pixel. Is there a chance to generalize it for every “cardinal direction”?
Or did I miss something?

Yeah, I got those errors as well at first. I think it is because we use OF 0.8.4. and some changes were made since that version. However to make it work in 0.8.4. you can change these lines to the following respectively:

// change to getPixelsRef()
int channelCount = image.getPixelsRef().getNumChannels();

// remove getData()
unsigned char * cursor = image.getPixels() + ((w * channelCount) * y);

@amnon: Thanks mate - worked for me! :wink:

…but at 31 FPS “only” :slight_smile:

Like I said above, if you store one of the brightness values in a variable, you can get some extra fps as well.

Adapted Code

void ofApp::update(){

    int channelCount = image.getPixelsRef().getNumChannels(); // 3 channels

    // variable to store the number of times pixels are swapped (for information only)
    int pixelSwaps = 0;

    // get a pointer starting at the first pixel of the image
    unsigned char * cursor = image.getPixels();

    for(int y = 0; y < h; y++, cursor += channelCount){

        // variable to store the brightness of the 'next' pixel
        int bNext = 0;

        for(int x = 0; x < w - 1; x++, cursor += channelCount){

            // create "current" and "next" pointers based on cursor position
            ofColor * c = (ofColor *)cursor;
            ofColor * n = (ofColor *)(cursor + channelCount);

            int b1 = bNext;
            if(x == 0) b1 = c->getBrightness();
            int b2 = n->getBrightness();

            if(b1 < b2){
                // swap pixels. We need to make a temp copy this time,
                // since we're using the pixel memory directly
                ofColor temp = *c;
                cursor[0] = n->r;
                cursor[1] = n->g;
                cursor[2] = n->b;
                cursor[3] = temp.r;
                cursor[4] = temp.g;
                cursor[5] = temp.b;
                pixelSwaps++;
                bNext = b1;
            }    else{
                bNext = b2;
            }
        }
    }

    image.update();

    ofSetWindowTitle(ofToString(ofGetFrameRate(), 2) + " fps | numPixelSwaps: " + ofToString(pixelSwaps));
}

I suppose to be even faster the alternative solution is to use a fragment shader, but then it doesn’t really matter if you use P5 or OF speed-wise, because the shader runs on the graphics card anyway.

@MACIEJ are you running the application in release mode? with this kind of things (loops over all the pixels with lots of operations, memory accesses…) it usually makes a huge difference to run in release or debug

@arturo Okay, I see. Thank you. I always forget to think about the release mode. This is kind of new to me. Anyway, I get about 37-39 FPS in release mode - nice!