I’m beginner user of Openframeworks and this is my first post here so first I would like to say hello to Everyone!
I’ve been writing a typical CPU based raycasting engine for learning purposes and I noticed that I have really poor performance filling 1920x1080 image buffer with calculated pixel colors.
I’m using ofImage as a buffer to draw to and I’m accessing ofPixels directly using ofPixels.setColor(). Once all pixels are set I’m calling ofImage.update() to update the texture and then ofImage.draw() to put the image on screen.
From what I see all performance is spent on setting calculated pixel colors. If I just skip that line - meaning doing all calculations but skipping writing to ofPixels() performance skyrockets - from 10fps to 250-300fps or so.
Would you have any suggestions on how to make creating the image on screen faster considering the image is generated using CPU in every frame? Should I get GPU involved in the process somehow?
I will be grateful for any pointers so I can dig deeper and find better solution.
Edit: If I set single pixel byte (green component for example) by accessing pixels directly using array syntax performance is a bit better. So setColor() doesn’t seem to be efficient, perhaps it would be faster if I had image buffer as RGBA and wrote each pixel as a 4 byte word?
if you are sure your color object matches the number of channels and order of your ofPixels object, you can use memcpy instead to write a entire pixel at once
memcpy(&pixels[index], &color, 3);
if you don’t need apha use only RGB colors.
Cheers
Thank you @dimitre !
Writing to pixels directly didn’t seem to make much difference compared to .setColor(). memcpy does look a bit faster but it’s not dramatic difference. Still, a nice improvement.
Hi, it is the usual bottleneck as memory transfer from CPU to GPU tends to be slow, although it shouldn’t be that much.
Can you post the code you are using to do such? some times there are tiny tricks that help.
ironically I think I am doing a similar thing and talking about it in my recent post about a crash when uploading to an ofFbo
so I do the following, think it is pretty quick? You’d need to keep the pixel float data in a separate array ready to upload
ofFbo showPixels; //allocate elsewhere
//set up an array of zeroes
int b1 = width;
int b2 = height;
int max = b1 * b2 * 4; //allowing for RGBA data
float* data = new float[max];
memset(data, 0, sizeof(float) * max);
//replace with data where we have it
for(int i = 0; i<max; i++) //cycle through and set the pixel RGBA values
//send the array to the fbo texture at the right channel
if(showPixels.isAllocated()) showPixels.getTexture(0).loadData(data, b1, b2, GL_RGBA, GL_FLOAT);
//clear up after
delete [] data;
@roymacdonald The rendering function is a bit of spaghetti now and hard to quote as it’s long.
I’ve just started refactoring it so I can post it later once it’s more readable.
Setting pixels itself is done like this right now:
int index = (y * _resX + x) * 3;
memcpy(&_buffer.getPixels()[index], &pixelColor, 3);
Thank you @Sam_McElhinney_io !
If I understand correctly you are bypassing ofPixels altogether here and just allocating data on heap, then once you’re done you’re loading that data straight to the GPU texture?
So in my case I could load the data to the ofTexture that is bound to the ofImage that I want to put on screen?
Hi,
That seems quite inefficient as you are still applying one memcpy per pixel.
What @Sam_McElhinney_io is a better approach. You can even still do it with an ofImage and would work fine
the following code works fine without having the fps affected
ofApp.h
#pragma once
#include "ofMain.h"
class ofApp : public ofBaseApp{
public:
void setup();
void update();
void draw();
ofImage img;
};
ofApp.cpp
#include "ofApp.h"
//--------------------------------------------------------------
void ofApp::setup(){
img.allocate(1920, 1080, OF_IMAGE_COLOR);
}
//--------------------------------------------------------------
void ofApp::update(){
}
//--------------------------------------------------------------
void ofApp::draw(){
auto p = img.getPixels().getData();
auto s = img.getPixels().size();
// just use a single value, copied to all pixels, so the calculation of it does not affect performance
float f = ofMap(ofGetElapsedTimeMillis()%3000, 0, 2999, 0, 255 );
for(size_t i =0; i< s; ++i){
p[i] = (unsigned char)f;
}
ofSetColor(255);
img.update();
img.draw(0,0);
ofDrawBitmapStringHighlight(ofToString(ofGetFrameRate()), 20, 20);
}
Yeah, I just do a standard ofFbo allocation in setup, and then in the update loop set a timer switch to do the push of data onto the ofFbo every other second. I use fbos because I then use shaders to do other things to the pixels.
You would need some kind of array in the main thread to separately calculate and keep track of all the pixel values, but that could just be a vector of glm::vec4, or whatever. What I actually do with that is have four threads, all calculating pixel values and updating separate parts of that array in parallel; so the calculation doesn’t kill the frame rate either, but it isn’t strictly necessary.