ive been switching between ofGLRenderer and ofGLProgrammableRenderer on a project and have been noticing huge slow downs when drawing lots of separate ofMesh’es using the ofGLProgrammableRenderer, but when using the ofGLRenderer, its lightning fast.
ofGLRenderer - 4000+ mesh strips ~ 60fps
ofGLProgrammableRenderer - 2000 mesh strips ~ 40fps
i usually give in by this stage and switch back to the old ofGLRenderer where i know things run faster, but i want to use some fancy new shaders so im trying to work out how i can improve my performance using ofGLProgrammableRenderer.
one thought i had was to somehow add my meshes into a single ofVbo and drawing them all at once… but im using OF_PRIMITIVE_TRIANGLE_STRIP and not sure how to break the meshes apart in one ofVbo? i suppose i could use OF_PRIMITIVE_TRIANGLES but this wouldn’t be as fast.
if anyone has any other ideas of how to improve performance in this case using ofGLProgrammableRenderer, please let me know.
found one possible solution using “degenerate triangles” which ill try tomorrow,
using degenerate triangles works well but usually just using triangles is more flexible and since the bottleneck is usually in uploading lots of vbos rather than uploading lots of vertices it should be ok too.
also if you can somehow profile your program to check which calls are slowing things down compared to the fixed pipeline renderer that would help a lot optimizing the programmable renderer
below is a time profile from ofGLProgrammableRenderer.
it looks like when a ofMesh is being drawn, all its data is being copied into a ofVbo first and then pushed to the graphics card. this process is taking up almost as much time as the actual rendering. i guess using ofVboMesh would get around this particular bottleneck… but perhaps when using the ofGLProgrammableRenderer, ofMesh should default to a ofVboMesh somehow?
can confirm that putting all meshes into a single ofVboMesh and drawing it all at once has a massive speed increase in the ofGLProgrammableRenderer. seems like the overhead of swapping out vertex arrays in the ofGLRenderer is very minimal, but in the ofGLProgrammableRenderer, things need to be batched. its a whole new way of thinking!
yeah when an ofMesh is drawn in the programmable renderer it’s always uploaded to a vbo but also the previous buffer is discarded. there’s no way around that in GL 3+ and updating the vbo every time only makes things worse. batching things is the most correct way to draw in any case but were you getting better performance only by putting things on a vboMesh?
at some point i added a deprecation message when drawing an ofMesh with the programmable renderer but there’s a couple of cases where it makes sense like when drawing text to the screen, there’s ways to get the mesh and store it on a vbo now but if you just want to quickly draw some text the fastest way is to just draw an ofMesh internally
yeah performance was much better using ofVboMesh over ofMesh.
thanks for the explanation.