Deferred rendering example

Hey guys, I posted a quick OF deferred rendering example to github. It’s very much WIP and verbose/unoptimized, but thought I’d post a quick link to it here in case anyone is interested!

https://github.com/jacres/of-DeferredRendering

cheers,
James

1 Like

wow - awesome!
Thanks for posting this - Theo

yew, it seems very promising, sure i will test it! thanks for sharing it @james

just few notes: in linux64 bit (Ubuntu 12.04) i got error in gBuffer.cpp at line 8

  
#include "gbuffer.h"  

should be instead

  
#include "gBuffer.h"  

and because i have an Nvidia GT540M Optimus i have to launch the app with

optirun ./of-DeferredRendering_debug

i receive

X Error of failed request: GLXUnsupportedPrivateRequest
Major opcode of failed request: 154 (GLX)
Minor opcode of failed request: 16 (X_GLXVendorPrivate)
Serial number of failed request: 63
Current serial number in output stream: 67

commenting out in testApp.cpp:

  
ofSetVerticalSync(false);  

solved the problem. i get 20 fps about with 25 lights.

Thanks Theo - still very WIP and not running super well yet, but it’s a start.

Thanks for letting me know about that Linux problem, kalwalt - I commented the vsync line out.

I found a lot of stupid mistakes in the code and ended up fixing a whole bunch of things which should improve performance. The newest version is up on github

Kalwalt - would you mind trying it again if you get a chance sometime? I’m curious to see if it runs better for you now (I get 30fps with 25 lights on a Macbook Pro with an nvidia GT330M). I’m thinking the 540M should do a bit better.

I’m going to look at putting light volumes in tomorrow - this should give a good little boost.

As a next step I’m also going to try getting rid of the position texture and reconstructing the position in the shader based on the depth value + merge the depth value into the w component of the normals texture (we’ll get rid of 2 textures with this which will save bandwidth/writes/lookups… not sure if it will be any faster because of this, but it might be on fill-rate limited cards?)

Hopefully that will make it into a more usable state :wink:

I updated the github version with some changes. Added in the light volumes technique - basically you draw bounding spheres around each point light at a size of the extent of where they’d affect lighting.

I was doing a pretty crappy brute force way before this - lighting calculations for each light were being calculated for every pixel. What the light volumes does is it makes sure that the lighting calculation for each light is only applied to the pixels that particular light affects, speeding things up. Seemed to work well! The framerate went up from 25 lights @ 30fps to 250 lights @ 30 fps.

i have tried the new version , i got 47-50 fps with 100 lights. but i noticed a problem in fullscreen mode: it seems that the mesh is rendered twice in the main window and shifted. Did you notice this?
see the screenshot and red arrow.

![](http://forum.openframeworks.cc/uploads/default/2734/Schermata del 2013-01-18 15:38:16.png)

Hey Kawalt - thanks for testing it out.

Unfortunately the FBO textures need to be recreated at the new window size when the dimensions change (currently they’re being set to the window width and height when setup is called). It still thinks everything is at the original 800x600 so it’s throwing the texture coordinates off that are based on the screen frag coords.

I’ll add that in - should be pretty quick to do!

that’s sound good! also a question : i saw in your flickr pages some screenshot about oF and subsurfaceScattering , Does it a real subsurfacescattering implementation? it seems well done… i heave made a fake one with glsl but this seems use a depth comparison framebuffer, does i am right?

That was a horrible hack that would really only work properly for spheres. With SSS you need to know the thickness of the object that the light is shining through. I just used the radius of the bounding sphere around the object, which was fast and looked ok for things like the metaballs and cubes. Actually, for the metaballs, it was a hardcoded thickness value that was the same for everything (but basically acted as a sphere).

There’s a great SSS technique here that I’m looking to try out - http://publications.dice.se/attachments/Colin-BarreBrisebois-Programming-ApproximatingTranslucency.pdf

It’s quite easy to implement, but works really well apparently. The only thing is you have to back the inverse AO map so it’s not something that’s realistic to do for realtime generative meshes or things that are deforming. But, hey, it’ll probably be possible in a few years with new hardware :slight_smile:

Hey Kalwalt - I updated the github repo. Full screen should work now, but the app has to start full screen using OF_FULLSCREEN in main.cpp (you can’t toggle between the two during runtime). To toggle, the fbo textures would have to be recreated whenever the window size changes and I’m thinking that’s probably not worth the code mess since it’s not commonly done in a production setting.

I also removed the position texture from the gbuffer and view space positions are now recreated in the shaders from the linear depth values - took a bit of work, but it’s working well. I think this made the SSAO run a lot faster in the process - turning SSAO on and off has a very minimal effect on framerate, whereas it used to drop it quite a bit.

The linear depth buffer texture is also removed and the depth values are now stored in the alpha component of the normal texture (basically 2 float textures were removed - so this will save on bandwidth and make a difference on cards that are fillrate limited).

now it works well also with my intel integrated video card! i get a bit less framerates in both cases(intel or Nvidia video card) but it is a good compromise. Well done!

Hey Kalwalt - I think I’m finally done working on this… have spent way too long with it :wink:

Resizing of the app window while running as well as toggling fullscreen now works (regenerating the fbos and textures when it’s resized now).

I also added the stencil trick to cut down on the number of pixels that are affected by the lighting (this also fixes a problem where the light would disappear if you put the camera within the area it affected). The drawing of the spheres and boxes is now using VBOs and it’s much faster on my cards - it should be a bit faster for you now (this example is now hitting 60 fps with 1000 lights on Linux!)

Should have mentioned that I implemented the techniques described here:
http://ogldev.atspace.co.uk/www/tutorial35/tutorial35.html (plus tutorial 36 and 37)

cheers,
James

i’haven’t tried yet your new improvements, but i’m very curious to see that! also thanks for the info for the reference.

resizing and toggling to fulscreen works well now on both my video cards (intel and Nvidia). i recieve only these warnings:

[warning] ofVbo: bad texCoord data!

[warning] ofVbo: bad index data!

[warning] ofVbo: bad color data!

[warning] ofVbo: bad index data!

but this is not a big problem.
I get 13 fps w 1000 lights with my Nvidia card . Maybe using bumblebee for optimus (sigh! i have no choice…) slow down a lot the framerates…

hi @james !

Thanks a lot for this piece of addon … i’ve been able to run it and it’s really usefull for rendering more then 8 lights …

but i’m facing something strange … a picture to try to explain it :

on the left is the classic pipeline i was used to … as you can see (not much sorry) the walls under the stairs are visible … but in your deferred shading, this walls and many others have “gone” … ?

in both cases the model comes from ofxFBX from Arturo that allows me to import geometry and lights exporting from Cinema4D …

do you have any idea why some walls are not being drawn on the Deferred Shading version ? something related with normals ? or with double-sided shading (that is a concept in 3D software’s to always render a face from both sides, so ignoring it’s normal) …

any thoughts to arrange this ?

thanks !

1 Like

Hi:

I noticed that in
ssao.frag:

vec4 pos = vec4( (tex_coord.x-0.5)*2, (tex_coord.y-0.5)*2, 1, 1 );
vec4 ray = u_inverseProjMatrix * pos;
return ray.xyz * depth;

Since pos is NDC with z=1, so ray will be inverse projected back to the far plane. So, I think ray.xyz should be normalized by abs(ray.z) before multiply with depth to recovery the position in view-space. Is this a typo, or any other reasons behind.

Thanks!
Eric