Local rotation in a vertex array

Hi,

I am trying to render a particle system which is filled with locally rotating discs.

Right now I am going the somewhat inefficient route of the following for drawing each particle:

glPushMatrix
glTranslatef(x,y) <–Move to the X,Y position
glRotatef… <-- To rotate Locally.
glBegin(GL_TRIANGLE_FAN)
glVertex3f <-- All vertex definitions are local.
glEnd()
glPopMatrix

I need better performance from the rendering and hence am coding it into a Vertex Array. However I have hit an issue. I can’t seem to do local rotations to items in a vertex array (which makes sense).

The only approach I can see round this is to manually rotate my vertex points before adding into the vertex array (using some pure math).

Does anyone have any guidance on this or any other approaches?

Thanks in advance,

Luke

hi luke,

actually your original method isn’t so inefficient at all, and i’m not sure what advantage vertex arrays will be giving you here, unless you’ve got tens of thousands of particles (and since you need to rotate them, i’m guessing not)… probably your bottleneck is somewhere else, eg the pixel fill rate of your video card, or texture uploading/texture cache misses.

easy way to test whether fill rate is the bottleneck is to make the output window smaller and if necessary scale down all the geometry (so that everything still fits in the window, and you aren’t getting benefits from viewport clipping). if your fps improves, then the bottleneck is pixel fill rate and changing the geometry generation code won’t help.

have you profied your code (with Shark or OpenGL Profiler (OSX)/vTune (Win32)/oprofile (Linux); or gDEBugger) to figure out where the bottleneck actually is?

Thanks Damian.

You’re quite correct. I used gDEBugger to isolate that the issue is not the OpenGL speed of draw.

The issue lies in my code. When I slowed it down I can see that my framerate is jumping up and down sporadically, while the number of OpenGL calls per frame stays constant.

One of my functions must be being demanding ‘on occasions’. Now to isolate which one. oprofile can tell me the average CPU time of each function, but my issue is sporadic spikes. I’ll have to dig some more.

Thanks again.

OK, so I spent an hour finding the source of the variance in frame rate.

It does appear to be my OpenGL calls. Total time for calling the particle draw is spiking from 0.02 secs to 0.15 secs occasionally causing a jitter. The draw function is drawing a constant number of triangles, so there should be no variance in time required to execute the draw.

If I turn off the drawing procedure it’s smooth sailing.

However, the variance occurs regardless of whether I use a Vertex Array (of simple lines) or the following code. Might be suggesting a wider bottleneck? Will keep digging.

NVidia GeForce 9500GT on Ubuntu 9.04

  
  
   rotationX+=vy;  
   rotationY+=vx;  
  
   glRotatef(rotationX,1.0f,0.0f,0.0f);  
   glRotatef(rotationY,0.0f,1.0f,0.0f);  
  
	glBegin(GL_TRIANGLE_FAN);  
  
   //make disc flash dark or light occasionally.   
   if (ofRandom(0.0,1.0f)>=0.1f) {  
        glColor4f(r * alpha, g * alpha, b * alpha, 0.7f);  
    }  
    else {  
        if (ofRandom(0.0,1.0f)>=0.5f) {  
            glColor4f(1.0f, 1.0f, 1.0f, alpha);  
        }  
        else  
        {  
            glColor4f(0.4627f, 0.2941f, 0.0f, alpha);  
        }  
    }  
  
    glVertex3f(0.0f,0.0f,0.0f);  
    for (int angle = 0; angle<=360; angle+=60){  
            glVertex3f(sin(angle*DEG_TO_RAD) * radius, cos(angle*DEG_TO_RAD) * radius, 0.0f);  
    }  
  
    glEnd();  
  
    glPopMatrix();  
  

P.S. Particle count is about 20,000.

hmm, i can’t see anything in there that would cause spiking.

however, here’s something else that might help a tiny bit with performance. direct calls to cos() and sin() are slow. instead, you can use cosf() and sinf(). or, better still, since your angle values are constant ( 0, 60, 120, 180, 240, 300, 360 ) you can make a pair of arrays and precalculate the cos and sin values for each angle.

chur
d

Some other things to try on Ubuntu, make sure Desktop settings are off, and perhaps try another driver (either older or more recent) just in case it’s driver related.

Some more recent NVidia drivers can be found here:
http://avenard.com/media/Ubuntu-Reposit-…-itory.html

damian: thanks for the tip. Have implemented. Still jittering, but it did reduce the CPU overhead slightly.

grimus: Have given a bunch of drivers a whirl. No joy.

Confession time… my diagnosis on the opengl calls being the cause appears to be wrong.

I set up a bunch of hi-res timers throughout the code and found that all areas of the code are slowing down. Really bizarre.

So I tried putting the start and end of the timer back to back. So there are realistically a miniscule number of transactions taking place.

  
    clock_gettime(CLOCK_REALTIME, &ts); // Works on Linux  
    hr_time_start = ts.tv_nsec;  
    clock_gettime(CLOCK_REALTIME, &ts); // Works on Linux  
    hr_time_end = ts.tv_nsec;  
    hr_time_diff = hr_time_end - hr_time_start;  
    if (ofSign(hr_time_diff) == 1) printf("Hi Res %ld\n", hr_time_diff);  
  

I found a really odd thing:

The timer gives about ~200nano seconds consistently between start and end. However when the jitter occurs the time elapsed spikes to ~700nano seconds.

I think this is now getting a bit off topic for the openframeworks forum so I’ll wrap it up here but keep investigating.

Thanks grimus and damian for your assistance.

ohhhh… you might be having a denormal issue. are you doing feedback anywhere? for example, an array buffer full of floats, where each frame you multiply it by 0.99 or something…

eg buffer[i] = buffer[i] * 0.99f + newValue;

if buffer[i] starts with non-0, and newValue is 0, then buffer[i] will get smaller and smaller, eventually getting into tiny, tiny values which cause the CPU to flip into a higher precision (slower) mode of calculation. google ‘denormal bug’.

also, you should be able to run oprofile in a way that will spit out OS overhead as distinct from your app overhead… and i’m sure there’ll be a way to access the individual reference frames snapped by the oprofile probe, which will allow you to hunt for the spikes. oprofile is amazing, it’s just very poorly documented and tricky to get your head round.

Thanks Damian,

Yes my code was full of these feedback loops. I usually cut them off when they reach certain thresholds, but there are a good number without.

I tried compiling it up with the Intel Compiler with the DAZ and FTZ flags but experienced the same results.

In the end I ran out of time to further pursue it as the install went in on Thursday.

I reduced the size of the various particle systems, which in turn reduced the jitter to an acceptable level for the event.

Thanks for your help. It is good to be well aware of the denormal bug. I’m going to spend a bit more time with oprofile over the next week.