GPUparticleSystemExample weird behaviors

Hi all,

I am trying to make a simple particle system with basic physics computed on the GPU. Each particle has 2 forces: a spring force to an anchor position and a repulsive force from the mouse position.

I managed to code that using the ofxMSAOpenCL add-on and it works just fine up to 2 million particles on my iMac. For this project, however, I can’t use openCL as the app will be run on a MacMini.

I looked at the GPUparticleSystemExample which seems to do what I need. It uses FBO to store position data, send them to the fragment shader, and write them back to another FBO. Then the FBO is used in a vertex shader to set the new particles position.

I tried to modify the example to match my needs, but weird things happened. Particles are not where there are supposed to be. Forces are applied correctly on some particles, but not on others. Let me explain with a simple example, starting from unmodified GPUparticleSystemExample example:

I set the velocity to 0.0 to have a static system.

  
// 2. Making arrays of float pixels with velocity information and the load it to a texture  
    float * vel = new float[numParticles*3];  
    for (int i = 0; i < numParticles; i++){  
        //vel[i*3 + 0] = ofRandom(-1.0,1.0);  
        //vel[i*3 + 1] = ofRandom(-1.0,1.0);  
        // set velocity to 0.0  
        vel[i*3 + 0] = 0.0;  
        vel[i*3 + 1] = 0.0;  
        vel[i*3 + 2] = 1.0;  
    }  

Then I try to put the points on a grid (reducing particle number to 200 for visibility): The particles don’t occupy the whole space!

  
numParticles = 200;  

  
 // 1. Making arrays of float pixels with position information  
    float * pos = new float[numParticles*3];  
      
    // add on a grid  
    float pX = 0.0, pY = 0.0;  
    float delta = 1.0/(float)textureRes;  
      
    for (int x = 0; x < textureRes; x++){  
        for (int y = 0; y < textureRes; y++){  
            int i = textureRes * y + x;  
              
            //pos[i*3 + 0] = ofRandom(1.0); //x*offset;  
            //pos[i*3 + 1] = ofRandom(1.0); //y*offset;  
            pos[i*3 + 0] = pX;  
            pos[i*3 + 1] = pY;  
            pos[i*3 + 2] = 0.0;  
            pY += delta;  
        }  
        pY = 0.0;  
        pX += delta;  
    }  

If I manually set the position of one point afterwards, other points have moved!!?!

  
pos[0] = 0.9;  
pos[1] = 0.9;  

I really don’t get what is going wrong. I am fairly new to openGL and shaders. Any help would be appreciated!

Thanks!

Hello @dantam,

For sure this is strange. Maybe it’s because of how oF wraps openGL textures, the work arround seams to change the delta to be the double of normalized… instead of doing 1.0/resolution using 2.0/resolution.

  
  
  float delta = 2.0/(float)textureRes;  
  

Now the question we can make to our self it’s why?

Hi Patricio,

Thanks for your reply. Unfortunately changing the delta doesn’t solve the problem. First the array doesn’t fill the whole view. Second, when i modify pos[0] and pos[1] afterwards, I still get the same behavior: several points are moving, not just one.

I also looked at your example in ofxFX and I can reproduce the same problem. Could it be a problem with the projection matrix? In one of the references you provide, they do this:

http://www.seas.upenn.edu/~cis565/fbo.htm#setupgl4

  
  
// viewport transform for 1:1 pixel=texel=data mapping  
    glMatrixMode(GL_PROJECTION);  
    glLoadIdentity();  
    gluOrtho2D(0.0,texSize,0.0,texSize);  
    glMatrixMode(GL_MODELVIEW);  
    glLoadIdentity();  
    glViewport(0,0,texSize,texSize);  

I tried to apply this but then I get no points at all…

dantam,
this is indeed very strange. I tried your code, it seems that somehow it is ignoring everything where x > textureRes/2 or y > textureRes/2.

thats why you only get to see particles on 1/4 of the screen.

The even more bizarre part is that if you change the for loop to only set values for 1/4 of the particles, you still see a grid with the total number of particles!
For example, try this:

  
  
numParticles = 100;  
  

  
  
// 1. Making arrays of float pixels with position information  
    float * pos = new float[numParticles*3];  
    for(int i=0;i<numParticles*3;i++) pos[i] = 0.0; //initialize positions to (0,0)  
      
// should only set the positions of 1/4 of the total numPars  
    for (int x = 0; x < textureRes/2; x++){  
        for (int y = 0; y < textureRes/2; y++){  
            int i = textureRes * y + x;  
              
            pos[i*3 + 0] = (textureRes-x)/(float)textureRes + ofRandomf()*0.01;  
              
            pos[i*3 + 1] = (textureRes-y)/(float)textureRes + ofRandomf()*0.01;  
        }  
    }  
  

the for loop only goes to textureRes/2 and should only be setting positions for 25 particles total, yet when you run, 100 particles appear in the lower right corner, and none at 0,0 which should be there since they were all initialized.

I’ve been digging into the source but everything seems like it should be working correctly, I dont understand why this is happening, it’s ignoring some data and making up the rest?

I am wondering if it has something to do with the format when we load the pos data into the texture.

  
  
// Load this information in to the FBO´s texture  
posPingPong.allocate(textureRes, textureRes,GL_RGB32F);  
posPingPong.src->getTextureReference().loadData(pos, textureRes, textureRes, GL_RGB);  
posPingPong.dst->getTextureReference().loadData(pos, textureRes, textureRes, GL_RGB);  

Since pos is float, shouldn’t it be GL_RGB32F or GL_RGB32F_ARB instead of GL_RGB?
If I change that all the points seem to be at (or around) (0,0). Still investigating…

I thought that initially too, but in fact GL_RGB and GL_FLOAT are correct, one defines the number of color components and the other defines the data type.

I think it has to do with the rendering shader, actually. if you look at the render.vert code:

  
  
    // Moves the position of each vertex (that are from -1.0 to 1.0)   
    // to the right one on the texture (that are from 0.0 to 1.0)  
      
    verPos.x = abs(verPos.x * 0.5 + resolution * 0.5);  
    verPos.y = abs(verPos.y * 0.5 + resolution * 0.5);  
  

I believe these two lines are causing the strange behavior. the inclusion of abs() in there explains why it seems to ignore some positions and not others. I find its easier to use texture coordinates instead of the vertex positions.

so in the testApp::update() we can modify the loop that draws points and add in a glTexCoord2d() call:

  
  
glBegin( GL_POINTS );  
    for(int x = 0; x < textureRes; x++){  
        for(int y = 0; y < textureRes; y++){  
            glVertex2d(x,y);  
            glTexCoord2d(x, y);  
        }  
    }  
  

now in the shader we don’t have to do any arithmetic on the texture coordinate, the can be used directly:

  
  
    vec2 verPos = (gl_MultiTexCoord0.xy);  
  

Now it should draw again, but if you put back in some velocities, you see that it stil has some strange effects, looks like particles are interacting/bouncing, yet by default this functionality is not supposed to exist. The problem is that the GPU is using linear texture filtering, so adjacent particles will “bleed” velocity/position data to one another.

I fixed it by adding this line to the pingPongBuffer allocate routine:
FBOs[i].getTextureReference().setTextureMinMagFilter(GL_NEAREST, GL_NEAREST);
which forces the GPU to use nearest neighbor filtering which is definitely a must for some GPGPU applications (like this one).

  
  
void allocate( int _width, int _height, int _internalformat){  
        // Allocate  
        for(int i = 0; i < 2; i++){  
            FBOs[i].allocate(_width,_height, _internalformat );  
            FBOs[i].getTextureReference().setTextureMinMagFilter(GL_NEAREST, GL_NEAREST); // <---- New Line  
        }  
          
        // Clean  
        clear();  
          
        // Set everything to 0  
        flag = 0;  
        swap();  
        flag = 0;  
    }  
  

I hope this helps out! seems like it should solve your problem, but let me know if there’s anything else. I’ll submit a bug for this as it is obviously preventing anyone from modifying this example in useful ways.

found another issue. with my fixes described above, everything works except that the last particle will overlap another particle. See the attached screenshot where the particle in the bottom right is missing and the one in the bottom left is brighter than all the rest.

If I comment out the entire position update block, then we get a perfect grid, so something about the position update is causing this overlap. I’m looking into it.

![](http://forum.openframeworks.cc/uploads/default/2608/Screen Shot 2012-11-02 at 2.40.57 PM copy.png)

Ok fixed that issue too.

I’m not sure why, but if I replace the velocity update code to draw the quad:

  
  
// Just a frame to put pixels on  
    ofSetColor(255,255,255,255);  
    glBegin(GL_QUADS);  
    glTexCoord2f(0, 0); glVertex3f(0, 0, 0);  
    glTexCoord2f(textureRes, 0); glVertex3f(textureRes, 0, 0);  
    glTexCoord2f(textureRes, textureRes); glVertex3f( textureRes, textureRes, 0);  
    glTexCoord2f(0, textureRes);  glVertex3f(0, textureRes, 0);  
    glEnd();  
  

with

  
  
velPingPong.src->draw(0, 0);  
  

then the problem is solved, no overlapping particles. must be something like a rounding error in converting the texture coordinates? I have no idea.

Wow Tim,

thanks with that. Also I will apreciate if you can take a look to ofxFX and check I f I was doing things there and if there is a way to improve it.

Best

P

Patricio,

No problem, if there’s anything in particular you want me to look at just email me. I’d love to help with ofxFX in any way, its such a great addon!

Thanks for your help Tim! So far it seems to be working perfectly!

Tim S:

Yes, the GPUParticlesSystemExample it’s based on ofxFlocking. It’s a simplify version of it to share how to use fragment shaders and textures to process other information rather than images. Can you take a look to ofxFlocking? It could be optimice a LOT following your tips and discoveries. I’ll really appreciate it.
Also If you think in new modules for it. Feel free to add them

Ok, im sure the modifications I’ve made here will translate directly to ofxFlocking. I’ll take a look at it in the next few days.