advice with multithreading..

Hi there, I’ve just started playing with multithreading and need some advice. I’ve had a look at the sample and posts etc. and get the basic idea of it… but need some help applying it how I want.

my basic update function is as follows:

1. fluid->update();  
2. particleSystem1->update();  // reads and writes data to fluid  
3. particleSystem2->update();  // reads and writes data to fluid  
4. for(int i=0; i<NUM_CAMERAS; i++) motionTracker[i].update();  // independant  
5. for(int i=0; i<NUM_CAMERAS; i++) motionTracker[i].addForceToFluid(); // writes to fluid  

I would like to make this multithreaded so the first 4 steps all run simultaneously. What i was thinking was before step one I make a copy of the vector field that is needed in steps 2 and 3 (or alternate between 2 pointers). While the fluid is updating, the particle systems run in their own thread reading data from the copy (from the last frame) and writing to their own temp buffer. Also for each camera there is a motionTracker instance which runs in its own thread performing optical flow and various other analysis writing to its own buffer.
Once all 4 steps are finished, the main thread writes all the data from the particle systems and cameras into the fluid and we move on to the draw function.

So I think I don’t want my threads to automatically loop, (so I don’t put a while in the threadedFunction - thats easy enough :stuck_out_tongue:

Just as an update:

For my fluid class I’ve extended the ofxThread, and put all of its functions between a if(lock()) { } block and added this:

	bool bHasRunThisFrame;  
	void start(){	  
		bHasRunThisFrame = false;  
		startThread(true, true);   // blocking, verbose  
	void threadedFunction() {  
		while( isThreadRunning() != 0  && bHasRunThisFrame != true ) {  
			if( lock() ){  
				bHasRunThisFrame = true;  


and my main update function is now:

1. fluid->start();     // run the update function once in another thread?  
2. particleSystem1->update();  // removed all references to fluid   
3. // particleSystem2->update();  // got rid of this for now  
4. for(int i=0; i<NUM_CAMERAS; i++) motionTracker[i].update();  // independant   
5. for(int i=0; i<NUM_CAMERAS; i++) motionTracker[i].addForceToFluid(); // writes to fluid  

And the app is crashing at startup :stuck_out_tongue:
Would really appreciate some ideas as to what I’m doing wrong!

heres what my console has to say (interesting how it starts with a draw loop! I noticed that when I removed the multithreading stuff as well!):

(its entering the draw loop before the fluid update is finished. Fluid->draw() has an if(lock() { } in it, is that part of the problem?

[Session started at 2008-06-20 13:54:59 +0100.]
Starting DRAW loop
drawing Balls
ending DRAW loop
Starting UPDATE loop
starting BALLS update
ending BALLS update
ofxThread: waiting till mutext is unlocked
starting MotionTracker 0 update
ofxThread: we are in – mutext is now locked
starting FLUID update
ending MotionTracker 0 update
Starting DRAW loop
ofxThread: waiting till mutext is unlocked

[Session started at 2008-06-20 13:55:02 +0100.]
Loading program into debugger

GNU gdb 6.3.50-20050815 (Apple version gdb-768) (Tue Oct 2 04:07:49 UTC 2007)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details.
This GDB was configured as “i386-apple-darwin”.Program loaded.
sharedlibrary apply-load-rules all
Attaching to program: `/Users/memo/Documents/MAIN/Lab/OpenFrameworks/of_preRelease_v0.05_xcode_FAT/apps/addonsExamples/_fluid3/’, process 1093.
[Switching to process 1093 thread 0x6303]
[Switching to process 1093 thread 0x6303]

P.S. I got the latest updates to ofxThread


Hey memo,

When the debugger (GDB) kicks in it will usually show you where it has crashed.
As long as you have you build profile set to Debug (rather than release).

Type Apple Shift Y in xcode or Run->Debugger and it should show you the calls leading to the crash - something like

0 - someFunction or assembly stuff (usually not too easy to decipher)
1 - yourFunction that did the bad thing
2 - testApp update
3 - main()

Usually if you click on the second or third one down it will show you where in your code your error is.

Hope that helps!

Hi Theo, thanks for that wasn’t aware of it. Unfortunately now though I don’t get a crash… the app just hangs. I’ve identified that in my fluid::addForce() function if I have if(lock() {… unlock(); } around the whole function the app hangs… if I remove it, it works! THis really confuses me as I thought we had to protect the data inside these if(lock()) blocks so it would wait till the other thread released the data before going in (i’ve got startThread(true)) - how can this work!?

any chance you could post a zip of the project?
threads can be a bit of a handful and I could debug it quicker if I could look through all the code.


Hi Theo, thanks for your help. the zip is here

At the moment it is setup to run not multithreaded. At the top of fluidsystem.h is a define FLUID_MULTITHREAD, enable that to see it working in its own thread - the results are very different and wrong.

in the FluidSystem::addForce methods, i’ve commented out the if(lock()) lines. If I put them back in the app hangs. Wiith them commented out it runs - but behaves very weird, as the main thread is calling the addForce methods in the middle of the fluid update, which is not good.

Also in this version, the fluid thread is running completely on its own, not synced to the main loop. Ideally i would like the fluid update to run once per main update as per my first post… but not sure how to go about it!

Hi Memo,

The problem was you were locking and then trying to lock again within that same function. The outer function had called lock then had called the function addForceAtPixel which also called lock - the outer function could not free up the lock because addForceAtPixel was blocking till unlock was called.

Basically you had


which will never unlock - see bellow where the locks are happening.

// add force to normalized x, y,  coordinates  
void FluidSystem::addForceAtNorm(float x, float y, float dx, float dy, float generateMult, float velocityMult) {  
printf("addForceAtNorm waiting \n");  
if(lock()) {  //<------ ++++ your first lock here +++++  
printf("addForceAtNorm inside \n");  
	int i = (int) (x * _NX + 1);  
	int j = (int) (y * _NY + 1);  
	if(i<0 || i>_NX+1 || j<0 || j>_NY+1) return;		  
        //+++ your second lock is inside here ++++  
	addForceAtPixel(i, j, dx, dy, generateMult, velocityMult);  
	unlock();	}  
printf("addForceAtNorm done \n");	  

by commenting out the lock unlock from addForceAtPixel - it then ran fine.
with about double the fps (aprox 15 on my laptop)

I imagine with some tweaks to timing and the structure of the code you could get it running a lot faster!

btw that code looks super nice - the fluid looks great!

good luck!

Hi Theo thanks. Yea its going to to run across 6 projectors on a 20m projection so I need to optimize it a bit :stuck_out_tongue:

That double lock / unlock thing does look quite bad, I call addForceAtPixel() as well from other classes, in fact both can be called, so they both kind of need it, but I guess the inner function (addForceAtPixel) needs to check whether there is already a lock or not before applying its own lock…

But in the meantime, I got rid of the outer function (addForceAtNorm) and don’t use it at all, just kept the lock on the pixel function. If I have the one lock on the pixel function it runs at about 3 seconds per frame! (on a 8core mac pro). If I remove the lock it runs fast but completely wrong! (the fluid uses the pixel function, not the norm one) - the lock needs to be there so the cameras don’t add their velocity to it until the update has finished… but I don’t understand whats going wrong!

when I have the lock in the addForceAtPixel() (and removed the addForceAtNorm call altogether), my console is just full of pages of:
ofxThread: we are out – mutext is now unlocked
ofxThread: waiting till mutext is unlocked
ofxThread: we are in – mutext is now locked
ofxThread: we are out – mutext is now unlocked
ofxThread: waiting till mutext is unlocked
ofxThread: we are in – mutext is now locked
dunno if thats any help…

sorry to be appear to be spamming but theres so many factors going wrong its quite frustrating!

I"ve got the fluid working in its own thread pretty damn fast now!

It turns out the problem was that I was doing a lock in the addForceAtPixel() function which was in a massive for loop. I thought that when I called the startThread with blocking true, it would cause the main thread to pause and wait when it encountered the first lock() call - i.e. the first one of the loop. It turns out thats not what happens, it was going through the for loop and waiting for each one or something. So I collated the for loop in a function in the fluid class and put one lock around it, now my app runs at 60fps!!! sweet!!!

But my new problem is, I get a crash when I try to make the motion tracker a thread. The stack is:

glGetIntegerv(GL_UNPACK_ALIGNMENT, &prevAlignment);

so I guess its the openGL multithreading issue. Is there no way to capture video in a seperate thread then?

(I’m sure I’ll have more questions soon :S sorry)

nice! I was thinking that the lock looked like it was happening in the wrong place. I always think it’s the best to try to minimize the amount of code inside a lock - just do memcpys, etc, I try to keep it to one or two lines of code, no function calls and use the thread to alter local data. you situation might be different though.

can you try setUseTexture(false) with the video grabber after you set it up? that way as you grab there will be no opengl stuff happening (normally, on a new frame of video we update the internal texture). I could imagine that messing up in a thread. in order to draw the video you’ll have to get those pixels back and into a texture in the main app/GL thread.

good luck!!

Thanks Zach, the setUseTexture(false) didn’t occur to me - it is a good solution as I don’t need to render the videoGrabber directly to screen, i just need the pixels. At the moment I hack-solved the situation by looping through all the cameras and capturing a frame in the main thread, then starting its own thread to do the analysis if a new frame was captured.

for(int i=0; i<NUM_CAMERAS; i++)  
   if( motionTracker[i].capture() ) motionTracker[i].start();  

I also found that for my purposes, I preferred to have the threads not looping endlessly with a sleep in their threadedFunction, but instead run them once each frame - that way I know exactly when they are running etc. i.e. at the beginning of the update function I know that none of them are running for sure, because the update function can’t finish before all the threads finish.

Actually to be sure of that, I had to create dummy functions which just did if(lock()) unlock(); to stop the flow of the main thread and put them towards the end of the update function - maybe this isn’t the best way to do it, but it definitely solved some issues I had:

void testApp::update() {  
    .... // memcpy all data I need to temp buffers  
    fluid->start();  // start fluid solver in its own thread  
    particles->start();  // update particles in its own thread  
    for(all cameras) motiontracker->start();  // analyse all cameras, each in their own thread  
    fluid->waitForFinish();  // don't carry on until fluid thread has finished  
    particles->waitForFinish();  // and particles  
    for(all motiontrackers) motiontracker-> waitForFinish(); // and motion tracking  
    .. // now combine all the data from the threads and get ready for drawing and next frame  

So that way I know all the code outside of the starts() and waitForFinish() will all be single threaded and I don’t need to worry about conflicts :S I don’t know if this is the best way to do it… the disadvantage is the fluid is running at the same fps as the app when it doesn’t actually have to, so i need to manage that manually in its update function - and I don’t know if there is any overhead in the actual pthread_create() function - but I can’t complain, quite happy now (with still a few hairs left on my head!) :stuck_out_tongue:

the setUseTexture(false) worked for the vidGrabber, much nicer solution thanks!

As an update to anyone who might be reading this thread later on… the above method is NOT the way to do it. After a while the threads just stop being created. Not sure why, the total number of threads for the process never goes above 8-10, indicating that the old threads are being destroyed - but still doesn’t work. So I explored two options, one involving a flag determining whether the thread is allowed to run or not, and the other option involves posix thread condition variables. The latter is the much more robust option apparently, but is mac/unix only and I still have some sync issues I need to sort out - so postponing that for a few weeks till after my current project.

So if anyone wants to do something similar (run a few threads in parallel at the beginning of the update loop, and make the main thread wait before running into the draw() function to sync up buffers etc.) the flag method code is below. This works stable as a rock so far (left it running over day doing some pretty heavy calculations on each thread and the counters stayed perfectly in sync)

not sure if I need that sleep in there, but I kept it just in case…

void App::setup() {  
	for(int i=0; i<NUM_THREADS; i++) thread[i].create(i);  
void App::update(){  
	printf("\n[ **** %i : ", ofGetFrameNum());  
	for(int i=0; i<NUM_THREADS; i++) thread[i].wakeUp();  
	...// do stuff for the main thread that doesn't involve the other threads   
	for(int i=0; i<NUM_THREADS; i++) thread[i].waitForFinish();  
        ...// now its safe to read and write to all the threads to exchange data  
	printf(" %i ", ofGetFrameNum());  
	for(int i=0; i<NUM_THREADS; i++) printf(" %i ", thread[i].getCounter());  
	printf(" **** ]");  

class MSAThread : public ofxThread {  
	    int counter;   
		int index;  
		bool bOkToRun;  
		~MSAThread() {  
		void create(int i = 0) {  
			counter = 0;  
			index = i;  
			bOkToRun = false;  
            startThread(true, false);   // blocking, verbose  
		void threadedFunction(){  
			while(threadRunning) {  
				if(bOkToRun) {  
					bOkToRun = false;  
		void wakeUp() {  
			bOkToRun = true;  
		void waitForFinish() {  
		int getCounter() { return counter; }  
		void logCounter() { printf("%i ", counter); }		  
		virtual void update() {  
			unsigned int n = random();  
			float f = 0;  
			for(int i=0; i < n; i++) f = cos(sin(i));  
			printf(" { %i:%i } ", index, counter);  

i was re-reading this thread again. i’m working on an application where i want to put the cv stuff running on a separate thread. since i want everything to be synced i was trying out this example here, but it doesn’t seem to work…
for some reason the waitForFinish() method doesn’t seem to be working, not sure why…

i’m using the MSAThread class like you posted it here, the only thing i changed is that now it prints “threaded update finished”. For this example i’m creating a single MSAThread called “theMSAThread”
my code is as follows:

void testApp::setup(){  
void testApp::update(){  
        //just do some stuff here...  
	for(int i=0; i<random(); i++){  
		float a = sqrt(random());  
	printf("main counter: %i , thread counter: %i \n", ofGetFrameNum(), thread.getCounter());  

andafter a while the console reads:

threaded update finished
main counter: 88 , thread counter: 84
threaded update finished
threaded update finished
main counter: 89 , thread counter: 86
main counter: 90 , thread counter: 86
threaded update finished
threaded update finished
main counter: 91 , thread counter: 88
threaded update finished
main counter: 92 , thread counter: 89
main counter: 93 , thread counter: 89
threaded update finished
main counter: 94 , thread counter: 90
threaded update finished
main counter: 95 , thread counter: 91
threaded update finished
main counter: 96 , thread counter: 92
threaded update finished
main counter: 97 , thread counter: 93
and so on…

it really seems that the main thread is not waiting for the other thread to finish even though it is set to “block”…
maybe you or anyone else knows whats going on?
i tried this in OF 006 and OF 005



ok, so i still dont know why the waitForFinish() method doesn’t seem to halt the main thread until the lock is successful, but changing that method to the code below seems to work:

void waitForFinish() {  
		bool bRunning = true;  
				bRunning = false;  

but it seems a little excessive having a while loop with successive locks and unlocks… shouldn’t a single call to lock() be enough to halt the main thread until the threadedFunction is finished? i have blocking set to true…



Could somebody explain me how the waitToFinish() method could ensure that the main thread stops? Can’t figure out how this should work . :roll: