People tracking / optical flow

I’m working on a project to track people outside in what I see as an uncontrolled environment. Here is some test footage that I’m working with. I have some initial test and some ok results here. I want to figure the best way to do this. I have been looking at optical flow videos on youtube and think thats the best approach. Check out these videos:

any thoughts awesome thanks


Is this the same project you mentioned at OF Lab? The one Golan said you should use LIDAR for? :slight_smile: The “best way” to do it really depends on the application: what do you want to happen?

If you’re trying to do something similar to the YouTube videos, where people move things around on a screen, the optical flow should definitely be enough. I’m guessing the rotation + translation of the objects can be found by summing the rotational forces of each optical flow vector relative to the center of the objects, and summing them directly, respectively.

If you’re worried about objects intersecting with people, keep in mind that optical flow doesn’t know the foreground from the background. Since the camera is still, you could model the background and subtract it, propagating strong flow vectors to weaker foreground vectors.

Also: sweet demo of the GUI addon :slight_smile:

Edit: Andy just-released a related demo.


This is exactly what I did for the Audience installation

Here is a screenshot…-de3b-o.jpg

I used optical flow to specific points over time, but not overal vector movement. When a target blob was selected, the software picks interesting features within that blob, then tracks them per frame. if those points go outside the blob they are removed. if there aren’t very many points within a blob it adds more.

cross over of blobs is a problem and points will jump, so i work out for each point which blob they belong to, then the blob that has the most points will remain the target, removing other blob points. this has mostly worked fine, although im sure could be improved.

cant share any code at the moment but happy to answer qs.

I have a hand rolled optical flow example that turns a movie into a vector field.


Ah, I have just seen your test footage. My code above was about tracking position of people using overhead camera. For your purposes optical flow vectors that everyone else mentions will work great.

Hey Todd, hows it going? Heres a few random thoughts / ideas:

  • I think an overall average motion/vector won’t work too well because you have a lot of random movement in the background - and also small movements (e.g. just moving the hand) tend not to contribute to the average motion much.

  • I’m not sure how good blobs are going to come out in that scene, but maybe you could do a face detect, then take the height of the face, and go half of that distance to the left and right (people are on average as tall as they are wide with their arms out stretched) and use that to create a mask - i.e. the area that the person covers. THen look at the blobs only in that area.

  • In addition (or instead) you could do an optical flow (not averaged, but as a vector field), but only look at the velocities within the mask created above. So you don’t pickup the people walking around in the background, but you do get the velocity if the person moves his hand across frame.

  • Probably the best solution, if you had the budget for bumblebee you would most definitely get a blob of the people standing in the window separated from the background. I hear the data is quite noisy, but you probably don’t need pixel precision anyway and it should be good enough for your purposes.

and then you can do an optical flow within that

P.S. Chris, the Audience installation looks really fun. Where is it at?

I’ve been trying to do something similar, and I’m having big problems getting reliable data from optical flow. For example, using the average movement vector is really hard because moving your hand one way and then stopping actually causes a peak in the opposite direction to the movement as the act of stopping tends to cause your hand to move in the opposite direction. (Hard to explain, you’d have to try it to see what I mean.) Memo’s idea of using the Bumblebee would be a good one, although I hear they are about $1000 :s

Here is a graph of the average vectors for me moving my hand one way, stopping, and then moving it the other way. Notice the smaller and shorter inverse peaks immediately after the bigger ones.

Perhaps some kind of filtering might help this, but I still think you’d have the problem of movement in the background, like others have said.

It could be possible to do background subtraction and then blob detection on the resulting image, and ignore smaller blobs (cars)

EDIT: How exactly do you intend to use the data? i.e. is it very important to you that you know exactly where the people are?

EDIT 2: Perhaps it would be possible to use some of the other cascade files with the Haar feature tracker (face detection). AFAIK there are files that detect upper bodies and full bodies?

Hey todd!

Perhaps you can filter a little bit the background movement just by thresholding the optical flow detection: the movement in the foreground will surely be bigger than that in the background, so simply by not using values less than a threshold you have a really simple filter.

there’s some code to easily use the lucas kanade algorithm in opencv from oF here:

just add a threshold property to the class and to see the effect, just add this to the draw method:

if((dx >movementThresHold || dy >movementThresHold){  

Don’t know if that will be enough, if you need something very precise, if not in the opencv blob tracking library in cvvidsurv (not the old one), there’re some really advanced background substraction classes for this kind of environments.

for my installation i use a combination of framedifferencing, blobtracker, facetracker, motion field plus a ‘skin tone tracker’ like this:

// skin tone tracker
col_1 = trackImg;
hue_pixels = col_1.getPixels();
for(int i=0; i<trackWidth*trackHeight; i++) {
int h = hue_pixels[i\*3];
int s = hue_pixels[i\*3 + 1];
int v = hue_pixels[i\*3 + 2];
if((h <= hue_tolerance/2 || h >= 180-hue_tolerance/2) && s>=min_s && s<=max_s && v>=min_v && v<=max_v) hue_filtered_pixels[i] = 255; else hue_filtered_pixels[i] = 0;
hueFiltered.setFromPixels(hue_filtered_pixels, trackWidth, trackHeight);

which is very effective, when youre only interested in head and headtracking.

greetings ascorbin

you might also want to look into something called Kalman Filtering. i’ve never had the opportunity to use it myself, but i’ve wanted to for a while, because the results are amazing… the math setup seems complex though.

you might also want to look into something called Kalman Filtering

The cvvidsurv travking module in opencv works with 4 stages, background substraction, blob detection, blob tracking and filtering. among the filters in that last stage there’s kalman filtering.

I’ve been told about kalman filtering before, but I looked into it and found the maths too intimidating without having read loads more on the subject. I will definitely take a look at the Kalman filtering segment of the cvsurv module though, thanks arturo

Just remembered, the name of that module in opencv is not cvsurv but cvvidsurv for cv video surveillance. I’m editing the previous posts, but just in case.

Thanks, I wondered why I couldn’t find it :stuck_out_tongue: