hand / arm / head / legs tracking

Hi

I’m working on a project where I want to detect hands, arms, legs and the head. Tracking the face/head will be done with the excellent opencv package. Now, I’m searching to a way to track the hands and legs.

**I want to give the following solution: **
When I’ve found the head using opencv and the hands using the algorithm/solution used here: http://forum.openframeworks.cc/t/fingertracking/846/0 It uses a algorithm based on the pointy forms of the fingers. Seems to be a solid and good working solution.

When I’ve got the hands and face I can roughly construct the position of the body (not totally correct but I want to test this.

Does someone has an idea/code how I can detect the hands?

Roxlu

I did pose estimation for this installation
http://www.greyworld.org/#monument-to-t-…-artist-/v2

At first I started comparing moments, horizontal & vertical projections, contour comparison, but none of them worked well with the noise of the scene. In the end I went for trying to estimate the positions of heads, hands, feel, elbows based on a variety of methods. I then track these points over time, normalize them and compare them to a training set.

Here is a screenshot of some workings

Your first point should be to examine the extremities of the contour of a blob, then find peaks and troughs. For each blob, cycle through every contour. Measure the distance from the blob centroid to position of each point and store those distances in an array. This array is the graph you see in a small box just below the person in the top two images.

Then cycle through that list and find peaks. compare each point with the one before and the one after it. If the current point is higher than the previous point and the next point, you have a peak. The index of that point in the list would be used to find the index of the point in the contour list. These are shown as blue circles on the person in the top two images.

I’m doing a host of other things:
skeletonization to help me make points more accurate, although not really needed
http://www.inf.u-szeged.hu/~palagyi/skel/skel.html

copying the Eyesweb approach to estimating ‘man params’ (shown as the smaller crosses within the body)

Taking an orientation line (using moments) through the body up to the top of the contour to find the nearest extremity to the head. This helps if people have their hands above their head, as in Eyesweb it plots the head position as the highest point (not always correct)

1 Like

Hi Chirs,

Thanks a lot for your reply! It’s indeed exactly what I need. I’ll try some of your suggestions!

**I comparing 3 points to detect peaks enough? **

And you mention a training set… are you refering to a neural network?

Roxlu

For me 3 was enough, although I smoothed the data slightly by averaging the contour first. i.e. this point = last point + this point + next point / 3

Training was just to see if the same poses was made as those stored in my xml files.

WOW, haha I just understood what you made! What a perfect project! I’m flabbergasted!

What technique did you use to match the poses with your training set?

Would there be a change I could get your sourc code?

Roxlu

So I have a set of points for a human body, normalise them down to fit within 0.0-1.0, then compare the normalised skeleton to those in my list and create a score. I have 16 poses in my records, each pose has any number of trained data sets, over multiple cameras. For each pose you can say what points of the body you want to compare, i.e. just head and hand, or feet and arms etc. If the score is over a threshold, then it sends a trigger to the statue to do that pose.

The whole software took a very long time to make & had to overcome many problems, like removing cars, background objects, changes in light, shadow removal.

Unfortunately I’m not going to release any of it, sorry. I’m happy to help on the forum & am contributing other parts of code back to the ofw community.

Start with the contour distances of points to the centre of the blob though, that should see you well for your needs.

Hi Chris,

I totally understand that you don’t want to share your code! No problem!
I’ll start with the contour thing you suggested

Actually this is pretty much the same technique I used for detecting the hands of people in Funky Forest ( http://muonics.net/site-docs/work.php?id=41 )

It works really well! Blurring the contour as chris suggested helps a lot too!

If you want the video to try against (from the installation) you can get it here:
http://muonics.net/cvMovies/multiPerson-…-ind-IR.mov

There are a bunch more here:
http://muonics.net/cvMovies/

[quote author=“chrisoshea”]I did pose estimation for this installation
http://www.greyworld.org/#monument-to-t-…-artist-/v2

At first I started comparing moments, horizontal & vertical projections, contour comparison, but none of them worked well with the noise of the scene. In the end I went for trying to estimate the positions of heads, hands, feel, elbows based on a variety of methods. I then track these points over time, normalize them and compare them to a training set.
(not always correct)
… snip…
[/quote]
chrisoshea, I was wondering, how di dyou manage to get those “perfect” blobs?
I’m trying to use background differencing with the ofxOpenCV addon, though I’m not totally happy with the results.

My code for bg differencing

  
  
#include "BlobberBackgroundDifference.h"  
BlobberBackgroundDifference::BlobberBackgroundDifference() {  
	m_refresh_background = true;  
	m_threshold = 40;  
}  
void BlobberBackgroundDifference::init(Blobber *pBlobber) {  
	m_gray_bg.allocate(pBlobber->getWidth(), pBlobber->getHeight());  
	m_gray_image.allocate(pBlobber->getWidth(), pBlobber->getHeight());  
	m_gray_diff.allocate(pBlobber->getWidth(), pBlobber->getHeight());  
}  
  
  
void BlobberBackgroundDifference::update(Blobber* pBlobber) {  
  
	m_gray_image = *pBlobber->getImage();  
  
	//test  
	//m_gray_image.threshold(30);  
	m_gray_image.dilate_3x3();  
	m_gray_image.erode_3x3();  
	//test  
  
	if (m_refresh_background) {  
		m_gray_bg = m_gray_image;  
		m_refresh_background = false;  
	}  
  
	m_gray_diff.absDiff(m_gray_bg, m_gray_image);  
	m_gray_diff.threshold(m_threshold);  
	m_gray_diff.blur();  
	pBlobber->setContourImage(&m_gray_diff);  
  
}  
  
void BlobberBackgroundDifference::draw(int iX, int iY) {  
	m_gray_diff.draw(iX, iY);  
}  
  
void BlobberBackgroundDifference::increaseThreshold() {  
	if (m_threshold == 100) {  
		return;  
	}  
	m_threshold++;  
	cout << m_threshold << "\n";  
}  
  
void BlobberBackgroundDifference::decreaseThreshold() {  
	if (m_threshold == 0) {  
		return;  
	}  
	m_threshold--;  
	cout << m_threshold << "\n";  
}  
void BlobberBackgroundDifference::updateBackground() {  
	m_refresh_background = true;  
}  
  

I looks to me like your are doing morphs on the live input image:
m_gray_image.dilate_3x3();
m_gray_image.erode_3x3();

Don’t do that, it will send your background difference crazy.

Try this…

  
  
void BlobberBackgroundDifference::update(Blobber* pBlobber) {  
  
   m_gray_image = *pBlobber->getImage();  
  
   if (m_refresh_background) {  
      m_gray_bg = m_gray_image;  
      m_refresh_background = false;  
   }  
  
   m_gray_diff.absDiff(m_gray_bg, m_gray_image);  
   m_gray_diff.threshold(m_threshold);  
  
 // morphs here  
 m_gray_diff.dilate_3x3();  
 m_gray_diff.erode_3x3();  
  
   m_gray_diff.blur();  
   pBlobber->setContourImage(&m_gray_diff);  
  
}   
  

What is the environment you are trying to track in? Stable lighting? solid white background?

Hi Chris,

I’ve put the morphs ‘downwards’ in the code like this:

  
  
  
void BlobberBackgroundDifference::update(Blobber* pBlobber) {  
	m_gray_image = *pBlobber->getImage();  
  
	//test  
	//m_gray_image.threshold(30);  
  
	//test  
  
	if (m_refresh_background) {  
		m_gray_bg = m_gray_image;  
		/*  
		unsigned char* p = m_gray_bg.getPixels();  
		int size = pBlobber->getWidth() * pBlobber->getHeight();  
		for (int i = 0; i < size; i++) {  
			p[i] = 0xFF;  
		}  
		*/  
		m_refresh_background = false;  
	}  
  
	m_gray_diff.absDiff(m_gray_bg, m_gray_image);  
	m_gray_diff.threshold(m_threshold);  
  
	m_gray_diff.dilate_3x3();  
	m_gray_diff.erode_3x3();  
	m_gray_diff.erode_3x3();  
  
	m_gray_diff.blur();  
  
  
	pBlobber->setContourImage(&m_gray_diff);  
  
}  
  

It’s still not perfect but a lot better, thanks!
Mabye I need to difference based on brightness ? or do you know better solutions?

Can you post some screenshots of what you are seeing?

Here you are:

Welcome to computer vision. It is very difficult to overcome obstacles like this for all environments. Segmenting the foreground from background pixels in a clean way and researchers have been publishing papers about this for years.

There are two main problems with that image

  1. you have a global threshold of 40 over the whole image, but your hand is of similar brightness to the background.
  2. the shadows cast by your hand on the wall are being picked up in the foreground.

Number 2 is very difficult to solve, number 1 is a bit easier.

If you have a lower threshold you will see the hand, but you might pick up more noise.

There is nothing in openframeworks yet that you can solve this for you straight away. However there are a few things using opencv functions you could try.

Have a look on google for
adaptive threshold "background subtraction"

Unfortunately there is no quick and easy answer. Maybe concentrate on other areas of your project though and get those working.

a nice approach for shadows and background subtraction is to not to absolute difference between background image and foreground but a straight subtraction - then you only care about pixels brighter than the background not darker.

Theo, what do you mean with “straight subtraction”? Can you give an example? I was also thinking to use a brightness subtraction as I want to use a white wall for the installation.

You can also try to turn off some of the color channels before doing the threshold. It seems like your hand is more blue, so maybe there will be more contrast in the blue channel.

hi all,

Thanks for all your replies! I’m using a colors now where certain channels are more
important than others.

** my code**

  
  
void BlobberBackgroundDifference::update(Blobber* pBlobber) {  
	// color differencing.  
	if (m_refresh_background) {  
		m_color_bg = *pBlobber->getImage();  
		m_refresh_background = false;  
	}  
  
	// Do a color difference.  
	m_color_fg = *pBlobber->getImage();  
  
	int size = pBlobber->getWidth() * pBlobber->getHeight();  
	int curr_r, curr_g, curr_b, back_r, back_g, back_b, diff_r, diff_g, diff_b, diff;  
	unsigned char* curr_pix = m_color_fg.getPixels();  
	unsigned char* back_pix = m_color_bg.getPixels();  
  
	for (int i = 0; i < size; i++) {  
		curr_r = *curr_pix++;  
		curr_g = *curr_pix++;  
		curr_b = *curr_pix++;  
		back_r = *back_pix++;  
		back_g = *back_pix++;  
		back_b = *back_pix++;  
		diff_r = abs(curr_r - back_r);  
		diff_g = abs(curr_g - back_g);  
		diff_b = abs(curr_b - back_b);  
  
		//diff = (diff_r * 0.5) + (diff_g * 0.59) + (diff_b * 0.11);  
		diff = (diff_r * 0.3) + (diff_g * 0.3) + (diff_b * 0.9);  
		//  0.30R + 0.59G + 0.11B  
  
  
		if (diff < m_threshold) {  
			m_color_diff[i] = 0x00;  
		}  
		else {  
			m_color_diff[i] = 0xFF;  
		}  
  
		//curr_pix += 3;  
	}  
	m_gray_diff.setFromPixels(m_color_diff, pBlobber->getWidth(), pBlobber->getHeight());  
	m_gray_diff.erode_3x3();  
	m_gray_diff.dilate_3x3();  
	m_gray_diff.blur();  
	pBlobber->setContourImage(&m_gray_diff);  
  
  
  
  
	//-------------------- gray scale method -----------  
	/* Gray image differencing  
	m_gray_image = *pBlobber->getImage();  
	if (m_refresh_background) {  
		m_gray_bg = m_gray_image;  
		m_refresh_background = false;  
	}  
  
	m_gray_diff.absDiff(m_gray_bg, m_gray_image);  
	m_gray_diff.threshold(m_threshold);  
  
	m_gray_diff.dilate_3x3();  
	m_gray_diff.erode_3x3();  
	m_gray_diff.erode_3x3();  
  
	m_gray_diff.blur();  
	pBlobber->setContourImage(&m_gray_diff);  
	*/  
  
}  
  

** result **

Improving?
I’m a noob in c++, so if I can do certain things better, please tell me.

And now, I’m wondering, when you look at the image you see two fingers which are very distinguishing in form. I’m wondering if there is a solution to detect these forms?

Roxlu

And now, I’m wondering, when you look at the image you see two fingers which are very distinguishing in form. I’m wondering if there is a solution to detect these forms?

I mentioned above how to do it.

  1. create a list storing the distance from the blob centre to every point in the contour points list.

  2. smooth out that list a little to remove noise

  3. find peaks in that list by comparing the points.

[quote author=“chrisoshea”]

And now, I’m wondering, when you look at the image you see two fingers which are very distinguishing in form. I’m wondering if there is a solution to detect these forms?

I mentioned above how to do it.

  1. create a list storing the distance from the blob centre to every point in the contour points list.

  2. smooth out that list a little to remove noise

  3. find peaks in that list by comparing the points.[/quote]

Ah ofcourse you did, sorry ;-). I’m thinking to implement a neural network to detect gestures based on the points. I’m not sure if this will work, but I’ll keep you up2date.

Roxlu