ofxKinect: differentiating between people and environmet

Hi everyone.

I’m trying to figure out how to detect and single-out a person as an object separate from their environment (a very simple environment, just a big room). I figured the only way to do this would be to use an Xbox Kinect’s depth perception to detect things within a certain depth range, thus differentiating a body from the walls behind it.

So my questions:

1. Would I be able to play with a specific depth range without losing the rest of the environment? I mean, I don’t want to mask the background visually, I just want to mask the depth in terms of whatever effects I apply to the targeted depth range. I have yet to see an example that proves that this is possible…

2. Should I be thinking about this in terms of skeleton tracking instead of depth? I figured skeleton-tracking would be only for movement-based communication, so I am using ofxKinect rather than ofxOpenNI (ofxKinect doesn’t seem to do skeleton tracking).

3. Would you go about doing this differently? If you have knowledge of a simpler execution please let me know.

**4.**Any examples / tutorials you know of dealing with Kinect depth and masking or object detection you think should be linked here?

So that you understand where I’m coming from:

I’m new to ofx, Xcode, and new to C++ but moderately skilled with Processing and some other languages so I get the generals of programming.

I’ve been playing with the Kinect for the past several months through Processing and Unity plugins, so I guess I’m still pretty noobish in this department too.

Any tips or pointers greatly appreciated.

what you are trying to do is super simple:

  1. you store a depth frame from the kinect when there is no user in the frame.

  2. in normal operation you then compare the live depth frames to the stored one from step 1. if a pixel in the current image is significantly closer to the camera than in the stored frame it is part of the foreground (the assumption being that anything foreground is a person). i’ve done this before and it works very well.

an even simpler version that works only if the background is far away, is to just set a depth threshold. i.e. anything closer that x is a person. super simple. you get a silhouette. works really well.

Thanks for your response dr.mo. I’m trying to do the easy route, just setting a depth threshold between where people would be located and where the backdrop would be, but I can’t seem to figure out how to target a depth range. I’m aiming to do this with the point cloud – would I instead have to go about targeting a specific depth range by creating a threshold based on color blob detection, such that is done in the openCV example, and then somehow translate that captured threshold to the point cloud? Because the point cloud seems to only be based on a w x h mesh. Here’s my relevant code (basically part of Theo’s ofxKinect example, which I’ve been trying to sort of reverse engineer):

  
  
  
  
void testApp::drawPointCloud() {  
	int w = 640;  //int w = 1280;  
	int h = 480;  //int h = 960;   
	ofMesh mesh;  
	mesh.setMode(OF_PRIMITIVE_POINTS);  
	int step = 1; //initially 2; step = distance between points  
	for(int y = 0; y < h; y += step) {  
		for(int x = 0; x < w; x += step) {  
			if(kinect.getDistanceAt(x, y) > 0) {  
				mesh.addColor(kinect.getColorAt(x,y));  
				mesh.addVertex(kinect.getWorldCoordinateAt(x, y));  
			}  
		}  
	}  
  
// The following couple lines are my pathetic failed attempt at applying a sound input variable (scaledVol) to the point cloud's mesh   
// of points (glPointSize) within a specific range (arbitrary for now)  
  
        for(int y = 0; y < h; y += step) {  
		for(int x = 0; x < w; x += step) {  
                        if(kinect.getDistanceAt(x, y) < nearThreshold - 50.0f && kinect.getDistanceAt(x, y) > farThreshold + 100.0f) {  
	                        glPointSize(scaledVol * 500.0f); // initially 3; 2; changes size of each point  
                        } else {  
                               glPointSize(2);  
                        }  
                 }      
          }  
  
  

So to clarify, I’m trying to manipulate individual pixels in the point cloud within a specific range. Can’t figure out what terminology to plug in, or what method to use. Any pointers greatly appreciated.

i think you got the right idea. perhaps you are thinking too complicated. i can’t stress how simple the concept is. there is no point cloud or mesh. these are display technologies that you can choose to use. but the data from the kinect is 2D grid of depth values (ie. an image) that you can simply threshold. see code below for example.

  
   
  
//NB: this code is just an example - i did NOT test this...  
  
  
  
for(int y = 0; y < h; y += step)   
{    
        for(int x = 0; x < w; x += step)   
        {     
                if(kinect.getDistanceAt(x, y) < nearThreshold)  
                 {  
                            //foreground  
                 }        
                 else  
                 {  
                            //background  
                 }  
        }  
}  
    
  

I did the same thing in a recent project starting from the ofxKinect example, there’s probably a more efficient way but this works. I created an array of float numbers with 320x240=76800 positions (the ofxKinect example loops through the 640x480px at half the resolution) and a function to be called when you’re storing the background depth (I call it when the key 0 is pressed for instance):

  
  
void testApp::storeBackgroundArray(){  
  
        int w = 640;  
	int h = 480;  
	int step = 2;  
	for(int y = 0; y < h; y += step) {  
                backgroundDistance[x+y] = kinect.getDistanceAt(x, y);  
	}  
}  
  

Then inside the function drawPointCloud() the conditional checks whether the current depth value exceeds the threshold established:

  
  
	for(int y = 0; y < h; y += step) {  
		for(int x = 0; x < w; x += step) {  
			if(abs(backgroundDistance[x+y] - kinect.getDistanceAt(x, y)) > thresholdBackground) {  
				mesh.addColor(0);  
				mesh.addVertex(kinect.getWorldCoordinateAt(x, y));  
			}  
		}  
	}  
  

I hope it helps…

What is “thresholdBackground”? Where did you initialize that? Also, assuming backgroundDistance is the array, where do you initialize the array so that it applies to both parts of the code?

there is an addon specifically for ofxkinect & bg segmention:
https://github.com/jwbowler/ofxKinectSegmentation

“This is a set of methods for segmenting a RGB+depth image, such as from a Kinect: that is, separating the foreground (people standing in front of Kinect) from the background (the objects and walls of the room behind them).”

Haven’t used it before but I plan to.

Hello
@Kj1
I tried this add on, I noticed an error in it, and I often get errors at runtime even if I use it the most simple way as possible. did it worked for you ?

nope, still haven’t used it and also not planning to anymore…