touch table with kinect

i am doing kind of a touch table with kinect tracking at the moment.
there’s a table with objects lying on it. the objects are fixed on the table and touching one of the objects triggers an event (no dragging or other stuff needed, only touching of the objects).

i thought i would do hand tracking with ofxOpenNI and define invisible boxes for each object. if the hand coordinates lie within one of these boxes the event is triggered. the kinect “watches” the people standing in front of the table.

one questions is how i can code the placing of the trigger boxes so that my setup is as flexible as possible. meaning: if i re-setup the installation and the kinect is not at the exact position it was the last time, i want it to be as easy as possible to re-align the trigger boxes.
i was thinking about using a calibration point (for example a corner of the table) and calculating the positions of the trigger boxes in relation to the calibration point. this way i wouldn’t have to change the positions of every trigger box but only the calibration point.

is this more complicated than i am thinking? do i need to calculate a plane for the table surface or something like that? or do you have any ideas how to do this in a less complicated way?

are the objects fixed to the table?
if not:
What about doing some background subtraction?
use opencv for this as it will be much easier.

  • take a snapshot the depth image of the table without objects.
  • place the objects and take another one.
  • subtract both and you’ll get an image with just the objects in it.
  • blob detect or do contour finding to the latter image. Now you can have the centroids of each blob or contour and take an average midpoint for its depth and you can calculate it’s bounding box and you’re ready.
    Instead of takeng just one image you might want to take for instance 100 images and then average them. I will give you a much more “clean” image.

if your objects are fixed, the your approach is almost fine. just setting one point won’t give you any rotation info that the setup might suffer. take the four corners of the table or any known four points and take it’s homography. That will give you the transform matrix of the table related to the camera.
apply this matrix to your invisible boxes and you’re ready.
you can even automate this process so it occurs, ofr instance when you turn on the system or every day when the table is not being used.
take a look at the ofxCV addon as it comes with some example that will be useful in both cases.

Good luck!

the objects are definetly fixed.

at the moment i’m trying to map the the table coordinates to the screen:

ofPoint dst[4]; // screen coordinates  
dst[0] = ofPoint(0,0,0);  
dst[1] = ofPoint(ofGetWidth(),0,0);  
dst[2] = ofPoint(ofGetWidth(),ofGetHeight(),0);  
dst[3] = ofPoint(0,ofGetHeight(),0);  
//using arturos findhomopgraphy.h  
ofMatrix4x4 matrix = findHomography(p, dst);  // p is the array with the table corners  
ofVec3f hnd = ofVec3f(user_l_hand.x,user_l_hand.y,user_l_hand.z); //hand coordinates  
hnd = hnd * matrix;  
ofSetColor(0, 255, 255);  
ofCircle(hnd.x, hnd.y, 10);  

but it does nothing … can’t figure out how to do this properly

Hello Panopticon,

I had adapted the CCV (CommunityCoreVision) for the Kinect Sensor for quite same reasons you have.
The software it´s named KinectCoreVision . You can found more info about it at:

The code it´s at github and there you can find I´m doing a hand-tracking algorithm… but at the end I get interested only at the position of fingers. You can use the same structure to track object by taking notes of the angles between the corners of the objects.

Here it´s a video of the final setup over a LCD TV

I hope you found it useful

try patricio’s app. it might help you a lot.

you are implementing incorrectly the homography.
the homography has to do with what the camera sees.
what you need to do is to provide to the findhomography function the following arrays of coordinates.

  • the coordinates of the corners of your table from the image grabbed by the camera. this coordinates are 2d as you get them from an image.
    -the 2d coordinates of the table seen from another point of view. In this case the idea is to provide the coordinates of the table’s corners as if it was being seen from above, perpendicular to the table surface. if you provide the screens corners then it’s assumed that the table has the same proportions of the screen. the proportions of your table is what’s important.

The matrix you will get is the one that you need to apply to the (virtual) camera to move it from seeing the table in the first position to the second one.

So, you have to set your invisible boxes coordinates in related to the perpendicular to the table view and the you apply the homography matrix to them so this are seen placed in the same position relative to the table that was set.

I guess I’m not being very clear because I’m somewhat sleepy.

//here goes a few hours spleep span. I figured out it might be clearer with an example

try out the following, it uses the ofxGLWarper addon I did.
I attached a screenshot and the drawing I printed just to have a visual reference.


#ifndef _TEST_APP  
#define _TEST_APP  
#include "ofMain.h"  
#include "ofxGLWarper.h"  
#define NUM_BOXES 3  
class box {  
	void setup(int _x, int _y, int _z, int _s){  
	int x, y, z, s;  
	void draw(){  
		ofBox(x+(s/2.0f), y+(s/2.0f), z+(s/2.0f), s);  
	void drawBottom(){  
		ofRect(x , y, s,s);  
class testApp : public ofSimpleApp{  
		void setup();  
		void update();  
		void draw();  
		void keyPressed(int key);  
		void keyReleased(int key);  
		void mouseMoved(int x, int y );  
		void mouseDragged(int x, int y, int button);  
		void mousePressed(int x, int y, int button);  
		void mouseReleased();  
		void takeSnapshot();  
		ofxGLWarper warper;  
	ofImage img;  
	bool showSnapshot;  
	ofVideoGrabber 		vidGrabber;  
	int 				camWidth;  
	int 				camHeight;  
	bool drawCuadrado;  
	int frameSize[2];  
	box boxes[NUM_BOXES];  


#include "testApp.h"  
#include "stdio.h"  
void testApp::setup(){	   
	//we run at 60 fps!  
	warper.setup(frameSize[0],frameSize[1]); //initializates ofxGLWarper  
	warper.activate();// this allows ofxGLWarper to automatically listen to the mouse and keyboard input and updates automatically it's matrixes.  
	camWidth 		= 640;	// try to grab at this size.   
	camHeight 		= 480;  
	img.allocate(camWidth, camHeight, OF_IMAGE_COLOR);  
	img.mirror(false, true);  
void testApp::update(){	  
	ofBackground(20, 20, 20);  
void testApp::draw(){  
	if (showSnapshot) {  
	}else {  
	///all the things that are drawn AFTER ofxGLWarper's draw method are afected by it.  
					///el metodo draw de ofxGLWarper afecta a todos los elementos dibujados despues si.  
	// -- NOW LETS DRAW!!!!!!  -----  
	for (int i=0; i<NUM_BOXES; i++) {  
		ofSetColor(255, 0, 0);  
		if (drawCuadrado) {  
		}else {  
		ofSetColor(255, 255, 255);  
void testApp::takeSnapshot(){  
void testApp::keyPressed(int key){  
switch (key) {  
	case ' ':  
		if (warper.isActive()) {  
			warper.deactivate(); //once you are done with the Wrapping you should call this method, so it realeses the keyboard and mouse and stop processing the transformation matrixes.  
								 // this will reduce the amount of precessing needed.  
		}else {  
	case 'c':  
	case 's':  
void testApp::keyReleased(int key){}  
void testApp::mouseMoved(int x, int y ){}  
void testApp::mouseDragged(int x, int y, int button){}  
void testApp::mousePressed(int x, int y, int button){}  
void testApp::mouseReleased(){}  

update: I modified a few things in the code above, it works the same but it’s clearer.

TO USE: draw the corners of the frame drawr arround the boxes to it’s corresponding corner in the image captured by the camera.
press ‘s’ to take a snapshot of the camera to place it more easily.
press ‘c’ to toggle the fill of the boxes.
press ’ ’ to enable/disable warping.