object detection

Hi all,

I’ve ported an old project which is essentially ripped from the opencv examples on doing feature-based object detection. I think it is a fairly standard/powerful approach and opencv makes it very simple to use/experiment with many similar algorithms using the templated feature/descriptor/matcher routines.
[flash=200,200]http://vimeo.com/29734948[/flash]

Code is here: http://github.com/pkmital/Real-Time-Object-Detection

It can be used for quite interesting applications other than just detection of course. I have been using it for working on resynthesis algorithms where I use object detection together with a database of objects to resynthesize video - concatentative synthesis meets video. Regurgitating Svankmajer’s “Dimensions of Dialogues” after having eaten Svankmajer’s “Food”:
[flash=200,200]http://vimeo.com/30095044[/flash]

Thanks

Interesting!

So for the Dimensions video you posted:
You analysed Jan Svankmajer’s “Food” and took all of the objects out of that film and put it into a database, then analysed “Dimensions of Dialog” and recreated it using similar looking objects from the database you made from Food?

It’s very fitting to the themes he explores in his own work, very cool project!

nice :slight_smile: (I always use manuals and stabilo markers too when testing feature trackers :smiley: )

Yes exactly! It is a painfully slow but automated process! That video took about 2 days to complete.

Haha maybe it wasn’t the best selection.

I bet it did take a long time. How, if it is an automated process, does it look at a frame and say “this is an object”? or are you looking at the movie to be recreated first and then scouring the imagery source video for a good match for certain pixel blobs?

Also, if you don’t mind a bit of critique, I think the resulting video would have a much stronger impact if you could leave some of the images on screen for a longer time, as it is now they seem to be replaced every frame, so it is difficult to tell what they were originally from. Maybe the compression is killing it too.

Both actually. The first pass is the creation of the database, where it finds objects in each frame. The second pass is the resynthesis, again finding objects, but matching them to those in its database.

Excellent point and one I am working on currently. This essentially entails finding video segments in 3d and I have a few ideas for this that should make the whole process much faster and perceptually more continuous.

So you’ll be doing a pattern match on a 3d object created from spacial and temporal data? Very interesting! I’d love to hear more about this.

Very cool, pkmital. Thanks alot.

It’s strange that it runs very slow on my machine, whereas it seems very fast in your video.

Yes! I have yet to upload any preliminary results but the idea is simple. Rather than doing the flann-based vector-vector matching, I do sequence matching and concatenate vectors together to create a spatial-temporal descriptor.

What is the size of the image you are running on? You may find reducing the image size is an easy way to speed up the algorithm at not too much cost to the performance. I was running on a 320x240 image. You could also try using “FAST” instead of “DynamicSURF” for the FeatureDetector (see inside pkmDetector’s constructor) although the results will be less stable. As well you could try “NONE_FILTER” instead of “CROSS_CHECK_FILTER”, though again the results will be less stable.