Multi object tracker logic


I wanted to know how is the logic when implementing a MOT tracker like those that come with opencv or one DNN based.

All i see on the net is based on the fact that you pass the tracker, a bounding box o multiple to track and later run the tracker. So the logic is like… analyze some frames to detect something and later track those, but…

How is the logic when you do a object detection on each frame and want to feed that to the tracker??? I mean when you are continuously doing the detection part and want to feed those detection to tracker

Not sure if it answers your doubt, but take a look at the object tracker in ofxCv. It esentially just makes the object detection on each frame and then it compares the detected rectangles with the previously detected ones and assigns the same ID to the most similar ones.

Hi, you may have found this already but I have forked a great solution to this from patriciogonzalezvivo

My fork is here:

And works well with the current master. It also has a nice video for testing I made with a kinect from above of poeple waking around a space, a pretty reasonable test scenario that was useful for me. If this is a useful scenario here are some more useful videos for teting tracking people in a space from above, all made with kinect using a depth filter to remove the floor.


Yep those answers helped, doing a fast search seems both solution used same logic so i will try to implement something generic that can be used with all the tracker algo variants.

I dont think those will work when a user pass behind another… but for simple usage is enough i think