Check out ofxCv by Kyle McDonald. There is an example of doing this.
You should try to implement a solution yourself. Keep a record of the previous frames blobs, and compare all the new blobs to each in the previous frame, assigning a “best match” for each blob in the new list. A best match could be based on centroid position and a threshold for movement (like described above), or it could be a combination of factors (velocity, area, etc)