I think that you could extend the method I described to deal with resyncing playback. To do this, you would use the basic method I described with a heartbeat message being sent out by a main computer regularly. The playback computers would listen to these messages continually during playback, continually updating their idea about what time it truly is on the basis of the last N messages. Using their beliefs about the true time, they could easily calculate what frame they should be playing, which could be compared to the actual frame that is being played, and, if off by more than some threshold amount (2 frames, for example), they could skip around in playback to resync. This assumes, of course, that playback con be controlled very precisely and that what is being played back can be changed without introducing additional latency.
The way I like to think about this problem is that you have a number of distributed computers doing video playback. They all want to know what time it is, i.e. what the “true” time is. The main controller computer knows the true time, but it can’t tell the playback computers what time it is without introducing error due to the latency involved. So, using a statistical model, the playback computers can come up with their best guess about what the true time really is based on a bunch of timestamps from the main computer.
The heartbeat messages sent out be the main computer contain a timestamp generated by the main computer at the time it was sent. This timestamp represents the true time. When each timestamp is received, the listening computer generates its own timestamp and, using the last N timestamp pairs, calculates the best fit line that connects the true time timestamps with the listening computer’s timestamps. This best fit line is a map that allows the playback computers to, at any point in time, estimate what the true time is using the following equation (which is just a rearrangement of the prediction equation I gave before)
T_et = (T_local - b0) / b1
where T_et is the Estimated True time, T_local is the current time on the playback computer, and b0 and b1 are the parameters of the best-fit line as calculated based on the last N heartbeat messages. With this estimate of the true time and the stored true time at which playback started, it is straightforward to calculate what frame should be playing. Basically, just take the difference between the estimated true time right now and the start time (in units of true time) and multiply by frames per time unit to get the current frame that should be playing.
So, at any point in time, the playback computer can use the equation above to estimate what the true time is and also estimate what it frame it should be playing, then make any adjustments, as needed. Depending on how costly it is to adjust video playback, the amount of error in the current playback frame needed to trigger an adjustment can vary a bit.
I’d like to point out that the equation I gave above is just the function that maps from local time to true time, whereas in my last post I gave the function
Y_pred = b0 + b1 * (X_N + X_off)
which can be rewritten, using T for time and with more generality as
T_local = b0 + b1 * T_true
which can now be clearly seen to map from true time to local time and to be a rearrangement of the equation I gave to map from local time to true time. Thus, the best fit line is useful to think of as a bidirectional map that allows you to connect local time and true time in either direction. Naturally, it is not a perfectly accurate map, because it is affected by latency, both in terms of average latency and random errors in latency from message to message, but it’s still useful. The random variation in latency is mostly cancelled out statistically by using a large number of timestamps.
However, it is important to notice that this method is not robust against the average latency being different between the main computer and different playback computers. For example, if it takes, on average, 10 ms longer for the messages to get to one of the computers than all of the other computers, that computer will always be 10 ms farther behind true time than the other computers. Possibly the best way to address this is to estimate the average latency between the main computer and the other computers, and have all of the playback computers know about this average latency in order to be able to make an adjustment for it.