Hardware decoding live video stream(s) in openFrameworks?

Hey all,

I’m hoping to migrate a project I’ve developed in Touchdesigner over to openFrameworks. Currently I’m capturing two video streams from a pair of cameras being developed by a small startup. They’ve advised me that the new 4k model will probably overwhelm DirectShow based capture in windows, and that I should use the Media Foundation API to take advantage of hardware acceleration.

Touchdesigner has native support for streaming via Media Foundation, and I’ve noticed that openFrameworks has the ofxWMFVideoPlayer addon, but I haven’t found information on whether this addon can process streams or just plays videos.

What are people using to accomplish this kind of thing?

hello,

The ofxWMFVideoPlayer addon has a technical specification file. This wrapper seems to be done for video-player only. Also comments others can to try to extend it.

Can BlackMagic solution work for you?
This addon–> ofxBlackmagic2 has been recently updated.

Could you specify what kind of new cameras are you using. Just curious :wink:

I think the gstreamer decoders shipped with OF should be able to handle 4k streams. I’ve decoded 4k streams using libav/ffmpeg (without hw acceleration) and it works okay. Ffmpeg does implement hardware acceleration but I haven’t been able to use it yet. Note that gstreamer uses ffmpeg of libav underneath…

In any case it’s not easy to start with any of these libraries, I would suggest trying the gstreamer approach via OF and see if you can specify the codec to be used, there may be HW only codecs available.

Good luck!

Hi @fthrfrl,

Accessing a stream from a hardware device has two parts: (1) accessing the devices stream and (2) transforming the stream into something that you can work with (e.g. RGB pixels). Hardware acceleration only comes to work in the second part, because most camera streams use a certain compressed data-format to lower the bandwidth, such as YUV, NV12, MJPEG, H264.
The first part most of the time uses platform specific code under the hood to access USB ports, Firewire ports, Ethernet in a uniform way, that’s why you see often that platform specific backends are used for this. FFMPEG also does this. On Windows, I think it still uses DirectShow for accessing capture devices. Also Touch Designer let’s you select things. The Media Foundation framework for Windows is newer and I’ve used it successfully for tapping into a 4k stream coming from an USB 3.0 port. But it’s not the easiest way to go and needs good understanding on how this framework works.
For the second part, because transforming the stream in something you can work with is most of the time always needed, these frameworks tend to set up a pipeline to do this transformation directly for you. This can either be CPU based or hardware accelerated by special purposed parts of the GPU. Keep in mind that GPU based is not always what you want because then all the pixels need to be copied first to the GPU, transformed, copied back to the CPU for use, … On Intel chips, things are a bit more streamlined since CPU/GPU are built into one chip so there are special parts on that chip to do that decoding for you.

So an important question to ask is what you will do with the pixels from the camera? Are you doing CPU pixel operations? Or can everything be done on the GPU, for instance only for displaying the texture? NV12, a very common compression scheme for capture devices, can be transformed to RGB in a shader for example. The nice thing here is also that NV12 packets are smaller thus more efficient to upload. So, in the second case I would surely try using DirectShow and see how far you can get.