I’ve been scrounging around the forum for a while and I haven’t found the precise answer to my question so I thought I’d throw it out here.
The context for my project is I have between 1000 and 2000 1080p videos ( that range between 4s and 30s) that need to be played back on the fly in a sort of video sequencer application. Of these videos only around 5-20 really need to be loaded at once, but the set of videos can be changed pretty much at any time. Only one video is playing at a time, but I want to be able to switch between loaded videos 30 times/second.
The project is being developed on OFv0.9.8.0 64 bit on ubuntu 16.04. Most promising solutions seem to point towards using the AVFoundation players and codecs which is a little disheartening because a powerful enough mac device is not really an option.
I’ve loosely based my code on the ofxGaplessVideo addon so loading is threaded via loadAsync. I’ve experimented with a couple different codecs, the most promising being MJPEG and HAP. The main problem with mjpeg is that the decoding is done on CPU so playback is often slowed down when loading new videos. Playback is already pretty much maxing out my cpu, so when a new video is loaded it can cause the framerate to drop. I have a similar problem with HAP, which doesn’t support loadAsync, and attaches itself to the update loop. This means that when I’m loading a HAP video, there is sometimes a >30 ms delay between updates that I have no control over and playback gets choppy.
The only working solution I’ve found is to preload all of my HAP videos into players. For now I’ve been testing on a subset of about 300 videos. If I preload them all, they take up about 4GB of RAM and 500 MB of VRAM. This seems to correspond to approximately 1 decompressed frame (with float values and no alpha cahnnel) stored in RAM. With this setup I’ll need a pretty exorbitant amount of memory preload all my videos. The HAP videos are also going to fill up any SSD pretty fast, but I think I can maybe get away with having the videos on a spinning drive if I go with the preloading option. It still means I have to buy a crazy amount of RAM for it to work. I may also end up doing some tests to see if letting the RAM overflow into swap on an SSD doesn’t affect performance.
Ideally I would much prefer loading the videos on demand in a background thread in a manner that doesn’t impact playback. Does anyone have any suggestions?
A side-problem is that I can’t seem to get h264 video to decode on gpu even though i’m pretty sure my system is capable of doing so. Right now, h264 is giving really terrible performance but i could be the holly grail with async loading and GPU decoding.
Des anyone have some tips for getting h264 or h265 (HEVC) video to decode on the GPU with OF on ubuntu?
Thanks for any help!
Some thoughts after reading your post.
Do you really need separate files or could you classify and concatenate your movies into categories or themes? It’s always easier for a system to jump to a new frame instead of loading a new file. The latter always requires some housekeeping to be done (closing the player and opening a new one, maybe making resources on the CPU/GPU, …) which will take time.
IMHO: H264 and H265 (HEVC) aren’t good candidates for what you want. While being super efficient in compression, these algorithms induce dependencies between frames and require quite some decompression power to handle. That’s why it’s best to give this task to the GPU since they have dedicated processors for video en/decoding. But this again requires householding to happen. I have quite some experience with the Nvidia VideoCodec SDK, and know that a lot of things happen from loading a new file to showing the first texture on screen. In most cases (the) first frame(s) will be a reference frame but it’s not always the case so the decoder would need multiple frames to detect the specific compression context parameters.
Loading things asynchronously will only provide a real benefit in specific situations, as not all resources for a movie can be constructed on a different thread. For example, in general (not always), OpenGL - what is used in OpenFrameworks - requires resources to be made on the main thread. A texture as such needs to be made on the main thread. You will always need a texture to display video frames.
Compared to OpenGL, DirectX allows for a more multithreaded approach of loading textures and buffers.
My guess is you will be best off with a one texture per frame videocodec, as is HAP, which stores frames as DXT textures that can be easily uploaded to and decoded by the GPU. Also make sure you’re videos are having the same dimensions, framerate and compression parameters.
You can also try my addon ofxHPVPlayer. This is a cross-platform high-speed/high-fps video solution and was tested on Ubuntu. HPV files use the same DXT compression algorithms as HAP but frames are de/compressed in different ways. Also, the HPV system has no dependencies on platform specific media frameworks so the code is transparent from the reading of the actual bytes until the display on the GPU. It should be really fast in jumping between multiple video files and in jumping between frames in one video file. Sound, however, is not supported.
since last version the normal video player has a loadAsync option which loads the video without blocking the main thread
Wow, thank you so much for the in-depth response!
I suppose I could concatenate a bunch of videos together and keep track of what frame each individual video starts at. I could just jump to the right frame when I need to switch videos. That would definitely fix my memory problem. I wonder if that approach would work with mjpeg files (fixing my hard drive space problem). I guess I still don’t fundamentally understand how video decoders are working, I was under the impression that a much larger video file would have a larger memory footprint.
Good to know for h.264/HEVC. I don’t really have time to try out a bunch of codecs and see how I can make them work so it’s good to know that I don’t need to explore that avenue.
I did come across your HPV player when I was looking for solutions but I guess I’m a little turned off by the encoding process (having to export my video to frames first). How would you say the encoded file size compares to HAP? That would be the main selling point for me to take the time to reencode my videos/update my codebase. I do really like the code transparency aspect as well as the look of the overlay you have (with HDD and decompression time) so I can figure out if there’s a bottleneck in my system.
Also when you say jumping between different video files do you mean closing and loading new files or switching between different preloaded videoplayers?
I’m pretty sure I’m using that when I decode mjpeg videos but my laptop cpu can’t quite manage the decoding and loading in parallel. By last version do you mean a version more recent than 0.9.8?
MJPEG (or PhotoJPEG in a Quicktime container) will probably also work and is good for jumping to frames in an already loaded video. It stresses the CPU more as you say. How does the ‘native’ OF videoplayer work for you with the MJPEG files, together with using loadAsync()?
Larger videofiles don’t necessary have a bigger footprint. Most videoplayer implementations buffer an amount of frames (in a separate reading thread) while the display thread makes sure they are presented on screen. The reading thread will keep the buffer full but will never buffer more frames than specified.
Larger video dimensions (eg 4k vs FHD) of course do have an effect on the memory footprint.
I know making HPV files does require more steps than exporting HAP from e.g. Premiere. I should compare file sizes with HAP, can’t answer that. A HPV video with 250 frames of FHD video (=10s @ 25fps) with lots of changes in the pixels per frame typically compresses to between 120->150MB.
With jumping between different files I do mean closing previous and opening new files.
Okay that seems to be a significant gain over hap. I was getting files about 1.5 GB/minute. It wouldn’t be that complicated to make a simple python script to call ffmpeg to split the video into frames and then call the HPV converter.
With MJPEG I nearly have something that works, except for very fast transitions. I’ve been playing around with having different numbers of concurrent video players (4-30) with one playing at a time and an idle player loading the next video. My old implementation was trying to load too many videos in parralel which was hogging CPU time. Basically it takes maybe 15 ms for the async load call to complete, and then another 100 - 300ms to actually “prime” the video (get it to frame 1 in the playing state). This works until I do very fast transitions and I’m getting some strange behaviour where a video will get stuck in the loading state - isLoaded() just keeps returning false. There seem to be some unhandled messages in the gstreamer backend when this happens so I may dig around there to see what’s going around. I might also just set a timer or something to reset/close any video that’s been loading too long. The timer solution at the moment is causing me to segfault because I think I’m closing resources that ought not be touched.
I don’t actually need sound so I think I might be able to speed up the process if I disable the audio part of the player. If I can’t get this solution working I’ll probably try and move to HPV, which might be more stable in these high-speed situations. I do like the prospect of being able to save on hard drive space with MJPEG though.