Slow video playback in OF (compared to Processing)

For a new project, I need to load/play around 250 Quicktime movie files (~ 100 x 70 pixels each) simultaneously. This is obviously heavy, so I have been trying to load a single large video file (~ 1700 x 950 pixels) and then break it up into smaller videos from within OF. The large video is 60 seconds, and ~204MB.

However, I am getting very poor framerates with this technique (around 8 fps). With Processing, using the same technique, I get 30+ fps. This is surprising to me, as I had assumed OF/C++ was much faster than Java for all things, including video playback.

Loading the 250 videos individually seems to be out of the question (even with just 10 videos in OF, I get framerates ~1fps).

Also, in the past I remember having seen video wall projects where several hundred videos play at once with good performance – not sure what language these use if not C++…?

I am using MPEG-4 video compression, with keyframes every frame. Is there another compression I should try?

Also, I saw a mention somewhere of Zach having hacked something that reads QT files as a data stream, not sure if that would help.

Are there any other tips/ideas to get better video performance in OF?

Otherwise, I might be forced to go back to Processing for this project! :frowning:

Thanks so much!

Jonathan

hi -

can you upload zips of the projects you think are slow and include the data folder? I’d really like to see the speeds - we haven’t seen any complaints about slowness and quicktime before, and I’ve seen several projects that (a) use alot of movies and (b) use extremely large video files.

it could be a number of issues (texture loading, improper usage of quicktime, encoding, etc) but I’d really like to see it before speaking about.

if you want to use processing, go for it !! we won’t cry… people should use the tools they want to and if you find p5 better for video for your usage, by all means use it. We think OF is faster for video, but if you find it isn’t then do it in what you think will be best

good luck !
zach

Hey Zach,

Thanks so much for the speedy response!

I prepared two zip files for you to have a look at. They are both based on the OF “moviePlayerExample” file.

The first is: http://number27.org/download/OF/moviePlayerMany.zip (13 MB)
This one loads and plays 10 copies of a small (3 MB) Quicktime movie simultaneously. I get 4 FPS on this.

The second is: http://number27.org/download/OF/moviePlayerBig.zip (285 MB)
This one loads and plays a single large (142 MB) Quicktime movie, which is encoded as MPEG-4, at 30 fps, with keyframes at every frame. I get 2 FPS on this.

I share your feeling that OF / C++ should perform much much faster than Java in this situation, but I’m confused as to why I’m getting these really slow framerates. Maybe you can figure our what I’m doing wrong. Any advice would be much appreciated!

Thanks again so much!

Jonathan

hey thanks ! the examples helped me understand the problem - I see several issues with the smaller example. the big example didn’t unzip properly so I have to try downloading it again…

this is based on OF 0.04 code - can you try it out under the 0.04 version? (openframeworks.cc/download) it would be very helpful -

testApp.h / testApp.cpp

  
  
class testApp : public ofSimpleApp{  
	  
	public:  
		  
		testApp();  
		void setup();  
		void update();  
		void draw();  
		  
		void keyPressed  (int key);  
		void mouseMoved(int x, int y );  
		void mouseDragged(int x, int y, int button);  
		void mousePressed(int x, int y, int button);  
		void mouseReleased();  
		  
		ofVideoPlayer 		fingerMovie[10];  
		  
};  
  

  
  
#include "testApp.h"  
#include "stdio.h"  
  
//--------------------------------------------------------------  
testApp::testApp(){  
	  
}  
  
//--------------------------------------------------------------  
void testApp::setup(){	   
	ofBackground(255,255,255);	  
	  
	for (int i = 0; i < 10; i++){  
		fingerMovie[i].setUseTexture(true);  
		fingerMovie[i].loadMovie("movies/fingers.mov");  
		fingerMovie[i].play();  
	}  
}  
  
//--------------------------------------------------------------  
void testApp::update(){  
	MoviesTask(0,0);    // idle all the movies, instead of one as a time.  
						// seems to be a lot faster with multiple movies  
}  
  
//--------------------------------------------------------------  
void testApp::draw(){  
  
	printf("frameRate %f \n", ofGetFrameRate());  
	  
	ofSetColor(0xFFFFFF);  
	  
	for (int i = 0; i < 10; i++){  
		fingerMovie[i].draw(i*30,20);  
	  
	}  
}  
  
//--------------------------------------------------------------  
void testApp::keyPressed  (int key){   
  
}  
  
//--------------------------------------------------------------  
void testApp::mouseMoved(int x, int y ){  
}  
  
//--------------------------------------------------------------  
void testApp::mouseDragged(int x, int y, int button){  
}  
  
//--------------------------------------------------------------  
void testApp::mousePressed(int x, int y, int button){  
}  
  
//--------------------------------------------------------------  
void testApp::mouseReleased(){  
}  
  
  

I think that there are several things that can be optimized but I’d like to know if this helps at all, I get about 25 fps playing 10 x 320x240.

there are some big costs associated with playing multiple movies, and I think we are for the most part maxing out quicktime, not OF - so this is a quicktime fix, which calls the moviesTask once per idle frame on all movies as opposed to once per movie/frame, which seemed to be alot more taxing.

then there are some optimizations in terms of pixels and drawing that will help too - but this seemed to help at least on the pc side get 10 movies playing smoothly.

as an alternative, you might consider loading the movies into ram, ie calculating number of frames you would like, grabbing the frames into ram (or even textures, since graphics card space is cheap) and playing the movie that way. it will definitely be much, much faster because it will not be dependent on quicktime to decompress, read from disk, every frame.

zach

for the big movie, I get:

a) playback in quicktime itself (12 fps) ( I am on a pc, so it’s not so fast)
b) playback in OF (8 fps)
c) calling ofSetUseTexture(false) (10-11 fps)
d) commenting out all pixel code in quicktime callback: (18/19fps)

I’m not sure what’s reasonable, but a quick explanation here’s of what’s happening in OF::VP code:

  • when there is a new frame of video, the “DrawCompleteProc” gets called
  • when we get pixels via the callback we convert them from BGRA to RGB (this is going to cost alot on such a big image) - commenting it out and putting the pixel directly into the texture, gets us back to 11-12 fps.
  • if set use texture is true, we upload the the data to the graphics card

so yes, there is some inefficiency which likely scales up across multiple movies or very large ones, but I think it’s not huge in the scheme of things. ie, the drop from 12 to 8fps would be a drop from 50 to 40fps if quicktime could process the movie faster. While there is overhead in processing the pixels (ie, we drop from 12fps to 8fps), and if we don’t draw at all we are at 18 fps (the qt app is at 12 drawing, so not a fair comparison)

my suspicion is that the bottleneck here are two things hard for us to control but easy to experiment with and diagnose: (a) the quicktime compression and (b) the graphics card.

what kind of graphics card do you have on your mac? are there other codecs that might decompress faster, such as jpeg or sorenson?

I hope that info helps -

good luck
zach

1 Like

Hey Jonathan,

I was intrigued by what you said - “I am using MPEG 4 video compression with keyframe every frame”.

Keyframing every frame seemed counter-logical to me as that is basically denying the compressor the ability to compress over time.

I always assumed that the value should be around 10-30 - but this site says it is meant to be 300 for NTSC and 250 for PAL.

Apparently the quicktime defaults haven’t changed in 8 years!

http://www.kenstone.net/fcp-homepage/qt-…-m-fcp.html

I am going to play with the large video you uploaded and see if I can get anything faster out of it.

Cheers!
Theo

Now I think about it maybe more keyframes is better in your case - less work for the compressor maybe?

Hmm on my laptop with the large movie I get between 8 and 28fps.

I notice that the framerate on my laptop shoots up when the video has less movement in it - that is because less pixels need to be updated.

openFrameworks is grabbing all the pixel data from quicktime so that would suggest the slow down is in quicktime not openFrameworks which is doing the same operation every frame.

There is definitely overhead in swapping the argb pixel order to rgb which is what openframeworks is doing but when I comment this out framerate only increases by a small amount overall (ie up to 10-29 fps).

Commenting out the upload to texture increases my fps to about 17 to 50 fps
I imagine - there could be some improvement by converting to a compressed texture format first?

You are on a mac right jonathan? Could you see what fps you get from within quicktime. If you open movie inspector - Playing FPS: should tell you (the movie has to be playing).

cheers,
theo

also the graphics card could have a big impact on performance – you are uploading a big texture every frame (approximately 4mb if I calculated right) and depending on your graphics card it could very well be the bottle neck. integrated graphics cards tend to not be able to handle large texture throughput (ie, uploading tons of textures every frame), they work well with textures (ie ofImage, ofFont will run fine - they get uploaded once) but with throughput you could hit some walls. I tested on an nvidia card, this weekend I can test it on an intel integrated card – framerates like 1fps, etc to me suggest your graphics card is unhappy handling the upload. using ofSetUseTexture(false) is a good diagnostic, because this will cancel the upload so you can see how expensive everything else is.

thanks!
zach

Zach / Theo,

Thanks so much for all your feedback!

Let me back up a bit, and try restating exactly what I’m trying to do, and the various constraints I’m working with, and perhaps you guys will be able to suggest the best approach.

TARGET PLATFORM:

  • A very high resolution (2160 x 3840 pixel) touch-screen PC
  • Graphics card: Radeon X1950 XT / 512MB GDDR4 / PCI Express
  • Processor: Core 2 Extreme QX6700 2.66GHz
  • RAM: 4GB PC4200 DDR2
  • Motherboard: Intel D975XBXLKR Core 2 ready, Crossfire

CONSTRAINTS:

  1. Need ~250 videos (108 x 70 pixels each) to be playing simultaneously.
  2. These videos only need to be silhouettes (grayscale is fine) – so all I need is a brightness or grayscale value for each pixel, not full color.
  3. These videos are prerecorded, so can be saved as data, or whatever will be the most optimal form (i.e., if there is some way to bypass Quicktime)
  4. Total video space is 952 x 1696 pixels
  5. Total video length is ~60 secs
  6. Videos need to play independently (not all in sync), so once loaded (however that happens) they need to be available as independent local objects.

VARIOUS APPROACHES:

  1. Load many (250) small video files and play them.
  2. Load one large video, composited in AfterEffects, consisting of all 250 smaller videos, and then cut up the larger video within OF
  3. Store the video data in text files as arrays of per-pixel brightness (grayscale) values, and then read in these text files. Seems the problem with this is that no video compression can be taken advantage of. I did a quick test here, and found that the 400k “fingers.mov” example video shoots up to 14MB when saved as a text file with per-pixel brightness values for each frame. Does this seem right?
  4. Some other way to bypass Quicktime? Work with straight pixel data somehow? Zach, you had mentioned storing the video data in either RAM or as a texture on the graphics card. I would have to look into how to do that, but do you think that might be a good approach?
  5. Any other ideas?

HOW SHOULD VIDEO(S) BE SAVED:

  1. What’s the optimal framerate? 12fps? 30fps?
  2. Keyframes every… 12 frames? 30 frames? 1 frame?
  3. Codec: MPEG-4? Sorenson?

So, that’s what the landscape looks like. With your knowledge of video and OF, which path would you take?

Thanks again for all your continued help (and your fast responses)!

I’ll buy you both a few beers someday!

Jonathan

1 Like

a quick answer is that as much as you can do in texture memory the faster it will be. texture memory is on the graphics card itself, which typically has alot of space for images, and since it luminance, even better.

this is a demo I quickly coded that has 50 6 second 320x240 movies (of random noise) playing. the movie’s frames is basically an array of ofTexture objects. it takes a *long* time to start, since it is allocating and uploading those images, but better to do that once per app starting, then every frame.

http://www.openframeworks.cc/files/text-…-pleSrc.zip

on a pc, the memory goes way up when you start it (up to 700mb) - don’t fret - this is just the system registering all the allocation, but once it’s up and running it drops down and is stable at 30mb memory and runs at 60fps. I think w/ out vertical sync it would be even faster, it’s not working that hard at all since the images are all in there in graphics cards. This is how video games work - pump textures to the card at the start and use them when needed.

this way, you might even just save your images as sequential jpegs or load the small movies and get out the individual frames into textures. the more compressed they are the better, since decompression is usually alot faster then any disk reads.

I coded a simple logic for advancing frames, but it’s a bit hard to tell if it’s completely perfect. it usues an internal variable to represent where it is in the movie. You can also look at the code for the linux OF video player, which is doing the same thing. I think once you have recognizable images in there it might be easier to see if it’s playing clearly or not. currently it’s just random noise.

hope that helps !
zach

1 Like

Hey Zach,

Hooray for textures! They’re blazing fast! I have an example with 250 copies of the same video (each at 320x240 pixels) playing simultaneously, loaded into textures, and I get upwards of 70fps!!

You can grab the source here:
http://number27.org/download/OF/moviePixelgrabber.zip (6MB)

So this is definitely the way to go! Thanks so much for the pointers.

However, a couple of other issues have popped up:

  1. GL_LUMINANCE for textures is acting funny. While GL_RGB works fine, GL_LUMINANCE leads to a tiny checkerboard pattern in the texture, as opposed to giving the grayscale representation of the video, as I had expected. You can uncomment the GL_LUMINANCE section in the code to see what I mean.

  2. I noticed some basic math functions (floor, ceil, etc.) aren’t included in OF. This led me to try to include the standard math library Math.h, but I am not sure how to do this (a) find the source code for Math.h, (b) include it in the project – i.e. how do you find and add external libraries? Is there an example of this somewhere, and is there a good online resource for standard C++ libraries?

Thanks again!

Jonathan

cool –

a) are you on OF 0.03 or 0.04 ? we’ve move to allow non power of 2 textures on graphics cards that support it (basically nvidia or ati). I think the artifact you are seeing might be based on that – I haven’t run your stuff yet, just guessing :slight_smile:

b) we are including #include <math.h>, check in ofConstants.h. I justed tested, at least on a pc, I have access to everything std::math

http://www.cplusplus.com/reference/clibrary/cmath/

if you include ofMain you should be able to do stuff like this:

  
  
printf("%i \n", (int)ceil(1.4f));  
  

anywhere…

if you don’t, can you post screenshots of the errors in a separate thread in the osx subforum ?

thanks!
zach

Hey guys,

So, I am plugging away at this problem, and hitting some more walls with which I could really use some help.

My latest code is here:
http://number27.org/download/OF/textureMovie.zip

Basically, I am loading 60 6 second videos, saving them frame by frame as textures, and then looping through the textures to play the frames (to bypass Quicktime bottlenecks).

These are some of the issues I’m seeing:

1) Strange framerate fluctuation

As the program loops through the 180 frames of textures, my FPS fluctuates *consistently* between 30 fps (the target rate) and 8 fps (poor). After cycling through all 180 frames several times at 30 fps, the fps slowly drops to around 8 fps, before shooting back up to 30 fps each time frame 60 is encountered on each iteration. This goes on and on. I can’t understand why this is happening. The pixel complexity of all the frames seem to be roughly the same. Is there some strange array accessing issue? OF trying to catch its breath at the same place each time? Something else?

2) Program crashes with 100+ videos

I can load 60 videos ok, but when I increase the number of videos loaded to around 100 (determined by the “NUM_VIDEOS” variable in testApp.h), the program crashes after running for about 80 frames. This is surprising to me, because I had assumed the heaviest part of the process would be saving all the videos to textures, and that once that’s done, looping through the textures would be very fast. But the textures load fine, it’s the playback that causes the crash. When the 100 videos are being saved to textures in this case, the memory usage goes up to ~1.5GB. I have 4GB of RAM. Shouldn’t this be well within reason? Is there some way to see how much memory is allocated to OF? And some way to increase that amount of memory? For instance, in Processing there is a preference setting telling the machine how much memory to give to Processing. Is there some equivalent here (I’m using Dev-C++)? My real goal is to load ~220 videos (60 seconds & 60 x 170 pixels each). Is this reasonable? Or is this totally out of the question? If the latter, then I need to rethink this whole project (yikes!).

3) GL_LUMINANCE behaving badly

When I run this program, the GL_LUMINANCE textures come out as small black and white checkerboard patterns. The source video image is still recognizable, but is very distorted. Earlier in this thread, Zach mentioned that this could be a graphics card issue, associated with the non-power of 2 texture support. Is there some way to investigate / fix this for me?

Sorry for all the questions! I’m just trying to understand if what I’m trying to do will be possible, in terms of memory / performance, or whether it’s back to the drawing board.

Thanks again so much!

Jonathan

looking at 1 and 2 now -

for (3) you should check the API about textures, or see the texture example - the last parameter is the type of data you are passing in, not what is displayed.

this:

  
  
frames[i].loadData(pixels, WIDTH_MOV, HEIGHT_MOV, GL_LUMINANCE);  
  

should be:

  
  
frames[i].loadData(pixels, WIDTH_MOV, HEIGHT_MOV, GL_RGB);  
  

-z

about (1) and (2), I am on a geforce 8600 and I’ve maxed out my texture memory with 100 movies. I don’t crash, but I am definitely not able to store 100 texture movies in texture memory. I think I must have 128 mb… I’ll take a look at my specs.

I am not crashing w/ 100 movies, and framerate is locked to 30…

can you let us know some specs abuot the machine you are testing on? this stuff seems directly graphics card related, so the more we know the better.

thanks!
zach

Thanks, Zach.

Yeah, my bad on the GL_ILLUMINANCE thing – should have read the documentation more carefully. So that fix worked for #3.

But I’m still seeing the (bigger) problems from 1 and 2.

Here are some specs on my machine:

  • 2160 x 3840 pixel display
  • Windows XP
  • Graphics card: Radeon X1950 XT / 512MB GDDR4 / PCI Express
  • Processor: Core 2 Extreme QX6700 2.66GHz
  • RAM: 4GB PC4200 DDR2
  • Motherboard: Intel D975XBXLKR Core 2 ready, Crossfire

Is the texture memory stuff being bounded by RAM (I have 4 GB), or by graphics card memory (I have 512MB)? If the latter, is it possible to get graphics cards with more memory? Are there other things to be done (allocating more memory to OF from the OS, etc.)?

Thanks again.

Jonathan

you are definitely maxing out the graphics card memory with 100 movies:

(320*240*6*30*100) bytes in mb = 1,318 mb

if the textures wind up in ram, they will be really slow - since they will need to move from RAM to the graphics card per frame.

you might investigate SLI, which allows you to link graphics cards together which might be helpful – that might give oyu alot more memory. It looks like now there are now also some 1gb graphics cards, that could be useful too…

I’m not sure about the slowdown (might be because you hit frames that are in RAM not texture memory) or the crashing (I really don’t know – I’d watch memory usage and see if something bad isn’t happening there). since I don’t have either the crash or the slowdown on my machine, I’m pretty much at a loss about how to debug, but I think you are hitting the limits of what might be possible with one computer.

have you thought about doing the rendering across multiple machines, or is it necessary to run on one only?

hope that helps!
best,
zach

hi jonathon -

another simple question, have you considered no doing full resolution? with opengl textures you get bilinear interpolation at almost no cost, so you can really work at much lower resolutions then you would imagine. for example, instead of 320x240, the movies could be 160x120 but drawn at 320x240
(ofTexture has a draw function which takes x,y,w,h). you could cut your texture memory requirements by 75% and I honestly think that unless there is super delicate pixel stuff happening, you wouldn’t see or feel a difference.

just a thought !

take care
zach

Hey Zach,

Yes, lower resolution videos are a good idea and I’m planning to test it out. Am also looking into multiple graphics cards as you suggested, and I understand there is a new nVidia card with 1.5GB of memory, so will try to get one. Will keep you posted. Thanks for all the help so far. Really great stuff!

Jonathan

Hello Zach! I´m doing a project in processing and i have a problem i hope you can help me width. The project has like 70 videos which play randomly in a grid, the problem is that when a new video is loaded the others which are already playing break for a second and that can not happen!! I really don´t now how to fix this, if you could give me a help i would be very grateful.

I will leave this link to a zip file width my project: http://we.tl/SJ3YUXtTnO

Thanks in advance
Tiago