fastest (in the sense of optimising) way to resize an image

I’m making a video mirror installation and as a small feature I need a way to capture and store what’s coming in from the camera and play it back periodically. The images recorded don’t need to be the full res (infact they can’t be otherwise I’d quickly run out of memory) so I’m trying to establish the best way to resize each frame on the fly as it comes off the ofxVideoGrabber.

Each frame is 1280x960 and the video is coming in at 30fps.

At the moment I’m loading the pixels into a texture, drawing that to an FBO that’s the same size as my desired resolution and reading the pixels info off that like this:

(Where I’ve already allocated an array of images and an ofxFBOTexture with my scaled dimensions)

  
  
void loadPixelsIntoMyBufferAndScaleThem(unsigned char * txtr, int w, int h){  
	  
	textureToScale.allocate(w, h, GL_RGB);  
	textureToScale.loadData(txtr, w, h, GL_RG8);  
  
	// take that texture and draw it into an FBOtexture...  
	drawIntoMe.begin();  
  
           // WHERE scW and scH are my scaled dimensions  
           textureToScale.draw(0, 0, scW, scH);  
  
	drawIntoMe.end();  
	  
	images[imgCount].setFromPixels((unsigned char *) drawIntoMe.getPixels(), scW, scH, OF_IMAGE_COLOR);  
}  

Doing this every frame causes a drop of about 10fps overall. But that’s substantially faster than loading into an image and calling resize() on them all.
I haven’t tried using openCV images and using scaleInto() but could if that’ll be faster.

I will need to do the reverse when it comes to playing back too (ie. go from scaled dimensions to full 1280x960) but I never need to show the image, only use the data. To that end I have setUseTexture(false) on my images. I tried using ofTextures instead but there wasn’t any difference in speed.

If anyone has any tips on optimising the procedure please let me know.

Cheers.

the framerate hit probably comes from uploading the pixel data to the video card. if you can resize the image in memory first before uploading it (but not using the relatively high quality resize() function), it should be much faster.

to do this, if you’re downscaling by an integer fraction (1/n) then you skip fancy bicubic resize methods and just quickly run through the array taking a moving average of each n+1 pixels, first in columns and then in rows, applying an approriate-pixel-value-scaling-kernel, and that should be very fast.

here’s my code that does this for a greyscale 50% resize function, it’s using libcvd but hopefully you’ll be able to apply it to the getPixels() from an ofImage, just remember to call update(). i’ve made a few notes about conversion to ofImage, it should be very straightforward. more difficult will be converting to 3 bytes per pixel (RGB) instead of just 1. every pixel array offset (eg image.data() + n) will need to be multiplied by 3, and the increments in row processing (the second half) will need to be 3 instead of just ++.

  
  
Image<byte> ImagePyramid::fastResize( Image<byte> input )  
{  
	// create image for output  
	Image<byte> output(input.size()/2);  
  
	int w = input.size().x;  
	int h = input.size().y;  
	// stride will be 0 if you change this to use openFrameworks getPixels()  
	int stride = input.row_stride();  
	// ensure even width/height by throwing away the last row/column if necessary  
	if ( w%2!= 0 )  
		w--;  
	if ( h%2!= 0 )  
		h--;  
	int dstride = output.row_stride();  
	int dw = output.size().x;  
	  
	// separate the kernel processing into columns then rows for efficiency  
	int i,j;  
	// columns  
	for (j=0;j<w;j+=2) {  
		// input.data() is the same as ofImage getPixels()  
		byte* src = input.data()+j;  
		byte* end = src + stride*(h-2);  
		byte* dst = output.data()+j/2;  
		// pointer arithmetic - this will step through pixel by pixel until  
		// src is pointing at the same pixel as end  
		while (src != end) {  
			byte sum= (byte)(0.25*(src[0]+src[2*w])  
					   + 0.5*src[1*w]);  
			// assign sum to the pixel at dst  
			*(dst) = sum;  
			src += stride*2;  
			dst += dstride;  
		}  
	}  
	// rows  
	for (i=(h/2)-3;i>=0;i--) {  
		// we have already done column update and put the result in output  
		// so now read from and write to output only.  
		byte* src = output.data()+i*dstride;  
		byte* end = src + dw-2;  
		while (src != end) {  
			byte sum= (byte)(0.25*(src[0]+src[2])  
					   + 0.5*src[1]);  
			*(src+1*dstride+1)=sum;  
			++src;  
		}  
	}  
	  
	// resized  
	return output;  
}  
  

3 Likes

Thanks for the tip. You’re right, the overhead was loading it into graphics memory.
Cheers,
a.