Texture3D usable in Fragment Shader


I dont even remember why I templated it :stuck_out_tongue:

did you try just using ofxVolumetrics example?

That kind of visual artifact usually happens when you are using the incorrect stride -how many bytes each pixel uses-. There’s not much to do about it in the shader so I would guess that it has to do with how you are allocating the memory in the texture.

Oh my friend thanks for the hint, that helped a lot !

I actually saw that there was multiple way to load the data on a texture3d, and one takes ofPixels& and determine automatically the format to apply.

So by changing the loop to loadData to :

	for (int z = 0; z < volDepth; z++) {
// We don't event need the intermediate volumeData array

It actually load correctly the images as a texture3d and then it can be passed to the fragment shader and correctly interpreted !

Quick view : https://video.twimg.com/ext_tw_video/1270100564114845698/pu/vid/1280x720/FpdHClxJNZOzV32c.mp4

Thanks a lot ! I owe you a beer :stuck_out_tongue:

1 Like

Superb. Yes, that addon is quite nice.
The video perfectly japanese duality (which I love) where you have very peaceful things(as the images) and absolutely crazy ones (the music). :smiley:

No worries.

1 Like

I am trying to use @totetmatt 's code as a starting point for working with texture3d. As I am on a mac, I understand the max GL-Version I can set OF to is 4.1 (@totetmatt is using 4.5).

On compiling I get the warning “extension ‘GL_ARB_texture_rectangle’ is not supported”. Apart from that everything seems to compile and run fine, except that nothing from the image composite from the texture3D is displayed.

I can confirm the shader seems to be running (apart from the images not being displayed) by modifying the last line of the fragment shader from



outputColor=vec4(t,1.) + vec4(uv,-uv);

Which gives shows me a slowly rotating color gradient.

Any pointers on how I could get this to work? I am using Tim Scaffidis branch of ofxVolumetrics.

I also use a texture3D with ofxVolumetrics with a fragment shader. Since it is a 3D game of life I need to render the shader result slice by slice back into the texture3D. I wonder basically, if it is possible to keep the data on the GPU (load a 2D texture into the texture3D?). At the moment I write every shader result to an ofPixels object which I can load into the texture 3D like written above (and copying the data to the CPU and back to the GPU seems to be the expensive part):

	for (int z = 0; z < volDepth; z++) {

Hey did you try it with maybe openGL 3.3 or higher? It looks like totetmatt’s shader code is using OpenGL 3.2 (#version 150). Also you can run the glInfoExample which will list the available extensions. On my 2015 mbp it lists GL_ARB_texture_rectangle as an available extension.

And if it helps, have a look at this forum post about using a sampler2Darray , which worked for me (on linux) with openGL 3.3. You might be able to use it instead of a sampler3D.

Sorry, edited for readability.

I can confirm GL_ARB_texture_rectangle shows up as available on my 2019 MBP on Catalina (using the glInfoExample).

But: OpenGL Version shows up as 2.1 ATI-3.10.23 in glInfoExample, although I understand this somewhat scrambled up by apple and not directly related to the actually usable max openGL version? Calling glGetString(GL_VERSION) in setup() shows the same version I set in main.cpp

Is this correct: The openGL version is set in main.cpp using the following command:
settings.setGLVersion(4, 5); // → openGL Version 4.5

If I understand this correctly, GLVersion was set to (4,5) in the code I downloaded from totetmatt, while the shadercode was using 3.2 (#version 150).

Setting #version to 150 and settings.setGLVersion(3.2) still throws the warning that GL_ARB_texture_rectangle is unsupported.

Same with #version 330 and settings.setGLVersion(3.3),
Same with #version 410 and settings.setGLVersion(4.1)

Do you have any advice? I also will check your suggested forum post, thanks a bunch for that!

Hey I like your approach of trying this and that to eliminate possibilities. I didn’t look at totetmatt’s main.cpp so I missed that he had 4.5 specified.

Have you tried loading square textures vs rectangular ones? If I remember right, sampler2D in the shader wants square textures (that are also a power of 2). So I’m wondering if sampler3D might be the same in this regard. With the sampler2darray, the textures can be rectangles, though they may all have to be the same size and format (GL_RGBA).

Yeah I think this is where oF figures out which openGL version to use when its setting up window and the context and stuff. And the mac probably won’t like 4.5 as you’ve mentioned, but 4.1 should be fine and 3.3 is fine for sure.

Then I pasted #extension GL_ARB_texture_rectangle:enable into a misc frag shader and got the same warning, both on an intel mbp and on my m1 mini. But, the shader still ran and did what it was supposed to do.

Also I can confirm that the sampler2Darray will work on a mac. The slitscan project worked great on the m1 mini, but it didn’t work the same on the intel mbp (integrated graphics). If you want to test it just let me know and I’ll get the files to you.

My macOS experience is pretty limited, and my openGL skills are sketchy and the internet has been a huge help. That said, I’m not sure if you need the extension; you could try and see if the shader complies and runs without it. And then I try to pair the openGL version in oF with the correct #version in the shaders (usually 3.3 and #version 330).

So @Jona I think you could definitely try loading a single texture into the 3d texture, without having to load them all into tex3d and then send that to the shader. In the slitscan project (with a sampler2Darray instead of a sampler3D), the texture got updated with the newest texture from a video player:

        glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray);
        glTexSubImage3D(GL_TEXTURE_2D_ARRAY, level, xoffset, yoffset, zoffset, width, height, depth, GL_RGB, GL_UNSIGNED_BYTE, videoPlayer.getPixels().getData());
        glBindTexture(GL_TEXTURE_2D_ARRAY, 0);

        zoffset += 1;
        if(zoffset > maxDepth) {zoffset = 0;}

So it might be as easy as just substituting GL_TEXTURE_3D for GL_TEXTURE_2D_ARRAY, also like this stack overflow thread.

1 Like

@TimChi thanks. And sorry, my question was not very clear. It works for replacing only one slice of the texture3D, but for that I still need to get the pixels from the texture2D. I wonder, if I can pass a texture2D to a texture3D without using pixels (if it is the case, that texture → pixels → texture is a bottleneck). But it seems, that ofxVolumetrics only accepts pixels for loading data.

Ah sorry and yes that’s more clear. In the slitscan project I found that loading (cpu → gpu) the entire std::vector into the texture array was slow, but loading a single texture “slice” was possible and much quicker. So getting the pixels from the video player texture and using glTexSubImage3D() to send them to the texture array worked well.

So after google searching a bit I found glCopyTexSubImage3d(), which looks like it will copy the current GL_READ_BUFFER into the GL_TEXTURE_3D or similar. So you might be able to do all of the copying on the gpu if you can get the slice into the GL_READ_BUFFER.

1 Like

Hey @TimChi, yes I did try the square textures sized a power of 2 (1024x1024). No luck. If I could test the slitscan project that would actually be amazing (I am basically trying to do something quite similar, a time-displacement per pixel from a displacement map, but I am only getting about 8fps on full-hd when doing this on cpu, so trying to shift it over to the gpu.)

Thanks a lot. I think thats it. When I redraw a 256x256x99 texture every frame, the pixels method drops down to 10 fps while glCopyTexSubImage3D stays almost at 60 fps.
This works for me (I only wonder how glCopyTexSubImage3D knows which texture3D to use):

    for (int x = 0; x < volDepth; x++) {
        ofSetColor(ofRandom(255), 12, 0);
        ofDrawRectangle(ofRandom(20), ofRandom(20), 50, 50);
        ofSetColor(50, ofRandom(255), 100);
        ofDrawCircle(90 + ofRandom(20), 90 + ofRandom(20), 50);
        glCopyTexSubImage3D(GL_TEXTURE_3D, 0, 0, 0, x, 0, 0, 256, 256);
1 Like

in case it’s helpful (and apologies because I haven’t followed the whole conversation) for a recent project I found it useful to have some large FBOs, and draw images in them tiled in order to have something like a 3d texture in the shader . IIRC I had about 10 4096x4096 fbos and was drawing 64 512x512 images in each one, so a total of 640 frames were stored and accessible in shader. I then passed each fbo to the shader, and could grab specific pixels out, etc. Was used for these kinds of works


I will definitely try a 3d texture – but wanted to mention this 2d approach I used if helpful…


Hey @Jona wow that’s a nice bump in performance! So it sounds like this glCopyTexSubImage3D function is working for you. I like this idea of being able to copy a texture from one thing to another in the gpu, without having to go thru an ofPixels object.

So if you look at the block of code I posted above, I think the application knows which 3d texture to use by the one that is bound with glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray), where textureArray is the GLuint handle for the array. Then you can “unbind” textureArray from GL_TEXTURE_2D with glBindTexture(GL_TEXTURE_2D_ARRAY, 0). I think you’re copying a snapshot of the fbo each loop iteration, right? If so, then the fbo must be bound to GL_READ_BUFFER. But, I’m speculating on this.

So this approach of using big ofFbos is pretty cool, because I don’t think all of your textures would have to be the same size. I could be wrong but I have a feeling that the textures in the texture array have to all be the same size ( or at least the size of the largest texture) But I haven’t verified this to know for sure. Using an ofFbo is a very “oF way” to do it too!

Yeah, even with 256x256x256 it runs smooth, but with 512x512x512 I get 100% GPU load and the fps drops. But thats also a lot of texture and I do not have the newest GPU (GTX970). And much better than with the pixels.
I just do not know how to get the GLuint handle from a texture, so that I can use it in:

glBindTexture(GL_TEXTURE_3D, texture3D);

Somehow it also works fine without binding the texture3D (maybe because its already bound?), but I guess thats not ideal.

Yes, I am copying a snapshot of the fbo each iteration. Somehow it works without binding to GL_READ_BUFFER, but maybe thats also not ideal.

Here is the ofxVolumetricsGPUCopyExample (adapted from the ofxVolumetricsExample - and without use of shaders yet): ofEmscriptenExamples/ofxVolumetricsGPUCopyExample at main · Jonathhhan/ofEmscriptenExamples · GitHub

@mativa I was so hoping that the lack of square textures was the problem. But awesome that you tried already. And then you probably tried both normalized and non-normalized values for uniform vec2 iResolution in the fragment shader too I’ll bet.

OK here is the slitscan project, pared down a bit. It should compile and run; just add a video to the data folder for it to work on. It runs great on my m1 mini (Monterey), and a Dell laptop on linux Mint 20.2 with Intel integrated graphics (i7-6600U). It runs on my 2015 mbp retina, but only 1 “slice” is drawn (likely the newest one). So I’m not sure why. Also @Jona have a look for how to set up the openGL stuff for the array texture.


#include "ofMain.h"
#include "ofApp.h"
int main( ){
    ofGLFWWindowSettings settings;
    settings.setSize(1920, 1080);
    ofRunApp(new ofApp());


#pragma once
#include "ofMain.h"

class ofApp : public ofBaseApp{
    void setup();
    void update();
    void draw();

    ofVideoPlayer videoPlayer;    
    ofShader shader;

    // stuff for the sampler2DArray
    GLuint textureArray; // a handle for the array
    GLsizei maxDepth; // max number of textures in textureArray
    GLsizei numLevels; // max number of mipmap levels
    GLint level; // the mipmap level(level 0, since there is only 1 level)
    GLint xoffset; // subsection x
    GLint yoffset; // subsection y
    GLint zoffset; // the depth (level) index value into textureArray
    GLsizei width; // texture width
    GLsizei height; // texture height
    GLsizei depth; // texture depth


#include "ofApp.h"
void ofApp::setup(){
    shader.load("common.vert", "radialSlitScan.frag");


    maxDepth = 128;
    numLevels = 1; // the number of mipmap levels
    level = 0; // level 0, the index, since there is only 1 level
    xoffset = 0; // offset for subsection
    yoffset = 0; // offset for subsection
    zoffset = 0; // this gets incremented
    width = static_cast<GLsizei>(videoPlayer.getWidth());
    height = static_cast<GLsizei>(videoPlayer.getHeight());
    depth = 1; // the texure layer; this does not get incremented

    // get a handle for textureArray, then bind the target and allocate it:
    glGenTextures(1,&textureArray); // get 1 handle and store the value in textureArray
    glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray); // bind the target
    glTexStorage3D(GL_TEXTURE_2D_ARRAY, numLevels, GL_RGB8, width, height, maxDepth); // allocate some storage for it

    // calls for good practice (?):

    // unbind the target
void ofApp::update(){

    /* updating the sampler2DArray each time with all the textures is very slow; so use an indexing scheme in the fragment shader, and make only 1 call to glTexSubImage3D() each cycle to update the oldest texture with the newest frame from the video player */
        glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray);
        glTexSubImage3D(GL_TEXTURE_2D_ARRAY, level, xoffset, yoffset, zoffset, width, height, depth, GL_RGB, GL_UNSIGNED_BYTE, videoPlayer.getPixels().getData());
        glBindTexture(GL_TEXTURE_2D_ARRAY, 0);

        zoffset += 1;
        if(zoffset > maxDepth) {zoffset = 0;}
void ofApp::draw(){
    // not sure if all of these gl calls are necessary; but it might be good practice to consistently bind/enable and unbind/disable stuff
    glActiveTexture(GL_TEXTURE0 + textureArray);
    glClientActiveTexture(GL_TEXTURE0 + textureArray);
    glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray);

    shader.setUniformTexture("textureArray", GL_TEXTURE_2D_ARRAY, textureArray, 0);
    shader.setUniform2f("resolution", static_cast<float>(width), static_cast<float>(height));
    shader.setUniform1f("zoffset", static_cast<float>(zoffset));
    shader.setUniform1f("maxDepth", static_cast<float>(maxDepth));
    videoPlayer.draw(0.f, 0.f); // some texcoord for the shader

    glActiveTexture(GL_TEXTURE0 + textureArray);
    glBindTexture(GL_TEXTURE_2D_ARRAY, 0);


#version 330
uniform mat4 modelViewProjectionMatrix;
in vec4 position;
in vec2 texcoord;

out vec2 vTexCoord;

void main()
    gl_Position = modelViewProjectionMatrix * position;
    vTexCoord = texcoord;


#version 330
uniform sampler2DArray textureArray;
uniform vec2 resolution;
uniform float zoffset; // the newest layer in the textures array
uniform float maxDepth; // the max number of layers in the textures array
in vec2 vTexCoord;
out vec4 fragColor;

void main()
    vec2 tc = vTexCoord / resolution.xy;

    // shift and correct for the rectangular nature of the textures
    float aspectRatio = resolution.x / resolution.y;
    tc = tc * 2.0 - 1.0;
    tc.x *= aspectRatio;
    tc = tc * 0.5 + 0.5;

    float index = 1.0 - distance(vec2(0.5), tc);
    index *= maxDepth;  // radial tiles
    index += zoffset; // the newest texture first
    if(index > maxDepth) {index -= maxDepth;}
    vec3 color = texture(textureArray, vec3(tc, floor(index))).rgb;

    fragColor = vec4(color, 1.0);
1 Like

Hey @TimChi, thanks a lot for this! All of your inputs gave me an amazing starting point!

Yes, I did try to mess with both normalized and non-normalized values for vec2 iResolution, to no avail… I am quite confused when normalized and when non-normalized is required.

The slitscan project appears to run correctly, maxing out at 6 fps on my Macbook pro 16" 2019, 2.3ghz i9, 32gb Ram with AMD Radeon Pro 5500M 8 GB, Catalina. Doubling the maxDepth produces half the frame rate. This I don’t quite understand, as far as I am reading your code, only one Image per draw loop is uploaded to the gpu…? Interesting this runs so much better on your m1!
The input video (1920x1080) does get squashed into a square format on my macbook.

Also interesting: right now I get a slightly higher frame rate with my own project for heatmap-based time displacement. That one is mainly using ofimages with a for-loop and ofImage.getColor()/ ofImage.setColor()… Makes me worry it will not perform better in a shader :frowning: I definitely will try though!

I tried your example for the GL_TEXTURE_2D_ARRAY you shared earlier under the link this forum post. This works really well.

Hey happy to hear it runs! And also glad you found the 2-image test from the other thread. On the m1 mini, the application runs at 60 fps without a lot of load on the cpu and moderate load on the gpu. I used a 1920x1080 video (.MOV from an iPhone) for testing, and the player says its playing at 30 fps. But its strange because my linux i7-6600U laptop runs it just fine too.

The newest frame from the video gets uploaded to the texture array with the following call in ofApp::update(); zoffset tracks the most recent layer in the texture array:

glTexSubImage3D(GL_TEXTURE_2D_ARRAY, level, xoffset, yoffset, zoffset, width, height, depth, GL_RGB, GL_UNSIGNED_BYTE, videoPlayer.getPixels().getData());

And then I don’t think the texture array transfers between the cpu and gpu in the shader, because its stored on the gpu and the shader gets access to it (what and where it is) with:

shader.setUniformTexture("textureArray", GL_TEXTURE_2D_ARRAY, textureArray, 0);

You can “un-square” the video by commenting out these lines in the fragment shader:

    // shift and correct for the rectangular nature of the textures
    float aspectRatio = resolution.x / resolution.y;
    tc = tc * 2.0 - 1.0;
    tc.x *= aspectRatio;
    tc = tc * 0.5 + 0.5;

I get confused by normalized vs un-normalized coords too, and square vs rectangle, etc. Its interesting that the sampler2Darray coords are normalized, so that using values of 0.0 - 1.0 in the texture() call gets the correct color.

OK well at least you’ve got a place to start. I had to comment the code so heavily so I wouldn’t forget how it all comes together. If you can figure out why it runs so differently on the mbps then please post back! I’ve tried a few things but haven’t been able to figure it out. Maybe its just an apple thing though.