Texture3D usable in Fragment Shader

So this approach of using big ofFbos is pretty cool, because I don’t think all of your textures would have to be the same size. I could be wrong but I have a feeling that the textures in the texture array have to all be the same size ( or at least the size of the largest texture) But I haven’t verified this to know for sure. Using an ofFbo is a very “oF way” to do it too!

Yeah, even with 256x256x256 it runs smooth, but with 512x512x512 I get 100% GPU load and the fps drops. But thats also a lot of texture and I do not have the newest GPU (GTX970). And much better than with the pixels.
I just do not know how to get the GLuint handle from a texture, so that I can use it in:

glBindTexture(GL_TEXTURE_3D, texture3D);

Somehow it also works fine without binding the texture3D (maybe because its already bound?), but I guess thats not ideal.

Yes, I am copying a snapshot of the fbo each iteration. Somehow it works without binding to GL_READ_BUFFER, but maybe thats also not ideal.

Here is the ofxVolumetricsGPUCopyExample (adapted from the ofxVolumetricsExample - and without use of shaders yet): ofEmscriptenExamples/ofxVolumetricsGPUCopyExample at main · Jonathhhan/ofEmscriptenExamples · GitHub

@mativa I was so hoping that the lack of square textures was the problem. But awesome that you tried already. And then you probably tried both normalized and non-normalized values for uniform vec2 iResolution in the fragment shader too I’ll bet.

OK here is the slitscan project, pared down a bit. It should compile and run; just add a video to the data folder for it to work on. It runs great on my m1 mini (Monterey), and a Dell laptop on linux Mint 20.2 with Intel integrated graphics (i7-6600U). It runs on my 2015 mbp retina, but only 1 “slice” is drawn (likely the newest one). So I’m not sure why. Also @Jona have a look for how to set up the openGL stuff for the array texture.

main.cpp:

#include "ofMain.h"
#include "ofApp.h"
//========================================================================
int main( ){
    ofGLFWWindowSettings settings;
    settings.setGLVersion(3,3);
    settings.setSize(1920, 1080);
    ofCreateWindow(settings);
    ofRunApp(new ofApp());
}

ofApp.h:

#pragma once
#include "ofMain.h"

class ofApp : public ofBaseApp{
public:
    void setup();
    void update();
    void draw();

    ofVideoPlayer videoPlayer;    
    ofShader shader;

    // stuff for the sampler2DArray
    GLuint textureArray; // a handle for the array
    GLsizei maxDepth; // max number of textures in textureArray
    GLsizei numLevels; // max number of mipmap levels
    GLint level; // the mipmap level(level 0, since there is only 1 level)
    GLint xoffset; // subsection x
    GLint yoffset; // subsection y
    GLint zoffset; // the depth (level) index value into textureArray
    GLsizei width; // texture width
    GLsizei height; // texture height
    GLsizei depth; // texture depth
};

ofApp.cpp

#include "ofApp.h"
//--------------------------------------------------------------
void ofApp::setup(){
    shader.load("common.vert", "radialSlitScan.frag");

    videoPlayer.load("video.MOV");
    videoPlayer.play();
    videoPlayer.update();

    maxDepth = 128;
    numLevels = 1; // the number of mipmap levels
    level = 0; // level 0, the index, since there is only 1 level
    xoffset = 0; // offset for subsection
    yoffset = 0; // offset for subsection
    zoffset = 0; // this gets incremented
    width = static_cast<GLsizei>(videoPlayer.getWidth());
    height = static_cast<GLsizei>(videoPlayer.getHeight());
    depth = 1; // the texure layer; this does not get incremented

    // get a handle for textureArray, then bind the target and allocate it:
    glGenTextures(1,&textureArray); // get 1 handle and store the value in textureArray
    glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray); // bind the target
    glTexStorage3D(GL_TEXTURE_2D_ARRAY, numLevels, GL_RGB8, width, height, maxDepth); // allocate some storage for it

    // calls for good practice (?):
    glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

    // unbind the target
    glBindTexture(GL_TEXTURE_2D_ARRAY,0);
}
//--------------------------------------------------------------
void ofApp::update(){
    videoPlayer.update();

    /* updating the sampler2DArray each time with all the textures is very slow; so use an indexing scheme in the fragment shader, and make only 1 call to glTexSubImage3D() each cycle to update the oldest texture with the newest frame from the video player */
    if(videoPlayer.isFrameNew())
    {
        glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray);
        glTexSubImage3D(GL_TEXTURE_2D_ARRAY, level, xoffset, yoffset, zoffset, width, height, depth, GL_RGB, GL_UNSIGNED_BYTE, videoPlayer.getPixels().getData());
        glBindTexture(GL_TEXTURE_2D_ARRAY, 0);

        zoffset += 1;
        if(zoffset > maxDepth) {zoffset = 0;}
    }
}
//--------------------------------------------------------------
void ofApp::draw(){
    // not sure if all of these gl calls are necessary; but it might be good practice to consistently bind/enable and unbind/disable stuff
    glActiveTexture(GL_TEXTURE0 + textureArray);
    glClientActiveTexture(GL_TEXTURE0 + textureArray);
    glEnable(GL_TEXTURE_2D_ARRAY);
    glBindTexture(GL_TEXTURE_2D_ARRAY, textureArray);

    shader.begin();
    shader.setUniformTexture("textureArray", GL_TEXTURE_2D_ARRAY, textureArray, 0);
    shader.setUniform2f("resolution", static_cast<float>(width), static_cast<float>(height));
    shader.setUniform1f("zoffset", static_cast<float>(zoffset));
    shader.setUniform1f("maxDepth", static_cast<float>(maxDepth));
    videoPlayer.draw(0.f, 0.f); // some texcoord for the shader
    shader.end();

    glActiveTexture(GL_TEXTURE0 + textureArray);
    glBindTexture(GL_TEXTURE_2D_ARRAY, 0);
    glDisable(GL_TEXTURE_2D_ARRAY);
    glActiveTexture(GL_TEXTURE0);
}

common.vert:

#version 330
uniform mat4 modelViewProjectionMatrix;
in vec4 position;
in vec2 texcoord;

out vec2 vTexCoord;

void main()
{
    gl_Position = modelViewProjectionMatrix * position;
    vTexCoord = texcoord;
}

radialSlitScan.frag:

#version 330
uniform sampler2DArray textureArray;
uniform vec2 resolution;
uniform float zoffset; // the newest layer in the textures array
uniform float maxDepth; // the max number of layers in the textures array
in vec2 vTexCoord;
out vec4 fragColor;

void main()
{
    vec2 tc = vTexCoord / resolution.xy;

    // shift and correct for the rectangular nature of the textures
    float aspectRatio = resolution.x / resolution.y;
    tc = tc * 2.0 - 1.0;
    tc.x *= aspectRatio;
    tc = tc * 0.5 + 0.5;

    float index = 1.0 - distance(vec2(0.5), tc);
    index *= maxDepth;  // radial tiles
    index += zoffset; // the newest texture first
    if(index > maxDepth) {index -= maxDepth;}
    vec3 color = texture(textureArray, vec3(tc, floor(index))).rgb;

    fragColor = vec4(color, 1.0);
}
1 Like

Hey @TimChi, thanks a lot for this! All of your inputs gave me an amazing starting point!

Yes, I did try to mess with both normalized and non-normalized values for vec2 iResolution, to no avail… I am quite confused when normalized and when non-normalized is required.

The slitscan project appears to run correctly, maxing out at 6 fps on my Macbook pro 16" 2019, 2.3ghz i9, 32gb Ram with AMD Radeon Pro 5500M 8 GB, Catalina. Doubling the maxDepth produces half the frame rate. This I don’t quite understand, as far as I am reading your code, only one Image per draw loop is uploaded to the gpu…? Interesting this runs so much better on your m1!
The input video (1920x1080) does get squashed into a square format on my macbook.

Also interesting: right now I get a slightly higher frame rate with my own project for heatmap-based time displacement. That one is mainly using ofimages with a for-loop and ofImage.getColor()/ ofImage.setColor()… Makes me worry it will not perform better in a shader :frowning: I definitely will try though!

I tried your example for the GL_TEXTURE_2D_ARRAY you shared earlier under the link this forum post. This works really well.

Hey happy to hear it runs! And also glad you found the 2-image test from the other thread. On the m1 mini, the application runs at 60 fps without a lot of load on the cpu and moderate load on the gpu. I used a 1920x1080 video (.MOV from an iPhone) for testing, and the player says its playing at 30 fps. But its strange because my linux i7-6600U laptop runs it just fine too.

The newest frame from the video gets uploaded to the texture array with the following call in ofApp::update(); zoffset tracks the most recent layer in the texture array:

glTexSubImage3D(GL_TEXTURE_2D_ARRAY, level, xoffset, yoffset, zoffset, width, height, depth, GL_RGB, GL_UNSIGNED_BYTE, videoPlayer.getPixels().getData());

And then I don’t think the texture array transfers between the cpu and gpu in the shader, because its stored on the gpu and the shader gets access to it (what and where it is) with:

shader.setUniformTexture("textureArray", GL_TEXTURE_2D_ARRAY, textureArray, 0);

You can “un-square” the video by commenting out these lines in the fragment shader:

    // shift and correct for the rectangular nature of the textures
    float aspectRatio = resolution.x / resolution.y;
    tc = tc * 2.0 - 1.0;
    tc.x *= aspectRatio;
    tc = tc * 0.5 + 0.5;

I get confused by normalized vs un-normalized coords too, and square vs rectangle, etc. Its interesting that the sampler2Darray coords are normalized, so that using values of 0.0 - 1.0 in the texture() call gets the correct color.

OK well at least you’ve got a place to start. I had to comment the code so heavily so I wouldn’t forget how it all comes together. If you can figure out why it runs so differently on the mbps then please post back! I’ve tried a few things but haven’t been able to figure it out. Maybe its just an apple thing though.