Body Extraction ARKit

Hi,

After looking for a way to use the body extraction algorithms in ARKit in OF, I decided to try to implemented it.

It is now working on iOS >= 13.0 with OF iOs 0.11, and you can find it here.

There is 3 textures extracted from each frame using the ARMatteGenerator.
I was helped by this repo by @aferriss , this post on stackoverflow, this sample code from apple as well as hints from @sortofsleepy to understand how it worked.

Then the 3 textures are the alpha texture, body depth texture and frame depth.

To convert the first 2 to OF :

- (CVOpenGLESTextureRef) convertFromMTLToOpenGL:(id<MTLTexture>) texture  pixel_buffer:(CVPixelBufferRef)pixel_buffer _videoTextureCache:(CVOpenGLESTextureCacheRef)vidTextureCache
{
     int width = (int) texture.width;
     int height = (int) texture.height;
     MTLPixelFormat texPixelFormat = texture.pixelFormat;
//    NSLog(@"texture PixelFormat : %lu width : %d height : %d", (unsigned long)texPixelFormat, width, height);
    

     CVPixelBufferLockBaseAddress(pixel_buffer, 0);
     void * CV_NULLABLE pixelBufferBaseAdress = CVPixelBufferGetBaseAddress(pixel_buffer);
     size_t bytesPerRow = CVPixelBufferGetBytesPerRow(pixel_buffer);

      [texture getBytes:pixelBufferBaseAdress
                         bytesPerRow:bytesPerRow
                         fromRegion:MTLRegionMake2D(0, 0, width, height)
                         mipmapLevel:0];


     size_t w = CVPixelBufferGetWidth(pixel_buffer);
     size_t h = CVPixelBufferGetHeight(pixel_buffer);
    
    CVOpenGLESTextureRef texGLES = nil;

     CVReturn err = noErr;
     err = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
                                                     vidTextureCache,
                                                     pixel_buffer,
                                                     nil,
                                                     GLenum(GL_TEXTURE_2D),
                                                     GLint(GL_LUMINANCE),
                                                     w,
                                                     h,
                                                     GLenum(GL_LUMINANCE),
                                                     GLenum(GL_UNSIGNED_BYTE),
                                                     0,
                                                     &texGLES);


     if (err != kCVReturnSuccess) {
         CVBufferRelease(pixel_buffer);
//         NSLog(@"error on CVOpenGLESTextureCacheCreateTextureFromImage");
     }

     CVPixelBufferUnlockBaseAddress(pixel_buffer, 0);
     CVOpenGLESTextureCacheFlush(vidTextureCache, 0);
    

    // correct wrapping and filtering
    glBindTexture(CVOpenGLESTextureGetTarget(texGLES), CVOpenGLESTextureGetName(texGLES));
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,GL_LINEAR);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,GL_LINEAR);
    glBindTexture(CVOpenGLESTextureGetTarget(texGLES), 0);
    
    return texGLES;
}

with the CVPixelBufferRef declared using :

//=================================================================================
    // create a pixel buffer with the size and pixel format corresponding to :
    // MTLTexture Alpha --> with (full resolution) : 1920 x 1440
    // MTLTexture Alpha --> with format (10) MTLPixelFormatR8Unorm corresponding to kCVPixelFormatType_OneComponent8 for  pixel buffer
    //=================================================================================
    
        cvret = CVPixelBufferCreate(kCFAllocatorDefault,
                                             1920, 1440,
                                             kCVPixelFormatType_OneComponent8,
                                             (__bridge CFDictionaryRef)pixelBufferAttributes,
                                             &pixel_bufferAlphaMatte);

It took me ages to figure out the texture conversion, so for future reference for other people, maybe it will help.

Then the depth camera is already formatted into a CVPixelBuffer, just needed to get the reference to it.

The textures are mapped to the camera not the scene, so needed a remapping using CGAffineTransform, which passes as a uniform in the shader, and remaps the textures.
Shader code to process it :
vertex shader :

attribute vec2 position;

                                                   uniform vec4 cAffineCamABCD;
                                                   uniform vec2 cAffineCamTxTy;
                                           
                                                   varying vec2 vUv;
                                                   varying vec2 vUvCam;

                                                   // https://developer.apple.com/documentation/coregraphics/cgaffinetransform
                                                   vec2 affineTransform(vec2 uv, vec4 coeff, vec2 offset){
                                                        return vec2(uv.s * coeff.x + uv.t * coeff.z + offset.x,
                                                                    uv.s * coeff.y + uv.t * coeff.w + offset.y);
                                                   }
                                          
                                                   const vec2 scale = vec2(0.5,0.5);
                                                   void main(){
                                                       
                                                       
                                                        vec2 uV = position.xy * scale + scale;
                                                        vUv = vec2(uV.s, 1.0 - uV.t);
                                                        vUvCam = affineTransform(vUv, cAffineCamABCD, cAffineCamTxTy);
                                                                                                               
                                                        gl_Position = vec4(position,0.0,1.0);
                                                       
                                                       
                                                       
                                                   }

fragment shader :

precision highp float;
                                                     varying vec2 vUv;
                                                     varying vec2 vUvCam;
                                             
                                                     uniform sampler2D tex;
                                                     uniform sampler2D texAlphaBody;
                                                     uniform sampler2D texDepthBody;
                                                     uniform sampler2D texDepth;
                                             
                                                     uniform mat4 u_CameraProjectionMat;
                                             
                                                     uniform float u_time;
                                             
                                                     void main(){
                
                
                                                         
                vec4 sceneColor = texture2D(tex, vUv);
                float sceneDepth = texture2D(texDepth, vUvCam).r;
                
                
                float alpha = texture2D( texAlphaBody, vUvCam).r;
                float dilatedLinearDepth = texture2D(texDepthBody, vUvCam).r;
                
                float dilatedDepth = clamp((u_CameraProjectionMat[2][2] * - dilatedLinearDepth + u_CameraProjectionMat[3][2]) / (u_CameraProjectionMat[2][3] * -dilatedLinearDepth + u_CameraProjectionMat[3][3]), 0.0, 1.0);
                
                float showOccluder = 1.0;
                showOccluder = step(dilatedDepth, sceneDepth); // forwardZ case
                
                
                // camera Color is a sine of the actual color * time
                vec4 cameraColor = vec4(sceneColor.r + abs(sin(u_time)), sceneColor.g + abs(cos(u_time)), sceneColor.b, sceneColor.a) * dilatedDepth;

                vec4 occluderResult = mix(sceneColor, cameraColor, alpha);
                vec4 mattingResult = mix(sceneColor, occluderResult, showOccluder);
                gl_FragColor = occluderResult;

with the uniforms :

// get and draw textures
            auto _tex = [_view getConvertedTexture];

#ifdef ARBodyTrackingBool_h
            
            auto _texMatteAlpha = [_view getConvertedTextureMatteAlpha];
            auto _texMatteDepth = [_view getConvertedTextureMatteDepth];
            auto _texDepth = [_view getConvertedTextureDepth];
            // remap Matte Textures
            CGAffineTransform cAffine = [_view getAffineCameraTransform];
#endif
            if(_tex){
                shader.begin();
                shader.setUniformTexture("tex", CVOpenGLESTextureGetTarget(_tex), CVOpenGLESTextureGetName(_tex), 0);

#ifdef ARBodyTrackingBool_h

                if(_texMatteAlpha)shader.setUniformTexture("texAlphaBody", CVOpenGLESTextureGetTarget(_texMatteAlpha), CVOpenGLESTextureGetName(_texMatteAlpha), 1);
                if(_texMatteDepth)shader.setUniformTexture("texDepthBody", CVOpenGLESTextureGetTarget(_texMatteDepth), CVOpenGLESTextureGetName(_texMatteDepth), 2);
                if(_texDepth)shader.setUniformTexture("texDepth", CVOpenGLESTextureGetTarget(_texDepth), CVOpenGLESTextureGetName(_texDepth), 3);
                // textures affine coordinates
                shader.setUniform4f("cAffineCamABCD", float(cAffine.a), float(cAffine.b), float(cAffine.c), float(cAffine.d));
                shader.setUniform2f("cAffineCamTxTy", float(cAffine.tx), float(cAffine.ty));
                
                shader.setUniform1f("u_time", ofGetElapsedTimef());
                shader.setUniformMatrix4f("u_CameraProjectionMat", getProjectionMatrix());
#endif
                mesh.draw();
                
                shader.end();
            }

I know i could have done something a bit neather maybe by processing some of the textures so they can be used as OF textures straight, no remapping?

Not sure how to do so, might leave it for now, but I thought I’d share my progress here.

The textures can be accessed by the ARProcessor, from the ofApp using :

CVOpenGLESTextureRef getTextureMatteAlpha();
CVOpenGLESTextureRef getTextureMatteDepth();
CVOpenGLESTextureRef getTextureDepth();
ofMatrix3x3 getAffineTransform();

I wanted to make things like this, this or that , and I’m pretty sure that’s a step closer to do so? - @zach?

++

5 Likes

Nice thanks, will look into it in detail at some point, I couldn’t figure this out when I tried to look into it a few months ago.

Just an FYI - this code is on the develop branch now if anyone is interested in trying this out.