Flask server hosting GAN model to OF app

This is a bit of a combo Python/OF question but I think this community is uniquely suited to help. I have a flask server that listens for a JSON object of 512 vectors, then uses those as input to a Generator model to output an image. As of now it saves the image to disk, but what I want is for it to pass the image to an ofApp. I have a couple of concerns: How do I package the data on the Python end? Is there a way for this to be fast (RunwayML can do a 1024x1024 image on my machine in about 60ms)? On the OF end, how do you reassemble the data into an image? Any guidance is appreciated.

from flask import Flask, jsonify, request, send_file # initialize our Flask application
import json
import io
from flask_cors import CORS
import pickle
import dnnlib
import torch
import PIL.Image
import legacy

device = torch.device('cuda')
with dnnlib.util.open_url("snapshot.pkl") as f:
    G = legacy.load_network_pkl(f)['G_ema'].to(device) # type: ignore

app = Flask(__name__)
CORS(app)

@app.route("/test", methods=['GET', 'POST', 'PUT'])
def test():
    return "OK"

@app.route("/json", methods=['GET', 'POST', 'PUT'])
def getjsondata():

    if request.method=='POST':
        print("received POST")

        data = request.get_json()

        #print(format(data['z']))
        jzf = [float(i) for i in data['z']]
        jzft = torch.FloatTensor(jzf)
        jzftr = jzft.reshape([1, 512])

        z = jzftr.cuda()
        c = None                   # class labels (not used)
        trunc = 1
        img = G(z, c, trunc)
        img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
        PIL.Image.fromarray(img[0].cpu().numpy(), 'RGB').save('newtest.png')
       

    return 'OK'


if __name__ == '__main__':
    app.run(host='localhost', port=9000, debug=True)

I managed a solution, for now running the receiving app in a browser but will move to OF.

@app.route("/json", methods=['GET', 'POST', 'PUT'])
def getjsondata():

    if request.method=='POST':
        # print("received POST")

        data = request.get_json()

        #print(format(data['z']))
        jzf = [float(i) for i in data['z']]
        jzft = torch.FloatTensor(jzf)
        jzftr = jzft.reshape([1, 512])

        z = jzftr.cuda()
        c = None                   # class labels (not used in this example)
        trunc = 1
        img = G(z, c, trunc)

        #img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
        img = (img * 127.5 + 128).clamp(0, 255).to(torch.uint8)

        # turn into PIL image
        pil_img = transforms.ToPILImage()(img[0]).convert("RGB")
        #pil_img = PIL.Image.fromarray(img[0].cpu().numpy(), 'RGB')
        pil_img.save('newtest.png')

        response = serve_pil_image64(pil_img)
        response.headers.add('Access-Control-Allow-Origin', '*')
        return response


    return 'OK'

def serve_pil_image64(pil_img):
    img_io = BytesIO()
    pil_img.save(img_io, 'JPEG', quality=70)
    img_str = base64.b64encode(img_io.getvalue()).decode("utf-8")
    return jsonify({'status': True, 'image': img_str})

Importantly, it has been served as base64 inside a JSON object. On the JavaScript side:

// construct an HTTP request
var xhr = new XMLHttpRequest();

// upon successful completion of request...
xhr.onreadystatechange = function() {
    if (xhr.readyState == XMLHttpRequest.DONE) {
        var json = JSON.parse(xhr.responseText);
        // console.log(json);
        document.getElementById("image_output").src = "data:image/jpeg;base64," + json.image;

    }
}


xhr.open("POST", "http://localhost:9000/json");
xhr.setRequestHeader('Content-Type', 'application/json; charset=UTF-8');

absolute stab in the dark, but I’d say NDI is great for that:
https://github.com/buresu/ndi-python

I’m gonna try this route myself soon.
Then on the OF side, just use ofxNDI.

Hope this helps,

Best,

P

This is a good idea. I was trying to recreate the functionality of the RunwayML “local” app (and did in the end) but I didn’t think about why they use network interfacing to begin with, and why a Syphon or NDI-like approach might work better. Might speed things up, too. Thanks!