Driving multiple oscillators (hundreds) to create a spectrogram reader

I’m working on a project to create a spectrogram reader for iOS and I’m coming up against some issues. I’m wondering if someone can take a look at my approach and offer some suggestions, or suggest a different method for approaching this problem.

The basic idea for the spectrogram reader is to take a vertical slice from the center of your camera input in grayscale and, for now, map the brightness of the pixels to the amplitude of an array of oscillators. Right now I’m setting a threshold to read all values above 200 as white and anything below as black. This array is the same size of our vertical pixels (so for an iPhone 4, this would be 480). I’m able to get this to work on OSX (Macbook Air) for testing with passable results (there are some artifacts in the audio that I’d like to clean up, and if anyone has any suggestions for that I’d really appreciate it, but for now I’m concerned with just a working proof of concept). From top to bottom, the array of oscillators have frequency values from 2000 to 80. Moving a drawing across with a diagonal line moving from bottom to top results in a pitch rising, and vise versa. This is expected behavior. When I try to build this for my iPhone 4 and iPhone 6, the audio playback stops.

If I manually drop the amount of oscillators down to say, 50, audio does work, so my first guess is that there are too many oscillators for my iPhone to process. I’m wondering if my method is broken, or if there might be a better way to achieve what I’m trying to do. As a point of reference, I think what I’d like to do is recreate the behavior of the oscbank~ object in MaxMSP. Right now I’m still just trying to prototype the idea…

Below I’ll post both project’s code. The first is for OSX and the second is for iOS. They’re nearly identical, except for platform-specific things. I put the iOS code on github if that’s faster/easier to work with.

This is with openFrameworks 0.8.4 on Yosemite for OS X and iOS 8 for iPhone.

Please let me know if this needs clarification and if you have any suggestions or tips!

Tagging @admsyn in case he knows any platform specific tricks for audio on iOS :wink:

– OSX ofApp.h–

#pragma once
#include "ofMain.h"
#include "ofxOpenCv.h"
#include "oscillator.h"

class ofApp : public ofBaseApp{

public:
	void setup();
	void update();
	void draw();

	void keyPressed(int key);
	void keyReleased(int key);
	void mouseMoved(int x, int y );
	void mouseDragged(int x, int y, int button);
	void mousePressed(int x, int y, int button);
	void mouseReleased(int x, int y, int button);
	void windowResized(int w, int h);
	void dragEvent(ofDragInfo dragInfo);
	void gotMessage(ofMessage msg);


    //video and opencv
    int camW;
    int camH;
    ofVideoGrabber grabber;
    ofxCvColorImage	colorImg;
    ofxCvGrayscaleImage grayImage;
    unsigned char* grayImagePixels;
    std::vector<int> grayscaleVerticalLine;
    
    //audio
    void audioOut(float * input, int bufferSize, int nChannels);
    ofSoundStream stream;
    std::vector<oscillator> oscillators;
    
};

– OSX ofApp.cpp–

#include "ofApp.h"

//--------------------------------------------------------------
void ofApp::setup(){
    
    
    camW = 320;
    camH = 240;
    
    grabber.initGrabber(camW, camH);
    
    camW = grabber.getWidth();
    camH = grabber.getHeight();
    
    colorImg.allocate(camW, camH);
    grayImage.allocate(camW, camH);
    cout << "of w: " << ofGetWidth() << " of h: " << ofGetHeight() << endl;
    
    //setup our grayimage vertical line vector, throw all black (0) into it
    for (int y=0; y<grayImage.getHeight(); y++){
        grayscaleVerticalLine.push_back(0);
    }
    
    //setup audio
    int sampleRate = 44100;
    int bufferSize = 512;
    ofSoundStreamSetup(2, 0, this, sampleRate, bufferSize, 4);
    
    
    int numberofOscillators = grayImage.getHeight();
    
    for (int i=0; i<numberofOscillators; i++){
        oscillator osc;
        osc.setup(44100);
        osc.setVolume(0.5);
        
        //scale freq for osc depending on how many we have
        osc.setFrequency(ofMap(i, 0, numberofOscillators-1, 2000, 80));
        oscillators.push_back(osc);
    }
    
    cout << "oscillators size: " << oscillators.size() << endl;
    
    for (int i = 0; i<oscillators.size(); i++){
        cout << "osc #" << i << ": freq: " << oscillators[i].getFrequency() << endl;
    }
    
}

//--------------------------------------------------------------
void ofApp::update(){
    
    
    grabber.update();
    
    if (grabber.isFrameNew()) {
        if (grabber.getPixels() != NULL) {
            
            //make a grayscale image out of our video grabber
            colorImg.setFromPixels(grabber.getPixels(), camW, camH);
            grayImage = colorImg;
            
            //collect the grayscale values from the center vertical line in the grayscale image, store in a vector
            grayImagePixels = grayImage.getPixels();
            
            
            for( int y = 0; y<grayImage.getHeight(); y++){
                int position = grayImage.getWidth()/2 + (y * grayImage.getWidth());
                
                float invertedGrayscaleValue = 255 - grayImagePixels[position];
                invertedGrayscaleValue = invertedGrayscaleValue > 200 ? 255 : 0;
                grayscaleVerticalLine[y] = invertedGrayscaleValue;
                
            }
            
            //change osc volumes depending on black/white value in vertical line from camrea
            for (int i = 0; i<oscillators.size(); i++) {
                
                float new_volume = ofMap(grayscaleVerticalLine[i], 0, 255, 0.0, 1.0);
                oscillators[i].setVolume(new_volume);
            }
            
            
        }
    }


}

//--------------------------------------------------------------
void ofApp::draw(){
    
    grayImage.draw(0, 0);
    
    ofMesh verticalLine;
    verticalLine.setMode(OF_PRIMITIVE_LINE_STRIP);
    verticalLine.enableColors();
    
    for (int i=0; i<grayscaleVerticalLine.size(); i++) {
        verticalLine.addVertex(ofVec3f(ofGetWidth()/2, i, 0)); // change position of line here
        
        float invertedGrayscaleColor = (255 - grayscaleVerticalLine[i]) / 255.0;
        verticalLine.addColor(ofFloatColor(invertedGrayscaleColor,
                                           invertedGrayscaleColor,
                                           invertedGrayscaleColor,
                                           1.0));
    }
    
    verticalLine.draw();
}

//--------------------------------------------------------------
void ofApp::keyPressed(int key){

}

//--------------------------------------------------------------
void ofApp::keyReleased(int key){

}

//--------------------------------------------------------------
void ofApp::mouseMoved(int x, int y ){

}

//--------------------------------------------------------------
void ofApp::mouseDragged(int x, int y, int button){

}

//--------------------------------------------------------------
void ofApp::mousePressed(int x, int y, int button){

}

//--------------------------------------------------------------
void ofApp::mouseReleased(int x, int y, int button){

}

//--------------------------------------------------------------
void ofApp::windowResized(int w, int h){

}

//--------------------------------------------------------------
void ofApp::gotMessage(ofMessage msg){

}

//--------------------------------------------------------------
void ofApp::dragEvent(ofDragInfo dragInfo){ 

}

//--------------------------------------------------------------
void ofApp::audioOut(float * output, int bufferSize, int nChannels){
    
    for (int i = 0; i < bufferSize; i++){
        
        float sample = 0;
        
        int totalSize = oscillators.size();
        
        for (int i=0; i<totalSize; i++) {
            sample += oscillators[i].getSample();
        }
        
        sample = sample / totalSize;
        
        
        output[i*nChannels    ] = sample;
        output[i*nChannels + 1] = sample;
        
    }
    
    
}

–iOS ofApp.h–

#pragma once

#include "ofMain.h"
#include "ofxiOS.h"
#include "ofxiOSExtras.h"
#include "ofxOpenCv.h"
#include "oscillator.h"

class ofApp : public ofxiOSApp {
	
    public:
        void setup();
        void update();
        void draw();
        void exit();
	
        void touchDown(ofTouchEventArgs & touch);
        void touchMoved(ofTouchEventArgs & touch);
        void touchUp(ofTouchEventArgs & touch);
        void touchDoubleTap(ofTouchEventArgs & touch);
        void touchCancelled(ofTouchEventArgs & touch);

        void lostFocus();
        void gotFocus();
        void gotMemoryWarning();
        void deviceOrientationChanged(int newOrientation);
    
        //video and opencv
        int camW;
        int camH;
        ofVideoGrabber grabber;
        ofxCvColorImage	colorImg;
        ofxCvGrayscaleImage grayImage;
        unsigned char* grayImagePixels;
        std::vector<int> grayscaleVerticalLine;
    
        //audio
        void audioOut(float * input, int bufferSize, int nChannels);
        ofSoundStream stream;
        std::vector<oscillator> oscillators;

};

–iOS ofApp.mm–

#include "ofApp.h"

//--------------------------------------------------------------
void ofApp::setup(){
    
    camW = ofGetWidth();
    camH = ofGetHeight();

    grabber.initGrabber(camW, camH);
    
    camW = grabber.getWidth();
    camH = grabber.getHeight();
    
    colorImg.allocate(camW, camH);
    grayImage.allocate(camW, camH);
    cout << "of w: " << ofGetWidth() << " of h: " << ofGetHeight() << endl;
    
    //setup our grayimage vertical line vector, throw all black (0) into it
    for (int y=0; y<grayImage.getHeight(); y++){
        grayscaleVerticalLine.push_back(0);
    }
    
    //setup audio
    int sampleRate = 44100;
    int bufferSize = 512;
    ofSoundStreamSetup(2, 0, this, sampleRate, bufferSize, 4);
    
    
    int numberofOscillators = grayImage.getHeight();
    
    for (int i=0; i<numberofOscillators; i++){
        oscillator osc;
        osc.setup(44100);
        osc.setVolume(0.5);
        
        //scale freq for osc depending on how many we have
        osc.setFrequency(ofMap(i, 0, numberofOscillators-1, 2000, 80));
        oscillators.push_back(osc);
    }
    
    cout << "oscillators size: " << oscillators.size() << endl;
    
    for (int i = 0; i<oscillators.size(); i++){
        cout << "osc #" << i << ": freq: " << oscillators[i].getFrequency() << endl;
    }
    
}

//--------------------------------------------------------------
void ofApp::update(){
    
    grabber.update();
    
    if (grabber.isFrameNew()) {
        if (grabber.getPixels() != NULL) {
            
            //make a grayscale image out of our video grabber
            colorImg.setFromPixels(grabber.getPixels(), camW, camH);
            grayImage = colorImg;
            
            //collect the grayscale values from the center vertical line in the grayscale image, store in a vector
           grayImagePixels = grayImage.getPixels();
            
            
            for( int y = 0; y<grayImage.getHeight(); y++){
                int position = grayImage.getWidth()/2 + (y * grayImage.getWidth());
                
               // grayscaleVerticalLine[y] = grayImagePixels[position];
                
                float invertedGrayscaleValue = 255 - grayImagePixels[position];
                invertedGrayscaleValue = invertedGrayscaleValue > 200 ? 255 : 0;
                grayscaleVerticalLine[y] = invertedGrayscaleValue;
                
            }
            
            //change osc volumes depending on black/white value in vertical line from camrea
            for (int i = 0; i<oscillators.size(); i++) {
                float new_volume = ofMap(grayscaleVerticalLine[i], 0, 255, 0.0, 1.0);
                oscillators[i].setVolume(new_volume);
            }
    
        }
    }

}

//--------------------------------------------------------------
void ofApp::draw(){
            
    grayImage.draw(0, 0);
    
    ofMesh verticalLine;
    verticalLine.setMode(OF_PRIMITIVE_LINE_STRIP);
    verticalLine.enableColors();
    
    for (int i=0; i<grayscaleVerticalLine.size(); i++) {
        verticalLine.addVertex(ofVec3f(ofGetWidth()/2, i, 0));
        
        float invertedGrayscaleColor = (255 - grayscaleVerticalLine[i]) / 255.0;
        verticalLine.addColor(ofFloatColor(invertedGrayscaleColor,
                                           invertedGrayscaleColor,
                                           invertedGrayscaleColor,
                                           1.0));
    }
    
    verticalLine.draw();

}

//--------------------------------------------------------------
void ofApp::exit(){

}

//--------------------------------------------------------------
void ofApp::touchDown(ofTouchEventArgs & touch){

}

//--------------------------------------------------------------
void ofApp::touchMoved(ofTouchEventArgs & touch){

}

//--------------------------------------------------------------
void ofApp::touchUp(ofTouchEventArgs & touch){

}

//--------------------------------------------------------------
void ofApp::touchDoubleTap(ofTouchEventArgs & touch){

}

//--------------------------------------------------------------
void ofApp::touchCancelled(ofTouchEventArgs & touch){
    
}

//--------------------------------------------------------------
void ofApp::lostFocus(){

}

//--------------------------------------------------------------
void ofApp::gotFocus(){

}

//--------------------------------------------------------------
void ofApp::gotMemoryWarning(){

}

//--------------------------------------------------------------
void ofApp::deviceOrientationChanged(int newOrientation){

}

//--------------------------------------------------------------
void ofApp::audioOut(float * output, int bufferSize, int nChannels){
    
    for (int i = 0; i < bufferSize; i++){
        
        float sample = 0;
        
        int totalSize = oscillators.size(); // change to any number for testing (50, 100 etc)
        
        for (int i=0; i<totalSize; i++) {
            sample += oscillators[i].getSample();
        }
        
        sample = sample / totalSize;
        
        output[i*nChannels    ] = sample;
        output[i*nChannels + 1] = sample;
        
    }
    
    
}

Both projects use a simple oscillator class based on ofZach’s example code:

–oscillator.h–

//
//  oscillator.h
//  musicPrints-iOS-openFrameworks
//
//  Created by Johann Diedrick on 12/13/14.
//
//

//class referenced from ofZach's week two course at avsys 2012
//https://github.com/ofZach/avsys2012/tree/master/week2_oscillatorTypes

#pragma once

#include "ofMain.h"


class oscillator{
    
public:
    
    enum{
        sineWave, squareWave, triangleWave, sawWave, inverseSawWave
    } waveType;
    
    int type;
    
    int sampleRate;
    float frequency;
    float volume;
    float phase;
    float phaseAdder;
    
    void setup (int sampRate);
    void setFrequency (float freq);
    void setVolume (float vol);
    float getFrequency();
    
    float getSample();
    
};

–oscillator.cpp–

//
//  oscillator.cpp
//  musicPrints-iOS-openFrameworks
//
//  Created by Johann Diedrick on 12/13/14.
//
//
//class referenced from ofZach's week two course at avsys 2012
//https://github.com/ofZach/avsys2012/tree/master/week2_oscillatorTypes

#include "oscillator.h"

void oscillator::setup (int sampRate){
    sampleRate = sampRate;
    type = sineWave;
}

void oscillator::setFrequency(float freq){
    frequency = freq;
    phaseAdder = (float)(frequency * TWO_PI) / (float)sampleRate;
}

void oscillator::setVolume(float vol){
    volume = vol;
}

float oscillator::getFrequency(){
    return frequency;
}

float oscillator::getSample(){
    phase += phaseAdder;
    while (phase>TWO_PI) {
        phase-=TWO_PI;
    }
    
    if (type == sineWave) {
        return sin(phase) * volume;
    } else if (type==squareWave){
        return (sin(phase) > 0 ? 1 : -1) * volume;
    } else if (type==triangleWave){
        float pct = phase / TWO_PI;
        return (pct < 0.5 ? ofMap(pct, 0, 0.5, -1, 1) : ofMap(pct, 0.5, 1.0, 1, -1)) * volume;
    } else if (type==sawWave){
        float pct = phase / TWO_PI;
        return ofMap (pct, 0, 1, -1, 1) * volume;
    } else if (type==inverseSawWave){
        float pct = phase / TWO_PI;
        return ofMap(pct, 0, 1, 1, -1) * volume;
    }
}

Hey @jdiedrick , I haven’t read the code in depth but here’s some thoughts:

  • If you haven’t yet, it’d be good to look into FFTs, since they’re the basis of spectrograms
  • Following that, check out the principles of Additive Synthesis, since it sounds pretty similar to the system you’re building
  • A viable path to take here would be to create an additive synthesis system by treating your pixels as FFT values and running an inverse FFT to produce a waveform. I’ve got a quick-hack version of that over in the description for this video, though it’s wasn’t really written to be read by anyone but me :slight_smile:
  • The issue you’re running into is probably just that the iPhone can’t generate the samples fast enough. The specific issue is probably that you’re generating each sample via sin() which is slow-ish. Try a lookup table. There’s a simple lookup table technique described in the ofBook

Good luck, let me know how it goes or if you want any clarification!

@admsyn Thanks for the direction! I read though the ofBook example. I’m still reading through that Additive Synthesis link. I know just enough about FFT to know how it works…I don’t think that’s entirely what I need right now. For what I’m doing presently, I’m assigning the top of my screen to a certain frequency, and going down to a bottom frequency. I think down the line, I’d want to use that FFT/iFFT method if we know more about where the “spectrogram” images are coming from, but for now I’m just using basic black ink drawings for testing. So the frequency range can be arbitrary at this point.

I looked at your video and code as well and I think we’re doing similar things. I think what I’m trying to do is a way more stripped down version of what you’re doing though :slight_smile: I’ll keep looking over it to see if I can gain some insight from it…

I took your advice and implemented a lookup table based on your example. It definitely helped clean up the sound “artifacts” that were present before, but I still feel like its not fast enough for what I’m trying to do.

I’m able to get say 50 sine waves going well, but anything above that seems to choke the device (I’m working on an iPhone 6 now). I tried dropping the sample rate to 50 and buffer size to 8 (just experimenting), and it did seem to help a little, but not in a significant way. I feel like all the heavy lifting is happing here:

//--------------------------------------------------------------
void ofApp::audioOut(float * output, int bufferSize, int nChannels){

for (int i = 0; i < bufferSize; i++){
    
    float sample = 0;
    
    int totalSize = oscillators.size(); // change to any number for testing (50, 100 etc)
    
    for (int i=0; i<totalSize; i++) {
        sample += oscillators[i].getWavetableSample();
    }
    
    sample = sample / totalSize;
    
    output[i*nChannels    ] = sample;
    
   }
}

where I’m adding up all of the samples from the oscillators and dividing by the number of oscillators. Is this the most efficient way to do this? Do you have any suggestions on a better, faster method?

My updated code can be found here with more comments:

Let me know if this makes sense!

i haven’t had time to thoroughly think about or do the math, but suppose that you only need sine waves for each of your oscillators, then using an inverse FFT to generate your waveform should save a lot of computation, and i second @admsyn’s recommendation. without going too deep into the theory, here are the relevant parts:

  • any discrete signal (a digital audio buffer, in this case) can be represented using a sum of sinusoids that start at the DC value, and increase in frequency until the maximum possible frequency that can be represented (which is a half your sample rate)
  • a forward FFT takes an input buffer (a number of samples in time), and finds the magnitude and phase of each sinusoid; this gives you a frequency representation of that particular; this is usually how spectral analysis is done
  • an inverse FFT converts the magnitude and phase back into the time domain (series of samples). if the spectrum is modified in some way, then we have a resynthesized version that is different. (this forms the basis of FFT-based equalization, which in practice requires a lot of other nuances for it to work properly).

now, disregarding the “analysis” part involving the forward FFT, it would be possible to construct your signal in the frequency domain, and apply the inverse FFT to get the time domain signals. this way, you can compute your entire buffer by going through your list of frequencies once, define their amplitudes/phases, and then perform a single iFFT calculation to generate the output buffer representing signals from all your oscillators. one limit here is that you can’t have any arbitrary frequencies for each oscillator like the way you construct them currently and add them up - they are now limited by the frequencies defined by the FFT analysis, which start from DC and go all the way up to the maximum frequency possible (== Sample Rate/2), evenly spaced by half the FFT size*; however, for a spectrogram purpose, this is exactly what you’d want anyway. the other tricky thing in this situation, is to keep track of the phase of each oscillator between each buffer, as simply knowing the amplitude and frequency of an oscillator within each buffer is not enough to create continuous transitions across buffers. this is generally referred to as “phase unwrapping” in related literature.

*for example: if your sample rate is 44100, your max frequency is 22050; and if you use an FFT size of 1024, you end up with frequencies of:

0, 22050/512, 222050/512, 322050/512 … all the way to 22050