Pitch detection


I want to get pitches from wav file.

when i used the ofgetspectrum(), I couldn’t get the exact frequency number like 440 Hz or something…

if i can get all freqency info every 20ms, i can map each frequency to piano scale…i guess.

what to do to get freq info?

need your help…

Thanks in advance.


ps: http://vimeo.com/1932816

it’s exactly what i want~~

1 Like

Hi Jung-im,

The data returned by ofGetSpectrum() is frequency data, you just need to know what it means.

If you’re sampling at 44.1 kHz, and ofGetSpectrum() returns 512 bins, then those bins are spread evenly from 0 to 44.1/2 kHz = 22.05kHz

bin 0 = 22,050 Hz * 0 / 512 = 0 Hz
bin 1 = 22,050 Hz * 1 / 512 = 43 Hz
bin 2 = 22,050 Hz * 2 / 512 = 86 Hz
bin 2 = 22,050 Hz * 3 / 512 = 129 Hz

bin 511 = 22,050 Hz * 511 / 512 = 22,006 Hz

If you don’t need to know precise frequency, and just which “pitch class” a sound falls into, a simple pitch detection algorithm involves taking the 512 frequency bins and placing them in 12 pitch bins, and picking the strongest.

I think I’ve seen another post on this forum about pitch detection, but I can’t find it now…



You are my HERO!

if you do need pitch, here’s an example which uses the aubio (aubio.org) library to detect the pitch :


a student in the workshop we’re giving in the netherlands needed an example of pitch detection:


it’s an osx project – and comes from code from this class:


there’s some nice audio visual code there, look for “code” examples…

  • z

Wow, thanks Zach. I have been working on trying to get something like this for a while, and will continue on my own this summer; but this is definitely helpful.

Hello Jach,

the source codes you linked are so good. aubio is a great library.

Thanks a lot!

Hello Kyle,

I want to implement as you suggested.

I think aubio provide library named “fftOctaveAnalyzer.cpp”.

there’s a fomular which map freq to midi note number.

m = 69 + 12log(freq/440)

but, as you explained to me, the first freq bin1 map to F1,

and the next bin map to F2. so i have no choice but to lost 12 pitch like F#, G, A…

what can i do to get all midi pitch?

any advice will save one girl!!

many thanks!


Hey Jung-im, glad to help :slight_smile:

I think you probably want frequency-to-pitch conversion, so you can do bin to frequency to pitch.

Here’s some code that should do that:

void frequencyToNote(float frequency, int& octave, int& note, float& cents) {  
	float x = logf(frequency / 440.f) / logf(2.f) + 4.f;  
	octave = floorf(x);  
	x -= octave;  
	x *= 12.f;  
	note = roundf(x);  
	x -= note;  
	x *= 100.f;  
	cents = x;  

(I haven’t tested this.)

1 Like

I always thought frequency and pitch were the same?

I have a question on this topic Im sure many people will have too.

Basically I want to take the sound buffer, and get the pitch and the ampliture (volume) overall.

I downloaded kyles ofxFft

I get the buffer like this
ofSoundStreamSetup( 0, 1, this, 44100, 512, 4 );

and start the fft like:
fft = ofxFft::create(512, OF_FFT_WINDOW_HAMMING, OF_FFT_BASIC);

this makes 257 bins (this->binSize = (signalSize / 2) + 1;) in ofxFft.

Whats the best way to get 1 frequency number from this?

How can I get the amplitude of the sound?

Many thanks

Frequency is a measurable physical phenomena, while pitch is a pyschoacoustic one. Given a single sine wave, we hear a single pitch at that frequency. Given multiple sine waves, we might identify the loudest one as the pitch. With a more complex sound, it’s unclear what we will identify as the pitch, or whether we will even identify the sound as “pitched” at all.

That said, finding the dominant frequency bin will give you some useful results for simple signals like whistling. Simply search the 257 bins for the largest value, and then do a transform from the bin number (index) to the bin’s frequency center (described in my last post in this thread).

The amplitude of that frequency bin will be the value of the frequency bin.

The amplitude of the sound as whole might be calculated by using the RMS power of the frame or by summing the amplitude bins.

I hope this explains things!


Thanks Kyle.

Does this look right to you

at 44100 and buffer of 512, this is 257 bins in fft.

This means 44100 / 512 = 171.59 freq per bin (range). The middle freq of a bin is 85.79 (171 / 2)

To get the middle frequency, of bin 17 say:

(171.59 * 17) + 85.79 = 3002.82

That look ok?

Here is a video…

Although noisy, it gives us the loudest frequency at that peak.

By the way, I noticed this when using fft->draw()

See it flare up randomly? I dont see this when I copy the fftOutput and draw it in my own class. The only thing I can see within draw() that could cause that is getAmplitude() ? (i am using fft basic)


The flare ups are probably some issue with threading: drawing before the audio is normalized.

Your peak detection looks correct, but your math for calculating the frequency of a bin is wrong.

The first bin will be the 0 Hz component (a constant DC offset to the signal).

The remaining bins (256 total) will be for the 44.1kHz / 2 = 22.05 kHz of audio available. So each bin has 22,050 Hz / 256 = 86.13 Hz per bin.

The range of bin 0 will be 0 Hz to 86.13 Hz.

The center of bin 0 will be 43.06 Hz.

The range of bin 17 will be 1464 Hz to 1550 Hz.

The center of bin 17 will be 1507 Hz.

I just added a couple things to the addon to help people out in the future:

float getAmplitudeAtBin(float bin);  
float getBinFromFrequency(float frequency, float sampleRate = 44100);  
float getAmplitudeAtFrequency(float frequency, float sampleRate = 44100);  

[quote author=“zach”]if you do need pitch, here’s an example which uses the aubio (aubio.org) library to detect the pitch :


Sorry, Zach. I am a novice about openframeworks and sound visualization. As I saw your “aubio” library, I thought it is very useful to my project.
So, I downloaded "pitchDetectionExample_061_10.6_SL.zip"and copied the files in its “src” sub-folder (which includes “aubio” library folder) into a new codeblocks project(by duplicated from “allAddonsExample”).But, after I compiled, it couldn’t work and the errors are:

||=== allAddonsExample, release ===|
C:\Users\kar\Desktop\fyp_practice\workplace\apps\extraExamples\pitchDetection3\src\aubioAnalyzer.h|4|aubio.h: No such file or directory|
C:\Users\kar\Desktop\fyp_practice\workplace\apps\extraExamples\pitchDetection3\src\aubioAnalyzer.h|28|error: aubio\_pitchdetection\_mode' does not name a type| C:\Users\kar\Desktop\fyp\_practice\workplace\apps\extraExamples\pitchDetection3\src\aubioAnalyzer.h|29|error:aubio_pitchdetection_type’ does not name a type|
C:\Users\kar\Desktop\fyp_practice\workplace\apps\extraExamples\pitchDetection3\src\aubioAnalyzer.h|31|error: ISO C++ forbids declaration of fvec\_t' with no type| C:\Users\kar\Desktop\fyp\_practice\workplace\apps\extraExamples\pitchDetection3\src\aubioAnalyzer.h|31|error: expected;’ before ‘*’ token|
C:\Users\kar\Desktop\fyp_practice\workplace\apps\extraExamples\pitchDetection3\src\aubioAnalyzer.h|32|error: ISO C++ forbids declaration of aubio\_pitchdetection\_t' with no type| C:\Users\kar\Desktop\fyp\_practice\workplace\apps\extraExamples\pitchDetection3\src\aubioAnalyzer.h|32|error: expected;’ before ‘*’ token|
||=== Build finished: 7 errors, 0 warnings ===|

Is the place I put the “aubio” library wrong? or is there something I missed before run?
I searched about this in the web for a long time, but there was almost no information about installing/import “aubio” in codeblocks.

Hallo OF This is my very first post after i worked with OF about a Year now (I even learned programming with OF)
so, I worked on a polyhonic pitch detection algorhythm for a while now and I got something running(not perfectly but might be usefull)
i’am able to get up to five pitches out of the audio signal within the range of a piano except the lowest octave
and made 4 oscilators to reproduce the notes.
I tested it on my E-Guitar and if i look only for the lowest pitch it workes realy great and quiet fast the more notes been played the more wrong pitches are displayed.
Take a look at the source (its not very clean and I should make a Class out of it but i’m not realy an advanced coder yet,so sorry if you get confused) and please tell me if there are better or other methods to get this done.

I hope it helps someone, because the of people helped me alot, thanks for that,

p.s it works without additional libs or addons and it is realtime (rtAudio)


there seems to be a lot of confusion within this topic.

I checked out the code posted by Yonas and there are several things that are interfering with the correct detection of the pitch. I’ll try to correct those later.
It’s main problem has to do with understanding pitch and frecuency, as Kyle explained very well, but also how the bins are distributed in relation to the notes.

The pitches (notes) have their frequencies logarithmically spaced, while the fft haves its bins linearly spaced. Although converting from linear to log distribution is trivial, you cannot detect pitch just by getting the higher valued bin, yet it might work for single note sounds, but with lots of false matches.

Real-world sounds are quite complex, composed of several overlapping sinewaves at different harmonics (multiples of the base frequency). In several cases you might get very high valued peak for some harmonics which can be detected as a different “note”.
Just to get a single pitched sound you can detect the peaks of the fft and refer to the lowest bin of those, probably ignoring the first 2 or 3 bins might help.
Keep in mind that there are several other algorithms that might be much more efficient and precise at getting a single pitch from a sound source. just google “pitch detection algorithm”.

Polyphonic pitch detection is just another thing; getting a reliable result is really complex. Consider that just a few years ago, Melodyne, a very well known pitch correcting software in the music production business, introduced the polyphonic pitch detection feature and it was really groundbreaking. It really yields some impressive results. Google it and check it.

Another issue with ffts and pitch, due to the linear/log spacing, is that at low pitched notes two or more semitones can fall into the same bin, making pitch detection not posible. The solution to this is to process an fft with more samples, hence bins, so each bin has a narrower bandwidth; yet the processing time is higher and the temporal resolution of the fft is lower, which is mainly noted at higher frequencies.
A solution to this is to use a constant-q transform, in which the bins of an fft are weighted-averaged in a certain way to produce a transform where the bins correspond exactly to a note of the musical scale.

Some years ago, while learning processing, i decided to implement a constantQ method for processing (at that time the only piece of code I found was in matlab). I published it at google code but I haven’t updated it since then.

although it works and the constant Q algorithm is correctly implemented, there are several flaws regarding the visualization (I just checked it and there’s a vertical offset on the “pianoroll” grid).
I’ll resurrect this project and port it into OF.

I hope this is of any help.
If my writing is somewhat confusing is because I’m tired.lol.


i heard about the constant Q algorhythm but i didn’t find an implementation, that would realy help, because of my aim is to make an realtime algorhythm (for interaction) and performing an fft with great resultion is very expensive… i heard of melodyne, too, but its not realtime, it uses image processing like algorhythms(relevance evaluation algorithm) on greater pieces of data to sharpen the the results of a signal transformation.
having two semitones in the same bin isn’t a real problem, because of the peaks next to the peak tells you what the exact pitch is, i tested it and i gess you could even use it for tuning or vibrato detection , by calculating the relatives between the bins around the peak, because they are never at sero, if you have a clear signal witch I think is the biggest problem…
i will test high pass filtering for recucing noise,
using a set of bandpass filters as a recrusive frequency domain analysis(but i don’t expect much),
and hopefully(:-)) the constantQ transform
but i dont think i will ever get something usefull out of my piano, the fft results are realy bad.
if there is interest i will take a closer look at non realtime algorhytms,

well, the idea of using a bandpass filter for each pitch(of an pianoroll → 88) turnt out to be a good idea.
its a recrusive method, not like the fft, so its alot faster and less cpu expensive… the only thing is that when i’am changing the bandwidth to a realy small level, what is directly corresponding to the resolution of the analyse, the filters produce a delay like effect, i could’nt get this fix yet, but it’s a realy simple way of getting pitches detected.
take a look,


Good job!
Right now I’m very busy, so I can’t peek inside the code.
I’m not sure what would be less cpu expensive, either the bandpass bank or the constantQ.
The nice thing about the implementation of the costantQ is that it needs to precalculate de weigheted bins, so the processing of the signal just involves a common fft that then multiplied by the preprocessed bins.


One other thing you can do to detect pitch is to use the algorithm used in the max msp/Pd external called fiddle~

the source code is here (in C)


It’s pretty good

Another possibility, Supercollider also has some very nice pitch detection capabilities, as well as amplitude and detection of other sound characteristics. Having supercollider communicate to OF via OSC would probably be much quicker than making all the pitch stuff from scratch.