Last couple of weeks I’ve been working on a solution to get a live video stream
from an openframeworks application to the web. The idea was to grab a bunch of
pixels and send them to a Flash Media Server (FMS). I’ve used rtmpd as a test server
and used libraries like ffmpeg/libav/gstreamer for encoding and muxing the video.
Streaming video to the web works quite well. I’ve done a 48 hour test which
streamed my webcam to the web (no memory leaks, low cpu, just 8mb memory).
In short I do this: I grab pixels from the application, convert them to YUV420 (using
libswscale, part of libav), encode them with H264 and mux this into FLV. The generated
FLV is then send over what RTMPD calls “inboundflv”. This is just a TCP stream of FLV data.
Next step was adding sound. I had to do way more tests and I had to read up on sound
as I haven’t done anything with digital sound until now. Damian gave me a quick intro in
the world of digital audio and it seems to be like the same thing as working with pixels
Somehow I couldn’t get the correct settings for the sound which is passed into audioIn().
I was asking for 2 channels, 44100 samples but when I encoded these samples the sound
had a very low pitch. Finally I got it to work by asking only 1 channel from ofSoundStream
and telling the library (or the avconv util, see www.avconv.org) that I was using 2 channels.
So now that I had sound with a normal pitch I had to find a way to get them in sync. After
spending so many hours getting into encoding and tinto hese libraries I didn’t have time for another
adventure to get the sync working. It seems that learning/getting into these libraries and all
their peculiarities takes more time then writing things yourself. Basically all what is done, are
three things: encoding, muxing and input/output (streaming over tcp for example).
Update 1, 2012.08.08
So I’m at this point now and I’m thinking about creating a simple library which can create a
live stream to the web from raw data (RGB24, PCM audio). I want to use 100% open technologies.
So no flash, no h264. This means that either using ogg or webm is the way to go. I’ve got a
pretty clear idea how to make this work. I want to create this simple library which stands on it’s
own and only uses what is necessary to create a video stream with audio to the web. This means
no huge libraries, no wrappers like ffmpeg, gstreamer, libav, vlc etc… All what’s needed is libvpx,
Update 2, 2012.09.09
I’m still looking into this and writing a couple of test applications which take care of little parts
of what is needed to create this streaming library. At this point I’ve got a working example which
encoders raw RGB24 frames with libvpx and muxes it into a .webm video stream. This stream is
then send to an icecast server which makes the stream available in a web page. All code I wrote
for this is still part of my research to find out how to write a streaming video lib. Next part will be
looking into audio encoding with http://www.theora.org/downloads/, http://www.opus-codec.org/, mp3 and speex.
Update 3, 2012.10.19
Just wanted to mention that I’ve got a proof of concept working which is only using 2 libraries: x264 + libspeex. I’m muxing encoded data into flv and send that to a flash media server. Video + audio is nicely in sync! It needs a bit more work and testing.
I’m know people here are interested in something like this. If you’re interested in this and
want to help you’re more then welcome Please reply on this post if you want to make this work
and/or have ideas about how to approach this.