Monitoring Audio Levels with PulseAudio
I'm working on driving an analog VU meter from my Raspberry Pi using whatever audio is going out the Pi's sound outputs. The de facto Linux sound system, PulseAudio, allows any sound output (or "sink" in PulseAudio's nonclementure) to be monitored. In PulseAudio land, each sink has a corresponding "source" called the monitor source which can be read just like any other other PulseAudio input such as a microphone. In fact, to help with volume meter style applications, PulseAudio even allows you to ask for peak level measurements, which means you can sample the monitor sink at a low frequency, with low CPU utilisation, but still produce a useful volume display. When this feature is used, each sample read indicates the peak level since the last sample.
The main PulseAudio API is asynchronous and callback based, and the documentation is primarly just an API reference. This makes it a little difficult to figure out how to get everthing to hang together. Using the code from various open source projects (primarily veromix-plasmoid and pavumeter), along with the API reference, I was able to develop a fairly minimal code example that will hopefully be useful to others trying to do something similar. Although this example is written in Python, it is using the PulseAudio C API directly (via ctypes) so it should hopefully still be relevant if your application is written in C or another language.
Here's the demo code. The latest version is also available on Bitbucket. Note that in order to run this example you need to install Vincent Breitmoser's ctypes PulseAudio wrapper available at https://github.com/Valodim/python-pulseaudio.
import sys from Queue import Queue from ctypes import POINTER, c_ubyte, c_void_p, c_ulong, cast # From https://github.com/Valodim/python-pulseaudio from pulseaudio.lib_pulseaudio import * # edit to match your sink SINK_NAME = 'alsa_output.pci-0000_00_1b.0.analog-stereo' METER_RATE = 344 MAX_SAMPLE_VALUE = 127 DISPLAY_SCALE = 2 MAX_SPACES = MAX_SAMPLE_VALUE >> DISPLAY_SCALE class PeakMonitor(object): def __init__(self, sink_name, rate): self.sink_name = sink_name self.rate = rate # Wrap callback methods in appropriate ctypefunc instances so # that the Pulseaudio C API can call them self._context_notify_cb = pa_context_notify_cb_t(self.context_notify_cb) self._sink_info_cb = pa_sink_info_cb_t(self.sink_info_cb) self._stream_read_cb = pa_stream_request_cb_t(self.stream_read_cb) # stream_read_cb() puts peak samples into this Queue instance self._samples = Queue() # Create the mainloop thread and set our context_notify_cb # method to be called when there's updates relating to the # connection to Pulseaudio _mainloop = pa_threaded_mainloop_new() _mainloop_api = pa_threaded_mainloop_get_api(_mainloop) context = pa_context_new(_mainloop_api, 'peak_demo') pa_context_set_state_callback(context, self._context_notify_cb, None) pa_context_connect(context, None, 0, None) pa_threaded_mainloop_start(_mainloop) def __iter__(self): while True: yield self._samples.get() def context_notify_cb(self, context, _): state = pa_context_get_state(context) if state == PA_CONTEXT_READY: print "Pulseaudio connection ready..." # Connected to Pulseaudio. Now request that sink_info_cb # be called with information about the available sinks. o = pa_context_get_sink_info_list(context, self._sink_info_cb, None) pa_operation_unref(o) elif state == PA_CONTEXT_FAILED : print "Connection failed" elif state == PA_CONTEXT_TERMINATED: print "Connection terminated" def sink_info_cb(self, context, sink_info_p, _, __): if not sink_info_p: return sink_info = sink_info_p.contents print '-'* 60 print 'index:', sink_info.index print 'name:', sink_info.name print 'description:', sink_info.description if sink_info.name == self.sink_name: # Found the sink we want to monitor for peak levels. # Tell PA to call stream_read_cb with peak samples. print print 'setting up peak recording using', sink_info.monitor_source_name print samplespec = pa_sample_spec() samplespec.channels = 1 samplespec.format = PA_SAMPLE_U8 samplespec.rate = self.rate pa_stream = pa_stream_new(context, "peak detect demo", samplespec, None) pa_stream_set_read_callback(pa_stream, self._stream_read_cb, sink_info.index) pa_stream_connect_record(pa_stream, sink_info.monitor_source_name, None, PA_STREAM_PEAK_DETECT) def stream_read_cb(self, stream, length, index_incr): data = c_void_p() pa_stream_peek(stream, data, c_ulong(length)) data = cast(data, POINTER(c_ubyte)) for i in xrange(length): # When PA_SAMPLE_U8 is used, samples values range from 128 # to 255 because the underlying audio data is signed but # it doesn't make sense to return signed peaks. self._samples.put(data[i] - 128) pa_stream_drop(stream) def main(): monitor = PeakMonitor(SINK_NAME, METER_RATE) for sample in monitor: sample = sample >> DISPLAY_SCALE bar = '>' * sample spaces = ' ' * (MAX_SPACES - sample) print ' %3d %s%s\r' % (sample, bar, spaces), sys.stdout.flush() if __name__ == '__main__': main()
When running this demo, you'll need to modify SINK_NAME to match the name of the sink you want to monitor. If you're not sure of the sink, just run it - the program prints the details of all available sinks to stdout. If all goes to plan you should see a basic volume display in the console (when sound is actually playing!).
In this demo, the PeakMonitor class does all the interaction with the PulseAudio API. It needs the name of a sink to monitor and the sampling rate. Iterating over a PeakMonitor instance will give 8-bit peak level samples (actually they're from 128 to 255 - see comment in code and comment from Tanu on this article).
The main function implements a simple volume display. Some of the constants at the top of the code control how the volume level is displayed.
As mentioned earlier, the PulseAudio API is asynchronous, making heavy use of callbacks. This goes for API calls that return information about the sound configuration of the system as well as API calls that return or accept actual sound data. The reason for this is that the available sound devices may change at any point as a program is running (e.g. when USB audio devices are connected) and PulseAudio clients may need to be able to handle such changes.
Here's how the demo program interacts with PulseAudio:
- __init__ sets up the PulseAudio processing loop in a separate thread, establishes a connection to the PulseAudio daemon and asks PulseAudio to call the context_notify_cb method with updates about the status of the connection to PulseAudio.
- context_notify_cb will be called several times by the PulseAudio API as the connection is established. All things going to plan, the connection will reach the PA_CONTEXT_READY state. At this point, we request that the sink_info_cb method be called with information about each available sink.
- sink_info_cb is called once for each available sink. If a sink with the name matching the one passed to __init__ is seen, a stream is created on it's monitor source and stream_read_cb is set to be called with 8-bit peak level samples.
- stream_read_cb is then called at the rate requested when the PeakMonitor class was instantiated. pa_stream_peek reads the available sample(s) from the stream and pa_stream_drop tells PulseAudio that we're done with the stream data (presumably so the buffer can be re-used or deallocated). The callback functions are all called on the mainloop thread created by the PulseAudio API so the _samples Queue instance is used to safely return samples back to the main thread (__iter reads the samples from this queue).
When running this demo on my fairly high-spec laptop it has no noticeable impact on CPU utilisation, but it's another story when it runs on the Raspberry Pi: around 25% of the CPU is required when the monitor sampling rate is 344 Hz. The issue is that we've got PulseAudio calling our Python function (stream_read_cb) at a siginficant frequency and Python just isn't that fast on the Raspberry Pi's 700MHz CPU. The pointer manipulation being done in stream_read_cb, which would be incredibily fast in C, is being done using a significant amount of Python bytecode and function calls (partially because ctypes is being used to do them).
I'm not too comfortable with constantly pegging the Raspberry Pi CPU at 25% as I'd like to eventually to have a always-running running daemon process driving the VU meter on the Raspberry Pi. It doesn't seem like a good idea to constantly have the CPU doing that much work. I plan on trying out Cython to convert the PulseAudio interation parts to C or rewriting the entire program in pure C since it's not really taking much advantage of Python's features. More on this to come.
I hope this article is useful to developers who are interested in using the PulseAudio API. Please note that I'm by no means a PulseAudio expert. Please let me know if there's anything I could be doing better and I'll update the article.
(This article is actually quite far behind where I'm actually at with the VU meter project. I actually already have a DAC connected to the GPIO ports, driving an analog VU meter using the sound going out the Raspberry Pi's audio outputs. I'm hoping to publish some more articles this week.)
Update (2013-02-14): I've updated the code and article to explain about why peak samples only range from 128 to 255. See Tanu's comment below too.
Update (2013-05-06): There's now a new article that describes how this code is used to actually drives a VU meter from the Raspberry Pi.
posted: Sun, 10 Feb 2013 22:05 | permalink | comments