As I promised here is Part 3 of the audio series. The library covered here is called pyo. Not only does this library aim to assist in composing music, it's also intended to be used as a backend for audio processing programs, for example Cecilia 5, PsychoPy, Soundgrain and Zyne.
This library is very large however I will only cover the basics here because it unfortunately has a nasty habit of failing to boot an audio server (see below section, and also the comments).
Installation
This library supports Python 2 as well as 3.5/6/7/8. The docs don't say whether it works on 3.8 but so far I've been using pyo on 3.8 without problems.
pip install pyo
installs pyo but you should also install wxwidgets
too because Pyo has GUI functions you can take advantage of. Wxwidgets might take a long time to install because it has to compile C modules.
Quick start
This example plays a 1000 Hz sine tone for 3 seconds - the "beep" you hear when words are censored:
import time
from pyo import *
# Initialize a Server object
s = Server().boot()
# Start playing audio
s.start()
# Plays the sine wave in Python console, returns immediately
a = Sine(mul=0.01).out()
time.sleep(3)
a.stop()
Note: sometimes you might get an error when running Server().boot()
if you are using Portaudio on Linux. Sometimes I got these too and I haven't seemed to find a working solution for it. But remember this is the audio series, not every library is going to be perfect. If you have Jack installed, try using Server(audio="jack").boot()
although I haven't tested this.
Without sleeping, running this snippet in a script or otherwise non-interactively won't play a sound because the script exited, and pyo sounds are stopped when the script exits. You could sleep with import time; time.sleep(seconds)
for as many seconds as you want to play, but another workaround would be to spawn a GUI with a button to play the sound like this:
from pyo import *
s = Server().boot()
s.start()
a = Sine(mul=0.01).out()
s.gui(locals())
Creating the Sine object before booting the server (calling boot()
) will raise a PyoServerStateException. The server needs to be booted before creating audio objects.
Also you may have noticed that the sine wave sounds very quiet. That is because the sine wave's gain (the mul
parameter) was made very low. The loudness of all sounds can be controlled at once by adjusting s.amp
, the gain. This example drops the gain by 20 dB:
from pyo import *
s = Server().boot()
s.amp = 0.1
a = Sine().out() # Plays on left channel
a2 = Sine().out(1) # Plays on right channel
s.gui(locals())
Here's a decibel-to-amplitude conversion table for reference:
Variables should not be overwritten if you are using the GUI. If you overwrite a
with something else it won't play on the left channel.
The reason why we need to pass locals()
as the argument to s.gui()
is to allow it to make a prompt for you to type python commands in. gui()
does not return and if you Ctrl-C the terminal or quit the GUI then the whole Python process will exit. In order to make the GUI return to the python console instead you need to call it with gui(... exit=False)
.
One problem that I noticed with this interpreter is that you can't seem to type multi-line commands in it.
I have a preference to starting the GUI instead of directly playing the sounds and that reflects in the examples I show here.
Setting up your audio on Windows
Windows needs special attention to make sure pyo plays audio properly. The default Windows audio host used by pyo is DirectSound, however any version of Windows since Vista should use the WSAPI host to play audio properly. So on those operating system versions, pyo needs to be configured to use WSAPI with Server(..., winhost="wasapi")
. As stated earlier, not using the winhost
parameter defaults the server to use DirectSound which most likely won't produce the expected results unless you are using Windows XP.
You must also make sure that the sample rate used by your Windows audio device is the same as the one used by pyo, and pyo defaults to a 44100 Hz sample rate. To verify this you right-click the volume icon and click "Playback Devices", which should open a window like this:
Then you right-click on your speakers or other audio device and select Properties which should then show you the sample rate used by the audio hardware. So if your speakers have a sample rate of 48000 Hz, you can tell pyo to launch a server with that sample rate with Server(sr=48000, ...)
, and not specifying sr
defaults to 441000.
It's not required but you can also turn on Exclusive Mode if you want which bypasses the Windows volume control and any other effects the audio driver does, and just pass whatever sound pyo makes directly to the speakers or other output device. In particular, sounds other programs make won't be heard (more information). The Priority checkbox lets pyo use the devices exclusively even if another program is using it. If you're not sure what to do here, leave both of these checkboxes alone.
Last, and this applies to all operating systems, if you have a built-in sound card then you probably want to increase the buffer size to prevent glitches in audio playback, using something like Server(buffersize=512, ...)
(the default is 256).
Increasing the buffer size directly affects latency. Latency is determined by buffer_size/sample_rate
, so if your sound card is built-in like I just mentioned, it's going to have a small sample rate and so to avoid very small latencies which will cause samples of the sound to be skipped at playback, the buffer size must be increased. Increasing the buffer size by too much will cause samples to play very slowly and you will notice a gap between samples, so don't make the buffer size too large.
Again, pyo defaults to 44100 Hz sample rate, 32-bit float depth. 64-bit float depth can be used by importing pyo64
instead of pyo
. Regarding bit depth, you shouldn't need to change it to make playback work.
Sources, processes and sinks
There are audio objects which create a sound and are called sources. The audio objects that modify a sound are called processess, these return audio objects themselves. Processes can modify both sources and processes as you will see below. Finally audio objects are sent to a sink for output which could be a physical audio device or one of the channels of a speaker. Nearly all speakers have a left and right channel, some have more than two channels. If you have 5.1 surround then you typically have 6 channels and if you have 7.1 surround it's typically 8.
Among other sources, there is a sine wave source Sine()
, a white noise source Noise()
and a phase incrementor Phasor()
. I will have more to say about sources and processes in part 4.
Playing audio objects in parallel
You just make the audio objects you want to play and call out()
on each of them. They will play as soon as you call start()
or gui()
:
from pyo import *
s = Server().boot()
s.amp = 0.1
a = Sine()
hr = Harmonizer(a).out()
ch = Chorus(a).out()
sh = FreqShift(a).out()
s.gui(locals())
Creating all of your pyo objects before starting playback improves performance of pyo.
It's also possible to chain the processes together. Here I pass the sine wave through four harmonizers:
from pyo import *
s = Server().boot()
s.amp = 0.1
a = Sine().out()
h1 = Harmonizer(a).out()
h2 = Harmonizer(h1).out()
h3 = Harmonizer(h2).out()
h4 = Harmonizer(h3).out()
s.gui(locals())
Controlling sound playback
By default, out()
plays the audio object on channel 0 which is usually the left channel. To play it on all channels you have to call out()
for each channel number. Passing a number to out()
plays the sound on a specific channel. Usually, 0
denotes the left channel, 1
denotes the right channel and higher numbers denote sucessive channels, but it depends on the order your operating system numbers the channels.
out()
can also control the delay before a sound is played and how long the sound plays. It has keyword arguments delay
which controls the delay in seconds and dur
, the duration in seconds. Fractions (floats) can be used in place of numbers. This example plays noise frequencies below 1000 Hz after a delay of 5 seconds for a duration of 10 seconds:
from pyo import *
s = Server().boot()
s.amp = 0.1
n = Noise()
lp = ButLP(n).out(dur=10, delay=5)
s.gui(locals())
At any time, a sound can be stopped by calling its stop()
method.
Finally, some audio objects don't play as soon as they are created. You need to call their play()
method to play them and those objects will be pointed out as I cover them. All audio objects have a play()
method that plays them, though it's usually called automatically, and an isPlaying()
method that returns a boolean whether it is playing right now.
GUI controls
As if what we've seen so far wasn't good enough, pyo can also create widgets that allow you to control parameters to your audio objects while they're playing. This snippet creates GUI controls for two frequency modulators (FM) and one for the harmonics of the sine wave:
from pyo import *
s = Server().boot()
s.amp = 0.1
# Creates two frequency modulation parameters, one per channel.
a = FM().out()
b = FM().out(1)
# Opens the controller windows.
a.ctrl(title="Frequency modulation left channel")
b.ctrl(title="Frequency modulation right channel")
# If a list of values is given at a particular argument, the ctrl
# window will show a multislider to set each value separately.
oscs = Sine([100, 200, 300, 400, 500, 600, 700, 800], mul=0.4).out()
oscs.ctrl(title="Simple additive synthesis")
s.gui(locals())
In this picture I paused the playback.
In the "Simple additive synthesis" control, we are able to manipulate each harmonic of the sine wave, which will result in a combined waveform with each of the sine waves of those frequency. Remember that harmonics are numbers which determine the shape of a waveform. The phase can be changed as well.
Visualization of the waveform aka. signal
At this point it's worth noting waveforms are sometimes called signals especially in the field of signal processing.
pyo contains a Scope
object which creates an animated graph of the waveform that updates the graph in realtime as the waveform is played. Here is a visualization of three sine waves:
from pyo import *
s = Server().boot().start()
a = Sine(freq=100, mul=0.5)
b = Sine(freq=100, mul=0.5, add=0.5)
c = Sine(freq=100, mul=0.01)
sc = Scope([a, b, c])
s.gui(locals())
And here is a visualization of band-limited square waves:
from pyo import *
s = Server().boot().start()
osc = []
for pitch in [48, 52, 55, 60]:
amp = Fader(fadein=5, mul=0.1).play()
lo, hi = midiToHz((pitch - 0.1, pitch + 0.1))
fr = Randi(lo, hi, [random.uniform(.2, .4) for i in range(50)])
sh = Randi(0.1, 0.9, [random.uniform(.2, .4) for i in range(50)])
osc.append(LFO(fr, sharp=sh, type=2, mul=amp).out())
sc = Scope(osc)
s.gui(locals())
A lot of new classes and parameters have been shown here. One of them is the add
parameter to an audio object. It changes the vertical offset of a waveform (audio object). There is no point in making it greater than 1 or less than -1 because only the parts of the waveform that reside between -1 and 1 will be heard. The waveform is first multiplied by the mul
parameter and then added by the value in add
. Almost all audio objects have mul
and add
arguments in their function signatures.
There is also a range()
method that sqeezes the waveform between a minimum and maximum value. It's used like c = Sine(freq=100).range(-0.25, 0.5)
. This will shrink the sine wave between -0.25 and 0.5 values and it would be shown like that on the scope graph. The range minimum and maximum should be between -1 and 1.
The midiToHz()
function takes a MIDI note number, which might be fractional, and converts it into a frequency in Hz. It can also take lists and tuples of MIDI note numbers.
Randi(min=0.0, max=1.0, freq=1.0)
is a pseudo-random number generator which generates numbers beween min
and max
at frequency freq
.
Fader(fadein=0.01, fadeout=0.1, dur=0)
makes a fade-in and/or fade-out effect. Specifically, it makes an amplitude envelope that varies from 0 and 1. You have to explicitly call its play()
method to start the fader.
Last, there is a Spectrum
widget that plots the frequency of the waveform against the magnitude. Different freqencies inside the waveform have diffent sized amplitudes.
from pyo import *
s = Server().boot()
s.amp = 0.1
# Full scale sine wave
a = Sine()
# Creates a Dummy object `b` with `mul` attribute
# set to 0.5 and leaves `a` unchanged.
b = a * 0.5
b.out()
# Computes a ring modulation between two PyoObjects
# and scales the amplitude of the resulting signal.
c = Sine(300)
d = a * c * 0.3
d.out()
# PyoObject can be used with Exponent operator.
e = c ** 10 * 0.4
e.out(1)
# Displays the ringmod and the rectified signals.
sp = Spectrum([d, e])
sc = Scope([d, e])
s.gui(locals())
This spectrum widget has a strange looking horizontal scroll bar which can pan horizontaly into the plot range you want to see. In this spectrum, the frequency logarithm is plotted against the magnitude logarithm, filtered through a Hanning window.
Filters
If you made it this far, congratulations 🎉 this is where things start to get fun.
As stated in the pyo documentation, One of the most important thing with computer music is the trajectories taken by parameters over time. This is what gives life to the synthesized sound.
And indeed it's very important. Good synthesized sound needs more than simple sine and square waves.
Enter LFOs, low frequency oscillators. They are sounds which take a base waveform, a fundamental frequency and a sharpness (and of course mul
and add
), which is a metric of how many harmonics you want around the spectrum, higher means more harmonics. In pyo, an LFO can have one of 8 base waveforms:
- Saw up (default)
- Saw down
- Square
- Triangle
- Pulse
- Bipolar pulse
- Sample and hold
- Modulated Sine
The frequency you specify here is clamped between 0.00001 and the server sample rate/4.
Despite its name, an LFO can represent very high fundamental frequencies (the object was actually misspelled LFO early in design phase). These LFOs are band-limited, which means none of its partials (sine waves the LFOs are made of) exceeds the Nyquist frequency, which is sample rate/2. The Nyquist frequency is the highest frequency that can be reproduced.
Once again, it's important the sample rate used by pyo is high enough so the higher harmonics don't wrap around the Nyquist frequency, producing aliasing in the waveform.
from pyo import *
s = Server().boot()
# Creates a noise source
n = Noise()
# Creates an LFO oscillating +/- 500 around 1000 (filter's frequency)
lfo1 = Sine(freq=.1, mul=500, add=1000)
# Creates an LFO oscillating between 2 and 8 (filter's Q)
lfo2 = Sine(freq=.4).range(2, 8)
# Creates a dynamic bandpass filter applied to the noise source
bp1 = ButBP(n, freq=lfo1, q=lfo2).out()
# The LFO object provides more waveforms than just a sine wave
# Creates a ramp oscillating +/- 1000 around 12000 (filter's frequency)
lfo3 = LFO(freq=.25, type=1, mul=1000, add=1200)
# Creates a square oscillating between 4 and 12 (filter's Q)
lfo4 = LFO(freq=4, type=2).range(4, 12)
# Creates a second dynamic bandpass filter applied to the noise source
bp2 = ButBP(n, freq=lfo3, q=lfo4).out(1)
sc = Scope([bp1, bp2])
s.gui(locals())
Looks good, and sounds good too. 😎
Channels/Streams
Pyo audio objects can have more than one waveform, referred to in pyo as streams. The consequence of this is that nearly all object attributes can take list of values instead of a single value.
It is useful to mix down streams into a smaller number of streams before processing audio object, as this saves CPU cycles. This can be accomplished with the mix(voices=1)
method. By default it mixes all the streams down to one stream.
It's immediately realized that having an audio object with two streams has the same effect has having a stereo channel, versus all the other audio objects we've dealt with so far which only had one stream and therefore were mono channels. A sound with two streams plays on two output channels at the same time. This has far reaching implications. It makes editing conventional stereo sound possible. Speech input can be mixed down to mono. You no longer have to call out()
twice. In fact, an object with n
samples can play on n
channels so you can easily edit things like quadraphonic sound too.
It's safe to assume that the defining property of widely used audio processing tools is their ability to process multi-channel audio.
When you pass lists of different lengths to different properties of the same object, the smaller lists wrap around, possibly many times, to fill the length of the longest list. But if you inspect the properties it shows the original lists.
In addition to chnl
, out()
takes yet another parameter called inc
which specifies a step that skips some channels and outputs to others. It's best described with an example. Assuming audio object a
has four streams, a.out(chnl=0, inc=2)
will output the four streams to channels 0, 2, 4 and 6 respectively (assuming your audio hardware supports at least 7 channels). That's about as precise as it gets (but see below). There is room for improvement in the way the output channels are chosen; I personally would want to use a list of channels as an argument to out()
.
In fact, this very behavior is possible and implemented. If you pass a list to chnl
, each stream will be output to the respective channel. Just make sure that the list is the same length as the number of streams.
Spectrum generators
Pyo has four objects which can synthesize spectrum oscillators:
- Blit, impulse train generator with control over the number of harmonics
- RCosc, an RC circuit approximation (a capacitor and a resistor in series)
- SineLoop, sine wave oscillator with feedback
- SuperSaw, Roland JP-8000 Supersaw emulator
I won't be displaying scopes or spectrums of these here since there are so many generators, but this demo lets you experiment with them. In here you can move the "voice" control to adjust the interpolation:
from pyo import *
s = Server().boot()
# Sets fundamental frequency.
freq = 187.5
# Impulse train generator.
lfo1 = Sine(.1).range(1, 50)
osc1 = Blit(freq=freq, harms=lfo1, mul=0.3)
# RC circuit.
lfo2 = Sine(.1, mul=0.5, add=0.5)
osc2 = RCOsc(freq=freq, sharp=lfo2, mul=0.3)
# Sine wave oscillator with feedback.
lfo3 = Sine(.1).range(0, .18)
osc3 = SineLoop(freq=freq, feedback=lfo3, mul=0.3)
# Roland JP-8000 Supersaw emulator.
lfo4 = Sine(.1).range(0.1, 0.75)
osc4 = SuperSaw(freq=freq, detune=lfo4, mul=0.3)
# Interpolates between input objects to produce a single output
sel = Selector([osc1, osc2, osc3, osc4]).out()
sel.ctrl(title="Input interpolator (0=Blit, 1=RCOsc, 2=SineLoop, 3=SuperSaw)")
# Displays the waveform of the chosen source
sc = Scope(sel)
# Displays the spectrum contents of the chosen source
sp = Spectrum(sel)
s.gui(locals())
FM generators
Pyo has two frequency modulation (FM) generators although it's simple for users to implement a custom generator.
from pyo import *
s = Server().boot()
# FM implements the basic Chowning algorithm
fm1 = FM(carrier=250, ratio=[1.5,1.49], index=10, mul=0.3)
fm1.ctrl()
# CrossFM implements a frequency modulation synthesis where the
output of both oscillators modulates the frequency of the other one.
fm2 = CrossFM(carrier=250, ratio=[1.5,1.49], ind1=10, ind2=2, mul=0.3)
fm2.ctrl()
# Interpolates between input objects to produce a single output
sel = Selector([fm1, fm2]).out()
sel.ctrl(title="Input interpolator (0=FM, 1=CrossFM)")
sp = Spectrum(sel)
s.gui(locals())
Noise generators
In addition to white noise, pink noise and brown noise can be synthesized.
from pyo import *
s = Server().boot()
n1 = Noise(0.3)
n2 = PinkNoise(0.3)
n3 = BrownNoise(0.3)
sel = Selector([n1, n2, n3]).out()
sel.ctrl(title="Input interpolator (0=White, 1=Pink, 2=Brown)")
sp = Spectrum(sel)
s.gui(locals())
Strange attractors (yes that's the name)
There is a special group of waveforms called strange attractors. Without getting into too much math, these attractors have fractal properties and displays chaotic behavior in the scope. Pyo has three strange attractors, Rossler, Lorenz and ChanLee, all of which support generating a stereo waveform.
The strange attractors can also be used to make LFOs. In that case the frequency of the LFE is a strange attractor object.
from pyo import *
s = Server().boot()
# LFO applied to the chaos
attribute
lfo = Sine(0.2).range(0, 1)
# Rossler attractor
n1 = Rossler(pitch=0.5, chaos=lfo, stereo=True)
# Lorenz attractor
n2 = Lorenz(pitch=0.5, chaos=lfo, stereo=True)
# ChenLee attractor
n3 = ChenLee(pitch=0.5, chaos=lfo, stereo=True)
sel = Selector([n1, n2, n3])
sel.ctrl(title="Input interpolator (0=Rossler, 1=Lorenz, 2=ChenLee)")
sc = Scope(sel)
# Lorenz with very low pitch value that acts as a LFO
freq = Lorenz(0.005, chaos=0.7, stereo=True, mul=250, add=500)
a = Sine(freq, mul=0.3).out()
s.gui(locals())
Random number generators
To wrap up this section I will show you generators that create random numbers which can then be used for such things like frequencies and parameters to add
and mul
. These generators can take a list of frequency values which causes the generator to return that many random numbers.
Choice
will choose a random MIDI note from a list of notes (or list of lists of notes, which will trigger list expansion) at the given frequencies. Randi
makes a floating point number between a minimum and maximum value at the given frequencies. The "i" and Randi stands for interpolation. Randi
interpolates between old and new values. RandInt
makes a random integer between 0 and a maximum number (exclusive) at the given frequencies. All of these take mul
and add
parameters.
from pyo import *
s = Server().boot()
# Two streams of midi pitches chosen randomly in a predefined list.
The argument choice
of Choice object can be a list of lists to
list-expansion.
mid = Choice(choice=[60,62,63,65,67,69,71,72], freq=[2,3])
# Two small jitters applied on frequency streams.
Randi interpolates between old and new values.
jit = Randi(min=0.993, max=1.007, freq=[4.3,3.5])
# Converts midi pitches to frequencies and applies the jitters.
fr = MToF(mid, mul=jit)
# Chooses a new feedback value, between 0 and 0.15, every 4 seconds.
fd = Randi(min=0, max=0.15, freq=0.25)
# RandInt generates a pseudo-random integer number between 0 and max
values at a frequency specified by freq
parameter. It holds the
value until the next generation.
Generates an new LFO frequency once per second.
sp = RandInt(max=6, freq=1, add=8)
# Creates an LFO oscillating between 0 and 0.4.
amp = Sine(sp, mul=0.2, add=0.2)
# A simple synth...
a = SineLoop(freq=fr, feedback=fd, mul=amp).out()
s.gui(locals())
And we're done
The large number of examples shows you that pyo is an advanced library but further discussion was spoiled because of the audio server problems which prevented me from doing anything with pyo. It's a shame, because it had a lot of other classes I didn't get to write about here. In the next part I will look at another library which hopefully doesn't contain these kind of errors.
Sometimes it takes just one bug to break the user experience.
Image by Gerd Altmann from Pixabay
Top comments (4)
s = Server().boot()
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_a52.c:823:(_snd_pcm_a52_open) a52 is only for playback
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
Pyo warning: Portaudio input device
HDA Intel HDMI: 0 (hw:0,3)
has fewer channels (0) than requested (2).Portaudio error in Pa_OpenStream: Invalid number of channels
Pyo error: From portaudio, Invalid number of channels
Portaudio error in Pa_CloseStream (pa_deinit): PortAudio not initialized
Portaudio error in Pa_Terminate (pa_deinit): PortAudio not initialized
Pyo error:
Server not booted.
Any ideas why this doesn't work? Audacity, pd-l2or, VLC and many other audio applications have no trouble communicating with my sound interfaces.
It seems like a problem in Portaudio (the backend that Pyo uses). This message in particular:
I've been getting that error sometimes and I can't seem to reproduce it consistently. My wild guess is that something else exclusively opened the input device, but I haven't looked hard enough at the Portaudio code base to confirm this.
Thanks for the answer Ali. Tried all sorts, making sure nothing else using audio, increasing priority of python, etc. No joy I'm afraid. And no idea why other audio applications (audacity, pd-l2ork etc.) do not have any such problems. Suspect something wonky about the python implementation vis á vis the low-level audio interface. As I said, all other applications involving audio work without a hitch.
In case you're wondering why the image is low quality: It got degraded by DEV's CDN. The original image has much higher quality than this.
Update Feb 18: It looks like pyo's not the one which is causing trouble with audio server booting, it's the Portaudio backend that pyo uses internally. I found this out while experimenting with pyaudio yesterday, which also uses Portaudio. Portaudio is a C library not a Python library. Fortunately I found a (rather minimalist) python library that doesn't use Portaudio so I'm going to check that out.