Pitch shifting in Web Audio API

Published 2020-02-23
Last updated 2021-06-07
Revisions

Using samples is a quick and easy way to create realistic sounding virtual instruments. We only need a couple of samples, typically recordings of individual notes played on a real instrument, which can then be pitch shifted to play any note. This article will teach you how to create a sample-based instrument using JavaScript and Web Audio API.

Changing the pitch

Let's begin by loading a sample and playing it. We'll use a recording of note C₄ played on harpsichord by pjcohen licensed under CC0 1.0.

const context = new AudioContext();

function loadSample(url) {
  return fetch(url)
    .then(response => response.arrayBuffer())
    .then(buffer => context.decodeAudioData(buffer));
}

function playSample(sample) {
  const source = context.createBufferSource();
  source.buffer = sample;
  source.connect(context.destination);
  source.start(0);
}

loadSample('harpsichord-c4.wav')
  .then(sample => playSample(sample));

How can we modify the pitch of the sample? When looking for available options, detune property of AudioBufferSourceNode seems to be the obvious answer. However, it has a couple of major drawbacks:

Safari does not support detune at all (see WebKit bug 193445).
Firefox does support detune but its range is limited to one octave (see Firefox bug 1624681). This limitation comes from an old specification of Web Audio API so other browsers (or their old versions) may also have the same limitation.

Instead of using detune, we'll use the better supported playbackRate property for pitch shifting. For example, setting playbackRate to 2 will play the sample twice as fast and 0.5 will halve the speed.

function playSample(sample, rate) {
  const source = context.createBufferSource();
  source.buffer = sample;
  source.playbackRate.value = rate;
  source.connect(context.destination);
  source.start(0);
}
playSample(sample, 0.5);

Test different playback rates below:

Make it musical

We can now change the pitch of the sample by changing its playback rate. But which playback rate should we use to play a specific musical note? To answer this, we need to understand how the frequencies of pitches are related to each other.

Let's start with the fundamental relationship between frequencies: an octave. An octave means that the ratio between two frequencies is 2:1. In other words the playback rate for one octave higher is 2 and for two octaves higher it's 4. Likewise, the playback rate for one octave lower is 0.5 and for two octaves lower it's 0.25. We can express this as the following function:

\[\text{playback-rate}(\text{octaves}) = 2^{\text{octaves}}\]

In western music each octave is divided into 12 equal parts called semitones. This can be expressed with a small modification to the previous function:

\[\text{playback-rate}(\text{semitones}) = 2^{\text{semitones}/12}\]

Typically, we think in terms of absolute pitches like C₄ or D₄. We can represent these pitches as MIDI notes 60 and 62. Now it's easy to calculate their difference of 2 semitones. In general the playback rate required to play MIDI note \(a\) using a sample of MIDI note \(b\) is the following:

\[\text{playback-rate}(a,b) = 2^{(a-b)/12}\]

Let's turn this into code:

function playSample(sample, sampleNote, noteToPlay) {
  const source = context.createBufferSource();
  source.buffer = sample;
  source.playbackRate.value = 2 ** ((noteToPlay - sampleNote) / 12);
  source.connect(context.destination);
  source.start(0);
}
playSample(sample, 60, 62);

Now we can use this function to build complete musical instruments. Try out the harpsichord below by clicking or tapping the keys or by using your computer's keyboard (first move focus by clicking or tabbing, then play starting from Q or Z keys on QWERTY keyboard):

If you're interested, check out the source code for the harpsichord.

Sometimes we need smaller intervals than a semitone, for example to implement vibrato or pitch bending. A commonly used unit for this is cent which is also used by the detune property. There are 100 cents in a semitone, so calculating the playback rate is straightforward:

\[\text{playback-rate}(\text{cents}) = 2^{\text{cents}/1200}\]

However, it's easy to convert cents to (decimal) semitones, so we don't need to rewrite our code in order to use cents:

let bend = 10; // 10 cents = 0.1 semitones
playSample(sample, 60, 62 + bend / 100);

Development ideas

There are many ways to improve upon the implementation presented here:

Using just a single sample can create a dull sound especially when it's pitch shifted considerably. For more realistic result, it's a good idea to use multiple samples of different notes and pitch shift the closest one to play a note.
Adding dynamics, i.e. playing a note hard or softly, is another way to increase the realism and expressiveness of an instrument. Dynamics can faked by changing the volume but for better result multiple samples of different velocities should be used.
Using Web MIDI API the instrument could be played on a physical MIDI keyboard.

Finally, there are many sound synthesis techniques like filters and envelopes which can be used, with or without samples, to create more varied and interesting sounds.