Tuomas Siipola Articles Projects

File formats in Web Audio API

Web Audio API makes it possible to play and modify sounds and music in the browser. Unfortunately different browsers, operating systems and devices support different audio formats. The common WAVE files are universally supported, but in most situations, using these wastes the precious network bandwidth. In this article, we're investigating how to serve audio to the users in the most suitable format.

Browser support

To provide content to variety of platforms on the web, server-driven content negotiation has been traditionally used. Newer approach is to specify multiple versions of the same content in HTML and let the browser choose the most suitable one. This is done for example in <audio> and <picture> elements with multiple <source> elements. However, neither of these approaches is supported in Web Audio API.

Luckily it's possible to replicate this behavior with JavaScript. We can use canPlayType method of HTMLMediaElement to check if the browser supports certain multimedia format. Creating HTMLMediaElement is easy with new Audio() which is equivalent to calling document.createElement('audio') or writing <audio> in HTML. Now we can check, for example, if the browser supports Ogg Vorbis:

const audio = new Audio();
const supports = audio.canPlayType('audio/ogg; codecs=vorbis') === 'probably';

The return value is pretty strange. If the browser might support the format, canPlayType returns either "maybe" or "probably". In order the receive the more certain value "probably", we must explicitly specify the container format and the codec:

Implementors are encouraged to return "maybe" unless the type can be confidently established as being supported or not. Generally, a user agent should never return "probably" for a type that allows the codecs parameter if that parameter is not present.

The following file formats should provide a good experience for most users:

FormatMIME typesTarget
Opus in Oggaudio/ogg; codecs=opusModern platforms
Vorbis in Oggaudio/ogg; codecs=vorbisOld platforms
Opus in CAFaudio/x-caf; codecs=opusmacOS High Sierra/iOS 11 or later
AAC-LC in MP4audio/mp4; codecs=mp4a.40.2Old Apple devices
PCM in WAVEaudio/wave, audio/wav, audio/x-wavFallback

Instead of the lossy formats listed above, we might sometimes want to preserve audio quality by using lossless formats:

FormatMIME typesTarget
FLAC in native containeraudio/flacPlatforms with FLAC support
PCM in WAVEaudio/wave, audio/wav, audio/x-wavFallback

Couple of things to note:

For more information on different formats, read Media type and format guide on MDN. For more details on MIME types in different browsers, check out Video type parameters on WHATWG Wiki.

Encoding files

Now we just need to encode the audio files in different formats. This can be easily automated with FFmpeg, a free software that supports pretty much any formats we'll ever need.

Let's assume that there's a high quality source file sound.wav. By running the following command in a terminal or a build script, we can encode the file in the previously mentioned formats:

$ ffmpeg -y -i sound.wav -f ogg  -c libopus   dist/sound.opus \
                         -f ogg  -c libvorbis dist/sound.ogg  \
                         -f caf  -c libopus   dist/sound.caf  \
                         -f mp4  -c aac       dist/sound.m4a  \
                         -f flac -c flac      dist/sound.flac \
                         -f wav  -c pcm_s16le dist/sound.wav

The default options of FFmpeg are reasonable, but we could trade audio quality for a smaller file by limiting the bitrate, for instance.

Example

To bring everything together, let's fetch a sound effect in the most suitable format and play it. We'll use the sound Calling_Bell_02.wav by RSilveira_88 which is licensed under CC BY 3.0.

const context = new AudioContext();

const audio = new Audio();
const extensions = [];
if (audio.canPlayType('audio/ogg; codecs=opus') === 'probably') {
  extensions.push('.opus');
}
if (audio.canPlayType('audio/ogg; codecs=vorbis') === 'probably') {
  extensions.push('.ogg');
}
if (audio.canPlayType('audio/x-caf; codecs=opus') === 'probably') {
  extensions.push('.caf');
}
if (audio.canPlayType('audio/mp4; codecs=mp4a.40.2') === 'probably') {
  extensions.push('.m4a')
}
if (audio.canPlayType('audio/flac') === 'maybe' ||
    audio.canPlayType('audio/flac') === 'probably') {
  extensions.push('.flac')
}
extensions.push('.wav');

async function loadSound(url) {
  for (const extension of extensions) {
    const response = await fetch(url + extension);
    if (response.status < 200 || response.status >= 300) {
      throw new Error(response.statusText);
    }
    const buffer = await response.arrayBuffer();
    try {
      return await context.decodeAudioData(buffer);
    } catch (e) {}
  }
  throw new Error('Unable to decode audio in any format');
}

loadSound('Calling_Bell_02')
  .then(buffer => {
    const source = context.createBufferSource();
    source.buffer = buffer;
    source.connect(context.destination);
    source.start(0);
  });

Here are the file sizes of different formats using the default settings of FFmpeg:

FormatFile size
Opus in Ogg13 KiB
Vorbis in Ogg9.9 KiB
Opus in CAF13 KiB
AAC in MP49.7 KiB
FLAC (native container)106 KiB
PCM in WAVE113 KiB

Unsurprisingly the file sizes of lossless formats are about one-tenth of the uncompressed WAVE file. Newer formats like Opus don't automatically result in smaller files compared to older formats like Vorbis. The file size depends heavily on the used audio data and encoder options. Instead of testing supported formats in a fixed order, we could instead pick the smallest one the browser supports.