Tuomas Siipola Articles Projects

File Formats in Web Audio API

Web Audio API makes it possible to play and modify sounds and music in the browser. Unfortunately different browsers and operating systems support different audio formats. Uncompressed WAVE files are universally supported, but in most situations, using these wastes the precious network bandwidth. We should serve users audio in the most suitable format.

Browser support

To provide content to variety of platforms on the web, server-driven content negotiation has been traditionally used. Newer approach is to specify multiple versions of the same content in HTML and let the browser choose the most suitable one. This is done for example in <audio> and <picture> elements with multiple <source> elements. However, neither of these approaches is supported in Web Audio API.

Luckily it's possible to replicate this behavior with JavaScript. We can use HTMLMediaElement.canPlayType() to check if the browser supports certain multimedia format. For example, the following code checks if a common type of WAVE file is supported:

const $audio = document.createElement('audio');
const canPlayWav = $audio.canPlayType('audio/wav; codecs=1') === 'probably';

The return value is indeed pretty strange. If the format is supported, canPlayType returns either "maybe" or "probably":

Implementors are encouraged to return "maybe" unless the type can be confidently established as being supported or not. Generally, a user agent should never return "probably" for a type that allows the codecs parameter if that parameter is not present.

To make sure the browser actually supports the desired file format, it's important to explicitly specify the container format and codec. The following file formats should provide a good experience for most users:

FormatMIME typeTarget
Vorbis in WebMaudio/webm; codecs=vorbisDesktop browsers and Android
AAC in MP4audio/mp4; codecs=mp4a.40.5iOS devices
PCM in WAVEaudio/wav; codecs=1Fallback for other platforms

Couple of things to note:

Learn more about supported media formats on MDN. For more information on MIME types, check a bit outdated but still relevant page on WHATWG Wiki.

Encoding files

Now we just need to encode the audio files in different formats. This can be easily automated with FFmpeg, a free software that supports pretty much any formats we'll ever need.

Let's assume that there's a high quality source file sound.wav. By running the following command in a terminal or a build script, we can encode the file in formats listed in the previous table:

$ ffmpeg -y -i sound.wav -c libvorbis dist/sound.webm \
                         -c aac       dist/sound.m4a \
                         -c pcm_s16le dist/sound.wav

Example

To bring everything together, let's fetch a sound effect in the most suitable format and play it.

const AudioContext = window.AudioContext || window.webkitAudioContext;
const context = new AudioContext();

const $audio = document.createElement("audio");
let extension = null;

if ($audio.canPlayType('audio/webm; codecs=vorbis') === 'probably') {
  extension = 'webm';
} else if ($audio.canPlayType('audio/mp4; codecs=mp4a.40.5') === 'probably') {
  extension = 'm4a';
} else if ($audio.canPlayType('audio/wav; codecs=1') === 'probably') {
  extension = 'wav';
}

fetch(`sound.${extension}`)
  .then(response => response.arrayBuffer())
  .then(data => context.decodeAudioData(data, buffer => {
    const source = context.createBufferSource();
    source.buffer = buffer;
    source.connect(context.destination);
    source.start(0);
  }));

Unsurprisingly sizes of the compressed Vorbis and AAC files are about one-tenth of the uncompressed WAVE file:

FormatFile size
Vorbis in WebM9.7 KiB
AAC in MP411 KiB
PCM in WAVE113 KiB

Full example code can be found here. Sound Calling_Bell_02.wav used in the example is created by RSilveira_88 and licensed under CC BY 3.0.