If you haven’t read Part 1 yet, we highly recommend doing so. Before we see how compression works, there are two terms you should know, Codec and Container.
1) CODEC
Say you recorded sound as digital data and want to listen to it now. To do this, you need a ‘codec'(A portmanteau of coder-decoder) which is a program or an algorithm which encodes or converts data into a listenable format(Container) or decodes or converts the listenable format back to sound data. There is no ‘perfect’ codec, its selection depends on factors like output file size, audio quality, delivery method etc. However, the predominant codec currently is h.264(h.265 hasn’t found mainstream popularity yet.)
2) CONTAINER
The sound data that we encode is ‘contained’ in a container. Containers are used to pack the audio stream and other stuff like metadata. It is basically technical jargon for what we call a ‘format’. They are represented by file extensions, for example, the MPEG-4 container is represented by a .mp4 format.
Codecs and containers are used in pairs, with the most prominent combination currently being h.264/MPEG-4.
Okay, so you used a codec and converted your recorded data to an audio format. This format is typically WAV(Waveform Audio File Format) or AIFF(Audio Interchange File Format). They are very similar – WAV was developed by Microsoft and AIFF was Apple’s answer to WAV. Both formats are used interchangeably today and quite simply, they are high fidelity, original copies of the source audio.
However, you don’t need a high fidelity, original copy of Despacito or Afreen Afreen. Rather, you can’t afford to spare 34 MB for one song. Enter audio compression(At this point we highly recommend that you check out this article on data compression.)
Audio compression is of 2 types:-
• Lossy or Irreversible Compression
Lossy compression works using ‘psychoacoustics’, the study of sound perception. Basically, it removes information that is less likely to be heard by us hence providing the illusion that the sound is still unaltered. To an extent, it works as described but compressing a file beyond a certain degree results in distortion and addition of ‘artifacts’ – undesirable sounds which weren’t present in the original file. This discarding of data can reduce the file size by as much as 7 times, depending on the codec/container combination used. The most ubiquitous audio format of the digital musical landscape, MP3(MPEG-1 Audio Layer 3), utilizes lossy audio compression. Although its patents expired in May, that didn’t sound the death knell for this omnipresent file format(used by Google Play Music). This does not imply that it’s the best lossy audio format. There are better codecs out there like AAC(used in Apple Music) and OGG Vorbis(used in Spotify). Vorbis(.ogg) is a free, open source codec that provides better sound quality than its lossy counterparts at a similar memory footprint, yet hasn’t found mainstream success like MP3.
These codecs have bit rates ranging from 96kbps to 320kbps.
• Lossless or Reversible Compression
Lossless compression reduces file size by about half while maintaining the exact same audio quality. This is achieved by using compression techniques like RLE(Run-length Encoding) which compresses silences to almost zero space. Popular lossless codecs are FLAC(Free Lossless Audio Codec), ALAC (Apple Lossless Audio Codec)and WMA(Windows Media Audio) – Lossless.
Of these, FLAC is considered the gold standard for lossless compression and is offered by some music streaming services like Tidal(owned by Jay-Z), Qobuz and Deezer in their premium plans. It has a very high bitrate of 1411kbps. ALAC is quite similar to FLAC, except it used to be proprietary. Apart from FLACs and WAVs, there’s another format called DSD(Direct Stream Digital). It is a high-res format that uses an altogether different method of storing sound information. It samples sound at millions of bits per second to generate an audio signal. However, it is far from popular and it’s possible that it may never be.
Now the bone of contention is whether high fidelity audio formats like FLAC are worth their salt. The answer is, in most cases, no. You wouldn’t be able to perceive the difference on your run-of-the-mill headphones. And unfortunately, high-end audio systems don’t ship with a learning curve. Even audiophiles find it difficult to discern between a FLAC and a 320kbps Vorbis sometimes. And even with today’s technological standards, we still seem to be running short of storage all the time. For now, at least, it seems quantity stands tall over quality. Perhaps with the burgeoning reign of music streaming services, higher quality audio will become more prevalent.