High Quality Audio Discs

An overview of high-density audio formats,
sample rates, and what they mean

by Richard Elen, August, 2000
Originally published on AudioRevolution.com

 

Several years ago, the search began for a new consumer audio/video distribution medium to supersede VHS and LaserDisc. The new format would be entirely digital, with both digital picture and sound. It would also be compact. And it would offer 5.1 surround-sound capability.

The DVD-Video format was designed to meet these goals, with a CD-like disc, but of much higher density, to be read by a laser beam with shorter wavelength and thus being able to carry more data in the same space.

But while DVD offered digital images and wide-screen picture formats, the audio, despite offering 5.1 surround, did not greatly advance the level of audio quality over that of Compact Disc — at least not initially.

Compact Disc uses PCM — pulse code modulation — to encode 16-bit words at a 44.1 kHz sample rate. This means that in the digital recorder used to master a CD, every 44,100th of a second, the converter measures the instantaneous voltage of the waveform and stores that value as "words" 16 binary digits ("bits") in length.

There are two limitations here: both the sample rate and the word length limit the maximum audio quality to a level that is arguably less than we can hear. For example, the dynamic range — the difference in loudness between the loudest and quietest signal you can record — of a 16-bit signal is about 96 dB. Although the vast majority of music does not use this much dynamic range (rock and pop music in particular), we can probably hear greater differences in level than 16 bits allows. In fact we can notice a difference between 16 and 20 bits, and even between 20 and 24, although the difference is less easy to spot. But 24 bits is probably beyond the limits of the ear/brain combination's ability to distinguish additional detail.

Robert Stuart, audio expert, head of leading consumer audio manufacturer Meridian Audio (http://www.meridian-audio.com) and co-developer of the Meridian Lossless Packing (MLP) system used in DVD-Audio, notes that the word length does not determine the dynamic range in real life. Instead, the number of bits defines the noise floor, not the resolution of the system. This is because all modern digital systems use dither — generally a special kind of noise — to smooth out transitions from one bit to another at low levels. The better-dithered a system is, the higher its resolution — so much so that perfect dither would produce a system with infinite resolution, irrespective of the actual number of bits, which would simply define the noise floor. In a good digital system, just like analog, you can hear music way down below the noise — and dither makes a digital system behave more like analog. And there are real limits to how low the noise floor can be, because of such factors as thermal noise in the components. A 24-bit system theoretically yields a noise floor of -144 dB, but you are hard-pushed to exceed about -120.

For some years, there have been calls for higher sampling rates to permit the ability to capture higher frequencies, even though these may be inaudible directly.

Thanks to the Shannon-Nyquist Theorem in information theory, a sample rate of 44.1 kHz means that the highest frequency you can record is about 22.05 kHz — half the sample rate, known as the "Nyquist Limit". Well, of course, you might argue that as we can only hear up to 18 kHz, this doesn’t matter — and you may be right. But arguably, there may be some instruments that produce ultrasonic signals, and these interfere with each other in the air to produce audible beat frequencies which affect the timbre of the sound. Thus, if you have an upper frequency limit of 22 kHz, you may be missing something. Maybe.

Another argument in favor of higher sample rates has been that a lot of the problems with early digital recordings were related to the filters used to cut off the audio signal before it reaches the Nyquist limit. In the old days, these were analog "brick-wall" filters that introduced ringing and significant phase-shifts that adversely affected the sound, making it harsh and clangy, and resulting in early criticism of digital.

Some audiophiles, unfortunately, never listened to digital again, and thereby missed out. Because in the intervening years, we learned a great deal about digital audio. In particular, we discovered "oversampling". Here, the system is clocked at a multiple of the "real" sample rate, creating a set of imaginary zero-level samples between the real ones. These are then integrated to generate a series of samples at a much higher sample rate, where the Nyquist Limit is way up out of the way — today you can oversample tens of times and have a Nyquist limit so high up that there is absolutely no useful information lost in the conversion process.

There is an advantage in having a higher sample rate, however, in that you get a string of real samples instead of most of them being imaginary. Perhaps, you can hear the difference...

In addition, we simply don’t do brick-wall analog filters any more. We don’t even do conversion the way we did thirty or even fifteen years ago. Today, we use "Delta-Sigma" converters, which essentially produce a one-bit bitstream without the limitations of previous methods. That one-bit technique, incidentally, forms the foundation of a completely different digital recording system, DSD, that we’ll look at shortly.

Despite evidence that a properly-recorded 24-bit digital signal, even at a sample rate as low as 44.1 kHz, can give extremely high quality results at least exceeding those achieved with traditional analog techniques, there has been pressure for some time for a carrier that can offer better results. The obvious choice of sample rates was double the existing 44.1 and 48 kHz (the latter is used in some professional environments and in conjunction with video), ie 88.2 and 96 kHz. Any audibly perceptible improvement in a digital signal disappears by the time you get to a sample rate of about 64 kHz, but it made a lot of technical sense to make the new sample rates simple multiples of existing techniques, for ease in sample rate conversion. Converting from 48 to 44.1 kHz, for example, requires a 276-pole digital filter, which is likely to sound a bit nasty; converting from 88.2 to 44.1, on the other hand, is simply every other sample plus a different filter.

DVD-Video offers six channels of audio, but in most cases you can only manage 48 kHz sampling and up to 20 bit word lengths. You can, as some smaller record companies like Chesky have done, create two-channel, 24-bit, 96 kHz DVD-Video discs with very limited graphic content, but there isn’t the room for high-density surround sound.

The recent DVD-Audio specification is intended to address this concern. It offers up to six channels of 24/96 digital audio, or even two channels of 24-bit, 192 kHz sampling, which is probably excessive for all but the most exacting audiophile purposes.

You’ll notice that there are six full-bandwidth channels. There are six full-bandwidth channels in DVD-Video, too, although the bandwidth is, of course, not as wide. If you have all your five main channels offering full bandwidth (from DC on up), what is the Low Frequency Effects channel doing? Essentially nothing. If you’ve read my article An Introduction to 5.1 (or whatever it’s called), you’ll recall that the LFE was designed to carry very low frequency sounds like dinosaur footfalls in an analog movie theater environment, separately so that they would not cause additional distortion in the main audio channels. Neither DVD-Video or DVD-Audio needs one, as your system’s bass management capabilities are there to make sure that wherever the bass comes from, it is sent to the speakers best able to handle it. Especially when it comes to DVD-Audio, that LFE is just taking up precious real-estate, so why not use it for something else? The obvious choice is to use it for height information, literally adding a new dimension to the home listening experience.

As it is, DVD-Audio is hard-pushed to get a reasonable playing time on a disc. The only thing that makes it possible is compression.

Now compression, rightfully, has a bad name among audiophiles. When we think of compression, we think of lossy compression, aka perceptual coding, where theoretically inaudible information is removed to save space. Dolby AC-3 does this on DVD-Video. So does DTS. So does MiniDisc (although it’s got better over the years). And of course, so does MP3 and its competitors to an enormous extent, which is why Internet audio sounds so horrible. To avoid the quality compromises of lossy compression, the record industry specified that DVD-Audio must use lossless compression — ie, a compression system that does not lose any data and recovers the same bits on playback as were there in the original master. Not only that, MLP is so efficient that the data rate on a DVD-Audio disc is surprisingly low — only about 1.5 times that of a CD!

The DVD Consortium held a competition in Japan to choose the compression technique, and the winner was Meridian Lossless Packing (MLP) which is now the mandatory compression scheme on DVD-Audio. If you have the room, however, you can place a Dolby AC-3 or DTS 5.1 data stream on the disc. This creates a "DVD-Universal" disc that can be played on special DVD-Audio players and on DVD-Video players too, although at a lower level of quality.

Meanwhile, there is another technology that seeks to challenge DVD-Audio to become the next generation audio distribution medium. This technology comes from Sony and Philips, the developers of the Compact Disc, and it’s called Super Audio CD. The technology uses similar discs to those used by DVD, but the audio is recorded with a completely different type of digital recording technique than the familiar PCM. Instead, Direct Stream Digital (DSD) is used. This technique features a one-bit conversion technique and samples at a massive 2.8 MHz. Many experienced audio engineers and producers feel that DSD sounds best of all, although there are some practical limitations to what the technology can do today.

Unfortunately, the existence of two high-quality audio media introduces the possibility of a format war, like that between VHS and Betamax. This would hurt the industry and reduce sales. The answer is for the DVD Consortium to mandate that all DVD-Audio players should play Super Audio CDs, and for Sony and Philips to mandate that all SACD players should also play DVD-Audio discs. If this was the case, consumers could buy any player knowing it would not become obsolete, and producers could use whichever of the two techniques best suite the material. If only the market were so sensible.

Go home