XiVero GmbH, Düsseldorf, Germany, 22.04.2015
High Resolution or not High Resolution: That is the question!
What is and why should I use High Resolution Audio?
The introduction of the Compact Disc in 1981 brought a digital standard of 16Bit resolution and a sampling frequency of 44.1kHz.
That system is able to provide a signal to noise ratio of 96dB and a frequency response of up to 20kHz. Although the Nyquist-Frequency is at 22.05kHz it is necessary to apply a steep low pass filter for analog reconstruction, which needs to go from 0dB to -96dB within the small frequency range of 2.05kHz.
The problem of constructing analog filters with a steep frequency response has been solved by introducing digital filters. Nevertheless, those digital filters still need to be steep but by introducing technologies like FIR-Filters it is possible to achieve better results within the digital than the analog domain.
The oversampled output of those digital filters can be easily handled by the analog filters now using smooth frequency responses with minimal impact on the music signal itself.
Why do we talk about digital filters if we want to explain High Resolution Audio?
Well, it is one of the advantages of HD-Audio that the necessary low pass filters can work far away from the music signal reducing any negative impact.
So, what are now the criterions for real HD-Audio?
The industry agrees that at least 24Bit resolution is a minimum pre-condition for HD-Audio.
In Theory 24Bit would provide a Signal to Noise Ratio (SNR) of 144dB in contrast to the 16Bit / 96dB, but even the best electronic is limited to around 120dB, an equivalent of around 20Bit.
There is a factor of 1 million between the lowest and highest signal amplitude for a 120dB dynamic range.
What about sampling frequency?
There is quite a debate about the necessary sampling rate. If we just apply the sampling theorem then the sampling rate needs to be 2x of the highest frequency we want to reproduce.
It is common sense that the human hearing – under best conditions – can sense a sinus signal at frequencies of up to around 20kHz. Further research states that the harmonics of instruments are heard even if they are higher than 20kHz.
Because our systems aren’t mathematically perfect we need a sampling frequency that is a bit higher than the Nyquist frequency to reconstruct the analog signal out of the digital domain. That is one of the reasons why the CD-Standard uses 44.1kHz.
You may ask, why on earth did they choose such an awry sampling rate. The explanation can be found in how digital signals in 1981 have been recorded on magnetic tape. The Analog to Digital Converter output has been recorded on Video-Tape. To do so, without much modifications of the video equipment, it was necessary to save the 16Bit samples as video signals by saving 3 samples per line. The calculation for the PAL-Standard (588 active lines of 625 total) would be 294 Lines/Frame x 50 Frames/Second x 3 Samples/Line = 44,100 Samples/Second.
Ok, nice, but what sample frequency would be enough for HD-Audio?
If we take the two key points discussed above into consideration then we need to have a sampling frequency of at least 2 x 20kHz + Margin and we need to make sure that the low pass reconstruction filters act in a way that they don’t touch the music signal itself.
With the invention of the DVD in 1996 the specification team selected 48kHz as minimum sample frequency.
Listening tests show that the combination of 24Bit / 48kHz can achieve better results than 16Bit / 44.1kHz. It seems that the increased bit resolution adds the most to the perceived improvement.
Why should I go for 96kHz, 192kHz or even 384kHz?
Higher sampling rates increase the SNR but not as effective as additional bits. We need to choose a 4x higher sampling rate to achieve an improvement of around 6dB in SNR, which equals 1Bit in additional resolution.
If we have 384kHz versus a 48kHz recording then we would gain 12dB of SNR. As described above, that makes no sense because we aren’t already able to make use of the available 24Bit.
If the SNR is not an argument for choosing a higher sampling rate then the reproducible frequency range should be, right?
Real world digital recordings are able to show musical content of up to around 48kHz.
MusicScope Screenshot showing a good HD-Recording (24Bit / 96kHz)
By having Nyquist in mind we need at least a sampling rate of 96kHz to reproduce all frequencies within the original signal. Furthermore, if we take the research regarding the perception of instrument harmonics into consideration then the higher sampling rate could be justified.
Choosing even higher sampling rates provide advantages in pushing the low pass reconstruction filter envelope further away from the musical content to reduce its impact to a minimum.
We can now say that HD-Audio starts with at least 24Bit / 48kHz and improves at 24Bit / 96kHz.
Unfortunately, the real world implementations of electronics are quite complex and any DAC has its sweet spot where it works best. It is save to say that most of the available Digital to Analog Converters work very well at 24Bit / 96kHz but we cannot say how they perform at higher sampling rates, where built tolerances implicate a larger impact on the sonic quality.
What about the different formats PCM versus DSD?
DSD (Direct Stream Digital) started as archive format for analog master tapes with a 1Bit resolution at a sampling rate of 64 x 44.1 kHz = 2.8224MHz, therefore the name DSD64.
The reason for choosing DSD as archive format lies in its simplicity to convert it easily into any PCM-Format by applying a simple decimation process.
After the patents for the CD expired the companies Philips and Sony wanted to improve the sonic quality by introducing a new format. The SACD (Super Audio Compact Disc), using DSD as audio format, was born.
DSD64 claims to achieve a frequency response of up to 100kHz and a dynamic range of around 120dB. At least the frequency range is not really usable because at around 30kHz the quantization noise of the 1Bit digitization increases above the wanted signal and renders it unusable. Sony even recommends activating special low pass filters within the SACD players limiting the usable frequency range to avoid damage of the tweeters caused by the ultrasonic quantization noise.
MusicScope Screenshot – DSD64 Recording with musical content up to 35kHz
Higher DSD sample rates have the advantage of reducing the quantization noise:
- DSD64 (Standard-rate = 2.8224MHz)
- DSD128 (Double-rate = 5.6448MHz)
- DSD256 (Quad-rate = 11.2896MHz)
- DSD512 (Octuple-rate = 22.5792MHz)
MusicScope Screenshot – DSD128 Recording with musical content up to 50kHz. The quantization noise starts to increase at higher frequencies.
MusicScope Screenshot – DSD256 Recording
That recording uses 715MB for 4:13 recording time and seems to be an original DSD64 track converted to DSD256, even introducing periodical distortions (vertical broken line)
Whereas DSD is quite good for archived material it looses some advantages for modern recordings where always a digital mixing and mastering process is involved. Because there is no way to master DSD directly the standard proposes to use DXD within the studio software environment, which is a 24Bit PCM format with a sample rate of 352.8kHz (8x 44.1kHz). This has the disadvantage of an additional conversion process back to DSD that always adds more quantization noise.
There are an increasing number of recording studios that invest quite some effort to create great native DSD recordings by avoiding any conversion into PCM and back to DSD.
What are the advantages of HD-Audio beside of increased SNR and improved Frequency Response?
Carefully recorded Native Studio Master Recordings, applying at least 24Bit and >= 48kHz, have the potential to save a good deal of dynamic range during Mastering. Well, a pop song still needs some compression but modern HD-Audio records should at least try to preserve a good dynamic range.
Today’s standard recordings (16Bit/44.1kHz) are plagued by Inter Sample Peaks because of faulty limiting during recording or intentional loud mastering which causes audible distortions. The danger of producing such Inter Sample Peaks decreases with an increase in sample frequency.
We just introduced the AudioRepair tool to repair Inter Sample Peaks within digital standard (16Bit / 44.1kHz) recordings to enhance the listening experience.
Working on different Audio Formats to repair Inter Sample Peaks and to make sure that the Frequency Response stays within the standardized limits.
What are the quality differences between HD-Audio files and how can I visualize them?
The MusicScope provides several functions to analyze standard and High Resolution digital recordings.
Subsequently, we present a couple of MusicScope measurements to point out the differences between HD-Recordings and we even like to show some simple and some well done forgeries available in the real world. Furthermore, we present distortions introduced during recording that degrade the achievable quality.
Let’s start with a good HD-Recording where the amplitude of the spectrum smoothly decreases with increasing frequencies. This is a natural behavior which can be different if electronic instruments (Synthesizer) are involved.
MusicScope Screenshot showing a good HD-Audio recording (24Bit / 96kHz) with a natural spectrum and spectrogram.
Well, not all available HD-Recordings are really HD or that kind of HD they promise to be.
The following example is offered as 192kHz but the material is actually 96kHz.
MusicScope Screenshot showing an up-sampled version of a 96kHz track to offer it as a 196kHz version.
The red line shows the automatic Cut-Off Frequency Detection algorithm in action.
It gets even worth! Subsequently we show an example that is an up-sampling from 44.1kHz to 192kHz.
To stay with “Hamlet”: It goes up to 22kHz – The rest is silence!
MusicScope Screenshot showing an up-sampled version of a 44.1kHz track to offer it as a 196kHz version. The steep edge even indicates a conversion from 16Bit to 24Bit. By measuring the Spectrogram amplitude in ”Minimum Mode” there is no value below -100dB which is a strong indication for 16Bit source material.
Well, and there are even some smart forgers who simply add harmonics to the record by using tools like “Maximizer” to create artificial frequencies beyond the recorded frequency range.
This makes it a bit more difficult to reveal the fake but what is worse is the negative impact on the audio quality.
MusicScope Screenshot of a real world fake by adding harmonics to create artificial content above 22kHz.
MusicScope Screenshot of the same fake as above but the spectrogram is in “Min” and “Col” mode to make the forgery more obvious. There are even two distortions visible as vertical lines.
The following measurement examines a DSD record that exhibits strong distortions compromising the achievable quality.
Periodical distortions (e.g. sinus signals) are introduced, most likely by the recording equipment used.
MusicScope Screenshot of a DSD64 record.
There is a strong sinus distortion with an amplitude of nearly -40dB at 25kHz, introduced during the recording process.
What is the conclusion?
We’ve seen a lot of different measurements of high resolution audio recordings available on the market. There are really great Native Studio Masters but also some bad apples.
Generally we can say that HD-Audio can add more sonic quality to the world of music, not just because of the bit resolution and sampling rate but because it mitigates the impact of technical limitations like low pass reconstruction filters necessary to transform the signal back into the analog domain.
Finally, there is a great chance that mastering engineers take the opportunity to invest even more effort into the technical side of a recording to bring the best results to their customers in terms of dynamics and therefore lively music.