Catching Up

Posted on December 9, 2018 by soundkeeperb

With 19 months having passed since the last entry in this blog, yes, it is high time to do some catching up.

One of the most interesting projects I’ve worked on since the last entry in this blog was the newest album by Jason Vitelli, whose Confluence I had the good fortune to produce, record, and release on Soundkeeper Recordings. For his latest, Head Above Tide (extended-res version here), Jason needed a different approach than the one we used for Confluence. Where the latter was recorded live to stereo, for this project he needed the ability to overdub and to record different parts at different times. The project utilized the technique of recording the various parts with a stereo microphone array, similar to what I use for Soundkeeper projects, but with provision for laying each of them down at different times. (I wrote about this technique in Recording in Stereo (Part 2)).

The basic tracks and many of the overdubs were done at Top of the World Studios, which I designed for my good friend Art Halperin. Art and Jason recorded it and the three of us mixed it there. Then I mastered it back at my own studio. Those familiar with mastering know that it involves listening to an album repeatedly. After doing the mixes and mastering this record, I think it notable that when I wanted to relax afterward and listen to some music, I kept going back to this album. Kudos to Jason for creating another original that challenges the listener (as all great music does) and rewards the effort with new joys on each hearing.

***

The first time I mentioned Metric Halo in this blog was back in November of 2013 in the entry called Three Decisions (Part 1). For those who may be new to MH, they are a premier supplier of pro audio hardware and software, with a fiercely loyal following among those who’ve been lucky enough to use their gear. The hardware consists of computer interfaces that serve as microphone preamps, A-D (analog-to-digital) converters, headphone amps, and D-A (digital-to-analog) converters, with more features than I will list here. The software consists of various plug-ins, a sophisticated audio analysis application, and the MIO Console with Record Panel, the latter being built into their hardware units. Granted I have not heard every single competing product out there, but I believe I’ve heard the contenders (many in blind comparison tests). That said, to my ears, the MH gear excels in each of these categories to the point where, in terms of ability to simply get out of the way, I have not heard anything that comes close to matching it, much less besting it.

A while back, Metric Halo announced an upgrade was coming for their hardware and software. They called it 3d – a step up from the 2d boards it was to succeed. Keeping in mind the last sentence in the previous paragraph, I was curious to hear what the new hardware and software would achieve. Earlier this year, the hardware upgrade for my ULN-8 became available. The 3d hardware was in, but the beta software was still to be developed. And the unit wouldn’t run without it.

***

Toward the end of 2017, I spoke with Markus Schwartz about the idea of doing a follow-up to the Equinox project I produced and recorded back in 2010, and which was selected by Stereophile as their Recording of the Month in February of 2011. Thus the seed was planted for the next Soundkeeper Recording. Markus had ideas about the music and direction he wanted to go in, and about the players he would select for this outing. I told him about the upgrade to the recording gear from Metric Halo, and that there was time since I couldn’t record until I had received and tested the new software. More on this project in the next entry in this blog.

By the Spring of 2018, the software component of the 3d upgrade arrived and the listening tests began. Somehow, designer B.J. Buchalter had taken what I’d already felt was the best recording gear I’d ever experienced (particularly when used to make high-resolution, 24-bit, 192k recordings), and raised it up another level. Dynamics, at both micro and macro levels, are more in evidence. Spatial resolution and overall sense of focus have been improved, increasing the realism of the recordings and allowing the gear to get even further out of the way than its previous iteration. Sometimes you have to hear something better to know how something can be better. Congratulations B.J. and Metric Halo.

***

When Soundkeeper first started with downloads, we were breaking up the extended-resolution (24/96) and high-resolution (24/192) versions of our albums into gigabyte-sized files in order to keep download times as short as possible. Somewhere along the way we realized this was not necessary, and that a full album at any of the resolutions we offer could be provided as a single downloadable zip file.

Another development related to downloads is that most customers now seem to prefer these to the files-on-disc formats we offered before we got into downloads. For those who play files on their computers or via a dedicated music server, this makes sense as there are no shipping costs and the music arrives in minutes. With this in mind, the next Soundkeeper Recordings release will be offered as a CD and in six downloadable formats: 16/44, 24/96, and 24/192, as .aif and .wav. There will be no files-on-disc formats and no CD-R version. (We do have some stock of these for our previous releases but they will not be replaced once they’ve sold out.)

Next time, the new album.

The Lowdown on Downloads

Posted on March 16, 2017 by soundkeeperb

(This entry was updated 12/2/18.)

Three years ago I posted the entry in this blog called Listening to Tomorrow. I wrote about the wonders I experienced after loading my music library onto a computer hard drive and using the computer as a “music server.” Since then, the idea of music existing as computer files—as opposed to physical discs one loads into a player—has expanded.

Today, there are a myriad of music server applications for the various computer operating systems. For those who want to take the fidelity beyond the capabilities of their computer’s sound card, there are countless external digital-to-analog converters (DACs) to choose from. There are also numerous online sources for downloading music. Some still offer the data-reduced formats such as mp3. Others now tout “full CD quality”—in some quarters, an oxymoron. And some offer extended-resolution and high-resolution files. (For more information on the different formats, see the blog entry cited above.)

My music server has become the way I listen, whether via Wi-Fi feeding smaller systems in the house, or via direct connection to the music library drive when listening on the main system. Yet for several reasons, as a consumer I have been hesitant to purchase downloads. Early experiences with more than one provider were disturbing in that what was often sold as “high resolution” turned out to be upsampled Redbook—in other words, plain old CD sound, in a high res “package”—sold at a high res price. Whatever the reason (or reasons), this was so rampant I feared the fledgling market might never get off the ground.

I was also not enamored of the .flac format in which the vendors delivered their downloads. While called a “lossless” way to reduce file size, making for convenient, faster download times, the results were not so lossless according to everyone participating in the comparison tests we ran in my studio. (Based on what I see on the Internet and in many printed audio journals, it seems many listeners are not bothered by flac. In our tests however, the results were unanimous—everyone heard a difference between the source .aif masters and the .flac files created from them.)

In time I was glad to see some vendors offer what appeared to be the raw PCM formats I prefer, such as .aif and .wav. These are the formats used to make the recordings. However, it turned out that at least with some of the vendors, what was being delivered to the customer was still a .flac file. The “download manager” software the vendors provided for use on the customer’s computer expanded the file back to .aif or .wav. For my own purchases, I avoided the downloads and stayed with CDs or with the high resolution files-on-DVD versions that some of the vendors sold. When the discs arrived, I’d extract the files—this is called “ripping” a disc—and add them to the server myself. For all the files on my server I chose the uncompressed .aif format—the same format I use to make and master recordings.

As the owner of the Soundkeeper Recordings label, I stayed away from offering downloads for several reasons, even though many folks have requested them over the years. The prime reason is that I seek to deliver our recordings to our customers with nothing less than the very best sonics, and from my perspective the download schemes I’ve seen involve compromises.

A full album at high resolution (24-bit/192 kHz sampling rate) can be larger than four Gigabytes in size. Where others reduce file size—and by that means shorten download times—by utilizing so-called “lossless” compression formats (such as .flac or .alac), to my ears these result in subtle alterations of the sound, hence I don’t consider them lossless. Trading fidelity for convenience is not what Soundkeeper wants to offer our customers.

Another common approach taken with downloads, is to break albums up into “singles”. Our artists go to considerable efforts to create whole albums, so this is the only way we want to deliver their work to our customers.

It took a while for the answer to come but I believe there is another way. Soundkeeper Recordings will soon offer downloads without any of the compromises cited above. How to deliver full albums at up to 24/192 resolution? Fans of the so-called “lossless” formats compare them to zipping a word processor file. Yes, the zipped words come back intact, even though I can’t say I find the same to be true of flac’d music.

So what about zipped music? We’ve used zipped music files before, such as those on the Format Comparison page of the Soundkeeper Recordings website. And when unzipped, no one who participated in our tests could differentiate between the source file and the copy that had been zipped.

What about file size? Converting an .aif or .wav file to a .zip file does not reduce the size to any significant degree. It does make for simple downloads though, without exacting a sonic price. When the files have been downloaded, the user unzips the file and simply drags the tracks into the server application of their choice (iTunes, Amarra, etc.).

One of the reasons I prefer .aif format for my music files is that the files can contain metadata (artist, album title, track title, composer, album cover art, etc.). This metadata becomes part of the file. The .wav format does not support metadata, so when the user adds this information in their music application, it resides in the application and not in the file. If the file is moved out of the application, the metadata is lost. In contrast, move an .aif file and the metadata travels with it. The .aif file downloads from Soundkeeper Recordings will have the full metadata in them when they arrive on the customer’s machine.

Within the next week, we’ll begin offering downloads in six formats: 16/44, 24/96, and 24/192, .aif or .wav. For those who prefer disc formats, we plan to continue offering CD versions. The downloads will replace our files-on-disc formats and are just a long overdue addition that will please a different set of Soundkeeper listeners.

Toward a definition of high resolution audio

Posted on June 15, 2014 by soundkeeperb

We are starting to see the idea of high resolution audio gain some traction beyond the audiophile world, where it has been enjoyed for the past several years. Some of the major labels, perhaps seeing new opportunities for commerce, have formed a working group to define exactly what high resolution audio is.

I think we’re going to see a wide variety of perspectives on this one. The labels are using variations of the term “Master Quality”, with different designations, depending on whether the original source is an analog tape, a CD master, or some other digital format. My take is this can be quite vague, particularly in view of the fact that there is such enormous variation from recording to recording, even within one of the above source formats. In some ways, the sonic differences between recordings can far exceed the sonic differences between formats.

Another definition, which seems to come up a lot in the hobbyist fora, is “anything better than CD quality ”, meaning anything where the digital audio is encoded with a word length longer than CD’s 16-bits and a sample rate higher than CD’s 44.1k. (Word length and sample rate are discussed in the previous entry, Is “too much” not enough?)

Sometimes I think terms like “Master Quality” or “CD quality” are oxymorons, like “the sound of silence”, “jumbo shrimp”, “living dead”, or “civil war”.

Personally, I would differentiate between “high resolution” and “not as low resolution”. (How’s that for a selling point? “This new album is not as low resolution as the previous one!” ;-} )

As I hear it, going from 16/44 to 24/44 is an improvement, as is going to 16/48 or 24/48, but I wouldn’t refer to any of these as “high resolution” for the simple reason that to my ears, they are not. 24/44 does not do as much damage to low level information as 16/44 but in my view, it still suffers from an inadequate sampling rate, as does 24/48. The anti-aliasing filter (also discussed in the previous entry) is still way too close to the top of the audible range and its consequences reach down well into the audible range. (Yes, I know some claim otherwise. I’m still waiting to hear the audible evidence to support such claims. So far, it speaks otherwise.)

As we get to the 2x sample rates (i.e., 88.2k or 96k), there is much less damage and perhaps I’d refer to these as “intermediate resolution”. I say this because of what I perceive as the critical threshold that is crossed when 4x rates (i.e., 176.4k or 192k) are properly done. While “properly done” still seems to describe the minority of devices carrying these numbers in their spec sheets, those that do achieve it do something I’ve never heard from any other format, including the best analog—and that is what I have been referring to as “getting out of the way”. This alone makes the 4x rates, to my ears, a bigger jump upward in quality over the 2x rates than the latter are over standard CD. And this alone differentiates them in my mind as being true high resolution.

While the intermediate resolution rates can sound very good, this is exactly what I believe prevents me from thinking of them as high resolution: they sound. I don’t want gear or recordings or formats that sound “good”, “detailed”, “smooth”, etc., I want them to not sound. I want them to get out of the way, leaving the sound to that which is being recorded, played and listened to—the performance.

I’m reminded of how the video world defined intermediate resolutions—those better than standard but not really high—with the term “extended”.

Personally, I’d place anything at 1x rates (or with a 16-bit word length) in the SRA (“Standard Resolution Audio”) category. This would include 16/44, 16/48, 24/44 and 24/48. With the latter two, my experience has been that while the added word length helps, the limitations of having the low-pass filtering so close to the audible range—and thus, its effects within the audible range—mean that in the end, these are all effectively just minor variations of “CD resolution”. (I would ultimately consider 16/96 or 16/192 SRA also. There is, in my view, no good reason to record with less than 24-bits and if the release is going to be at one of these sample rates, I would deem word length reduction to 16-bits counterproductive and just plain silly.)

The above is at odds with what appears to be the more common “anything better than CD” definition of high resolution. To me, that is like saying anything better than a Big Mac is filet mignon. Or anything better than Night Train is Dom Perignon. I don’t think so. I think there are intermediate levels and that it takes more than being better than mediocre (or just plain bad) to quality as “fine”.

Any 2x recording (88.2 kHz or 96 kHz) again, at 24-bits, I would refer to as ERA (“Extended Resolution Audio”). Now we have a real improvement in fidelity to the input. It doesn’t quite get out of the way, but to my ears, it is noticeably better than SRA.

HRA (“High Resolution Audio”), I would reserve for 4x recordings (176.4 kHz or 192 kHz) again, at 24-bits. Properly done and played back on gear that can actually perform at these rates, we have the first format in my experience that is truly capable of getting out of the way. This is what high resolution audio is about.

By these definitions, I would consider Soundkeeper Recordings’ CDs as well as our CD-Rs to be SRA. The latter is certainly closer sounding than the pressing but ultimately, they’re both 16-bits. I’d call our 24/96 DVDs and 24/96 files-on-disc releases ERA and our 24/192 files-on-disc releases HRA.

Of course, as I’ve long said, my belief is that 90-95% or more of a recording’s ultimate sonic quality has already been determined by the time the signals are leaving the microphones. The delivery format just determines how much of that original quality is available for playback. I’d rather hear a CD (or even an mp3) of a Keith Johnson recording than a 24/192 (or the original masters) of recordings from a lot of other engineers. But best of all, is the HRA version of Keith’s work.

Is “too much” not enough?

Posted on May 22, 2014 by soundkeeperb

As digital audio and the means of playing it back mature, there is an increasing divergence of perspectives to be found on the Internet. Some revel in the sonics of music heard at high resolution, while others argue that the CD standard is not to be audibly improved upon and still others want even higher resolution. All this while Joe and Jane Average download one song at a time at resolutions that throw away at least 75% of the information contained on a CD.

There are new efforts from some quarters to show Joe and Jane what they’re missing and to elevate what the download services offer. The idea is to, at the very least, deliver 100% of what the CD offers and at best, deliver true high resolution. Yet these efforts have spawned Internet “papers” and articles in effect, ridiculing the very idea of high resolution and arguing the supposed inaudibility of its benefits, or worse, suggesting that high resolution by definition will sound worse, not better.

I can’t speak for what others find but I can say that whatever these folks are reporting is quite the opposite of what I experience. I’m hearing fidelity such as I’ve dreamed about for years and when I read those stories, they strike me much as though the authors are trying to convince me there are no colors in a rainbow.

The arrival of high resolution digital has the potential to fulfill the promise digital audio first made more than a quarter century ago. Back then, astute listeners wondered at the marketing mantra “perfect sound forever” while cringing at the dry, bleached and airless sounds delivered by the first CD players. While a great deal of progress has been made during the intervening years, the inherent limitations of the format remain.

Looked at in the most rudimentary fashion, the specifications for CD would, on the surface, appear to be all that is needed to perfectly reproduce anything that can be heard. Human hearing is nominally sensitive to frequencies from 20 Hz through 20 kHz (i.e., 20 cycles per second through 20,000 cycles per second). As we age, the top end limit decreases and most adults would be lucky to hear 15 kHz. With CD, music is sampled 44,100 times per second. That is, the digital recorder “looks at” the sound 44,100 times every second and captures a sample. According to the theory, all frequencies below half the sample rate, (in this case, all frequencies below 22,050 cycles per second) will be captured accurately and since this is well beyond what most folks can hear, it all sounds quite neat.

These digital samples are each a series of digital bits, with each bit representing one of two binary states or values, often thought of as “ones and zeros”. Each sample is stored in a digital word. The CD standard uses 16-bit words, where each sample contains 16 values. The particular combination of ones and zeros represents the level (i.e., volume) of each sample. A series of 16 zeros (i.e., 0000000000000000) would be the lowest level that can be encoded and represents complete silence. A 16-bit word representing an intermediate level might look like this: 0111011110101110. The highest level would be 0111111111111111, a zero followed by 15 ones. (For technical reasons which are beyond the scope of this entry, the loudest value is not a series of 16 ones.)

A word length of 16-bits allows up to 65,536 different levels to be represented. The difference between the loudest sound that can be captured and the noise floor of the format is called the signal-to-noise ratio. Signal-to-noise ratio is measured in units of loudness called decibels (dB). For a 16-bit format like CD, the signal-to-noise ratio is approximately 96 dB, which means the noise floor (the inherent noise of the format) is 96 decibels below the loudest sound that can be captured. This is much quieter than vinyl or analog tape. Any hiss heard on a CD is captured from the source and is not inherent in the medium. Many folks confuse the signal-to-noise ratio specification with dynamic range (the difference in level between the loudest possible sound and the lowest sound). We’ll come back to this later and see why this is misguided.

The problems start when we move from the theoretical to the practical. (Someone, perhaps it was Yogi Berra, once said “In theory, there is no difference between theory and practice, but in practice, there is.”) When digital audio is recorded, any frequencies above half the sample rate can cause problems – they engender aliases or aliasing distortion, false frequencies that are not part of the program material. In order to avoid aliasing, when digital audio is encoded, as well as when it is played back, most digital processors use a filter to ensure that no frequencies above half the sample rate can pass. These anti-aliasing filters have audible side effects, manifesting in the time domain – the signal gets smeared in time. Some designers will use gentler filters to minimize the time smear but in doing so, they cause the higher frequencies to fall off prematurely. A number of modern playback devices have user-selectable filters where the listener can select between steep filtering and its associated time issues or gentler filtering and its associated frequency issues.

So, while CD can capture all the audible frequency range, the requisite filtering means the frequencies delivered to the listener are not all arriving on time or are not all arriving in the same proportion in which they were captured, or some combination of both of these. One great advantage of the higher sample rates is that the anti-aliasing filter is moved far above the audible range. This allows gentler filtering to be used without affecting the audible frequency range.

In recent years, thanks in no small part to formats like DVD and others, which are capable of storing more information than will fit on a CD, digital audio has grown up from the 16-bit words and 44.1 kHz sample rates by which sound is encoded for CD. We’ve had 24-bit audio with sample rates of 96 kHz, 176.4 kHz and 192 kHz. For reference, a 24/96 (24-bit, 96 kHz) version of a given recording contains more than three times the information contained in the same recording at 16/44 (16-bit, 44.1 kHz). A 24/192 version contains more than six times the information. And where a word length of 16-bits allows up to 65,536 different levels to be represented, going to 24-bits increases the dynamic resolution 256 times, allowing up to 16,777,216 different levels to be represented.

The widespread use of computers (and computing devices) for audio playback has enabled the proliferation of high resolution audio and emancipated music from the confines of silver discs and the limitations imposed by the process of retrieving music from these in real time. (Separating the processing “overhead” from the playback will provide higher quality playback.) Good as the best disc players and transports can be, my experience has been that there is invariably a loss of focus and fine detail, often subtle, sometimes not so subtle. It is only via proper computer playback that I’ve heard results that I find indistinguishable from listening to the master used to create those silver discs.

This is good news, even for music at CD resolution, because the listener at home can now hear what is effectively the CD master itself. However, while the limitations of playback from molded disc have been removed, the limitations of the format remain. In addition to the frequency and time-related issues brought about by having the anti-aliasing filter just above the audible range, there are the consequences of inadequate word length. Although the noise floor with a 16-bit medium like CD is 96 dB below the loudest possible sound that can be captured by the format, many often confuse this signal-to-noise ratio with dynamic range. The assumption is that if the noise floor is 96 dB below the loudest sound, sounds just above the noise floor will be captured with the same fidelity, providing a range of dynamics as wide as the signal-to-noise ratio. In fact, with a 16-bit medium, the fidelity plummets at lower levels.

The full resolution, in this case 16-bits, is only realized for sounds near the top of the volume range. Each bit captures about 6 dB of the dynamic range (about 6.02 dB to be more precise but let’s use 6 in this example to keep things simple), so in a 16-bit system, sounds lower in level than 6 dB below the maximum will effectively be captured at less than 16-bit resolution. To wit, if this lower level information is say, 12 dB lower in level, it will be encoded at what is effectively approximately 2 bits less than the full resolution of the format (i.e., 14 bits in a 16-bit recording, 22 bits in a 24-bit recording). If it is say, 36 dB lower in level, it will be encoded at what is effectively approximately 6 bits less resolution (i.e., 10 bits in a 16-bit recording, 18 bits in a 24-bit recording).

Some information, such as the trailing end of reverb as it fades away, or the higher harmonics of musical instruments, can be well more than that 36 dB lower in level than the loudest sounds and will be encoded with resolutions corresponding to fewer bits. This results in the thinned, bleached and coarsened instrumental harmonics in even the best 16-bit recordings, as compared to a good 24-bit recording (or of course, the original sound in real life). It also results in the defocusing of the spatial information and in the relative airlessness in the 16-bit recording compared to a good 24-bit recording (and real life).

While the level meter may show a peak on that 16-bit recording that is within the top 6 dB, this, like the waveform views shown by some computer software, is only a view of the “top” part of the musical waveform — the loudest part. Sounds and components of sounds that are underneath the top part (i.e., in the background) are not captured as faithfully. Accordingly, when considering the dynamic range of the format, it is a good idea to take into account the relative distortion at different levels within the range. If increasing distortion is not desirable, the real dynamic range potential is going to be considerably less than what the spec sheet might suggest (or is often echoed in the audio press and in some places on the Internet). Note that even with low level information as in the examples above, a 24-bit recording still delivers more resolution than a 16-bit recording at its best.

Why then, would someone publish a “white paper” against higher resolution or declare that resolutions like 24/192 are “pointless” or worse? A few possible reasons come to mind:

The higher sample rates place significantly increased demands on the gear used to record and play them back. For example, digital gear contains an internal clock to control the timing as the device encodes or decodes the stream of digital samples. Spacing between the samples must be kept accurate or the reconstructed analog waveform that we hear will not have the correct shape and hence, will not provide the correct sound. Irregularities in timing are referred to as jitter. Higher sample rates also mean the analog stages of the gear must be able to perform at the wider bandwidths. Perhaps the folks complaining about high resolution are using gear that does not have clocking that is up to the task and analog stages that can perform at high bandwidth. Such will either not reveal any benefits or will actually sound worse than they do at the easier, lower rates like 24/96. (This is true of a number of “professional” units as well as those sold to audio enthusiasts. A built-in, $250 “soundcard” simply won’t do it, regardless of what the specs claim. In today’s market, it may cost 10 times this amount for a device truly capable of revealing the potential of these sample rates. Maybe it is no wonder these folks hear little or no difference.)
It could be possible that the rest of the system these folks are using isn’t up to resolving a wide band recording. Or it could be that these folks are just not sensitive to these particular differences. I’ve always found that different folks have different sensitivities to different aspects of sound.
Perhaps they believe CDs (or 24/96) already sound identical to the input signal. If that is the case, I can understand that anything more would seem wasteful.

Sample rates like 176.4k and 192k don’t, as some have erroneously suggested “have more jitter”. Sample rates don’t have jitter. As stated above, higher sampling rates do place greater demands on clocking accuracy (just one reason why buying a DAC
(digital-to-analog converter) “by the chip” is at best a foolish enterprise). They also place greater demands on the analog stages surrounding the digital stage.

Why some would see these characteristics as “flaws” (and write papers or articles on the subject), I don’t understand. I’ve always gone with empirical evidence over theoretical analysis; that is, when “theory” and direct experience are at odds with each other, I’ll tend to seek a new theory. (As I see it, theory should explain the experience, not the other way around.)

All this to say, when a firmware upgrade enabled 192k capability in the converters I use for my work, I approached it conservatively — even continuing to do a few recording sessions at 96k because I was familiar with it and could be confident in the results. But then I started running tests at 192k and quite quickly found I had to get my jaw up off the floor: for the very first time in my experience, I was hearing (with this device anyway) a recording device “disappear”. I had never heard that before, even with the best analog recorders and most certainly nothing close with the best digital recorders, even with this very device when used at 96k.

Now I felt a threshold had been crossed (I’ve read similar words since then from one of my favorite audio engineers, Keith Johnson). The results no longer sounded like “great digital”; they no longer sounded “digital” at all. They didn’t sound like “great analog” either. The jump from 24/96 to 24/192, when done well, is to my ears a much more significant jump than the one from 16/44 to 24/96. It’s all about that threshold; this is the promise digital made in 1983, finally and for real. (While it certainly sounds more faithful to the input signal than 16/44 does, 24/96 doesn’t yet, to my ears, “get out of the way”. Having the anti-aliasing filter moved well up and away from the audible range definitely helps but it is the rates like 176.4k and 192k where I find the threshold is crossed. Interestingly, to me, while many speak of the treble response, I find some of the greatest benefits to be in how much more lifelike I find the bass.) 24/192 pointless?!? Only if real progress in music reproduction is pointless.

While some decry the higher sample rates, either declaring them to offer no audible sonic differences from the lower ones or to offer inferior sonics (!) to the lower ones, other voices are talking even higher numbers. We are seeing marketers talking 32-bits and sample rates like 384 kHz. As long as there are customers who are taken in by sheer numbers, there will be those who see an opportunity for commerce, who will accommodate them with sheer numbers.

I get a chuckle out of these things because my experience has been that in reality (that is, with the hardware or software one can go out and buy today, as well as the recordings one can purchase to play on these), I see gear that isn’t particularly clean at 24-bits. I see other gear which, when presented with 4x sample rates (176.4k or 192k), performs worse than it does at 2x rates (88.2k or 96k). Yet the spec sheets and ads say “24-bit” (or more) and they say “192k” (or more). And the reviewers simply echo the numbers.

In the here and now, if it is a minority that can achieve the performance potential of 24/192, I take claims of higher numbers as a joke at best and cynical marketing at worst. Just my opinion of course but with so few showing they can design for 4x rates, why would anyone think those same few could deliver 8x rates (or more)? I find it interesting that those claims are not coming from the designers of gear that can achieve the potential of 4x rates. (We have the equivalent of makers of 2-cylinder subcompacts claiming to make cars that an outrun a Lotus!)

The numbers game isn’t limited to hardware. I’ve seen one company release CDs they claim were made with “32-bit mastering” and another claim “100 kHz resolution”. (Do they have 100k gear or are they rounding up from 96k?) Does anyone think those CDs are anything other than 16-bit, 44.1k? If there are such folks, I have a fine bridge in New York City to sell. The tools I use to create a CD master have 80-bit data paths and I’m working at 192k. The higher quality tools do result in a higher quality CD but should I then say they are made with “80-bit mastering”? Or that they exhibit “192 kHz [or 200 kHz] resolution”? I’d rather make records than sell bridges.

The finest 24/192 I’ve heard to date has given me back recordings I have not yet been able to discern from the direct input from my microphones. (To be clear, I am referring to gear that actually seems to achieve the potential of these numbers and not merely gear that sports them on a spec sheet.) Would 32/384 sound better? I suppose I’d have to hear the flaw(s) in properly done 24/192 first. And second, if the first condition was met (and in my experience, it has not even been challenged yet), that 32/384 gear would have to actually achieve the potential of that resolution and not merely claim it on a spec sheet. For me, right now, it is just marketing. Someday perhaps, we’ll have the audible evidence. Perhaps. Right now, I’m trying to imagine how it might be better than what is (so far) indistinguishable from the input signal.

24/192? 32/384? 64/768? Or should I wait for the 128/1536 version?
Is the best of today’s 24/192 too much? Is it not enough? I think it is just right.

Listening to Tomorrow

Posted on February 23, 2014 by soundkeeperb

I have heard the future and it has already been here for quite a while. For years, when I wanted to listen to recorded music, I’d walk over to the shelves housing my music library and select a disc to play. Whether a vinyl disc or a polycarbonate CD, it was always a physical disc, which needed to be retrieved from the shelf, removed from its cover and placed on (or into) the player. Note the operative word there is was. I’m talking about the integration of the home computer and the music system—in other words, what is also called a “music server.”

Music servers can take many forms. The music stored on them can also take many forms, as can the user interface. At its simplest, think of song files loaded into a program like iTunes. Some companies are now making dedicated hardware servers, essentially hiding the computing guts within a somewhat traditional looking component. The only giveaway might be the little display screen on the front of the box. Otherwise, it could be confused with any electronic component with knobs and/or buttons on the front.

The music files played by the servers come in a wide variety of formats. First there were the early “lossy” formats, such as mp3. In order to shorten download time and minimize the amount of storage space required, these formats shrink file size by throwing away information. Some lossy files can contain as little as less than 1/10^th the information on a CD (!) while others contain about 1/5^th the information on a CD. These formats can be convenient for very fast file transfer via the Internet, such as when sampling tracks from an album. However, one must pause at claims from some quarters that removing 80-90% or more of the information on a CD results in something that sounds the same as that CD.

More recently, a number of so-called “lossless” formats, such as .flac and .alac, have become popular. These too, seek to shrink file size, though in theory, they are able to reconstruct the missing data upon playback, rather than simply throwing it way. (There is some debate as to whether the results actually sound like the non-reduced original. To my ears, they do not.) Lastly, there are the raw, non-reduced PCM formats, such as .aif and .wav and also the variations on an even more recent format, DSD. All of these raw formats require more storage capacity, as the file sizes are not reduced. However, with the price of storage coming down every day, this is simply not an issue for many listeners, who prefer the performance of the raw formats – the same formats in which the recordings are originally created, mixed and mastered.

So let’s say you’ve selected a computer (and server application) or hardware player. The next step is to build your music library. Music files can be purchased as downloads (or files-on-disc) via the Internet or “ripped” from your own music collection. “Ripping” is the term used to describe the extraction of music files from a disc, such as a CD. Some of the music server programs, iTunes for example, have ripping functions built-in. A number of dedicated ripping applications are available for free on the Internet. Digital discs are not the only source music lovers are using to get files into their server libraries. Many vinyl enthusiasts will create “needledrops” by digitally recording their favorite vinyl records.

With all the files installed in your server, what next? What does having a music server mean? In my experience, the shortest answer is that having a music server will change your relationship to your music collection and it will change the way you listen, both of these in very positive ways. First, let’s say you want to hear a particular album or a particular track from a particular album. The music server isn’t just a player, it is a database too. Type in an artist’s name or the name of a musical track, hit enter and you can be listening to music in a lot less time than it would take you to walk to the music shelf, much less find the disc, put it in the player and press Play. But there is so much more. Typing in an artist’s name will show you everything in your library by that artist. Typing in a track name will show you every version of that piece of music in your library. There are countless variations, all dependent on what is called “metadata”.

Metadata is the information that accompanies each music file. Many server applications will automatically fill in a lot of the metadata for each track. For example, when you rip a CD from your collection, the server application might check the Internet and fill in all the artist, song and genre information, perhaps also adding composer, year of release, or other information, including the cover art. For those who want more details, many programs allow user entry of metadata as well. (I like to include who played what instrument for the jazz albums in my collection.)

With the right metadata, the music lover can search their library in almost any way they can imagine. Sometimes I like searching by composer, particularly among jazz standards but also among the other types of music in the collection. Or, I might just enter a genre, say, if I’m in a mood to hear some rhythm & blues. The point is, with this sort of capability, other ways of approaching the music library come to mind all the time. And sometimes, I’ll just set the server to “Shuffle” and let it pick the next tune. This has resulted in some very pleasant surprises and the rediscovery of old gems in the collection.

So, what is the sonic price of all this convenience? Just how much does one have to give up compared to what they can get from a good CD player? Well, assuming the music library is comprised of at least CD resolution files and not just the lossy files still popular among the larger download services, this is where it gets really good. Or rather, where it can. At its simplest, a server can be used with the speakers built into the computer or with small, powered desktop speakers. (Lossy files can be fine in this application because the playback hardware is limited anyway.) However, if instead of taking the sound from the computer speakers, the signal is taken from the server in digital form, it can be fed to an outboard DAC (digital-to-analog converter). The DAC in turn, can be connected to a full size audio system.

What I’ve found is that when CDs are ripped to a raw PCM format (I tend to favor .aif, which is the same format in which I record, mix and master), when both the CD in the player and the ripped file in the computer are fed to the same external DAC, there is not only no tradeoff but in fact, playback from the computer beats the CD every time.

Ever since I mastered my first CD back in 1983 and compared what came back from the replication plants with the masters used to make those CDs, I’ve found that CDs from different plants (sometimes different lines within the same plant) all sound different from each other and none sounds indistinguishable from the master used to create it. This is true regardless of the CD player or transport used, regardless of price or design. To my ears, comparing playback from disc with playback of the master used to create said disc, there are always losses of focus and fine detail, sometimes subtle, other times not so subtle at all.

Interestingly, when those same CDs are ripped to computer as raw PCM files and then compared with the masters, all the differences go away. In other words, with playback of these files via a good server, for the first time in my experience, the user can have the sound of the CD master at home. So, the convenience of a music server not only does not exact a sonic price, the results actually sound better than playback from a disc player or transport. (It might not beat good vinyl playback in some ways but that is a subject for another day. And besides, what I’ve outlined above is only the beginning. Read on.)

But wait, it gets even better. Music via a server is not limited to CD resolution, which in fact, might be seen as the “cassette” of the digital world. (I will leave to the reader what this says about the lesser digital formats.) Music servers make possible what has previously only been available via some of the post-CD music formats, such as DVD-Audio or SACD. Here again though, the differences between playback from disc and playback via a server come down in strong favor of the latter. The quality ceiling at this point, is determined by the resolution capability of the DAC.

The resolution of a digital file is determined by two factors. The first is its “sample rate” (how many times per second the original sound is converted into a digital sample during recording — think of frames per second with motion pictures, each frame being a “sample” of the video motion). The second factor determining the resolution of a digital file is its “word length” (how many digital “bits” are used to describe each sample). CDs carry 16-bit, 44.1kHz audio (sometimes abbreviated as “16/44”), which means each digital sample is described with 16 bits and the original sound is sampled 44.1 thousand times per second.

Some music is now available in 24-bit formats (as opposed to CD’s 16-bits), with sample rates of 88.2 kHz, 96 kHz, 176.4 kHz and 192 kHz (as opposed to CD’s 44.1 kHz). While 24/96 recordings can certainly sound much more realistic than 16/44 CDs, it is with the highest rates (I’m thinking 24/176 and 24/192 recordings) that a threshold is crossed and with the finest DACs, I have experienced something I’ve sought for as long as I’ve been an engineer but never attained until now. That is, when at recording sessions, I have not yet been able to discern the recorded sound from my direct microphone feed. This is something I’ve never experienced with any of the finest analog formats or with any other digital format.

Given 24/192 files on a music server, in raw PCM format, the listener now has, for the first time, what is essentially the recorded master itself. Note that the finest modern recordings are made at resolutions considerably higher than CD. In order to create a CD master, the sample rate must be converted (reduced) to 44.1k and the word length must be reduced from 24-bits to 16-bits. A good rip from CD can bring home the sound of the CD master but the 24/192 files will contain several times the amount of information that 16/44 CD is capable of holding. It is the difference between the CD master and the high resolution master itself.

I started this entry by saying “I have heard the future” and that is what I said out loud when I first experienced my music via a server: “This is the future!” Many young listeners have never known disc playback. For them it has always been downloaded files played on the computer via a program like iTunes or played via their iPods. As more mature music and audio enthusiasts got into the idea, a number of designers catering to this market started to offer higher fidelity options and the market for “high res” downloads (and files-on-disc) is really just getting started. There are even services that will take an existing disc collection and rip it to computer drive for a per-disc fee. Several years ago, I remember hearing the term “convergence” used a lot to suggest an expanded role for computers (and computing machines) in our lives. Music servers are a great example, fundamentally altering how music lovers interact with and enjoy their music collections.

The Soundkeeper

Audio, Music, Recording, Playback

Tag Archives: 24/96

Catching Up

The Lowdown on Downloads

Toward a definition of high resolution audio

Is “too much” not enough?

Listening to Tomorrow

Share this:

Share this:

Share this:

Share this:

Share this: