Catching Up

With 19 months having passed since the last entry in this blog, yes, it is high time to do some catching up.

One of the most interesting projects I’ve worked on since the last entry in this blog was the newest album by Jason Vitelli, whose Confluence I had the good fortune to produce, record, and release on Soundkeeper Recordings. For his latest, Head Above Tide  (extended-res version here), Jason needed a different approach than the one we used for Confluence. Where the latter was recorded live to stereo, for this project he needed the ability to overdub and to record different parts at different times. The project utilized the technique of recording the various parts with a stereo microphone array, similar to what I use for Soundkeeper projects, but with provision for laying each of them down at different times. (I wrote about this technique in Recording in Stereo (Part 2)). 

The basic tracks and many of the overdubs were done at Top of the World Studios, which I designed for my good friend Art Halperin. Art and Jason recorded it and the three of us mixed it there. Then I mastered it back at my own studio. Those familiar with mastering know that it involves listening to an album repeatedly. After doing the mixes and mastering this record, I think it notable that when I wanted to relax afterward and listen to some music, I kept going back to this album. Kudos to Jason for creating another original that challenges the listener (as all great music does) and rewards the effort with new joys on each hearing.

***

The first time I mentioned Metric Halo in this blog was back in November of 2013 in the entry called Three Decisions (Part 1). For those who may be new to MH, they are a premier supplier of pro audio hardware and software, with a fiercely loyal following among those who’ve been lucky enough to use their gear. The hardware consists of computer interfaces that serve as microphone preamps, A-D (analog-to-digital) converters, headphone amps, and D-A (digital-to-analog) converters, with more features than I will list here. The software consists of various plug-ins, a sophisticated audio analysis application, and the MIO Console with Record Panel, the latter being built into their hardware units. Granted I have not heard every single competing product out there, but I believe I’ve heard the contenders (many in blind comparison tests). That said, to my ears, the MH gear excels in each of these categories to the point where, in terms of ability to simply get out of the way, I have not heard anything that comes close to matching it, much less besting it.

A while back, Metric Halo announced an upgrade was coming for their hardware and software. They called it 3d – a step up from the 2d boards it was to succeed. Keeping in mind the last sentence in the previous paragraph, I was curious to hear what the new hardware and software would achieve. Earlier this year, the hardware upgrade for my ULN-8 became available. The 3d hardware was in, but the beta software was still to be developed. And the unit wouldn’t run without it.

***

Toward the end of 2017, I spoke with Markus Schwartz about the idea of doing a follow-up to the Equinox project I produced and recorded back in 2010, and which was selected by Stereophile as their Recording of the Month in February of 2011. Thus the seed was planted for the next Soundkeeper Recording. Markus had ideas about the music and direction he wanted to go in, and about the players he would select for this outing. I told him about the upgrade to the recording gear from Metric Halo, and that there was time since I couldn’t record until I had received and tested the new software. More on this project in the next entry in this blog.

By the Spring of 2018, the software component of the 3d upgrade arrived and the listening tests began. Somehow, designer B.J. Buchalter had taken what I’d already felt was the best recording gear I’d ever experienced (particularly when used to make high-resolution, 24-bit, 192k recordings), and raised it up another level. Dynamics, at both micro and macro levels, are more in evidence. Spatial resolution and overall sense of focus have been improved, increasing the realism of the recordings and allowing the gear to get even further out of the way than its previous iteration. Sometimes you have to hear something better to know how something can be better. Congratulations B.J. and Metric Halo.

***

When Soundkeeper first started with downloads, we were breaking up the extended-resolution (24/96) and high-resolution (24/192) versions of our albums into gigabyte-sized files in order to keep download times as short as possible. Somewhere along the way we realized this was not necessary, and that a full album at any of the resolutions we offer could be provided as a single downloadable zip file.

Another development related to downloads is that most customers now seem to prefer these to the files-on-disc formats we offered before we got into downloads. For those who play files on their computers or via a dedicated music server, this makes sense as there are no shipping costs and the music arrives in minutes. With this in mind, the next Soundkeeper Recordings release will be offered as a CD and in six downloadable formats: 16/44, 24/96, and 24/192, as .aif and .wav. There will be no files-on-disc formats and no CD-R version. (We do have some stock of these for our previous releases but they will not be replaced once they’ve sold out.)

Next time, the new album.

Everything Still Matters

Soon after the previous entry in this blog—Everything Matters—was posted, I heard from a friend who recently purchased a 24-bit, 192 kHz, high-resolution download of a classic album.  Like many of us, he sought an even better “view” of the recording than is offered by the CD version he already owns.  To his surprise, he prefers listening to the CD version, and finds the high res download as sounding “a bit bright.”

The authors of some recent tech website articles denigrating high resolution might see my friend’s comments as vindication.  In my view, this says more about the authors than it does about the audible reality.  Why these websites didn’t choose authors more experienced with systems for music playback, and more interested in sound quality, remains a mystery.  (Vide John Atkinson’s very well considered Access Journalism vs Accountability Journalism.)

In order to determine whether high resolution is the source of the problem (any problem), it must be compared with its standard resolution equivalent.  This means for a valid comparison of delivery formats the only difference must be the delivery format.  Both versions must be created at the same mastering session, by the same engineer, using the same channel (signal path).  There, as the man once said, is the rub.  In most cases the two items being compared were created at different mastering sessions, often by different engineers, in completely different mastering studios.  Right away any sort of equivalence is out the window.

Different mastering engineers have different ears, different sensibilities, different approaches, different talents, and different weaknesses.  Even the same engineer might take a diverging tack when remastering something they’ve mastered in the past.  When the two versions are done by different engineers the likelihood of variance in their methods is pretty much a sure thing.  This is expectable since they don’t share a common set of ears, and no two engineers I know of will do things the same way.  With regard to new masterings, in Everything Matters I said, “Sometimes the results are improved sonics, with newly revealed nuances from the original recording that were lost in the original mastering.  Other times, and sadly all too often, the remastering is simply a louder, brighter rendering.”

There is also a very good to excellent chance the signal path for the two versions differed.  Even in the same studio, things tend to change and evolve over time.  For an album like the one my friend purchased, which was originally recorded on analog tape, the A-D converter used in mastering can have a profound effect on the results.  This is particularly true at higher resolutions, where I have found many converters are stretched beyond their capabilities.  To wit, a lot of converters specified for 24/192 actually perform worse at this rate than they do at lower rates.  This I attribute to the significantly increased demands on clocking accuracy and on analog stage performance at the wider bandwidths.  It would seem to be easy to put 24/192 on a spec sheet but not so easy to design a device that can perform to the potential of the format.  And the converter is just one of several components comprising the signal path, each of which will have its own sonic consequences.

All of the above assumes the same source tape was used for the different versions.  This is a big assumption, even when “original” is claimed.  I’ve experienced a number of instances where, having handled the tapes myself, I knew the subsequent claims from some quarters of “original” were at best mistaken. Whether original or not, if different source tapes were used, the outcome could be acutely altered.

The bottom line here should not be surprising: A carefully made CD (or CD resolution file) will easily outperform a not-so-carefully made 24/192 file.  This has to do with how effectively the capabilities of each delivery format are realized—or not realized, as this case illustrates.

I concluded Everything Matters by saying “Everything after the microphones (i.e., mic cables, AC mains power, AC mains cables, mic preamps, recording format, recording device , mix, if any, mastering, playback format, playback device, interconnecting cables, amplification, speaker cables, speakers, speaker positioning, vibration isolation, room acoustics, etc., etc.) merely determines how much of what was captured the listener gets to hear.”  In my experience, when everything in the production of an album is the same except for the delivery format, a 24/192 file should reveal so much more of the source as to make the 16/44 (CD) version sound coarse, ill-defined, airless, and broken by comparison.  So either my friend’s 24/192 file was created from an inferior source, or the mastering was just not up to that achieved for the CD.

To my ears, properly done digital audio at 24/192 fulfills the promise digital made back in 1982 when the sonically hamstrung CD format made its first appearance.  I have said elsewhere that 24/192 is the first format I’ve ever heard where I have not yet been able to distinguish the output from the input—the first format I know of that is capable of giving us a virtually perfect rendition of the source.  In view of this, I must admit to being somewhat astonished at the negativity from some quarters of the tech web and tech press.  Nevertheless, if music lovers are to receive the benefits of this wonderful fruit of technological progress, the folks creating it must tend their crop more carefully.

Pressing Matters

It is my sincere hope that this blog provides entries of interest to music lovers, musicians, and audio enthusiasts, as well as folks who make records.  A few previous entries, such as Can you hear what you’re doing? (Part 1) and Can you hear what you’re doing (Part 2), were aimed specifically at those setting up studios in order to make records, among whom there are a great many musicians.  Of course, it has been my hope that others would find these of interest as well.  So it is with the current entry.  While it is intended primarily for those who make records, if I’m lucky, those who purchase and listen to records will also find something of value herein.

With audio mastering completed for the new Work of Art album entitled Winds of Change (first mentioned in the August 22, 2014 entry of this blog, also called Winds of Change), and with the album artwork approved, it was time to contact the CD replicator in order to get the “pressed” versions manufactured.  Actually, CDs are not pressed like vinyl records.  They are made using an injection molding process, but the term pressing seems to have endured in common use.

Those familiar with my label, Soundkeeper Recordings, know that we release each album in several different formats.  In addition to the regular CD, we offer six custom burned formats, including CD-R and five formats with higher resolution than a CD can provide:
–   Music-only DVD-R with 24-bit, 96 kHz audio, playable in most regular DVD players
–   24-bit, 96 kHz .aif files-on-disc
–   24-bit, 96 kHz .wav files-on-disc
–   24-bit, 192 kHz .aif files-on-disc
–   24-bit, 192 kHz .wav files-on-disc

For more about the different resolutions, see the May 22, 2014 entry in this blog, Is “too much” not enough?

As far as standard, 16-bit, 44.1 kHz CD resolution, the reason Soundkeeper Recordings offers our releases in CD-R format, and the true subject of this entry, is something I’ve said since I heard the finished product for the very first CD I mastered, back in January of 1983—CDs made at different plants all sound different from each other and none sounds indistinguishable from the master used to make it.  This may sound strange in view of the marketing that has accompanied the CD format from the beginning, primarily in the form of the slogan “Perfect Sound Forever” and the widely accepted idea that nothing can change once the signal is in a digital format.

Imagine my surprise then, when I first started mastering CDs and found that the same digital U-Matic tape (the format used at the time to send CD masters to replication facilities) sounded different depending on which side of the Sony DAE-1100 editor I used to play it.  The DAE-1100 was commonly used at the time to assemble CD masters.  The unit controlled tape machines for the ¾” tape cartridges that comprised the U-Matic format.  One or two machines could be used on the Playback side and another machine was used on the Record side.  The CD master was assembled on a U-Matic tape in the machine connected to the Record side of the editor.

Early on in my experience with this system, I wanted to compare a tape that was copied from another, just to hear for myself that a digital copy was indistinguishable from the original, as we’d all been told.  Unfortunately, the test never got that far.  What I found was that the original tape, played from the Playback side of the editor sounded better than the copy.  Something was getting lost on the copy, as it seemed coarser and less well defined than the original.  I don’t recall what made me try it but I decided to swap the tapes, listening to the copy from the machine attached to Playback side of the editor and the original from the machine attached to the Record side.  To my surprise, now the copy sounded better (i.e., more like the analog source tape I was using) than the original digital conversion.  When heard from the Record side of the editor, the original digital tape now sounded coarser and less well defined than the copy!  Clearly, there was something else going on.

Perhaps it was this experience that diminished the surprise when the finished CDs for that first CD mastering project came in and I compared them with the CD master used to make those discs.  Here the coarseness was even greater than what I’d encountered on the different sides of the DAE-1100 editor.  The finished CDs almost sounded “out of focus” compared with the CD master, such was the extent of the loss of clarity and fine detail.

Things got more interesting when I created CD masters for albums where large sales were expected.  In those days, there were fewer CD plants than there are today and they were all working at capacity.  In order to accommodate expected demand for the big sellers, the CD master would be cloned and those clones were sent to multiple replicators in order to get back sufficient numbers of finished discs to meet the demand.  This was an education in that I discovered that CDs from different plants all sounded different from each other.  Sometimes CDs from different lines within the same plant sounded different from each other.

So much for “Perfect Sound Forever”.  The format has been claimed to deliver perfect copies of the master.  Logic would demand that if this is the case, all those perfect copies would sound indistinguishable from each other and they’d all sound indistinguishable from the masters from which they were made.  But they weren’t then and they still aren’t today.  (There is an exception that I’ll get to shortly.)

Having sent CD masters to plants all over the world and all over the USA, I’ve had the opportunity to compare a lot of finished product to the masters from which said product was made.  Happily, the days of U-Matic tapes are long gone and the advent (long ago) of computer workstations made for many improvements.  Still, even with the most sophisticated CD mastering tools, the reality from the replication facilities remains—the finished discs don’t sound like the masters.

In my experience, a slow-burned CD-R made directly from the computer-based CD master, sounds more like that master than any pressed CD, even the best in my experience.  This is why Soundkeeper Recordings offers our releases in CD-R format as well as replicated CDs.  But how then, to select a CD replicator?  If they all produce discs that sound different from the CD master, how does one find the most faithfully made discs?  This is the question that was on my mind when I started the label.  Knowing that a lot of folks just prefer a factory-made disc to a burned version, even if the latter is more faithful to the master, I needed to find a replicator for Soundkeeper CDs.  My whole reason for starting this label was to avoid the compromises I feel are too often part of the record making process.  I wanted a no-compromise replicator — if such a thing existed.

I reached out to contacts at most of the plants I’d sent masters to over the years.  I told them about my concept for the label and that I needed the most faithful to the master, highest quality discs.  All but one of them told me essentially the same thing.  They said their CDs were perfect replicas of the CD master.  Since my own experience consistently told me something quite different, I could only conclude they were not hearing it the same way I was.  Or they just weren’t listening and were simply repeating the received mantra.  I thanked each in turn and moved on to the next person on my list.

Out of all the replication facilities, only one person at one facility told me, with no prompting from me whatsoever, “Oh the finished CDs will never sound like the CD master.”  I wanted to hear more but knew by then that I’d found my CD replicator.  Here at last, was someone who appeared to actually be listening.  It turned out, this replicator took an unusual approach to making their finished CDs too.  Where many plants increased their throughput – and hence, their income – by speeding up the process, this plant kept with the slower methods.

The first step in manufacturing a CD involves cutting what is called the glass master.  The CD master from the mastering facility is fed into a Laser Beam Recorder (LBR), where a laser is used to create the pits in a photoresistant coating on a glass disc.  This disc is used in the subsequent steps of CD manufacturing.  Most plants cut the glass master at high speed.  Some will cut the glass master in real time, at additional cost.  Many folks have found real-time glass cutting to result in finished discs that sound closer to the original CD master.  The person at this plant told me they cut all their glass in real time, at no additional cost.  It is just how they do it.

In addition, most CD replicators have moved to shorter injection molding cycles.  The faster the cycle, the more finished discs that can be produced in a given day.  Typical injection molding cycles for CDs are now about 4 seconds long.  The person at this plant told me they use a slower cycle, closer to 9 seconds long.  This makes for better formed pits on the finished discs, making it easier for the laser in the CD player to read the discs and minimizing the incidence of playback errors.

Whether the real-time glass cutting and slower injection molding cycle are the reasons or whether some other factors might be involved, I don’t know.  What I do know is that when I master an album, I listen to it so many times that I become intimately familiar with all the details of its sound.  Often, when I hear the finished CD that comes back from the replicator, it takes only a few seconds to hear the typical loss of focus and fine detail.  Something like a chord strummed on an acoustic guitar becomes a loose mélange rather than the six discrete, individual string sounds heard on the CD master.  With CDs from this replicator, the sound is so much closer to the CD master, I need to synchronize playback of the finished disc with the CD master in order to discern the remaining differences.  (Still not as close to the master as the CD-R but closer by far than I’ve heard from other CD plants.)

Now earlier on in this entry, I mentioned an exception.  In fact, I wrote about this in the February 23, 2014 entry in this blog, entitled Listening to Tomorrow.  Basically, what I’ve found is that what I’ve written about in the current entry comes into play when the CD is played in a CD player or via a CD transport.  This has been my experience regardless of the player or transport, or its price.  However, when the CD is properly extracted to a computer, the audible differences do go away.  To date, after 31 years of the CD format, it is only via computer that I’ve heard the audio from a CD disc sound indistinguishable from the master used to create that CD.  Still, those listening to computer music servers with CD or better resolution (as opposed to mp3 or other reduced formats) are in the minority.

Most of the music lovers I know of who purchase CDs will listen to them in CD players or via a separate CD transport feeding an external digital-to-analog converter (DAC).  In order to provide these folks with a CD that truly represents the CD master approved by the artist and producer, selection of the replicator is critical.  To this end, I feel very lucky to have found Bryan Kelley and the folks at GrooveHouse Records, who I have been recommending to mastering clients since my first conversation with Bryan, and who, as far as I’m concerned, are the official CD replicators for Soundkeeper Recordings.

Toward a definition of high resolution audio

We are starting to see the idea of high resolution audio gain some traction beyond the audiophile world, where it has been enjoyed for the past several years.  Some of the major labels, perhaps seeing new opportunities for commerce, have formed a working group to define exactly what high resolution audio is.

I think we’re going to see a wide variety of perspectives on this one.  The labels are using variations of the term “Master Quality”, with different designations, depending on whether the original source is an analog tape, a CD master, or some other digital format.  My take is this can be quite vague, particularly in view of the fact that there is such enormous variation from recording to recording, even within one of the above source formats.  In some ways, the sonic differences between recordings can far exceed the sonic differences between formats.

Another definition, which seems to come up a lot in the hobbyist fora, is “anything better than CD quality ”, meaning anything where the digital audio is encoded with a word length longer than CD’s 16-bits and a sample rate higher than CD’s 44.1k.  (Word length and sample rate are discussed in the previous entry, Is “too much” not enough?)

Sometimes I think terms like “Master Quality” or “CD quality” are oxymorons, like “the sound of silence”, “jumbo shrimp”, “living dead”, or “civil war”.

Personally, I would differentiate between “high resolution” and “not as low resolution”.  (How’s that for a selling point?  “This new album is not as low resolution as the previous one!” ;-} )

As I hear it, going from 16/44 to 24/44 is an improvement, as is going to 16/48 or 24/48, but I wouldn’t refer to any of these as “high resolution” for the simple reason that to my ears, they are not.  24/44 does not do as much damage to low level information as 16/44 but in my view, it still suffers from an inadequate sampling rate, as does 24/48.  The anti-aliasing filter (also discussed in the previous entry) is still way too close to the top of the audible range and its consequences reach down well into the audible range.  (Yes, I know some claim otherwise.  I’m still waiting to hear the audible evidence to support such claims.  So far, it speaks otherwise.)

As we get to the 2x sample rates (i.e., 88.2k or 96k), there is much less damage and perhaps I’d refer to these as “intermediate resolution”.  I say this because of what I perceive as the critical threshold that is crossed when 4x rates (i.e., 176.4k or 192k) are properly done.  While “properly done” still seems to describe the minority of devices carrying these numbers in their spec sheets, those that do achieve it do something I’ve never heard from any other format, including the best analog—and that is what I have been referring to as “getting out of the way”.  This alone makes the 4x rates, to my ears, a bigger jump upward in quality over the 2x rates than the latter are over standard CD.  And this alone differentiates them in my mind as being true high resolution.

While the intermediate resolution rates can sound very good, this is exactly what I believe prevents me from thinking of them as high resolution:  they sound.  I don’t want gear or recordings or formats that sound “good”, “detailed”, “smooth”, etc., I want them to not sound.  I want them to get out of the way, leaving the sound to that which is being recorded, played and listened to—the performance.

I’m reminded of how the video world defined intermediate resolutions—those better than standard but not really high—with the term “extended”.

Personally, I’d place anything at 1x rates (or with a 16-bit word length) in the SRA (“Standard Resolution Audio”) category.  This would include 16/44, 16/48, 24/44 and 24/48.  With the latter two, my experience has been that while the added word length helps, the limitations of having the low-pass filtering so close to the audible range—and thus, its effects within the audible range—mean that in the end, these are all effectively just minor variations of “CD resolution”.  (I would ultimately consider 16/96 or 16/192 SRA also.  There is, in my view, no good reason to record with less than 24-bits and if the release is going to be at one of these sample rates, I would deem word length reduction to 16-bits counterproductive and just plain silly.)

The above is at odds with what appears to be the more common “anything better than CD” definition of high resolution.  To me, that is like saying anything better than a Big Mac is filet mignon.  Or anything better than Night Train is Dom Perignon.  I don’t think so.  I think there are intermediate levels and that it takes more than being better than mediocre (or just plain bad) to quality as “fine”.

Any 2x recording (88.2 kHz or 96 kHz) again, at 24-bits, I would refer to as ERA (“Extended Resolution Audio”).  Now we have a real improvement in fidelity to the input.  It doesn’t quite get out of the way, but to my ears, it is noticeably better than SRA.

HRA (“High Resolution Audio”), I would reserve for 4x recordings (176.4 kHz or 192 kHz) again, at 24-bits.  Properly done and played back on gear that can actually perform at these rates, we have the first format in my experience that is truly capable of getting out of the way.  This is what high resolution audio is about.

By these definitions, I would consider Soundkeeper Recordings’ CDs as well as our CD-Rs to be SRA.  The latter is certainly closer sounding than the pressing but ultimately, they’re both 16-bits.  I’d call our 24/96 DVDs and 24/96 files-on-disc releases ERA and our 24/192 files-on-disc releases HRA.

Of course, as I’ve long said, my belief is that 90-95% or more of a recording’s ultimate sonic quality has already been determined by the time the signals are leaving the microphones.  The delivery format just determines how much of that original quality is available for playback.  I’d rather hear a CD (or even an mp3) of a Keith Johnson recording than a 24/192 (or the original masters) of recordings from a lot of other engineers.  But best of all, is the HRA version of Keith’s work.

Is “too much” not enough?

As digital audio and the means of playing it back mature, there is an increasing divergence of perspectives to be found on the Internet.  Some revel in the sonics of music heard at high resolution, while others argue that the CD standard is not to be audibly improved upon and still others want even higher resolution.  All this while Joe and Jane Average download one song at a time at resolutions that throw away at least 75% of the information contained on a CD.

There are new efforts from some quarters to show Joe and Jane what they’re missing and to elevate what the download services offer.  The idea is to, at the very least, deliver 100% of what the CD offers and at best, deliver true high resolution.  Yet these efforts have spawned Internet “papers” and articles in effect, ridiculing the very idea of high resolution and arguing the supposed inaudibility of its benefits, or worse, suggesting that high resolution by definition will sound worse, not better.

I can’t speak for what others find but I can say that whatever these folks are reporting is quite the opposite of what I experience.  I’m hearing fidelity such as I’ve dreamed about for years and when I read those stories, they strike me much as though the authors are trying to convince me there are no colors in a rainbow.

The arrival of high resolution digital has the potential to fulfill the promise digital audio first made more than a quarter century ago.  Back then, astute listeners wondered at the marketing mantra “perfect sound forever” while cringing at the dry, bleached and airless sounds delivered by the first CD players.  While a great deal of progress has been made during the intervening years, the inherent limitations of the format remain.

Looked at in the most rudimentary fashion, the specifications for CD would, on the surface, appear to be all that is needed to perfectly reproduce anything that can be heard.  Human hearing is nominally sensitive to frequencies from 20 Hz through 20 kHz (i.e., 20 cycles per second through 20,000 cycles per second).  As we age, the top end limit decreases and most adults would be lucky to hear 15 kHz.  With CD, music is sampled 44,100 times per second.  That is, the digital recorder “looks at” the sound 44,100 times every second and captures a sample.  According to the theory, all frequencies below half the sample rate, (in this case, all frequencies below 22,050 cycles per second) will be captured accurately and since this is well beyond what most folks can hear, it all sounds quite neat.

These digital samples are each a series of digital bits, with each bit representing one of two binary states or values, often thought of as “ones and zeros”.  Each sample is stored in a digital word.  The CD standard uses 16-bit words, where each sample contains 16 values.  The particular combination of ones and zeros represents the level (i.e., volume) of each sample.  A series of 16 zeros (i.e., 0000000000000000) would be the lowest level that can be encoded and represents complete silence.  A 16-bit word representing an intermediate level might look like this: 0111011110101110.  The highest level would be 0111111111111111, a zero followed by 15 ones.  (For technical reasons which are beyond the scope of this entry, the loudest value is not a series of 16 ones.)

A word length of 16-bits allows up to 65,536 different levels to be represented.  The difference between the loudest sound that can be captured and the noise floor of the format is called the signal-to-noise ratio.  Signal-to-noise ratio is measured in units of loudness called decibels (dB).  For a 16-bit format like CD, the signal-to-noise ratio is approximately 96 dB, which means the noise floor (the inherent noise of the format) is 96 decibels below the loudest sound that can be captured.  This is much quieter than vinyl or analog tape.  Any hiss heard on a CD is captured from the source and is not inherent in the medium.  Many folks confuse the signal-to-noise ratio specification with dynamic range (the difference in level between the loudest possible sound and the lowest sound).  We’ll come back to this later and see why this is misguided.

The problems start when we move from the theoretical to the practical.  (Someone, perhaps it was Yogi Berra, once said “In theory, there is no difference between theory and practice, but in practice, there is.”)  When digital audio is recorded, any frequencies above half the sample rate can cause problems – they engender aliases or aliasing distortion, false frequencies that are not part of the program material.  In order to avoid aliasing, when digital audio is encoded, as well as when it is played back, most digital processors use a filter to ensure that no frequencies above half the sample rate can pass.  These anti-aliasing filters have audible side effects, manifesting in the time domain – the signal gets smeared in time.  Some designers will use gentler filters to minimize the time smear but in doing so, they cause the higher frequencies to fall off prematurely.  A number of modern playback devices have user-selectable filters where the listener can select between steep filtering and its associated time issues or gentler filtering and its associated frequency issues.

So, while CD can capture all the audible frequency range, the requisite filtering means the frequencies delivered to the listener are not all arriving on time or are not all arriving in the same proportion in which they were captured, or some combination of both of these.  One great advantage of the higher sample rates is that the anti-aliasing filter is moved far above the audible range.  This allows gentler filtering to be used without affecting the audible frequency range.

In recent years, thanks in no small part to formats like DVD and others, which are capable of storing more information than will fit on a CD, digital audio has grown up from the 16-bit words and 44.1 kHz sample rates by which sound is encoded for CD.  We’ve had 24-bit audio with sample rates of 96 kHz, 176.4 kHz and 192 kHz.  For reference, a 24/96 (24-bit, 96 kHz) version of a given recording contains more than three times the information contained in the same recording at 16/44 (16-bit, 44.1 kHz).  A 24/192 version contains more than six times the information.  And where a word length of 16-bits allows up to 65,536 different levels to be represented, going to 24-bits increases the dynamic resolution 256 times, allowing up to 16,777,216 different levels to be represented.

The widespread use of computers (and computing devices) for audio playback has enabled the proliferation of high resolution audio and emancipated music from the confines of silver discs and the limitations imposed by the process of retrieving music from these in real time.  (Separating the processing “overhead” from the playback will provide higher quality playback.)  Good as the best disc players and transports can be, my experience has been that there is invariably a loss of focus and fine detail, often subtle, sometimes not so subtle.  It is only via proper computer playback that I’ve heard results that I find indistinguishable from listening to the master used to create those silver discs.

This is good news, even for music at CD resolution, because the listener at home can now hear what is effectively the CD master itself.  However, while the limitations of playback from molded disc have been removed, the limitations of the format remain.  In addition to the frequency and time-related issues brought about by having the anti-aliasing filter just above the audible range, there are the consequences of inadequate word length.  Although the noise floor with a 16-bit medium like CD is 96 dB below the loudest possible sound that can be captured by the format, many often confuse this signal-to-noise ratio with dynamic range.  The assumption is that if the noise floor is 96 dB below the loudest sound, sounds just above the noise floor will be captured with the same fidelity, providing a range of dynamics as wide as the signal-to-noise ratio.  In fact, with a 16-bit medium, the fidelity plummets at lower levels.

The full resolution, in this case 16-bits, is only realized for sounds near the top of the volume range.  Each bit captures about 6 dB of the dynamic range (about 6.02 dB to be more precise but let’s use 6 in this example to keep things simple), so in a 16-bit system, sounds lower in level than 6 dB below the maximum will effectively be captured at less than 16-bit resolution.  To wit, if this lower level information is say, 12 dB lower in level, it will be encoded at what is effectively approximately 2 bits less than the full resolution of the format (i.e., 14 bits in a 16-bit recording, 22 bits in a 24-bit recording). If it is say, 36 dB lower in level, it will be encoded at what is effectively approximately 6 bits less resolution (i.e., 10 bits in a 16-bit recording, 18 bits in a 24-bit recording).

Some information, such as the trailing end of reverb as it fades away, or the higher harmonics of musical instruments, can be well more than that 36 dB lower in level than the loudest sounds and will be encoded with resolutions corresponding to fewer bits.  This results in the thinned, bleached and coarsened instrumental harmonics in even the best 16-bit recordings, as compared to a good 24-bit recording (or of course, the original sound in real life).  It also results in the defocusing of the spatial information and in the relative airlessness in the 16-bit recording compared to a good 24-bit recording (and real life).

While the level meter may show a peak on that 16-bit recording that is within the top 6 dB, this, like the waveform views shown by some computer software, is only a view of the “top” part of the musical waveform — the loudest part.  Sounds and components of sounds that are underneath the top part (i.e., in the background) are not captured as faithfully.  Accordingly,  when considering the dynamic range of the format, it is a good idea to take into account the relative distortion at different levels within the range.  If increasing distortion is not desirable, the real dynamic range potential is going to be considerably less than what the spec sheet might suggest (or is often echoed in the audio press and in some places on the Internet).  Note that even with low level information as in the examples above, a 24-bit recording still delivers more resolution than a 16-bit recording at its best.

Why then, would someone publish a “white paper” against higher resolution or declare that resolutions like 24/192 are “pointless” or worse?  A few possible reasons come to mind:

  1. The higher sample rates place significantly increased demands on the gear used to record and play them back.  For example, digital gear contains an internal clock to control the timing as the device encodes or decodes the stream of digital samples.  Spacing between the samples must be kept accurate or the reconstructed analog waveform that we hear will not have the correct shape and hence, will not provide the correct sound.  Irregularities in timing are referred to as jitter.  Higher sample rates also mean the analog stages of the gear must be able to perform at the wider bandwidths.  Perhaps the folks complaining about high resolution are using gear that does not have clocking that is up to the task and analog stages that can perform at high bandwidth.  Such will either not reveal any benefits or will actually sound worse than they do at the easier, lower rates like 24/96. (This is true of a number of “professional” units as well as those sold to audio enthusiasts.  A  built-in, $250 “soundcard” simply won’t do it, regardless of what the specs claim.  In today’s market, it may cost 10 times this amount for a device truly capable of revealing the potential of these sample rates.  Maybe it is no wonder these folks hear little or no difference.)
  2. It could be possible that the rest of the system these folks are using isn’t up to resolving a wide band recording.  Or it could be that these folks are just not sensitive to these particular differences.  I’ve always found that different folks have different sensitivities to different aspects of sound.
  3. Perhaps they believe CDs (or 24/96) already sound identical to the input signal.  If that is the case, I can understand that anything more would seem wasteful.

Sample rates like 176.4k and 192k don’t, as some have erroneously suggested “have more jitter”.  Sample rates don’t have jitter.  As stated above, higher sampling rates do place greater demands on clocking accuracy (just one reason why buying a DAC
(digital-to-analog converter) “by the chip” is at best a foolish enterprise).  They also place greater demands on the analog stages surrounding the digital stage.

Why some would see these characteristics as “flaws” (and write papers or articles on the subject), I don’t understand.  I’ve always gone with empirical evidence over theoretical analysis; that is, when “theory” and direct experience are at odds with each other, I’ll tend to seek a new theory.  (As I see it, theory should explain the experience, not the other way around.)

All this to say, when a firmware upgrade enabled 192k capability in the converters I use for my work, I approached it conservatively — even continuing to do a few recording sessions at 96k because I was familiar with it and could be confident in the results.  But then I started running tests at 192k and quite quickly found I had to get my jaw up off the floor: for the very first time in my experience, I was hearing (with this device anyway) a recording device “disappear”.  I had never heard that before, even with the best analog recorders and most certainly nothing close with the best digital recorders, even with this very device when used at 96k.

Now I felt a threshold had been crossed (I’ve read similar words since then from one of my favorite audio engineers, Keith Johnson).  The results no longer sounded like “great digital”; they no longer sounded “digital” at all.  They didn’t sound like “great analog” either.  The jump from 24/96 to 24/192, when done well, is to my ears a much more significant jump than the one from 16/44 to 24/96.  It’s all about that threshold; this is the promise digital made in 1983, finally and for real.  (While it certainly sounds more faithful to the input signal than 16/44 does, 24/96 doesn’t yet, to my ears, “get out of the way”.  Having the anti-aliasing filter moved well up and away from the audible range definitely helps but it is the rates like 176.4k and 192k where I find the threshold is crossed.  Interestingly, to me, while many speak of the treble response, I find some of the greatest benefits to be in how much more lifelike I find the bass.)  24/192 pointless?!?  Only if real progress in music reproduction is pointless.

While some decry the higher sample rates, either declaring them to offer no audible sonic differences from the lower ones or to offer inferior sonics (!) to the lower ones, other voices are talking even higher numbers.  We are seeing marketers talking 32-bits and sample rates like 384 kHz.  As long as there are customers who are taken in by sheer numbers, there will be those who see an opportunity for commerce, who will accommodate them with sheer numbers.

I get a chuckle out of these things because my experience has been that in reality (that is, with the hardware or software one can go out and buy today, as well as the recordings one can purchase to play on these), I see gear that isn’t particularly clean at 24-bits.  I see other gear which, when presented with 4x sample rates (176.4k or 192k), performs worse than it does at 2x rates (88.2k or 96k).  Yet the spec sheets and ads say “24-bit” (or more) and they say “192k” (or more).  And the reviewers simply echo the numbers.

In the here and now, if it is a minority that can achieve the performance potential of 24/192, I take claims of higher numbers as a joke at best and cynical marketing at worst.  Just my opinion of course but with so few showing they can design for 4x rates, why would anyone think those same few could deliver 8x rates (or more)?  I find it interesting that those claims are not coming from the designers of gear that can achieve the potential of 4x rates.  (We have the equivalent of makers of 2-cylinder subcompacts claiming to make cars that an outrun a Lotus!)

The numbers game isn’t limited to hardware.  I’ve seen one company release CDs they claim were made with “32-bit mastering” and another claim “100 kHz resolution”.  (Do they have 100k gear or are they rounding up from 96k?)  Does anyone think those CDs are anything other than 16-bit, 44.1k?  If there are such folks, I have a fine bridge in New York City to sell.  The tools I use to create a CD master have 80-bit data paths and I’m working at 192k.  The higher quality tools do result in a higher quality CD but should I then say they are made with “80-bit mastering”?  Or that they exhibit “192 kHz [or 200 kHz] resolution”?  I’d rather make records than sell bridges.

The finest 24/192 I’ve heard to date has given me back recordings I have not yet been able to discern from the direct input from my microphones.  (To be clear, I am referring to gear that actually seems to achieve the potential of these numbers and not merely gear that sports them on a spec sheet.)  Would 32/384 sound better?  I suppose I’d have to hear the flaw(s) in properly done 24/192 first.  And second, if the first condition was met (and in my experience, it has not even been challenged yet), that 32/384 gear would have to actually achieve the potential of that resolution and not merely claim it on a spec sheet.  For me, right now, it is just marketing.  Someday perhaps, we’ll have the audible evidence.  Perhaps.  Right now, I’m trying to imagine how it might be better than what is (so far) indistinguishable from the input signal.

24/192?  32/384?  64/768?  Or should I wait for the 128/1536 version?
Is the best of today’s 24/192 too much?  Is it not enough?  I think it is just right.

 

Perfect Sound Forever? (Part 2)

There I was in 1984, Atlantic Records’ “CD mastering department”, responsible for creating a good portion of the masters used to replicate the monthly CD releases for the label and associated divisions (Atco, Elektra, etc.).  Demand for CD was on the increase and it was clear this was where recorded music was going.  The small CD section at the local Tower Records store was a bit larger every time I visited, slowly but surely encroaching upon the real estate that was, for the moment, dominated by vinyl LPs.  I saw customers so eager for new CDs, I got the impression even a disc of dog barks would be a hot sales item.

The manufacturers behind the format proclaimed “Perfect Sound Forever”, distortion-free music on a medium that would not wear out.  It sounded too good to be true.  Like most things that sound too good to be true, it wasn’t true.  I remember the expectation with which I first listened to digital masters and to the earliest CDs.  Despite the raves of my colleagues and those in the press, what I heard every time I listened sounded to me not like an evolutionary step forward for audio but like an electronic equivalent of fingernails on a blackboard, an irritating harshness that felt like a good deal of the music had been replaced by something unnatural, something mechanical, something cold.

A number of colleagues I spoke with did not seem to have the same experience.  In fact, they looked at me askance when I expressed great disappointment in what I’d heard, as if I was missing something so obvious, they couldn’t believe it.  They would point out how flat the frequency response measurements were, that the wow-and-flutter (a measure of speed inaccuracy) was virtually unmeasurable.  They would say “Just listen to the noise!”, amazed to have a medium that did not add any hiss.  I would respond “Just listen to the music!”

Yes, piano recordings did display a steadiness of pitch devoid of the indeterminacy sometimes engendered by analog media (played on less than great tape machines or turntables, or when either the tape was stretched or the vinyl pressing suffered a slightly off-center hole).  If any hiss was audible at all, it was the hiss from the original analog recording.  The digital medium wasn’t adding any that I could detect.  Yet, what good were rock steady speed and dead silent backgrounds when the piano sounded like it was made of aluminum?  And the cello sounded like a cousin of the kazoo?  Instrumental harmonics were bleached into thin, pale ghosts of themselves and the very air around the players (on recordings that had such) seemed to have been sucked from the room.  A great rock record invites the listener to turn up the volume.  Doing so with a rock CD just brought on the headache that much sooner.  What was wrong?

I had done everything I knew to ensure the highest possible quality.  I set up the CD mastering room with the audiophile sensibilities I sought to bring to my work.  I created CD masters bypassing most of the electronics in the room, keeping the signal path as short as possible, introducing only what was absolutely necessary and avoiding extra switches, wires, patch bays, consoles, etc.  I even took to carrying my own cables to work every day, replacing the generic studio cables connecting the output of the tape machine to the analog-to-digital converters with one of the best audiophile designs of the day, one that had repeatedly shown me it was capable of passing more of the musical information, with less degradation than the regular studio cabling.  Still, even with the CD masters created this way, a comparison with their vinyl counterparts, made using a far less purist approach, showed just how much more of the musical information on the master tape made it to the finished LP than ever made it to the CD.  There were no exceptions.  This was the case every single time.  Digital acolytes in the press attributed any favor shown the LP to euphonic (i.e., pleasant sounding) colorations in the medium, where CD was supposedly truer.  But as is often the case, the audible evidence said otherwise.  A well set up $100 turntable/cartridge combination would, in terms of bringing back the sound of the master recording, sonically wipe the floor with a $1000 CD player.

A fellow mastering engineer, one whose work I had admired for years, called one day and invited me to sit on a panel of mastering engineers to discuss CD at a meeting of the Audio Engineering Society in New York City.  I gratefully accepted and not long afterward, found myself sitting at a long table on stage in an auditorium, next to four other colleagues, all of us involved in CD mastering.  When I spoke, I felt quite alone in that my colleagues all sang the praises of the new medium while I (quite shyly at the time) said “I just don’t feel it sounds as good as my vinyl yet.”  (Yet?!?)  I explained how I felt vinyl was revealing much more of the musical information contained in the master tapes.  Despite any technical flaws or issues in manufacturing and playback, things that did not at the time seem to plague CD (at least not when one just looked at the surface of things), vinyl was providing more music and to my ears, that was more important.  When I left that evening, I thought folks were looking at me as though I had two heads.

What we came to learn as time passed and more audiophile companies got involved with digital and CD, was that a major part of that bad sound in the early days was due to the digital recording and playback gear itself, perhaps most specifically in the filtering that is an essential part of these mechanisms but also in the converter chips at their core.  I found it interesting that when folks like Bob Stuart started writing articles about jitter (timing irregularities between samples in the stream of digital data), a number of folks who had previously raved about CD (seemingly because of the “good” specifications they’d read) now found issues with the format.

With the advent of new knowledge came new filter designs and new converter chips.  The players were starting to get better.  Even the Sony 1630 converters I used in the studio got new retrofit filters that made for noticeable sonic improvements.  The CD format was growing in popularity every day and clearly was going to be around for a while.  The thought occurred that vinyl mastering engineers were routinely credited for their work on albums but no one as yet (at least to my knowledge) had been credited with CD mastering.  I spoke about this with management and after a conversation with the art department, saw the first CD booklet with my name in it.  As the format continued to grow and demand for more releases increased, outside facilities were contracted to create masters in addition to the ones that were keeping me busy full-time.  The only problem was the art department was not always informed when a master was going to be done by a third party.  As a result, some CDs I mastered did not have a credit and some CDs mastered by others have my name on them.  (In a way, I came to know whence the phrase “Be careful what you wish for” comes.)

I made some other observations regarding the digital audio of the day.  First, the playback and record sides of the Sony DAE-1100 digital audio editor did not sound the same.  The official word was that a digital tape could be cloned (“clone” being the term used to describe a digital copy) to create an identical copy.  Yet, when I cloned a digital tape and played it back to compare it with the original, the original always sounded cleaner.  Was there some degradation in the copy?  I found it interesting that when I took the tapes out of their respective machines and swapped them, putting the copy in the “playback” machine and the original in the “record” machine, the original now sounded degraded.  It turned out (for reasons I’m still not sure of) that playing back a tape from playback side of the editor just sounded better than playing the same tape from the record side.

As CD grew, we started using more and more replication facilities.  When sales for a particular release were expected to be large, often a single replicator could not produce a sufficient quantity of discs, so I’d create a CD master and then send clones of that master to different replicators.  When the discs came back, I made another discovery.  The discs from all the replicators sounded different from each other, sometimes subtly so and other times not so subtly.  And none of the discs sounded indistinguishable from the master used to make it.

It was plain to see there was much more to be learned about this digital juggernaut.  My thinking was that we’d had vinyl for about a hundred years.  In another hundred years, I expected CD would be pretty good.  Happily, it hasn’t taken nearly as long as that.  Today, CD can be “pretty good” if not exactly competitive with fine vinyl, despite what is said in some quarters.  Perfect sound forever?  Not to my ears.  It is more like “Decent sound, once in a while” but I can see how that is a bit less catchy as a marketing phrase.

Sonically, there was lots of room for digital to grow.  As futuristic as the equipment seemed at the time, it too, along with many of the very techniques involved in recording and editing, would soon undergo a revolution, as recording and mastering began to take advantage of the nascent world of desktop computing.

Perfect Sound Forever? (Part 1)

In early 1983, I created my first master for Compact Disc.  I first heard of the format nearly a decade earlier, while still in college.  I remember a promotional mock-up, looking very much like a miniature LP jacket.  Inside, was a cardboard disc printed with the distinctive rainbow reflections of the real thing.

Atlantic’s west coast affiliate, Warner Brothers, was already creating CD masters when it was decided that Atlantic would open its own CD mastering room.  I was to be the CD mastering “department” and was sent to Los Angeles to spend a few days with my counterpart, learning the procedures Warner Brothers had in place for creating CD masters.

At this point, the only CD mastering rooms I knew to exist were at Sony in Japan, Polygram in Germany, Warner in California, perhaps DADC in Terre Haute and now, Atlantic.  To my knowledge, I was one of the first engineers to do CD mastering.  Technically, the process of creating a master for CD replication is referred to as “premastering”.  To the replication facilities, the term “mastering” refers to the first stage of manufacturing, when the glass master is “cut”.  Glass mastering is the creation of a glass disc, etched by a laser beam recorder.  This disc is electroplated and used as the first part in the process that yields the injection molded finished CD.  Still, in terms of the creative process, which occurs prior to manufacturing, creating a CD master is still referred to as “mastering”.  Mastering, for any format, not just CD, has always been the last step in the creative process and also, the first in the manufacturing process.  It is the last chance to make any adjustments to the sound and it is where the “part” used to initiate manufacturing is created.

In those days, the CD master sent to the replication facility was recorded on a U-Matic video tape cartridge, housing ¾” (~19 mm) wide tape.  It was recorded using the video capacity to store the digital audio signal.  A parallel track stored the usual time code used by both video and digital audio.  The system was built around two U-Matic machines (one to play, one to record), the 1610 (later 1630) analog to digital (and digital to analog) converters, and the DAE-1100 digital audio editor.  Ancillary gear included another Sony device, the DTA-2000, to analyze finished tapes and provide a printout of error occurrences per minute.  This, along with a written “table of contents” indicating start and end time code locations for every track and other incidental details was sent to the replication facility with the CD master.  A pair of U-Matic machines, the 1630, the analyzer and the electronics associated with the editor filled an equipment rack several feet tall.

The editor itself was a small console, a few feet wide.  It contained controls for up to three tape machines (two for playback, one to record), readouts of the time code indicating the location of the tape in each machine, controls to perform editing, and a fader used for gain (i.e., level) adjustments.  Editing in the digital domain no longer involved using a razor blade to physically alter the original tape, as we had always done with analog tape.  (There were some short-lived exceptions in the form of the digital multitrack reel-to-reel recorders that were to come later.)  Digital editing was now effected by playing the original tape while recording the edits onto a new tape.  The finished result needed to be created sequentially.  If, upon listening to the results of an editing session, the producer decided to add to or remove anything from the middle of the program, a new tape was created, requiring the entirety of the program prior to the new edit to be copied first.

As the music was playing and the engineer heard the section where the desired edit point was located, the press of a button on the editor would store a 6-second sample of the music — the three seconds before the button press and three seconds after.  The playback and record machines would stop.  A small wheel in the middle of the editor was used to manually move forward and backward in the captured sample of audio, so the engineer could precisely locate the edit point on the newly recording tape.  Turning this wheel accomplished what used to be done with analog tape by having one hand on each reel and manually “rocking” the tape past the analog machine’s playback head in order to locate the desired edit point.  Where the edit point used to be marked with a grease pencil, all the engineer needed to do now was press another button on the editor.  Now that the “out” edit point was selected on the record machine, a similar process of location would be done on the playback machine to find the “in” point from which the new tape was to continue.  Once the edit point on each tape was selected, a preview button started a process where both tape machines would shuttle backward a predetermined amount of time, still synchronized with each other and then both started to play.  The audio would be from the record machine (i.e., what had already been recorded prior to the edit point) until the edit point was reached, when audio would switch to the playback machine, in effect, allowing the engineer to hear the edit before committing to it.

If a recording or mixing studio console was reminiscent in some way of an airplane or Space Shuttle cockpit, my first look at the DAE-1100 editor reminded me of Star Trek.  It felt like the future, with its smooth, uninterrupted surface of subtle grey, with darker gray, red, orange and blue “buttons”.  Being able to test an edit without committing to it meant all sorts of edits could be attempted without fear of having to splice together a missed edit.  I used to describe the precision of the edits as allowing me to “get in and out within the width” of the razor blade cuts we used to make.  In comparison, I described the thought of editing with a razor blade as now feeling much like editing with a hammer.

Having long experienced what seemed to me to be the inadequate monitoring in the studios I’d worked in, visited and read about in the professionally oriented magazines, I sought to do something different in the new CD mastering room at Atlantic.  Rather than loudness optimized speakers, placed against the wall, near the corners, over the engineer’s head, or small, dynamically challenged speakers placed where they would create a midrange dip at the listening position – both commonly seen in every studio in my experience, I wanted to bring some audiophile sensibilities into the room.  At my request, studio management agreed to install a pair of Dahlquist DQ-10 speakers (my favorites at the time).  These I placed a few feet off the wall behind them, in free space, with nothing else near the speakers.

Once the room was set up properly and known master tapes played back to my satisfaction, it was time to get my first really good listen to digital audio.  The advance word from the hobbyist and professional magazines, as well as from colleagues who’d already gotten to listen to a bunch of the earliest CD samples, was very positive.  Everyone was enthusiastic.  I was going to hear what had widely been touted as “Perfect Sound Forever”.  With great enthusiasm and anticipation, I listened to my first sample.  Then I listened to another one.  And another one.  I listened to all the samples we had.  I went back and listened to some analog master tapes and vinyl LPs to make sure the monitoring was what I expected it to be.  With the analog sources, it was.  With the digital sources, I wondered just what everyone had been raving about.