Real stereo, loudness wars and a fork in the road

Having been an avid music lover and audio enthusiast since childhood, having done pro audio work in a number of studios for over a decade and having read all the books and journals on the subject I could find, I was not prepared for that experience in 1988 when I heard stereo for the first time.  Apart from the unparalleled joy the experience elicited (as in “Who knew audio playback could be this realistic?”), it engendered a great deal of thought, including the realization that what I’d heard until then was essentially dual mono, synchronized but separate programs, not the coherent and convincing whole possible with real stereo.

Now that I’d learned how proper setup of stereo loudspeakers allowed the speakers to better “disappear” and (with recordings containing the information) leave behind a three dimensional sense of the performers and the space in which they played, I saw new possibilities for the recordings themselves.

In my last experiment with direct to stereo recording, I had considered the relationship between the positions of the two microphones during recording and the two speakers during playback.  I thought there might be some reciprocity between the both ends of the chain.  To more closely emulate the space between playback speakers, I used a 6 foot (~1.8 meter) spacing between the microphones.  Considering the time element, my reasoning was that the time a signal took to get from left mic to right mic should match the time it took to travel between the speakers in the listening room, hoping the symmetry would get me closer to “being there” when listening to the result.  While that recording avoided the hole-in-the-middle common too many recordings made with two widely spaced microphones, I found that instruments positioned slightly off center during the recording session had a tendency to “pull” to the near speaker on playback.  (I wrote an article called Recording in Stereo about my experiences in these tests.  A highly abridged version follows herein.)

How to get stable stereo without introducing the time-based distortions (i.e., “ghost” images) that would result from adding more microphones?  I decided to try some iterative experiments in the studio, recording my speaking voice as I walked around in front of the microphones.  Each test was repeated with the spacing between microphones changed slightly.  I started with my original 6 foot spacing, announcing my position to the microphones, for example “3 feet left of center, 2 feet left of center, 1 foot left of center, center, 1 foot right of center, 2 feet right of center”, etc.  Next, I did the same thing with the mics a bit closer together, then another test with the mics still closer together and so on until the mics were 7 inches (~18 cm) apart, matching the spacing between a typical listener’s ears.

To quote the article cited above, “On playback, I paid particular attention to the just off center area that had proven problematic with the 6 foot spacing.  Somewhere around 15 inches (~38 cm), things seemed to gel.  All the qualities I liked were there with considerably less vagueness in the image.  My long held belief in recording with omnis spaced at 6 feet was being revised.”

“I started researching just how it is our brains perceive stereo and the cues required for localization (our ability to determine where a sound is coming from).  What I learned was our brains use three types of cues to determine localization:  intensity, time and frequency [specifically, differences in intensity, time and frequency between the sounds arriving at each of our ears].  Then it dawned on me that if nature could have gotten by with fewer cues, it would have done so.  I began to consider what was needed to supply all three types of cues in a stereo recording.”

The last step was the design of an absorbent baffle to be placed between the microphones.  Again from the article, “I’d found my way to record in stereo, incorporating all three types of cues nature uses to inform us of where a sound is coming from:  intensity, timing and frequency.  The timing information provided by the omnis benefited from the disk-shaped baffle which provided increased intensity differences between mics as well as frequency discrimination between mics.”

All the while these experiments went on, I was doing mastering work for a number of labels.  A disturbing trend was making itself evident, in that all too many of the A&R folks and producers at the labels were talking more and more about loudness.  I can recall one “name” producer who asked me “How much do you usually raise the level of tapes that come in here?  We do 6 dB.”  I wasn’t sure how to reply to his query because my experience had long ago taught me that some tapes require the level to be dropped, not raised, if one wanted to get the best possible results from the master.  Other record folks were requesting a “balls to the wall” sound (ouch!).  These were the foundations of the so-called “Loudness Wars”, an arms race of sorts, where folks wanted their record to be louder than everyone else’s.  There were folks who evaluated my mastering work with VU meters and not with loudspeakers!  “If it goes in the black, you lose.”  (For more on the subject, see my article Declaring an end to the loudness wars.)

At this point, I had to stop and ask myself why I became an audio engineer and just what I sought to accomplish in my work.  I knew for sure that the weaponizing of sound and music was not among my goals.  What a strange dichotomy.   As I sought to create recordings that sounded more like life and had more dynamic range, the larger trend in the industry was to eviscerate dynamics, seeking ever greater quantity without regard for the cost in quality.

Where the best records of “loud” music invite the listener to turn up the playback volume, casualties of the loudness wars cause physical discomfort.  My personal take is that the loudness wars have played a large part in the decline of the record industry.  Highly compressed sound brings about a stress response in the listener.  Joe and Jane Average may not be consciously aware of this but as a result, they don’t buy nearly as many records as they used to and they don’t listen to the ones they do purchase as many times as they used to.  New records are supposed to bring pleasure, not a “fight or flight” response.

Having considered my reasons for being an audio engineer, I decided to take a two-fold approach.  First, those making mastering inquiries are asked how important final level is to them.  The many benefits of achieving loudness with the playback volume control, as opposed to recorded level, are explained.  Those whose prime interest is in the quality of the music and sound tend to become clients.  Those who really want loud records are gently referred elsewhere.  (I know many mastering engineers say they prefer not to squeeze the life from their clients’ recordings but consent to do this because they need or want the work.  That is a personal decision each individual must make for themselves.)  Second, my fascination with making records that sound like music itself—as opposed to simply sounding like records—was on the rise.  While mastering can be very rewarding, I came to understand that 90-95% (or more) of any recording’s ultimate sound quality has already been determined by the time the signals are leaving the microphones.  In other words, the overall quality is already there (or not) as soon as the signals enter the mic cables.  Everything else is just relatively minor adjustments to the overall picture.

Having rented time in a few different studios to do my mastering work, I started thinking of designing a space for myself.  The idea was more than appealing since I could have complete control over the acoustic design and gear selection.  Of course, monitoring, as always, was the prime concern.  In addition, it was time to assemble a recording kit of my own and lose the dependence on what I could borrow or rent.  And to provide a vehicle for distributing the new recordings, I was thinking about a new kind of record label.

Digital grows and first experiments in stereo

When I first heard of digital audio, it seemed full of excitement and promise, with claims of perfect sound, perfect copies and a noiseless medium that was indestructible.  When I first experienced the subject of all these claims, I heard pain-inducing sound, questionable copies, new forms of noise and found the media more than a little bit fragile.

The earliest digital systems did well in the published laboratory measurements.  Frequency response was flat, without the “head bump” in the bass or the diminishing energy at either end of the spectrum suffered by analog tape.  Measurements of gross speed inaccuracy showed there wasn’t any.  Signal-to-noise ratio measurements also revealed devices capable of hiss-free recordings.  But when one sat down to listen to the recordings created with these digital systems, they just didn’t sound very good.

The news got better as some designers who heard the flaws in the technology began to study and address its weaknesses.  The devices used to convert signals from analog to digital got better, as did those used to convert digital audio back to an analog signal for playback.  While it had the edge in terms of measured response, there was still a very long way to go before the sound of digital was going to be competitive with analog.  One of the major stepping stones on that road to progress was the personal computer, which was just coming into popular use at the time.  In the second half of the ‘80s, I was introduced to one of the first computer-based digital audio systems.  Where the first digital editing systems I’d seen seemed like futuristic machines allowing edits I couldn’t have imagined doing with the razor blade and Edit-All bar from the analog tape days, the computer-based system, called a digital audio workstation (or DAW) took the concept an order of magnitude further.  Access was fully random and instantaneous.  No more having to first record everything prior to an edit point because the old system required masters to be assembled in sequence.  No more waiting for tape to wind to a specific place to hear a specific passage.  The entire program (or a tiny fraction of a second of that program) could be viewed on screen at once.  A click of the mouse was all that was required to hear any part of that program instantly.  All sorts of sonic adjustments could be made that could not be made before, at a level of detail unattainable in the past.

Another promise of the digital audio workstation was something I had long looked forward to, which was the elimination of tape.  While it had served well as an analog medium, my experience with tape for digital audio was that it was quite fragile.  A particle of dust was all it took for playback to suffer a “dropout”, a momentary muting of the audio.  Digital recordings on tape didn’t age well either, as our digital tape analyzers confirmed with significantly increased incidence of the digital system’s error correction coming into play as a tape got older.  Some tape formats, like the miniscule DAT (Digital Audio Tape, a digital audio cassette of sorts) used tape so thin and so fragile it was not uncommon for 6-month old DATs to no longer be playable, the audio devolving from music into something more closely resembling a fax transmission.  The digital audio workstation had an accessory disc recorder, which recorded on blank discs, recordable CDs (or CD-R).  The first blank discs I saw sold for $75 each and the failure rate (the creation of “coasters”) was high.  How far we’ve come since then, with very high reliability, no-failure discs selling for 35 cents apiece!

At this point in my experience, however, I got suspicious.  I’d been there before with new technologies offering undeniable improvements in certain aspects of the quality or in certain aspects of the mechanical operations required to capture audio and turn it into a finished recording for the listener.  There was always that little detail though:  the sound.  Almost a faux pas to mention it in some circles but it is what all this is about, isn’t it?  So I wanted a real demo of this new computerized system.  I wanted to hear what happened to audio that passed through it.  I wanted to compare a CD-R made on one of these systems with the signal used to burn that disc.

While all these developments were occurring, I had been engaged in a related pursuit with my early experiments in recording in stereo.  I had learned and used the techniques common to most studio practices where multiple microphones were deployed to capture multiple sounds which were later combined during the mix down to (the 2-channel, dual mono result that is commonly but erroneously referred to as) “stereo”.  As interesting as this was and as interested as I was in honing the techniques in order to create something more convincing—something that sounded “in here” (in the control room) more like it sounded “out there” (in the studio with the musicians), I found the idea of a much simpler approach even more intriguing.  I began to experiment with a more first principles strategy, questioning every single aspect of record making, every single component of the process and every single decision involved.  This was the beginning of what I later came to think of as “The Questions”.  These are questions that need to be asked if one is ever to arrive at answers. They are the questions I’d never seen mentioned in any of the books on recording I’d ever read or in any of the magazines.  They are the questions I was never taught to ask when I was an assistant engineer, the questions that students in today’s “audio engineering” schools never encounter.  How fortunate I was that it ultimately occurred to me to ask them.

The questions are in fact, simple; so simple, they and the answers they might lead to tend to get overlooked:
“Why this microphone?
“What results do I expect from selecting this microphone?”
“What results of selecting this microphone might occur which I do not expect?”
“Why place it here?”
“What results do I expect from placing this microphone here?”
“What results of placing this microphone here might occur which I do not expect?”
“Why am I turning this particular knob to adjust the sound?”
“What did I do wrong in a previous step that I believe will be remedied by turning this knob?”
“What results might occur which I do not expect?”

There are an infinite number of questions, as many as there are decisions to be made in the process of making a record, from conception to manufacturing the finished product.  As I set out to find the questions and hopefully some answers to same, I started making recordings in an entirely different fashion.  Rather than layering multiple recordings, each picked up with a large number of microphones, I sought to capture real performances in a single shot, recording “live” (for the microphones), using only as many microphones as there would be playback channels.  In the case of stereo, that meant only two microphones.  (I’ve developed the technique since then to allow layered recording, i.e., overdubbing, where players do not all have to perform at the same time or where a musician or vocalist can perform more than one part.  However, I became increasingly taken with the idea of capturing real performances in real stereo.)

The first tests were solo piano recordings and these provided a great deal of education in terms of capturing what I’d hoped to capture but even more regarding certain aspects of the results that I did not expect.  For all of these tests, with the goal of maximum fidelity in mind, I was using microphones more commonly employed for critical measurements of sound than in making actual music recordings, where microphones with more pronounced sonic character were (and remain) much more the rule.  These were the Danish microphones from Brüel & Kjaer (B&K, now Danish Pro Audio or DPA), with relatively small diaphragms compared to the large diaphragm mics generally used to record music.  The B&Ks were also omnidirectional microphones—they “heard” sounds from all directions—whereas most studio mics have a more directional pickup tending to focus on what is directly in front of them.  (This most common, front-hearing type of microphone directivity is called “cardioid” because of the vaguely heart-shaped laboratory representation of how it “hears”)  Over time, I came to believe that all microphones are in fact omnidirectional but some (the sort called “directional” ) apply more color—are less transparent—to off-axis sounds, those coming from the sides or behind them.  True omnis are more neutral in terms of timbre than their directional counterparts.  They’re better at getting out of the way.  (Of course, not all recordists want their gear to get out of the way.  What is “good” depends entirely on the results one seeks.  For the purpose of making a recording that sounds like what occurs in the presence of the microphones, I want gear that gets out of the way.)

The mics captured so much of what was occurring in the room, they showed me things I had up to then failed to consider in the recording.  Prime among these is the room in which the performance occurs.  In hindsight, this only makes sense since the departure from close mic placement means the engineer is no longer simply mic’ing the instrument; they are mic’ing the event.  The place in which it occurs is very much a key sonic component of the event.  A fine grand piano sounds very different in a nice auditorium than it does in even a large domestic room.  The latter has intimacy but the former is required to access the grandeur—assuming the music and performance call for this.  (Here again, what is “good” depends entirely on the results one seeks.)

For the next experiment, I got permission to use a more suitable space:  Atlantic’s Studio A.  The instrumentation for this project consisted of grand piano, synthesizer, saxophone, bass and drums.  I’d been giving a lot of thought to how I would deploy the microphones this time.  For the earlier solo piano experiments, my thinking was really in terms of the piano, though the results taught me I should have taken a wider perspective to include the space.  Now I was also considering the relationship between the positions of the two microphones during recording and the two speakers during playback.  I thought there might be some reciprocity between the both ends of the chain.  (I will return to this concept in a future entry.)

The recording I made that night had a sense of coherence and focus I had heard on only a tiny number of recordings before then.  Though it was far from perfect and offered a number of new insights on what I should (and should not) do, it was a personal landmark insomuch as it really did offer a sense of being there, of bringing the listener to the performance, in the space in which that performance occurred.

When the folks offering the demo of the digital audio workstation responded to my skepticism by offering to burn me a CD-R, I knew exactly which recording would tell me the most about how passing through that computer system would affect the sound.  When they delivered the disc, I spent a lot of time doing synchronized comparisons of the disc playback with the original recording, switching back and forth between the two.  In the end, at the time, I could not detect a sonic difference.  Further tests of the workstation also revealed that while the existing (pre-computer) system introduced new types of distortions during certain operations, those same operations could be performed on the computer-based system with transparent results.  (There will be more to say about this in a future entry about the evolution of digital audio.)

Happily, digital audio was to make some great progress to get to where it is today.  Before that was to happen though, a perhaps even more earth-shaking experience was coming.  Despite what I’d been taught and what I’d read about in all the years I had enjoyed playing back recorded music, I was soon going to hear stereo for the first time.