“Let our rigorous testing and reviews be your guidelines to A/V equipment – not marketing slogans”

Audyssey Labs' MultEQ

by Patrick Hart — December 28, 2004

Tom Holman had a problem. As Professor of Film Sound at the University of Southern California's School of Cinema-Television, one of Tom's duties was to set-up, calibrate and acoustically equalize Norris Theater. This minimum five hour job was a step-by-laborious-step of set-up and reconfirmation of settings done months before. Inevitably, recalibration accompanied by more fine-turning was always required to make each channel of the Norris Theater system perform optimally. Norris was but one of literally tens of theaters and studio facilities that Tom has been calibrating manually for many years.

As an elective in a subject which interested him, Chris Kyriakakis had signed up for Tom's introductory course on Film Sound. Kyriakakis was first-year faculty in Audio Signal Processing at the USC Viterbi School of Engineering. He was looking for a difficult problem to tackle so that he could produce a number of publications that would eventually lead to tenure. (In support of this interview Chris sent eleven technical articles, most of which were co-written with other members of the Audyssey founding team.)

Tom Holman, father of THX, had a quick answer to Chris' "problem". Tom wanted himself replaced in the Norris Theater set-up equation. So in the summertime of 2000 Tom and Chris set-up a test with his students using Norris Theater as the base. All known conventional equalization devices were to be tried out using double-blind, A-B comparisons to determine which system, consumer or professional, sounded best.

Chris Kyriakakis' challenge, Tom said, would be to do his doctoral thesis on something which had never been done. Chris was tasked to take the winner of this "shootout" and audibly best it… double-blind… with Tom as the listener. And, Tom added, Chris' system had to perform this acoustic miracle automatically.

The MultEQ Technologies

Audyssey is a spin-off of USC's Immersive Audio Lab which Tom Holman and Chris Kyriakakis co-founded nine years ago. MultEQ has been in the works for the last 5 or 6 years and is the result of Sunil Bharitkar's PhD thesis. Sunil was a doctoral student with Chris at USC and is a co-foudner of Audyssey. Yes, Audyssey is a start-up, but this effort differs from most in that it has multi-million dollar funding backing it through an endowment from the National Science Foundation. At the time of my interview in early December I got to hear a demonstration of Audyssey's first technology implant into a consumer electronics receiver, the new $6,000 Denon Flagship AVR-5805.

Within the Immersive Audio Lab on the USC campus, I sat down with Chris Kyriakakis and Tom Holman to learn about this ingenious technology. MultEQ is but the first in a series of unique audio solutions that Chris' team is developing. Chris promises there are more spin-off applications in the pipeline that we'll be hearing about in the near future.

MultEQ is a comprehensive approach to system set-up and calibration and room correction. It is claimed to be radically different than the two or three "homegrown" technologies that other companies have put out into the consumer market so far.

Beginning with the analysis of how home theater systems' imbedded set-up routines have traditionally been configured, the Audyssey team determined that their system would need much more precise inputted information from the beginning in order to establish a baseline from which accurate manipulation of the data could take place. The MultEQ system therefore would have to be dependant on itself to gather more exacting information than could be measured or fed-in by a consumer. And, since the system would be configured to use Finite Impulse Response (FIR) filters, the baseline noise level of the "room-with-loudspeakers" system had to be at a specified minimum (in dB) before MultEQ would begin a sequence. Thus, the starting point for measurement is to place the spec-approved microphone at ear level in the main "audiophile" listening position.

Next up in the preliminaries, the system will determine how many speakers are hooked up. In the case of the Denon AVR-5805 the default is 9.1. But as will be seen later, a much more integrated, cohesive sound can be had by adding two extra subwoofers for which Audyssey MultEQ and the Denon are provisioned.

Determination of absolute phase,( or in-phase with positive polarity), is a critical aspect of on-the-screen/center channel speaker symbiosis that, as Tom points out, was known by Todd-AO as far back as 1954. So Chris Kyriakakis' team has thrown out the common method utilized by other systems of determining phase by means of looking at speakers in pairs. Instead, the Audyssey system can accurately determine both a positive wave front and distance to the audiophile seat (primary listening position) from the single center-channel speaker. The Denon's on-screen display within the lab indicated that the set-up distances where good down to 0.1 foot. In reality, Chris relates that " We spec that we're accurate to within 1" but actually the accuracy is closer to ¼"". (Note that a ¼" frequency wavelength is over 40KHz so it begins to become obvious that MultEQ can also deconvolve extremely accurate time and phase information.)

"What does it do if you're measuring dipoles?" I ask. Tom answers, "The system is waveform dependant and since dipoles are not dipolar at 80Hz MultEQ will measure their distance to the audiophile seat the same as it would a monopole." Chris adds that "It looks at the first arrival. The waveform doesn't flip with reflections. You still have a positive peak."

The MultEQ system now sets delays for the prime seat. Why? Other technologies within the MultEQ package require a baseline map of the speaker systems' position with respect to the audiophile seat before the other imbedded technologies can be implemented. And the FIR filters which are utilized can easily be designed to be "linear phase". Such filters, by definition, delay the input signal but don't distort its phase.

The Denon AVR-5805 allows for three subwoofers and each of three optimum crossover frequencies to be determined independently. In finding the best low-pass/high-pass filter set Tom explains they used "a priori knowledge" of what the best splices might be. "For instance, the THX spec calls for a 4th order low-pass filter and a second order high-pass. This is not asymmetrical as one might think if you take into consideration that an acoustic suspension satellite system capable of getting down to 80Hz will have its own second order slope." Tom gives an example. "We say, 'If we get this set of measurements then they must be coming from this speaker'. Does that make sense?" When I say "no" Tom goes on to explain "It is determining what a speaker must be doing versus what the room is doing. Based on the timing (first arrival) information there are very few possibilities from which to choose. So Audyssey fixes both the point and the slope of the sub-sat combo. THX works but you have to use all the ingredient parts and you have to space them in a particular way. That's all you could do at the time. But my gosh, it's been ten years"

Chris then talks about an adjunct application to MultEQ intended for HTIB systems. It's called PrevEQ. "The main problem with home theater-in-a-box systems is the huge hole between the sub and sat. With a HTIB system we would have the luxury of having the speaker systems in advance so we could pre-characterize the speakers and boost the subwoofer to make it go up higher in frequency (to match with the satellites)." Tom and Chris then both clarify that "the filter placed on the content side is 120Hz for Dolby and 80Hz for DTS which is a problem for movie theaters also. Audyssey can be set up in the Denon for the three subs as filter pairs. Each to match with a different satellite set. MultEQ is in fact the only existing system to figure out the distance to the sub for timing information."

"The approach to solving this problem in the past has been based on parametric EQ which is an extension of what was done with analog equalizers just, done digitally. The first problem is that you never have enough bands, typically 10, using an IIR (infinite impulse response) filter. IIR filters allow you to do things in the frequency domain but it does unknown things to the time domain. In many cases it manifests itself in ringing or smearing."

"Our approach is based on FIR filters which in the past have been computationally intensive but this is not an issue any more because the DSP power has increased so dramatically. FIR filters allow us to correct the time domain and frequency domain at the same time. 'Well, you might say, FIR filters don't give you enough resolution if you want to keep them relatively short.' And that's true. This is the reason we implemented Dynamic Frequency Allocation (another of the imbedded technologies) which gives non-linear spacing. So instead of having only 80Hz or so resolution we can get down, at low frequencies (where it matters), to under 5Hz of resolution. It's on a Bark Scale but the resolution starts below 5Hz at the lowest frequencies and goes up to a few tens of Hertz at 20KHz." (The Bark Scale ranges from 1 to 24 barks, corresponding to the first 24 critical bands of hearing. For computing all-pass transformations, it is preferable to optimize the all-pass fit to the inverse of the map, i.e. Barks vs. Hz, so that the mapping error will be measured in Barks versus Hz.)

The conversation now turned to the bottom line technology within MultEQ. The ability to have every seat be a good seat. Again Tom provided his historical perspective from tuning theaters in the early eighties. "While real-time analysis is 'time-blind' (so you have to know something about the time domain before you use it) nevertheless, if you clean it up, it has some advantages over the FFT-based analyzers. The THX R2 (from the eighties) was readily able to do spatial averaging and temporal averaging and we realized if we made an extension of it using a laptop with an add-on spectrum analyzer peripheral that we could send signals across dynamically from the analyzer and do a lot of mathematics to it and therefore clean up the signal."

Chris takes over, "So part 1 was, we knew if you EQ for the single sweet spot then every other position would suffer from much poorer frequency response. (And that was one of the reasons for the bad name 1/3 rd octave equalizers were given.-Tom) Initially Denon and every other potential customer thought 'let's have two modes'. One for a sole listener and one for when you have several listeners in a room. Well, it turns out if you EQ a whole room the audiophile seat gets better. If you take more of the problems of the room into account you're fixing a bigger area than just the audiophile seat so there's no need for two modes."

Chris continues, "The approach other people have taken is to throw DSP at it. There are room correction units on the market that do just that. They can do 8000-tap FIRs and you need 3 DSPs per channel. But if you want to be in a consumer product you have to make some computing decisions. So that was the thinking that went into Audyssey's Dynamic Frequency Allocation.

I then asked "Does it give the same response at each listening location? How is it possible, for instance, if you have a standard D'Apollito-style center channel which is known to have a lobe which points mostly toward the audiophile seat." Chris responds, "By measuring the response at different locations we use a fuzzy-logic based clustering approach which, after computation, makes the sound at the audiophile seat better. The average assigns equal importance to each seat, an importance of 1. Now by applying a weighting factor automatically we use an approach based on pattern recognition. It doesn't have anything to do with what we know about acoustics," Chris stresses. "This is the leap of faith. It is the first application of fuzzy logic that I know of in audio."

"If we were to treat the time domain version of these responses and say which of the criteria are closer to each other as far as pattern similarity, then I find for instance that seats 1, 3 and 5 in the room are "clustered" as far as similarity, seat 2 is by itself and seats 2 and 4 are similarly grouped together."

I interject and ask if the sound the system is reading is mainly direct sound and first order reflections and the answer was "No". "The response that we're taking is quite long. It's 8000 samples over 200 milliseconds. If you look at the time response, it has a pattern. But if the seats have similar problems, they will fall into similar clusters as set up by our pattern recognition method. Where it gets fuzzy is that a particular seat can belong to more than one cluster. In other words, what it says is that based on our theory that seat #2 has 80% of the characteristics of seat #3 but 20% of the characteristics of seat #1. So there are no hard boundaries."

"So now we have six responses which we've clustered into 3 groups. From each response we elect a representative of the cluster. It's not any one (exactly within the cluster), it's one that represents each one in the cluster in the optimal way. That's called a cluster centroid. So now, of the 3 clusters you have, you have 3 representative responses. So you do it again until you finally end up with the "President response" which represents the constituent responses in the optimal way. So the final representative response is the one we take and invert. When we invert we are inverting proportionally and non-linearly."

Tom now breaks off and talks about the early two-channel $12K SigTech box which had been brought to Skywalker Sound when Tom was still working at The Ranch. The SigTech supposedly fixed the first 50 milliseconds of time. Tom had tried an A-B with and without the SigTech on the Skywalker sound stage which he had previously calibrated using the THX R2. The SigTech was inaudible in double-blind testing. This prompts my question " So if a room is properly treated, can the system sound better?" And the answer from Tom, "Y ou can't fix a first reflection " (electronically).

The Listening Environment

USC's Immersive Audio Lab appeared larger than an IEC-standard 3000 cubic foot listening room. I would guess the dimensions at ~15' W X ~20' L X ~ 12' H. or approximately 3600 cubic feet. Since the Denon AVR-5805 is THX Ultra 2 certified for our test purposes there should have been sufficient power to take the prospect of amplifier distortion on peaks out of the listening equation.

There were two complete 5.1 systems in the room. The powered, 12" three-way amplified Genelecs that Tom always uses for a reference and a second, much lower priced Klipsch system which would be used for the day's demonstration.

In case you're wondering, the Genelec systems used for "dipole" side-mounted surrounds consisted of two 12" three-way systems stacked, one on top of the other, but with one system facing toward the front of the room and the second system facing toward the rear. These two giant "dipoles" were mounted about 8' high on custom-made steel four-poster stands and positioned at about 80 degrees from the front wall's center, just in front of our listening position.

The Klipsch's were a ~$2500 system featuring double 5.25" mid-woofers in the two-way center channel speaker. Klipsch's ubiquitous Tractrix horn tweeter was between the two mid-woofers. This speaker was mounted horizontally on a 48" stand in front of us so that it could fire just over the A/V mixing console that dominated the center of the room in front of the listening position. The left and right speakers appeared to be the same models as the center except that they were mounted vertically on somewhat shorter stands. The surrounds were 5.25" two-way dipole designs with an approximate 90 degree included angle on their opposing faces. They too were mounted on stands. We three were standing a bit back from the console, almost on a plane with the rear speakers when listening.

It appeared that the room had been carefully treated with a mix of moveable and fixed absorptive and diffusive panels on the room's sides and back wall. On the front wall, in addition to an ~80" projection screen there were a couple of absorptive panels along with just a couple of foam, 7th order quadratic diffuser panels. The ceiling was a drop ceiling which, if not properly weighted from above can act as a bass vent as SPLs rise. One of the Audyssey technology papers which Chris and his partners have presented in the last couple of years describe a test room with a virtually perfect RT60 = 0.3 seconds. I figured this was the room.

The first-article Denon AVR-5805 that was to be used for the demonstration was laid ignominiously upside-down on its top so that access to the TI chipset containing the MultEQ 512-tap per channel algorithms could be accessed. The Denon's cooling fans were running full time as a result of having their intended (top) air box outlets blocked by being put in such an unorthodox position. But even full-on, the sound of the fans was noticeable only with no program material playing and having an ear within 2 feet of the unit.

The Demonstration

I had been intending to rent the "Standing in the Shadows of Motown" DVD-Video for quite a while, but hadn't. So unfortunately, I thought, I would be going into yet another of the hundreds of audio demos I had heard over the years, without a good point of reference, such as how this acclaimed documentary might sound on my home system.

Chris cued up track 1, the live concert segment featuring Joan Osborn's rendition of "(Love is Like a) Heat Wave". Further thoughts of any reason for an A-B with my well set-up home system vanished. After thirty years in this business; as a Product Manager, as a speaker designer; even as a trained listener in Harman's well-regarded (and very neutral) Multi-channel Listening Lab, I have never heard such a monumental improvement in the sound of an audio system as I heard with the Denon AVR-5805 with MultEQ engaged.

Conversely, the sound of the Klipsch left/center/right trio without MultEQ engaged was distinctly Klipsch and resided in each of the three front speakers at their 4' height level. Joan's voice came from the center channel and the sounds of the band and crowd could be heard neatly separated to their appropriate left and right positions. In the rear, the crowd sounds could be distinctly heard to our left and right with a gap in the crowd sound between the left side surround and left front speaker and with a similar gap between the right side surround and right front speaker.

With MultEQ engaged once again the entire surround sound stage defined by the crowd wrapped completely around us in a 360 degree circumference as if we were situated at the camera angle as seen on the screen, about 30 feet back in the audience from the stage. In addition, the crowd sound had both depth and height, adding to the 3 dimensional effect. The crowd clapping was distinct and individually delineated for audience members who had been close to the surround microphones. Up front, Joan's voice and the sound of the band were up at exactly their height and spread precisely across the stage as depicted on the screen. (This was three feet above the actual height of the speakers!)

MultEQ Engaged MultEQ OFF

You could hear that the recording engineer accurately tracked and mixed both the vocals and the band's instruments so that the band's stage reinforcement sound system, as heard at that distance, was probably exactly what the crowd had heard.

The drums and bass guitar were completely distinct from each other, with defined slam at all frequencies, and tuneful as all get out. (Did I mention that this perfect splicing of the satellite center channel to the LFE channel was done with a single subwoofer?)

What was heard was the sound of a great band backing up Joan Osborne who herself was belting out the Martha and the Vandellas hit for all she was worth. The room's walls had disappeared. I mean, I could still see that the walls were there but the immersion experience was so deep that it seemed like the sound actually expanded, in a completely natural manner, beyond the wall's boundaries.

The speakers and any sonic character I might have attributed to that particular brand from previous listenings became irrelevant. This was the most realistic, electronically reproduced presentation I've ever heard. It seemed apparent that the Denon/Audyssey system was pulling off the recorded information in a manner which had been heard by the recording engineer at the time of the event. The only spoilers to the illusion were the two dimensionality of the video itself and that the lights were on in the room.

Going Forward

One of the lessons I've learned over the years is that if you're aware that you're listening to something, whether it be the speaker, the amplifier or a new set of cables, then, by definition, you're listening to a distorted reproduction generated by some component within your system that is itself flawed. Reproduction at home has almost always been so. For it is our listening environment which speaks to us just as loudly as the reproduction we hear from even our finest CDs and DVDs when played back within that room.

No electronic system is capable of changing the actual reverberation characteristic of a room by absorbing the acoustic energy of reflecting sound waves. Rather, the energy is manipulated, with appropriate attention and understanding given to the psychoacoustic importance of each and all frequency bands as related to the room/loudspeaker system characteristics at the listening position(s).

Having said that, the Audyssey MultEQ algorithms appear to offer the most accurate solutions to resolution of absolute phase, seamless splicing of subwoofer-satellites, and "every seat is a good seat" to date. The technology fits neatly within the Texas Instrument's Performance Audio Framework DSP package and is apparently very scaleable and therefore memory-efficient. Yes, the Audyssey technologies, promising as they are, must still prove themselves in the consumer arena where perfectly treated rooms do not exist. But having seen and heard this first, brilliant execution of MultEQ, I have no doubt that Audyssey Labs is up to the challenge.

Audyssey's Founding Team consists of:

Chris Kyriakakis
co-founder and CTO

Tom Holman
co-founder and Chief Scientist

Sunil Bharitkar, PhD
co-founder and VP Research

Philip Hilmes
co-founder and VP Engineering

For more information visit Audyssey