Why We Measure Audio Equipment Performance
In the Beginning…
As a former telecommunications electrical engineer, I come from a profession where analytical analysis and measurements are a necessity when designing a new product. Working for one of the largest government defense contractors in our country, there was little room for subjective opinion or impressions for a finished good. Our client would give us a set of requirements, and I would be tasked to design a box with the necessary electronics contained inside that would meet or exceed those requirements. Failure to do so would ensure we would lose our contract and I would subsequently lose my job. Doing that line of work for nearly 7 years prior to quitting to pursue Audioholics.com full-time was a lucrative and educational endeavor for me. I learned the importance of correlating design metrics with hand calculations, computer simulations and finally measured lab results. When all three matched, it meant my job was done and I was ready to bring the prototype to first article production for field tests.
When designing audio-related products I was always tasked to measure audio quality in terms of SPL output level achieved over a specified bandwidth and below a certain distortion threshold. One project I designed for NAOC (National Airborne Operations Center) was called a Red/Black speaker. The purpose of this product was to receive Red (secure) and Black (non-secure) audio into one device while keeping the signals isolated from each other but summing them onto a loudspeaker platform capable of delivering at least 20dB higher output than the aircraft noisefloor which I measured to be almost 70dB! I’ve always been the type of engineer that over-designs, meaning if you give me a spec to meet, I will typically design to exceed it by a fair margin. The speaker I designed for NAOC wound up producing almost 5dB more output than was spec’d by the client and had a wider bandwidth and lower distortion than required. My manager was not pleased, stating I designed a Cadillac on a Hyundai budget. In reality, the few better components I used really didn’t involve a significant cost adder. His angst towards me was quickly squashed when we were all sitting in a meeting room with top ranking Generals and Colonels commenting that this new Speaker sounded better than their personal home theater systems. What they didn’t realize was prior to the final design, I spent weeks tweaking the feedback circuit of the power amplifier, testing various loudspeaker drivers, enclosure stuffing and zobel filter networks to ensure the driving amplifier was happy with the load it was supplying current to. Some of this stuff wasn’t measurable or quantifiable, at least not at that level of my career. I went with instinct and I relied heavily on the most important instrument at my disposal—my ears!
While measurements were important to validate that my design was correct, they didn’t play as big of a role as expected in the decision making process when putting the finishing touches on the design. Needless to say, the product worked, the clients were happy, my bosses were happy that their clients were happy and my job was secure. In fact, my job was so secure that by the time I turned in my resignation papers, they continued to call me to come back for months not understanding how I could prefer working on an internet site over designing top secret audio communications equipment for the government. I’m not a big fan of sitting in a cubicle and I kindly suggested they watch the movie “Office Space” to gain some perspective.
Subjective vs. Objective
Over the years we’ve found there are some consumers who purchase products based on measured performance while others rely solely on subjective impressions. Hence the subjectivist vs. objectivist camps were born and to this day still slug it out with each other.
There are some manufacturers whom simply will NOT submit review samples to us because we produce 3rd-party verification measurements. There are also manufacturers more than willing to submit their review samples for us to measure. We found there to be a strong correlation between high quality performing products from manufacturers that freely allow 3rd-party organizations like Audioholics.com to measure them. This indicates to us that these manufacturers have nothing to hide and stand behind the products that they produce. Even if their products don’t measure up to expectations, you have to respect them for submitting samples and allowing for full public disclosure of performance. Still, regardless of how well a product measures, it doesn’t really give us a full picture of performance. The only exception perhaps would be passive components like cables, but amplifiers and especially loudspeakers are much more complex devices.
It’s important to run down the list of various audio components to briefly discuss what can and cannot be easily measured.
Cables are the easiest, most basic devices to measure, yet they are typically surrounded by the most controversy and pseudo-science. I believe this is the case simply because it’s the only way exotic cable manufacturers can make their products stand out from their competitors. Unless a cable is poorly designed, it really won’t perform much differently than a budget yet correctly designed alternative. In fact, we live by the motto “only poorly designed cables are sonically distinguishable”. You really have to botch design parameters to make a cable act like a tone control. There are some folks that like using cables as tone controls and to them I say more power to you. Who are we to judge how one likes to alter the sound of their systems? Accuracy isn’t always a primary goal for audiophiles, which is very apparent when they deliberately seek out cables that are unusually high in inductance or capacitance.
Many exotic cable manufacturers prey on the consumer’s lack of knowledge of basic electrical theory. They do so by wrapping their products in bogus science (aka. snake oil) that can’t be measured or quantified. They claim you just have to trust that their design is the holly grail and will transport your system to new sonic heights and if you can’t hear the difference it’s because you’re components aren’t good enough to allow you to discern them. Our response is this is total hogwash. Cables are easy to measure and it’s easy to quantify differences to determine if there will be any deleterious measurable effects that will lead to audible sonic differences. For HDMI, it’s even easier to measure how well a cable will transmit its data across a certain distance. If the cable fails, it will exhibit sparkling in the picture or audio drop outs. HDMI transmits digital signals that either get from the source to the display correctly or don’t. Period.
for more information, check out our A/V cable tech articles
Amplifiers & A/V Preamps
The primary job of an amplifier is to increase the low level audio signal to high output levels for driving a loudspeaker. An ideal amplifier would have infinitely high input impedance and infinitely low output impedance which would allow it to double its voltage each time the loudspeaker impedance is halved. In reality, very few amplifiers behave like a true voltage source due to limitations in the power supply or output devices or both. A great deal of what contributes to good sound can be measured for an amplifier such as bandwidth linearity, Signal to Noise Ratio (SNR), Crosstalk and full bandwidth power under various loading conditions. For A/V preamps, we measure their bass management circuits to ensure they properly filter “small” speaker channels and sum the bass to the dedicated LFE channel.
For more information on how we measure amplifiers see: Basic Amplifier Measurement Techniques
It’s important to note that although measurements can tell you a great deal about how an amplifier performs, we typically measure them with a worst case signal (continuous sinewave sweep) under an ideal test load (a resistor). As a result, you don’t really get the full picture of how the particular amplifier will work in your system driving real speakers while listening to real music. While measurements are important to weed out the bad performers, listening tests still play a very vital role in capturing the overall performance of the product.
A source device’s (i.e. BD, DVD and CD players) primary job is to playback physical media and transmit its signals via analog or digital to a preamplifier. The analog outputs of source device are not unlike preamplifiers, which are fairly straightforward to measure. Bandwidth, SNR, drive level vs. distortion, etc. can call be easily quantified. The quality of components that make up a DAC (including the analog circuits) plays a big role in how well the product will measure but so does another often overlooked design aspect: component layout. Just because a manufacturer uses a fancy name DAC in their product, doesn’t guarantee good performance. Board layout, power supply robustness and decoupling are equally important. Audible differences can be anywhere from small to large depending on the competency of design between the two products. These sonic differences tend to dematerialize when directly comparing their digital outputs fed into the same A/V processor or receiver. Unless the source device is down-converting a multi-channel bitstream to another format before transmitting it to your A/V preamp or receiver, the two devices should produce a transparent sonic signature and be virtually indistinguishable from each other. We measure the analog outputs of source devices as well as how they decode and transmit bitstream data. But, more and more consumers are simply bypassing the analog outputs of their source devices in favor of HDMI to take advantage of the new HD audio formats such as TrueHD and DTS HD.
For examples of how we measure source devices, see example:
Oppo BPD-93 & 95 Audio Measurements Review Supplemental
The subwoofer is arguably the most important speaker in a multi-channel playback system. It provides the deep and tactile bass response to recreate explosions and other low frequency sound effects in movie soundtracks.
As of late, we’ve been investing a lot of resources into developing the industry’s most comprehensive measurement standard for evaluating subwoofer performance.
A lot can be gauged by measuring a subwoofer properly. By properly, we mean measuring the subwoofer outdoors away from adjacent surfaces via groundplane technique or freespace.
Output, bandwidth linearity, distortion and how well a sub behaves under high output are all measurable parameters. Using our protocol makes it easy for a consumer or manufacturer to make apples-to-apples comparisons between all of the subwoofers we measure using our strict measurement protocol.
As good as our measurement protocol is, it still doesn’t give you a complete picture of how a sub will sound in a real room under real world listening conditions.
While the CEA2010 test signals detect harmonic distortion, the test procedure can still produce passing results for a subwoofer that sounds overdriven or distorted due to port, mechanical, or enclosure noises, buzzes, and rattles. Another subwoofer may fail for 3rd harmonic distortion but otherwise be subjectively free from these noises but sound much better to the ear. For example, the really deep bass frequencies below 20Hz often fail for 2nd or 3rd harmonic distortion or noise in the F6 and higher bandwidth which has a threshold of only -40dB from the fundamental, but subjectively the subwoofer output does not sound bad. The extra low signal to noise ratio when testing very deep bass frequencies outdoors can cause background noise to be an issue of failure also. At other times when conducting CEA2010 type testing at the upper end of the bass range (50-125Hz), the sub can be producing much louder levels and might be eliciting some spurious vibration, buzz or rattles from the driver or cabinet, but CEA2010 shows it to be well below the distortion thresholds. Subjectively the sub is obviously not sonically "pure" to the ear at that point, but it is judged as clean from the standpoint of harmonic distortion. Technically a subwoofer is allowed nearly 40% THD in theory, if all distortion harmonics came in just below the threshold of each bandwidth. Some units actually produce 30% THD and still manage to sound rather good at the same time.
CEA2010 testing is a great gauge of performance but we look at it more as a measure of peak SPL and dynamic output capabilities rather than a sound purity gauge. That's not to say that it isn't useful in that aspect but there can be so much variation in units, the way that they behave and the way that human hearing varies depending on the frequency band that quantifying this with a single quick measurement is simply not possible.
This is why we do subjective listening tests and also listen to how the sub sounds during our continuous reverse sine-wave sweeps, which can give a more revealing glimpse at how the subwoofer behaves when pushed to the limit and what types of driver, passive radiator, or port overload noises develop, or whether there are resonances, rattles or buzzes related to the cabinet or grille that might develop at high power. These things are not readily apparent during CEA2010 burst type testing many times.
This is explained in greater detail in our article: How to Interpret our Subwoofer Measurement Data
The bottom line is subjective listening tests are still needed to get an overall complete picture of true subwoofer performance.
We saved the best (or at least the most complicated) for last. The job of a loudspeaker system is to convert the electrical audio signals from your amplifier into mechanical vibrations to fill your theater room with sound. One would think it’s a fairly straightforward process to measure and quantify loudspeaker performance similar to that of a subwoofer. But, it’s really not the case.
Most loudspeakers have an array of drivers playing at different frequencies. At low frequencies, the loudspeaker behaves much like a subwoofer does in that its bass waves radiate omnidirectionally. As frequency increases, the radiation pattern of the loudspeaker typically becomes narrower. Depending on the driver topology, the acoustical convergence of all the drivers will occur at different distances. A simple loudspeaker containing only one driver in a sealed box can easily be measured nearfield but a multi-driver loudspeaker employing multiple woofers, midranges and tweeters may not acoustically converge until you reach a distance of 2 meters or more. Thus a fixed measurement distance cannot be specified to encompass all loudspeaker designs. It’s also very difficult to measure a loudspeaker in a room and in doing so, care must be taken to properly sum the driver and port responses and/or gate the measurement to remove as much of the room effects as possible.
For a more detailed discussion on this topic see: Audio Measurements – the Useful vs the BogusWe've also recently created a Loudspeaker Measurement Standard in attempt to more accurately quantify objective performance.
Even if we measure a loudspeaker’s listening window (on/off axis response) and power response (frequency response 360 degrees around the loudspeaker), it still doesn’t give you all the information on how it will play or more importantly sound in a real room. It’s often difficult or near impossible to quantify in measurements the “airiness” and depth in soundstage a loudspeaker can portray in a room. Some of this has to do with how smooth and narrow the baffle is with relation to the tweeter. A lot of it also has to do with the quality of the drivers and execution of the crossover network. Phase coherency through the crossover region also enhances the sound for improved pinpoint imaging and sound stage. A tweeter with great off-axis response, high power handling, and low resonance frequency, allows the designer to cross the system over at a lower frequency to prevent the midrange/woofer driver from beaming. These attributes aren’t always quantifiable by simple frequency response measurements. Listening tests and just knowing how to make the best usage of the components in the design are paramount.
Most loudspeaker distortion measurements published by A/V magazines and manufacturer marketing literature are done at low power and by sweeping single discrete tones over a frequency response. While these measurement systems can very accurate for measuring frequency response, they are typically not so great at measuring distortion. I’ve seen this first hand measuring the speakers in my Display device that just gazing at the results, one would think was a high performance speaker system.
Measurement systems like LMS for example have dual band-pass tracking filters for routine SPL measurements and frequency sweeps, they are not steep enough to reject the fundamental tone and other nearby harmonics. Consequently, the harmonic distortion curves they produce may be higher than they appear and smoother than reality. In speaking with the folks at LMS, they stated the stock M31 microphone has at least 20-30% distortion at higher SPL's making it impossible to accurately measure loudspeaker distortion without instead measuring the limitations of the microphone. This is the case with many measurement systems and not just a limit of LMS.
A sine-sweep test to measure a problem such as driver break-up or low frequency modulation on a midrange driver caused by an improperly designed or missing crossover network will be unrevealing in these cases. To see this problem, one must put in two frequencies simultaneously and view the output on a spectrum analyzer. (One can see distortion products as sum and difference frequencies) This is known as Intermodulation Distortion, or IM. Unlike THD, IM is readily audible even in very small amounts because it is not harmonically-related to the music in any way. Western music is based on octaves and musical 3rds, so 2nd- and 3rd-order THD components typically sound like they’re “part of the music,” and the ear tends to dismiss them, even in relatively large amounts. Conversely, IM distortion is very dissonant and harsh-sounding and it stands out like an audible ‘sore thumb.’ Measuring IM distortion is a fairly simple process and one all audio engineers are familiar with, but the results of are rarely published by the manufacturer or professional reviewer. It’s also very difficult to measure loudspeaker distortion at high output levels because many competent designers employ tweeter protection circuits in their designs which will be activated if being presented with a large continuous sine-wave sweep generated by the measurement system. Thus some sort of burst testing would seem more appropriate. Over the years this has been proposed at AES but nobody has come to an agreement for standardization. We may revisit this topic in the near future as a detailed discussion on this topic goes beyond the scope of this article.
Suffice it to say, distortion measurements accurate or not are NOT a substitute for what our ears can hear. We are very sensitive to certain types of distortion and compression. Compression is basically a squashing of the music’s dynamic range (the loud to soft passages becoming less distinguishable). Think of how a bad MP3 or XM radio sounds to a high quality CD, DVD-A or SACD.
Recommended reading for more info on Compression: Dumbing Down of Audio
Compression in loudspeakers is usually caused by heating of the drivers and/or crossover elements, or the driver voice coils jumping the gap under high input levels. We are especially sensitive to compression in the midrange frequencies (200Hz to 4kHz). Extensive listening tests at various loudness levels using familiar program material in a familiar acoustical listening space are necessary to weed out good from bad loudspeaker designs. Even better is if the consumer or reviewer can do controlled and level matched listening tests between products to get a better frame of reference as to what type of sound seems more pleasing and real to them. The flattest most accurate speaker may not be preferred, as many listeners prefer a little boost in midbass and upper treble response. Loudspeaker designers are well aware of this and many purposely incorporate this into their designs. This is why we always stress continued listening tests over the course of days or weeks to determine if you can ultimately live with the speaker you are considering buying.
Editorial Note by Philip Bamberg on Listening
I always say that if a speaker doesn’t fully reproduce signal fed to it, then that is veiling. Conversely if a speaker plays sounds NOT sent to it, then that is distortion. Furthermore, some audiophiles conclude that the veiled speaker is pleasing because it sounds “smooth”. And conversely other audiophiles incorrectly conclude that the distorting speaker sounds “detailed”. Next, listen to that smooth speaker for an hour to be sure it doesn’t sound boring. Or listen to the detailed speaker for an hour to be sure it doesn’t fatigue you.
The proof is often found with simple tests of your speakers. Play a continuously increasing sine sweep at a moderate to high level. Listen for improper sounds, such as buzzes, hot frequencies, or harmonic distortion at certain tones. (Also be certain that the bad sound isn’t simply the picture frame rattling on the shelf. And hot spot tones or distortion can typically come from room reflections which may make you incorrectly conclude that the speaker is bad.)
Finally, do not try to listen to speaker deficiencies at high playback levels. It is too much of a strain on your hearing. Distortion testing of speakers is a non-trivial endeavor because it requires a controlled environment with a echo-free and quiet background, and proper test equipment and methods. Comparisons can only be made under absolutely identical test conditions, and it is generally impractical to compare one tester’s results with another.
So Why Do We Measure?
There are several reasons why we measure product performance, which is tabulated below:
- To verify the product is free from obvious defects (either by design or damage during shipping)
- To attempt to correlate our subjective listening impressions with objective measurements
- To legitimize our subjective listening comments
- To validate manufacturer’s performance claims
- To provide a way for consumers to make apples-to-apples comparisons between competing products
- To find areas of product performance that can be improved via future designs
- Because we can
More often than not we can find a manufacturing defect in a product with just a measurement or two. A quick impedance sweep of a loudspeaker will reveal crossover design issues. A quick power sweep will validate a manufacturer’s power claims and so on. We typically conduct our listening tests prior to doing product measurements to help minimize any unintentional biases introduced by interpreting measurement results. However, it’s nice to be able to back up a subjective claim like “the bass this subwoofer produced was earthshaking” when you can show it was able to cleanly hit above 100dB at 20Hz via a 2 meter groundplane test. It’s even more of a bonus when our measurements can validate or even exceed a manufacturers performance claims. In my opinion, the best marketing tool a manufacturer can levy is when they quote our performance tests exceeding their own published results. It instills consumer confidence for that product and brand. It also makes it easier for us to gush over the product in our reviews.
Despite our best efforts of measuring product performance, it’s not always easy to make apples-to-apples comparisons. We do our best to test similar products similarly as is the case with our amplifier and subwoofer measurements. This task is much more difficult for loudspeaker measurements, as previously stated.
One of the most rewarding aspects of measuring product performance is when we encounter receptive manufacturers willing to consult us to improve their product performance rather than to dispute the validity of our measurements. We often work with them after the review on beta testing projects which involve more detailed measurements and analysis and consultation on our part in hopes the feedback helps them produce better future generation products. We’ve seen great results over the years and it partly explains why some of our closest partners are continually improving the performance and value of their products.
We Do It Because We Can!
Lastly, we measure because we can. We measure product performance because as engineers, we rely on a set of data to ensure the product is functioning as expected. Audio is as much a science as it is an art. We measure to complete the overall review process. Subjective data (listening tests) plus objective data (measurements) gives a much more comprehensive overview of a product under review. Consumer feedback in our forums from actual users of the product completes the picture.
After Audioholics covers a product and its readers weigh in with their opinions in our forums, you can rest assured the product performance is pretty well covered. You’ll know what to expect should you plan on buying it for your own usage. When all is said and done, the ultimate test is when you actually take delivery of the product and try it out in your own system and listening environment. We hope you will agree with our assessment or at least comment in our forums why you don’t Nobody has all of the answers but collectively we can at least approach an educated assessment to help other Audioholics reach their goal of sonic nirvana while weeding out the average from exceptional performing audio equipment.