Skip to content. Skip to navigation
 

Doing the Comparison

by Tom Andry last modified February 24, 2009

The first decision you'll probably make after picking the components is who you want to do the comparison. If this is an end user, the only opinion that matters is your own (and maybe a few in your family). In a professional setting, you want people that you know have golden ears right?

Wrong.

The difference between a person with "golden" ears and others is 90% experience (provided no hearing loss). There is nothing whatsoever wrong with having a "regular" person off the street as part of your participant group. In fact, I'd suggest that it would lend your comparison more validity. A reviewer is someone that has very definite tastes anyhow. That very well may make them a bit more biased towards speakers/components that sound like their own rather than Joe Average that is going in there with no preconceptions.

That being said, it is probably easiest to include reviewers in a professional comparison. First of all, they already know the vocabulary. They will be able to describe what they are hearing in a way that is both easy to read and understand. Reviewers are used to doing these types of comparisons and can probably do them in half the time (or less) than a regular person. When you are trying to do a number of comparisons in a single day, this can make a world of difference.

Listener fatigue is the concept that you get desensitized to a listening test after an extended period of time. This is true, especially at louder volumes. The way to combat this is to limit the number of comparisons your group is doing and to TAKE BREAKS. Give your listeners a chance to relax between comparisons. When listening at higher volumes (these comparisons tend to be loud) it is especially important to take breaks to relieve fatigue. Also, vary the volume often. Not only will this reduce fatigue, but it gives your participants a chance to experience the components at a variety of volume levels.

I've mentioned before about the importance of switching seats but let me reiterate here. Even when you are not evaluating speakers where one seat might be ever so slightly closer to one speaker than another, room effects are not as uniform as you might think. While you might measure a 10dB suckout at a particular frequency in one seat, the next seat over might have a much less dramatic suckout or even a bump! Switching seats during each comparison will help balance out some of those room effects.

I've addressed the idea of the sighted vs single blind vs double blind listening tests before, but let me sum up. Essentially, in a sighted test, everyone sees everything. The listeners know what they are listening to at all times. They might not know exactly which component is being used but they know which ones are being compared. The single blind test is where the listeners don't know which components are playing but the facilitator does. The double blind test is where no one (listener or facilitator) knows which are playing until after the comparison. Most comparisons in audio are either sighted or single blind. Double blind usually takes equipment that most people just don't have (including us here at Audioholics).

The problems with the sighted test are obvious. If you know that your favorite speakers are playing, you're obviously going to be biased toward them. With the single blind test, many problems are eliminated but you have the problem of the facilitator affecting the results purposely or even unconsciously. There is a ton of research out there on experimenter expectancy bias and more but I'll let those interested read about all of that on your own time. Personally, I believe that if you take a few precautions, you'll be just fine. Here is a short list of suggestions:

  • Rules for the comparison should be set beforehand so that each participant will know what to expect.
  • Comparison pairs should be randomly selected.
  • Comparison pairs should be masked/disguised as much as possible. Components not in the current comparison must also be out of view of the participants.
  • The facilitator may not speak to anyone during each comparison (participants or anyone else).
  • The facilitator must be out of sight of the participants during as much of the comparison as possible.
  • The facilitator should allow the users to switch freely between the comparison pairs at will as many times as necessary.
  • Once the comparison is over, notes should be collected from the participants and revisions should only be moderated by the facilitator.

The problem with the single blind test is that the facilitator can sometimes give off clues to their preferences. That is why I suggest no talking (even talking to others about unrelated things since it might have an effect) and being out of sight from the participants as much as possible. The facilitator should basically be invisible.

Also, I would highly recommend doing both sighted and single blind tests (the blind test first). If nothing else, it would be interesting to see how the participant's observations change when they know what they are listening to. If their observations don't change, that too is interesting. Regardless, I would NOT reveal which components were which in the blind test until well after all the comparisons (sighted and blind) are completed. I'm a bit of sadist in this regard and I often don't let the participants know until they read the final report.

Finally, while I've talked about the importance of the facilitator not interacting during the comparison, it is probably more important that the participants don't interact. They need to have a completely unique and authentic experience that they can report. Also, make sure after every comparison you collect the forms so that they aren't modified after the inevitable discussion during the breaks.

 

Recent Forum Posts:

Post Reply
mtrycrafts posts on March 03, 2009 21:53
R-Carpenter;531996
1)That's a good idea but it bring up a question of how different size speakers will interact and affect each others sound in free space. Baffle step compensation is designed differently by different manufacturers.
I'd go with turntable. It's not hard to manufacture. Basically it's a large Lazy Susan with carpeted base.


Apparently the turntable method at the labs are enough not to have interference between the speakers when they are 180 behind in the off mode.

R-Carpenter;531996
2)Ehh, yes if we want to be perfectly precise. How do we account for differences in off axis FR between different speakers? It will affect SPL matching unless the reviewer is in anechoic chamber.
Personally, I'd go with pink noise within 0.5db. Differences in FR between the speakers are very obvious in comparison to amps, CD players and other electronics.


The point of level matching is not to make it harder to differentiate, it is to eliminate a volume difference that the listener may misconstrue as being better. You don't EQ the hole frequency band, perhaps at one frequency, like 1kHz. And the other parameters are really what will tell one speaker apart from the other and stand it out, that is the whole point, which is the better speaker.

R-Carpenter;531996
3)I was being a bit sarcastic about it.


missed it

R-Carpenter;531996
On another note, I think a reviewer should use real MLS measurements and not a Micky Mouse analyzer.


What's wrong with Mickey Mouse?
mtrycrafts posts on March 03, 2009 21:39
majorloser;532165
They might be the best sounding speakers in the world.
But if their ugly, they won't be going in my home.
So yeah, there is a bias in sighted tests. But we are talking about comparisons, not strictly testing for performance.

Besides, if you do it right there will be a non-voting person controlling the a/b switch and not telling you which pair your listening to. .


Yes, but, if you can see the speakers I am sure you will have a good probability of knowing which one is on

The visual impact should play a role, of course, but in my estimation after your sonic experimentation and see which has more weight for you.

Even with that switcher person not telling you, it still matters and can affect the bias responses. That is why DBT is used when it matters.
In this case though...
mtrycrafts posts on March 03, 2009 21:31
theranman;529538
I've conducted a few comparo's myself, and to be honest, the whole single-blind, double-blind stuff is totally unnecessary. When a test is set up properly, you'll find that speakers differ SO MUCH that you'll laugh at the notion of DBT's. ...

Go to town and be amazed how quickly and easily you can tell differences.
....


The issue is not that there are still sonic difference between speakers but where does one's bias will take them; the one that visually impresses and affects the sensory inputs or, in fact the one that one really prefers due only to the sound and nothing else.

If DBT was not necessary, even with speakers, research labs would not use them. After all, that is what they used to find out what people of all backgrounds prefer, flat frequency response, wide dispersion, etc. And demonstrated that without it, the other sensory inputs do a number on preferences.
mtrycrafts posts on March 03, 2009 21:20
But what about the differences in sensitivity and FR between speakers, is it possible to get multiple speakers within 0.1dB of each other?

You cannot and don't think the labs EQ the speakers so the levels are matched, but you do need to level match and perhaps, with speakers, you do need to use a very sensitive spl meter that is capable of such level.
Come to think of it, the same voltage may not be enough as when the speakers are the constant and the components are swapped; then, it is imperative to use a volt meter as the speaker will output the same level with the same input voltage.
majorloser posts on March 03, 2009 11:31
avaserfi;532004
I made this post in another thread, but considering the disucssion here I feel it is worthy of being copied:


They might be the best sounding speakers in the world.
But if their ugly, they won't be going in my home.


So yeah, there is a bias in sighted tests. But we are talking about comparisons, not strictly testing for performance.

Besides, if you do it right there will be a non-voting person controlling the a/b switch and not telling you which pair your listening to. The people listening just need to sit back, relax, close their eyes and listen. Once your ears pick out a particular pair you like, it will become very obvious when that pair is playing. It worked out quite well at my house last year when Gene ran the show. We came up with the top two bookshelfs being the Status Acoustic Decimos and a pair of AV123 x-ls speakers. Definitely not in the same ballpark with appearance and price range. And for what it's worth, the worst performers were the SVS bookshelves.

Unfortunately the Decimos sounded so good I had to buy a pair. And at this point, I don't want to find anything that sounds better. My wallet couldn't stand it and wife would kill me.

So what's the lesson to learn from this?

DON'T KEEP EXPENSIVE GEAR THAT YOU DON'T OWN IN YOUR HOME FOR VERY LONG.
YOU MIGHT END UP WANTING TO BUY IT.
Post Reply
 
Join our Newsletter for News & Deals