Overview of Testing Methodologies - page 2
So how does this make random error your friend? Say that you know that a specific question always gets vastly different responses based on a person's gender, ethnicity, and income. If you randomly select the people you ask, then it is reasonable to expect that half of your group will be predisposed (biased) to answer one way while the other half will be predisposed to answer the other - effectively canceling out the biases. See, you've taken a systematic error and randomized it! Brilliant!
Reliability - More than what your girlfriend says you aren't
I briefly touched on reliability above. Why is reliability so important? When NASA engineers want to know the weight of the space shuttle so they know how much fuel to include, it's pretty darn important that their scale is reliable. If it is not, they may include too much fuel (a waste at best, a falling bomb at worst) or too little (nothing good can come of that), but what about for audio?
Say you want to know if your new speakers are better than your old speakers. You set both pairs of speakers up on a switch and fill the room with your family. You play the same material and switch between the speakers each time asking which set of speakers the family liked better. Systematic bias would be if one set of speakers was louder than the other (studies have shown that louder equates to better for most people). Random error would be if your house was close to an interstate and on occasion, the noise from the vehicles interfered with the test.
But, you say, only the second scenario shows an unreliable test. As long as you always set the speakers up in the same way, the test would always be biased towards the louder speakers. True, and herein lies the true evil behind the unreliable test:
An unreliable test can either make differences harder to detect (random error) or make you think that differences exist when they don't or don't exist when they do (systematic error).
That's right, true believers, an unreliable test that is fraught with systematic error will actually make the test seem MORE reliable. You'll measure the same thing multiple times and it'll come up the same every time. Reliable! Nope, cause it's wrong. Give someone else the same ruler, and they'll get a different result. But it'll be the same every time. Give it to a third person and they will consistently get a result, which is different from the first two. Unreliable - but it appears to be reliable. Oooooh, evil!
But how does random error make differences harder to detect? Well, you've all seen those statistics that have the + or - some % points or something. Well, that's random error. They are telling you that random error could change the results as many as X points in any direction. Basically, the larger the random error, the bigger the number, the bigger the number, the larger the range, the larger the range, the more chance there is that the middle of that range (the number they always report) will fall somewhere in the "didn't detect a difference" realm. The only fix is to increase the number of measurements taken. By increasing the number of measurements, you slowly become more and more confident that the average of your measurements more closely approximates the actual true value. If an instrument was perfectly reliable, you could measure something once and be done with it (even carpenters measure twice, right?). As the reliability of the instrument decreases, the number of measurements you have to take increases in order to be confident that you are close to the true value.
The Double-Blind Experiment - Finally!
Ahhh…it's about time. What exactly is the double-blind method and why is it so desirable. By now, you should be able to guess that the reason it is so desirable is that it is very good at controlling bias. The double-blind experiment is simply where both the participants and the researchers do no know who is in the experimental group. Aha! Clear as mud.
An experiment (in the truest sense) tests something (usually a theory or hypothesis). In a double-blind experiment, there are two groups of subjects (participants), one group gets the treatment and the other doesn't. Generally, the subjects don't know whether they are receiving the treatment or not but the researchers do…this is called a single-blind experiment. In the double-blind, the researchers don't know which group is which as well. This methodology is most often used in pharmaceutical research so let's use that as an example:
Example 7: StickItToDaLittleGuy Inc. is a large pharmaceutical company that has developed a new drug to treat chronic headaches. The company recruits a large number of people with chronic headaches to be participants. A bunch of pills are created and put in either a blue or red bottle. One bottle contains the actual drug while the other contains a sugar pill (called a placebo). Participants are randomly given either a red or blue bottle. They are instructed that, when they get a headache, they are to take two pills. If the pills haven't worked within 30 minutes, they are to take their normal medication. The company wants to know how many times the pills worked, how many times they didn't, and what if any side effects are experienced.
See also:
Recent Forum Posts:
http://home.provide.net/~djcarlst/abx.htm [home.provide.net]
Your link no longer takes you where you want to go.
