The Notes: What are within-subject experimental designs?

Introduction

In an experiment, a particular comparison is produced while other factors are held constant. For example, to investigate the effects of music on reading comprehension, an experimenter might compare the effects of music versus no music on the comprehension of a chapter from a history textbook. The comparison that is produced—music versus no music—is called the independent variable. An independent variable must have at least two levels or values so that a comparison can be made. The behavior that is observed or measured is called the dependent variable, which would be some measure of reading comprehension in the example.

Presumably, any changes in reading comprehension during the experiment depend on changes in the levels of the independent variable. The intent of an experiment is to hold everything constant except the changes in the levels of the independent variable. If this is done, the experimenter can assume that changes in the dependent variable were caused by changes in the levels of the independent variable.

Role of Independent Variable

Experimental design concerns the way in which the levels of the independent variable are assigned to experimental subjects. This is a crucial concern, because the experimenter wants to make sure that it is the independent variable and not something else that causes changes in behavior.
Between-subject designs
are plans in which different participants receive the levels of the independent variable. Therefore, in terms of the example already mentioned, some people would read with music playing and other people would read without music. Within-subject designs are plans in which each participant receives each level of the manipulated variable. In a within-subject design, each person would read a history chapter both while music is playing and in silence. Each of these designs has unwanted features that make it difficult to decide whether the independent variable caused changes in the dependent variable.

Because different subjects receive each level of the independent variable in a between-subject design, the levels of the independent variable vary with the subjects in each condition. Any effect observed in the experiment could result from either the independent variable or the characteristics of the subjects in a particular condition. For example, the people who read while music is playing might simply be better readers than those who read in silence. This difference between the people in the two groups would make it difficult to determine whether music or reading ability caused changes in comprehension. When something other than the independent variable could cause the results of an experiment, the results are confounded. In between-subject designs, the potential effects of the independent variable are confounded with the different subjects in each condition. Instead of the independent variable, individual differences, such as intelligence or reading ability, could account for differences between groups. This confounding (the variation of other variables with the independent variable of interest, as a result of which any effects cannot be attributed with certainty to the independent variable) may be minimized by assigning participants to conditions randomly or by matching the different subjects in some way, but these tactics do not eliminate the potential confounding. For this and other reasons, many experimenters prefer to use within-subject designs.

Because each subject receives each level of the independent variable in within-subject designs, subjects are not confounded with the independent variable. In the example experiment, this means that both good and bad readers would read with and without music. Yet the order in which a subject receives the levels of the independent variable is confounded with the levels of the independent variable. Therefore, determining whether a change in the dependent variable occurred because of the independent variable or as a result of the timing of the administration of the treatment might be difficult. This kind of confounding is called a carryover effect. The effects of one value of the independent variable might carry over to the period when the next level is being tested. Just as likely, an unwanted carryover effect could result because the subject’s behavior changes as the experiment progresses. The subject might become better at the task because of practicing it or worse because of boredom or fatigue. Whatever the source of the carryover effects, they represent serious potential confounding.

Counterbalancing

Carryover effects can be minimized by counterbalancing. Counterbalancing means that the order of administering the conditions of an experiment is systematically varied. Consider the reading experiment: One condition is reading with music (M), and the comparison level is reading in silence (S). If all subjects received S before M, order would be confounded with condition. If half the subjects had M before S and the remaining subjects had S before M, the order of treatments would not be confounded with the nature of the treatments. This is so because both treatment conditions occur first and second equally often.

Complete counterbalancing is done when all possible orders of the independent variable are administered. Complete counterbalancing is easy when there are two or three levels of the independent variable. With four or more levels, however, complete counterbalancing becomes cumbersome because of the number of different orders of conditions that can be generated. With more than three levels, experimenters usually use a balanced Latin square to decide the order of administering conditions. In a balanced Latin square, each condition occurs at the same time period on average, and each treatment precedes and follows each other treatment equally often. Imagine an experiment with four levels of the independent variable, called A, B, C, and D. One might think of these as four different types of music that are being tested in the reading-comprehension example. Suppose there are four subjects, numbered 1, 2, 3, and 4. In a balanced Latin square, the following would be the orders for the four subjects: subject 1, A, B, D, C; subject 2, B, C, A, D; subject 3, C, D, B, A; subject 4, D, A, C, B. Notice that across subjects each treatment occurs first, second, third, and fourth. Notice also that each treatment precedes and follows each other treatment. Although these four orders do not exhaust the possibilities for four treatments (there are a total of twenty-four), they do minimize the confounding from carryover effects.

Inferential Statistics and Testing Subjects

Another feature favoring within-subject designs concerns inferential statistics. Because each participant serves in all conditions in within-subject designs, variability associated with individual differences among subjects has little influence on the statistical significance of the results. This means that within-subject designs are more likely than between-subject designs to yield a statistically significant result. Experimenters are more likely to find an effect attributable to the independent variable when its levels vary within subjects rather than between them.

A final reason within-subject designs are preferred to between-subject ones is that they require fewer subjects for testing. To try to minimize the confounding effects of individual differences in between-subject designs, experimenters typically assign many subjects randomly to each condition of the experiment. Since individual differences are not a hindrance in within-subject designs, fewer subjects can be tested, and there is a corresponding savings in time and effort.

Reversal Design

Experimenters in all areas of psychology use within-subject designs. These designs are used whenever the independent variable is unlikely to have permanent carryover effects. Thus, if the characteristics of the subjects themselves are the variable of interest (such as place of birth or reading ability), those variables must be varied between subjects. If permanent carryover effects are of interest (such as learning to type as a function of practice), however, experimenters use within-subject plans.

Many experiments undertaken to solve practical problems use within-subject designs. These experiments are often small-n designs, which means that the number of subjects (n) is small—sometimes only one. Consider an experiment conducted by Betty M. Hart and her associates. They wanted to decrease the amount of crying exhibited by a four-year-old boy in nursery school. They observed his behavior for several days to find the baseline rate of crying episodes. During a ten-day period, the boy had between five and ten crying episodes each day that lasted at least five seconds. Hart and her associates noted that the teacher often tried to soothe the boy when he began crying. The researchers believed that this attention rewarded the crying behavior. Therefore, in the second phase of the experiment, the teacher ignored the boy’s crying unless it resulted from an injury. Within five days, the crying episodes had decreased and remained at no more than one per day for a week. To gain better evidence that it was the teacher’s attention that influenced the rate of crying, a third phase of the experiment reinstated the conditions of the baseline phase. The teacher paid attention to the boy when he whined and cried, and in a few days the level of crying was back to six or seven episodes per day.

The small-n design used by Hart and her associates is an example of a reversal design. In a reversal design, there is first a baseline phase, then a treatment phase, and finally a return to the baseline phase to make sure that it was the treatment that changed the behavior. Hart’s experiment had a fourth phase in which the teacher again ignored the boy’s crying, because the purpose of the treatment was to reduce the crying. In the fourth phase, the level of crying dropped to a negligible level.

When there is only one subject in an experiment, counterbalancing cannot be used to minimize carryover effects. Thus, the experience in the treatment phase of a reversal design might carry over into the second baseline phase. Experimenters seek an approximate return to the original behavior during the second baseline phase, but the behavior is seldom exactly as it was before the treatment period. Therefore, deciding about the effectiveness of the treatment introduced in the second phase may be difficult. This means that the reversal design is not a perfect experimental design. It has important applications in psychology, however, especially in clinical psychology, where practical results rather than strict experimental control are often very important.

Trappers Case Study

Lise Saari conducted an experiment that used a more conventional within-subject design. Saari wanted to assess the effect of payment schedule on the performance and attitudes of beaver trappers. The trappers received an hourly wage from a forest-products company while they participated in the following experiment.

Initially, trapping performance was measured under the ordinary hourly payment plan. Later, the trappers worked under two incentive plans manipulated in a within-subject design. In the continuous-reward condition, trappers received an additional dollar for each animal that was trapped. In the second condition, trappers received a reward of four dollars when they brought in a beaver. They obtained the four dollars only if they correctly predicted twice whether the roll of a die would yield an even or an odd number. In this variable-ratio condition, the trapper could guess the correct roll one out of four times by chance alone. In summary, the trappers always received a one-dollar reward in the continuous-reward condition. In the variable-ratio condition, however, the payment of four dollars occurred once every four times on average. Therefore, the trappers averaged an extra dollar for each beaver in each condition.

To minimize carryover effects, counterbalancing the order of treatments occurred as follows. The trappers were split into two groups, which alternated between the two schedules, spending a week at a time on each. This weekly alternation of experimental payment continued for the entire trapping season.

Compared to the amount of trapping that occurred under the hourly wage, the results showed that beaver trapping increased under both the continuous and the variable-payment scheme. The increase was, however, much larger under the variable payment plan than under the continuous one. In addition, Saari found that the trappers preferred to work under the variable-ratio scheme. Since both plans yielded the same amount of extra money on average, the mode of giving the payment (continuous or variable) seems crucial.

The experiment by Saari has obvious important practical implications concerning methods of payment. Still, it is equally important that the design of the experiment was free of confounding. The counterbalancing scheme minimized the possibility of confounding the payment scheme with order. Thus, Saari could conclude that the change in attitudes and the increased trapping performance resulted from the variable payment plan, not from some confounding carryover effect.

Use in Psychology

Within-subject designs have a long history of use in psychology. The psychophysics experiments conducted by Ernst Weber and Gustav Fechner in the nineteenth century were among the first within-subject experiments in psychology. The tradition of obtaining many observations on a few subjects started by Weber and Fechner continues in modern psychophysical scaling and signal-detection experiments.

One of the most famous small-n experiments in psychology is that reported by Hermann Ebbinghaus in his book Über das gedächtnis: Untersuchurgen zur experimentellen Psychologie (1885; Memory: A Contribution to Experimental Psychology, 1913). Ebbinghaus tested himself in a series of memory experiments. In his work on remembering nonsense syllables and poetry, he discovered many laws of retaining and forgetting. These laws are now firmly established. Numerous modern experiments with larger numbers of experimental participants and various verbal materials have yielded results confirming Ebbinghaus’s work. Among the most important findings are the shape of the curve of forgetting over time, the important role of practice in improving retention, and the benefits of distributing practice as opposed to cramming it.

B. F. Skinner pioneered the use of small-n designs for laboratory experiments on rats and pigeons in the 1930s. Skinner’s work on schedules of reinforcement is among the most frequently cited in psychology. In his work, Skinner insisted on making numerous observations of few subjects under tightly controlled conditions. His ability to control the behavior of experimental subjects and obtain reliable results in within-subject plans such as the reversal design has led to the wide acceptance of within-subject plans in laboratory and applied experimental work.

Developmental psychologists regularly use a variant of the within-subject design. This is the longitudinal design, in which repeated observations are made as the subject develops and grows older. In a typical longitudinal experiment, a child first might receive a test of problem solving when he or she is three years old. Then the test would be repeated at ages five and seven.

Cross-Sectional Plan

The longitudinal design inherently confounds age or development with period of testing, since age cannot be counterbalanced for an individual. An alternative developmental design is the cross-sectional plan. In this design, subjects of different ages are tested at the same time. Since participants of different ages have grown up in different time periods with different people, age is confounded with generation of birth in the cross-sectional design. Thus, the cross-sectional plan is between subjects and cannot control for individual differences. Although the longitudinal design confounds age with time of testing, individual differences do not confound the results. Therefore, the longitudinal design is a valuable research tool for the developmental psychologist.

Because of their control, efficiency, and statistical power, within-subject designs are popular and important in psychology. All areas of applied and basic scientific psychology rely heavily on within-subject designs, and such designs are likely to remain important in the field.

Bibliography

Gescheider, George A. Psychophysics: Method and Theory. 2d ed. Hillsdale: Erlbaum, 1984. Print.

Gravetter, Frederick J., and Larry B. Wallnau. Study Guide: Essentials of Statistics for the Behavioral Sciences. 6th ed. Belmont: Thomson, 2008. Print.

Kantowitz, Barry H., David G. Elmes, and Henry L. Roediger III. Experimental Psychology. 10th ed. Stamford: Cengage, 2015. Print.

Martin, David W. Doing Psychology Experiments 7th ed. Belmont: Wadsworth, 2008. Print.

Nolan, Susan A., and Thomas E. Heinzen. Essentials of Statistics for the Behavioral Sciences. 2d ed. New York: Worth, 2014. Print.

Reis, Harry T., and Charles M. Judd, eds. Handbook of Research Methods in Social and Personality Psychology. 2d ed. New York: Cambridge UP, 2014. Digital file.

The Notes

Wednesday, January 28, 2009

What are within-subject experimental designs?

No comments:

Post a Comment

What are hearing tests?