62 S I N G L E - C A S E R E S E A R C H M E T H O D S
INTERIM SUMMARY
Because observing behavior continuously is demanding and probably not feasible in applied settings, many researchers utilize some version of interval recording to acquire a representative sample of the target behavior. Although automated record- ing devices are sometimes available, human observers are still relied on frequently to collect data in such settings. Humans can be fallible as observers, but through appropriate training their reliability and validity in collecting data can be increased substantially.
INTEROBSERVER AGREEMENT
Notice that in the previous section we referred to observers, indicating that more than one person may be observing and recording the relevant behavior in a study. The rationale behind this practice is fairly intuitive. Because many studies in applied settings may not allow for automated recording and measurement, human observers will be doing the business of data collection. As we have seen, humans are not infallible observers, even when well trained prior to the study. As a means of controlling for error and maintaining the integrity of the observational and mea- surement process, applied studies frequently employ two—and, on rare occasions, more than two—observers to collect data. Typically, a study employs one primary observer and one secondary observer. The primary observer usually observes dur- ing each and every session of a study. The secondary observer usually observes, concurrently but independently of the primary observer, during some subset of the sessions of a study; 20% to 33% of the total number of study sessions is a sug- gested standard in single-case research studies that use observation recording sys- tems. Data collected by the secondary observer serve essentially as a check against the data collected by the primary observer and permit the researcher to calculate indices of observational reliability or interobserver agreement. We cannot be lulled into thinking, though, that just because we have two indi-
viduals observing behavior the measures they provide will be accurate and worthy of supporting whatever conclusions they may prompt. This means some effort must be made to ensure the reliability of the observations made by our human observers. Interobserver agreement represents an objective method of calculat- ing the amount of correspondence between observers’ reports, thus indicating a measure of reliability for the collected data. Because human observers are the major source of data in many applied research projects, a considerable dialogue has emerged concerning interobserver agreement and how best to assess it. The issue is a complex one, and no consensus yet exists. We can nevertheless present
03-Morgan-45705:03-Morgan-45705 7/8/2008 10:46 AM Page 62
Co py ri gh t © 2 00 9. S AG E Pu bl ic at io ns , In c. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r, e xc ep t fa ir u se s pe rm it te d un de r
U. S. o r ap pl
ic ab le c op yr ig ht l aw .
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/2/2016 7:30 PM via KAPLAN HIGHER ED (IA) AN: 688769 ; Morgan, David L., Morgan, Robin K..; Single-Case Research Methods for the Behavioral and Health Sciences Account: ns019078
some of the more frequently used methods for evaluating interobserver agreement and come to appreciate some of the thorny issues that surround this topic.
Interobserver Agreement for Frequency/Rate Measures
One of the simplest ways to evaluate interobserver agreement (also known in the literature as interobserver reliability and interrater reliability) is by calculat- ing percentage agreement among observers. This essentially amounts to counting all observations made by the observers and identifying the percentage of observa- tions on which the observers agreed. For instance, say that two observers count, during a 12-hour period, how frequently a resident of a nursing home initiates con- versations with other residents. The primary observer records 11 such occur- rences, and the secondary observer reports having observed 13 occurrences. By simply dividing the smaller number by the larger, and then multiplying by 100%, we obtain a percentage agreement, in this case 11/13 × 100% = 85%. Although this agreement would seem to be pretty high, the number does not tell us whether the two observers agreed or disagreed on specific instances of the behavior.
Interobserver Agreement for Duration and Latency Measures
Recall that duration represents the amount of time that elapses between the onset and offset of the target behavior. Suppose two separate observers collect measures of exercise duration for a person recovering from a stroke. Prior to cal- culating agreement, the researcher must ensure that the observers are collecting data for exactly the same time period. It makes no sense to check the reliability of the observers’ data if this is not the case. Suppose that for the designated observa- tion period, one observer reports a duration of 37 minutes and 0 seconds, and the second observer reports a duration of 40 minutes and 0 seconds. To calculate agreement, we simply divide the smaller number by the larger number and multi- ply by 100%. In this case, interobserver agreement would be reported as 93% (37 minutes/40 minutes × 100% = 92.5%). Calculating agreement for latency, which is the amount of time elapsing between a stimulus and the target behavior, follows the same formula. The shorter time period is divided by the longer time period and multiplied by 100% to render a percentage agreement index.
Interobserver Agreement for Interval Recording and Time Sampling
Calculating agreement for data obtained through interval recording and time sam- pling methods is a bit more involved because we are counting not simply how often the behavior occurred or how long it takes but rather the percentage of intervals in
Chapter 3 � Observational Strategies 63
03-Morgan-45705:03-Morgan-45705 7/8/2008 10:46 AM Page 63
Co py ri gh t © 2 00 9. S AG E Pu bl ic at io ns , In c. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r, e xc ep t fa ir u se s pe rm it te d un de r
U. S. o r ap pl
ic ab le c op yr ig ht l aw .
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/2/2016 7:30 PM via KAPLAN HIGHER ED (IA) AN: 688769 ; Morgan, David L., Morgan, Robin K..; Single-Case Research Methods for the Behavioral and Health Sciences Account: ns019078
which our observers agree that the behavior did or did not occur. Figure 3.2 depicts a typical scoring sheet showing the recorded observations of two observers for a sequence of 15 successive intervals. Remember, the length of the interval or how often the behavior occurred during each interval does not matter. A check is placed in the box to indicate that, for that particular interval, the behavior was observed to occur. Given two observers and a dichotomous recording decision (behavior occurred
or did not occur), four outcomes are possible for each interval. Both observers could record the occurrence of the behavior, in which case there will be a check in both of the corresponding interval boxes (see interval boxes 1–5, 8, and 12–14 in Figure 3.2). This first outcome represents an agreement between the primary and the secondary observer. It also is possible, of course, that both observers record a nonoccurrence, in which case both boxes will be left blank or marked with an “X” (see interval boxes 6, 9, and 11). This outcome also represents an agreement. In addition, the primary observer could record an occurrence while the secondary observer records a nonoccurrence (a disagreement). Finally, the primary observer could record a nonoccurrence while the secondary observer records an occurrence (also a disagreement—see interval boxes 7, 10, and 15 for examples of disagree- ments). We can quantify interobserver agreement for these data by calculating the overall percentage of agreement across intervals. Two of the preceding interobserver outcomes are relatively straightforward. If
both observers record an instance of behavior for the same interval, we say that agreement has occurred. If one observer records an instance of behavior and the other does not, we score this as a disagreement. The more complicated outcome is one for which neither observer scores an instance of behavior (interval boxes 6, 9, and 11). Although you might be tempted to view this as an agreement, there are conflicting views on this matter. If the behavior being observed is very subtle or tends to occur at very low rates, or if the interval is very brief, then occurrences of the behavior could go unnoticed by both observers. For this reason, our confidence in agreements of nonoccurrence is suspect relative to agreements of occurrence. Consequently, some authorities suggest that researchers should report three indices of interobserver agreement: (1) one for occurrences only when behavior is observed in 30% or fewer intervals; (2) one for nonoccurrences only when behavior is scored in 70% or more intervals; and (3) all boxes, that is, occurrences and nonoccurrences combined (Cooper, Heron, & Heward, 2007). Each of these measures of interobserver agreement can be reported in published studies, and readers can interpret these reported levels of agreement and make their own deci- sions about the reliability of the measurement system. Let us take a look at how the data in Figure 3.2 would be calculated as interobserver agreement percentages using each of these three methods.
64 S I N G L E - C A S E R E S E A R C H M E T H O D S
03-Morgan-45705:03-Morgan-45705 7/8/2008 10:46 AM Page 64
Co py ri gh t © 2 00 9. S AG E Pu bl ic at io ns , In c. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r, e xc ep t fa ir u se s pe rm it te d un de r
U. S. o r ap pl
ic ab le c op yr ig ht l aw .
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/2/2016 7:30 PM via KAPLAN HIGHER ED (IA) AN: 688769 ; Morgan, David L., Morgan, Robin K..; Single-Case Research Methods for the Behavioral and Health Sciences Account: ns019078
Occurrences Only. This method of calculating agreement uses only those inter- val boxes in which at least one, but possibly both, of the observers (primary and/or secondary) placed a check, indicating an occurrence of the behavior. This means that boxes 6, 9, and 11 are omitted from the calculation because neither observer recorded an occurrence. Thus, the calculation for occurrences only becomes the following: agreements on occurrence/agreements on occurrence + disagreements on occurrence × 100% = % agreement on occurrence only or 9/12 × 100% = 0.75(100%) = 75% agreement on occurrence only.
Nonoccurrences Only. There are six interval boxes in Figure 3.2 in which a nonoc- currence was recorded by one or both of the observers: boxes 6, 7, 9, 10, 11, and 15. This means that the calculation of interobserver agreement can utilize only these six boxes. Of these six intervals, both primary and secondary observers recorded nonoccurrences in three boxes (6, 9, and 11), and these represent agree- ments. Calculation of interobserver agreement for nonoccurrences only reduces to agreements on nonoccurrence/agreements on nonoccurrence + disagreements on nonoccurrence = % agreement on nonoccurrence only, or 3/6 × 100% = 0.50(50%) = 50% agreement on nonoccurrence only.
Occurrences and Nonoccurrences. This measure of interobserver agreement uti- lizes all interval boxes, regardless of outcome (recorded occurrences or nonoccur- rences). For the data in Figure 3.2, the 15 interval boxes result in the following calculation of interobserver agreement agreements on occurrences and non- occurrences/agreements + disagreements on occurrences and nonoccurrences = % agreement on occurrences and nonoccurrences = 12/15 × 100% = 0.8(100%) = 80%. Clearly, the method of calculating interobserver agreement has significant
implications, not only for these hypothetical data but for actual data reported in the literature. In the current example, calculations ranged from 50% to 80% agree- ment, a disparity that seems to justify recommendations that researchers report both calculated agreement percentages and their methods of calculation. This practice allows readers of the study to interpret the reported levels of agreement and make their own decisions about the reliability of the measurement system. Perhaps the bigger question to be addressed, however, is just what makes for
acceptable levels of interobserver agreement when human observers provide the primary data for a study. This is not an easy question to answer because there is no magical level of interobserver agreement that defines a good study, although clearly researchers strive for high levels of reliability. Generally speaking, sources in the literature suggest that 80% constitutes a general guideline for minimally acceptable interobserver agreement, although even this level of reliability may be
Chapter 3 � Observational Strategies 65
03-Morgan-45705:03-Morgan-45705 7/8/2008 10:46 AM Page 65
Co py ri gh t © 2 00 9. S AG E Pu bl ic at io ns , In c. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r, e xc ep t fa ir u se s pe rm it te d un de r
U. S. o r ap pl
ic ab le c op yr ig ht l aw .
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/2/2016 7:30 PM via KAPLAN HIGHER ED (IA) AN: 688769 ; Morgan, David L., Morgan, Robin K..; Single-Case Research Methods for the Behavioral and Health Sciences Account: ns019078
insufficient when the behavior of interest is well defined and commonly studied (Miller, 1997). Other factors, such as the observational setting, complexity of operational definition, and whether multiple behaviors are being observed simul- taneously, may influence the level of agreement that would be acceptable in a spe- cific study. Interobserver agreement levels that fall below acceptable standards suggest that it might be a good idea to revisit the operational definition or to retrain the observers. It is not unusual for interobserver agreement to fluctuate during the course of a
study, particularly studies that entail long periods or phases of observation and data collection. The tendency for an observer to provide inaccurate measures of behavior over time is referred to as observer drift. This phenomenon may be due to fatigue, boredom, or simply forgetting to apply or haphazardly applying the operational definition. Whatever the cause, observer drift needs to be attended to in order to maintain measurement integrity. This can be done by retraining observers on the operational definition and conducting periodic reliability checks on observations. Observers who are aware that their observations will be moni- tored tend to be less susceptible to observer drift and provide more reliable data (Kent, O’Leary, Dretz, & Diament, 1979).
Reactivity
The social nature of human behavior has consequences as well for subjects whose behavior is being observed for research purposes. Although it is sometimes possible to observe behavior unobtrusively, without the subject or client being aware of the observation, this is frequently not the case in behavioral and health science research. More often, subjects and clients are fully aware not only that
66 S I N G L E - C A S E R E S E A R C H M E T H O D S
Successive intervals
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
√ = target behavior was observed to occur in interval (occurrence)
X = target behavior was not observed to occur in interval (nonoccurrence)
Primary observer √ √ √ √ √ X √ √ X X X √ √ √ √
Secondary observer √ √ √ √ √ X X √ X √ X √ √ √ X
Figure 3.2 Calculating interobserver agreement for interval recording
03-Morgan-45705:03-Morgan-45705 7/8/2008 10:46 AM Page 66
Co py ri gh t © 2 00 9. S AG E Pu bl ic at io ns , In c. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r, e xc ep t fa ir u se s pe rm it te d un de r
U. S. o r ap pl
ic ab le c op yr ig ht l aw .
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/2/2016 7:30 PM via KAPLAN HIGHER ED (IA) AN: 688769 ; Morgan, David L., Morgan, Robin K..; Single-Case Research Methods for the Behavioral and Health Sciences Account: ns019078
their behavior is being observed but also of the purposes for which it is being observed. To understand why this may be important, all you have to do is imagine how your own behavior differs depending on whether you think someone is or is not watching you. A case in point is the encounter with the uncooperative vending machine. We have all had the frustrating experience of putting our money into a machine, making our selection, and standing helplessly by as the machine fails to deliver our requested food or drink. This is, of course, where things get interest- ing. People can become marvelously creative problem solvers under this condi- tion, and their behavior often runs the gamut from pressing every button on the machine to reaching their arms as far up into the machine as they can to attempt- ing to coax their selection out of the machine by slapping, rocking, and otherwise assaulting the mechanical monster. Naturally, the degree of frustration actually vented on the machine is largely determined by whether there is an audience. Vandalism, after all, is an act punishable by law. When people behave in a different-from-usual manner in response to being
observed by others, we say that reactivity has occurred. The problem with reac- tivity is that the behaviors observed may represent not the person’s usual pattern of behavior but rather an idiosyncratic response to being observed. Any parent who has ever visited an elementary classroom to observe his or her child under- stands the impact that his or her presence has, at least initially, on the children. A strange adult in the classroom is a very conspicuous change in the classroom envi- ronment, and some disruption of the daily routine is to be expected. Fortunately, reactivity tends to diminish with repeated observation (Haynes & Horn, 1982). Eventually, the children will habituate, or get used to, the parent’s being present and will settle down into their daily schedule. For the most part, reactivity is prob- lematic only during the initial portion of an observation period. The threat that reactivity poses is that our measures of behavior will paint an artificial picture of the target behavior. For this reason, it is often a good idea to allow initial obser- vation to continue long enough to establish whether reactivity has occurred. If it has, continuing observation and measurement will allow the behavior to return to its natural state, and this is essential if the data being collected are to be used in drawing meaningful inferences.
INTERIM SUMMARY
Before collecting the primary data for a study, researchers must assess the reliabil- ity of the behavioral measures reported by human observers. Interobserver agree- ment, reported as a percentage measure, can be calculated for various dimensions of
Chapter 3 � Observational Strategies 67
03-Morgan-45705:03-Morgan-45705 7/8/2008 10:46 AM Page 67
Co py ri gh t © 2 00 9. S AG E Pu bl ic at io ns , In c. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r, e xc ep t fa ir u se s pe rm it te d un de r
U. S. o r ap pl
ic ab le c op yr ig ht l aw .
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/2/2016 7:30 PM via KAPLAN HIGHER ED (IA) AN: 688769 ; Morgan, David L., Morgan, Robin K..; Single-Case Research Methods for the Behavioral and Health Sciences Account: ns019078
behavior, including response frequency/rate, duration, latency, and occurrence or nonoccurrence. Establishing sufficient levels of interobserver agreement should occur during training of observers and before formal data collection begins for the study. Finally, individuals being observed may behave in an atypical manner in the beginning stages of observation, a phenomenon known as reactivity. Reactivity is usually short lived, and researchers should allow for its effects to diminish before beginning data collection. In the next chapter, we will begin considering the unique strategies used to both monitor and graphically present the repeated measures data that are so instrumental to single-case research methodology.
KEY TERMS GLOSSARY
Observation The act of hearing, seeing, feeling, or otherwise coming into contact with some stimulus in the environment.
Operational definition Definition of a research variable based on operations used to measure the variable.
Response rate A dimension of behavior consisting of number of responses occurring during a period of time.
Duration A dimension of behavior consisting of time from onset to offset of the behavior.
Latency A dimension of behavior consisting of time between an antecedent stimulus and the onset of a behavior.
Intensity A dimension of behavior consisting of magnitude or strength.
Environmental by-products A dimension of behavior consisting of a lasting or permanent impact on the environment produced by the behavior.
Interval recording A method of observation in which a specified time period is subdivided into shorter blocks and observation occurs systematically for each block.
Time sampling An interval recording strategy in which observation and non- observation intervals are alternated over a prolonged period of time.
Reliability A measurement strategy or tool that renders consistent and accurate measures of a variable.
Interobserver agreement An objective method of calculating the amount of correspondence between observers’ reports.
68 S I N G L E - C A S E R E S E A R C H M E T H O D S
03-Morgan-45705:03-Morgan-45705 7/8/2008 10:46 AM Page 68
Co py ri gh t © 2 00 9. S AG E Pu bl ic at io ns , In c. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r, e xc ep t fa ir u se s pe rm it te d un de r
U. S. o r ap pl
ic ab le c op yr ig ht l aw .
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/2/2016 7:30 PM via KAPLAN HIGHER ED (IA) AN: 688769 ; Morgan, David L., Morgan, Robin K..; Single-Case Research Methods for the Behavioral and Health Sciences Account: ns019078
Applied Sciences
Architecture and Design
Biology
Business & Finance
Chemistry
Computer Science
Geography
Geology
Education
Engineering
English
Environmental science
Spanish
Government
History
Human Resource Management
Information Systems
Law
Literature
Mathematics
Nursing
Physics
Political Science
Psychology
Reading
Science
Social Science
Home
Blog
Archive
Contact
google+twitterfacebook
Copyright © 2019 HomeworkMarket.com