Loading...

Messages

Proposals

Stuck in your homework and missing deadline? Get urgent help in $10/Page with 24 hours deadline

Get Urgent Writing Help In Your Essays, Assignments, Homeworks, Dissertation, Thesis Or Coursework & Achieve A+ Grades.

Privacy Guaranteed - 100% Plagiarism Free Writing - Free Turnitin Report - Professional And Experienced Writers - 24/7 Online Support

11.2 comparing data displayed in box plots answers

27/11/2021 Client: muhammad11 Deadline: 2 Day

LEARNING OBJECTIVES When you have completed this chapter, you will be able to:

LO4-1 Construct and interpret a dot plot.

LO4-2 Construct and describe a stem-and-leaf display.

LO4-3 Identify and compute measures of position.

LO4-4 Construct and analyze a box plot.

LO4-5 Compute and interpret the coefficient of skewness.

LO4-6 Create and interpret a scatter diagram.

LO4-7 Develop and explain a contingency table.

MCGIVERN JEWELERS recently posted an advertisement on a social media site reporting the shape, size, price, and cut grade for 33 of its diamonds in stock. Develop a box plot of the variable price and comment on the result. (See Exercise 37 and LO4-4.)

Describing Data: DISPLAYING AND EXPLORING DATA4

© Denis Vrublevski/Shutterstock.com

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 95

INTRODUCTION Chapter 2 began our study of descriptive statistics. In order to transform raw or un- grouped data into a meaningful form, we organize the data into a frequency distribution. We present the frequency distribution in graphic form as a histogram or a frequency polygon. This allows us to visualize where the data tend to cluster, the largest and the smallest values, and the general shape of the data.

In Chapter 3, we first computed several measures of location, such as the mean, median, and mode. These measures of location allow us to report a typical value in the set of observations. We also computed several measures of dispersion, such as the range, variance, and standard deviation. These measures of dispersion allow us to de- scribe the variation or the spread in a set of observations.

We continue our study of descriptive statistics in this chapter. We study (1) dot plots, (2) stem-and-leaf displays, (3) percentiles, and (4) box plots. These charts and statistics give us additional insight into where the values are concentrated as well as the general shape of the data. Then we consider bivariate data. In bivariate data, we observe two variables for each individual or observation. Examples include the number of hours a student studied and the points earned on an examination; if a sampled product meets quality specifications and the shift on which it is manufactured; or the amount of electric- ity used in a month by a homeowner and the mean daily high temperature in the region for the month. These charts and graphs provide useful insights as we use business analytics to enhance our understanding of data.

DOT PLOTS Recall for the Applewood Auto Group data, we summarized the profit earned on the 180 vehicles sold with a frequency distribution using eight classes. When we orga- nized the data into the eight classes, we lost the exact value of the observations. A dot plot, on the other hand, groups the data as little as possible, and we do not lose the identity of an individual observation. To develop a dot plot, we display a dot for each observation along a horizontal number line indicating the possible values of the data. If there are identical observations or the observations are too close to be shown individually, the dots are “piled” on top of each other. This allows us to see the shape of the distribution, the value about which the data tend to cluster, and the largest and smallest observations. Dot plots are most useful for smaller data sets, whereas histo- grams tend to be most useful for large data sets. An example will show how to con- struct and interpret dot plots.

LO4-1 Construct and interpret a dot plot.

E X A M P L E

The service departments at Tionesta Ford Lincoln and Sheffield Motors Inc., two of the four Applewood Auto Group dealerships, were both open 24 days last month. Listed below is the number of vehicles serviced last month at the two dealerships. Construct dot plots and report summary statistics to compare the two dealerships.

Tionesta Ford Lincoln

Monday Tuesday Wednesday Thursday Friday Saturday

23 33 27 28 39 26 30 32 28 33 35 32 29 25 36 31 32 27 35 32 35 37 36 30

96 CHAPTER 4

Sheffield Motors Inc.

Monday Tuesday Wednesday Thursday Friday Saturday

31 35 44 36 34 37 30 37 43 31 40 31 32 44 36 34 43 36 26 38 37 30 42 33

S O L U T I O N

The Minitab system provides a dot plot and outputs the mean, median, maximum, and minimum values, and the standard deviation for the number of cars serviced at each dealership over the last 24 working days.

The dot plots, shown in the center of the output, graphically illustrate the distribu- tions for each dealership. The plots show the difference in the location and dis- persion of the observations. By looking at the dot plots, we can see that the number of vehicles serviced at the Sheffield dealership is more widely dispersed and has a larger mean than at the Tionesta dealership. Several other features of the number of vehicles serviced are:

• Tionesta serviced the fewest cars in any day, 23. • Sheffield serviced 26 cars during their slowest day, which is 4 cars less than

the next lowest day. • Tionesta serviced exactly 32 cars on four different days. • The numbers of cars serviced cluster around 36 for Sheffield and 32 for Tionesta.

From the descriptive statistics, we see Sheffield serviced a mean of 35.83 vehicles per day. Tionesta serviced a mean of 31.292 vehicles per day during the same period. So Sheffield typically services 4.54 more vehicles per day. There is also more dispersion, or variation, in the daily number of vehicles serviced at Sheffield than at Tionesta. How do we know this? The standard deviation is larger at Shef- field (4.96 vehicles per day) than at Tionesta (4.112 cars per day).

STEM-AND-LEAF DISPLAYS In Chapter 2, we showed how to organize data into a frequency distribution so we could summarize the raw data into a meaningful form. The major advantage to organizing the data into a frequency distribution is we get a quick visual picture of the shape of the

LO4-2 Construct and describe a stem-and-leaf display.

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 97

distribution without doing any further calculation. To put it another way, we can see where the data are concentrated and also determine whether there are any extremely large or small values. There are two disadvantages, however, to organizing the data into a frequency distribution: (1) we lose the exact identity of each value and (2) we are not sure how the values within each class are distributed. To explain, the Theater of the Republic in Erie, Pennsylvania, books live theater and musical performances. The the- ater’s capacity is 160 seats. Last year, among the forty-five performances, there were eight different plays and twelve different bands. The following frequency distribution shows that between eighty up to ninety people attended two of the forty-five perfor- mances; there were seven performances where ninety up to one hundred people at- tended. However, is the attendance within this class clustered about 90, spread evenly throughout the class, or clustered near 99? We cannot tell.

Attendance Frequency

80 up to 90 2 90 up to 100 7 100 up to 110 6 110 up to 120 9 120 up to 130 8 130 up to 140 7 140 up to 150 3 150 up to 160 3

Total 45

One technique used to display quantitative information in a condensed form and provide more information than the frequency distribution is the stem-and-leaf display. An advantage of the stem-and-leaf display over a frequency distribution is we do not lose the identity of each observation. In the above example, we would not know the identity of the values in the 90 up to 100 class. To illustrate the construc- tion of a stem-and-leaf display using the number people attending each perfor- mance, suppose the seven observations in the 90 up to 100 class are 96, 94, 93, 94, 95, 96, and 97. The stem value is the leading digit or digits, in this case 9. The leaves are the trailing digits. The stem is placed to the left of a vertical line and the leaf values to the right.

The values in the 90 up to 100 class would appear as follows:

9 ∣ 6 4 3 4 5 6 7

It is also customary to sort the values within each stem from smallest to largest. Thus, the second row of the stem-and-leaf display would appear as follows:

9 ∣ 3 4 4 5 6 6 7

With the stem-and-leaf display, we can quickly observe that 94 people attended two performances and the number attending ranged from 93 to 97. A stem-and-leaf display is similar to a frequency distribution with more information, that is, the identity of the observations is preserved.

STEM-AND-LEAF DISPLAY A statistical technique to present a set of data. Each numerical value is divided into two parts. The leading digit(s) becomes the stem and the trailing digit the leaf. The stems are located along the vertical axis, and the leaf values are stacked against each other along the horizontal axis.

98 CHAPTER 4

The following example explains the details of developing a stem-and-leaf display.

E X A M P L E

Listed in Table 4–1 is the number of people attending each of the 45 performances at the Theater of the Republic last year. Organize the data into a stem-and-leaf display. Around what values does attendance tend to cluster? What is the smallest attendance? The largest attendance?

S O L U T I O N

From the data in Table 4–1, we note that the smallest attendance is 88. So we will make the first stem value 8. The largest attendance is 156, so we will have the stem values begin at 8 and continue to 15. The first number in Table 4–1 is 96, which has a stem value of 9 and a leaf value of 6. Moving across the top row, the second value is 93 and the third is 88. After the first 3 data values are considered, the chart is as follows.

Stem Leaf

8 8 9 6 3 10 11 12 13 14 15

Organizing all the data, the stem-and-leaf chart looks as follows.

Stem Leaf

8 8 9 9 6 3 5 6 4 4 7 10 8 7 3 4 6 3 11 7 3 2 7 2 1 9 8 3 12 7 5 7 0 5 5 0 4 13 9 5 2 9 4 6 8 14 8 2 3 15 6 5 5

The usual procedure is to sort the leaf values from the smallest to largest. The last line, the row referring to the values in the 150s, would appear as:

15 ∣ 5 5 6

TABLE 4–1 Number of People Attending Each of the 45 Performances at the Theater of the Republic

96 93 88 117 127 95 113 96 108 94 148 156 139 142 94 107 125 155 155 103 112 127 117 120 112 135 132 111 125 104 106 139 134 119 97 89 118 136 125 143 120 103 113 124 138

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 99

The final table would appear as follows, where we have sorted all of the leaf values.

Stem Leaf

8 8 9 9 3 4 4 5 6 6 7 10 3 3 4 6 7 8 11 1 2 2 3 3 7 7 8 9 12 0 0 4 5 5 5 7 7 13 2 4 5 6 8 9 9 14 2 3 8 15 5 5 6

You can draw several conclusions from the stem-and-leaf display. First, the mini- mum number of people attending is 88 and the maximum is 156. There were two per- formances with less than 90 people attending, and three performances with 150 or more. You can observe, for example, that for the three performances with more than 150 people attending, the actual attendances were 155, 155, and 156. The concentra- tion of attendance is between 110 and 130. There were fifteen performances with at- tendance between 110 and 119 and eight performances between 120 and 129. We can also tell that within the 120 to 129 group the actual attendances were spread evenly throughout the class. That is, 120 people attended two performances, 124 peo- ple attended one performance, 125 people attended three performances, and 127 peo- ple attended two performances.

We also can generate this information on the Minitab software system. We have named the variable Attendance. The Minitab output is below. You can find the Minitab commands that will produce this output in Appendix C.

The Minitab solution provides some additional information regarding cumulative totals. In the column to the left of the stem values are numbers such as 2, 9, 15, and so on. The number 9 indicates there are 9 observations that have occurred before the value of 100. The number 15 indicates that 15 observations have occurred prior to 110. About halfway down the column the number 9 appears in parentheses. The parentheses indicate that the middle value or median appears in that row and there are nine values in this group. In this case, we describe the middle value as the value below which half of the observations oc- cur. There are a total of 45 observations, so the middle value, if the data were arranged from smallest to largest, would be the 23rd observation; its value is 118. After the median, the values begin to decline. These values represent the “more than” cumulative totals. There are 21 observations of 120 or more, 13 of 130 or more, and so on.

100 CHAPTER 4

Which is the better choice, a dot plot or a stem-and-leaf chart? This is really a matter of personal choice and convenience. For presenting data, especially with a large num- ber of observations, you will find dot plots are more frequently used. You will see dot plots in analytical literature, marketing reports, and occasionally in annual reports. If you are doing a quick analysis for yourself, stem-and-leaf tallies are handy and easy, partic- ularly on a smaller set of data.

© Somos/Veer/Getty Images RF

1. The number of employees at each of the 142 Home Depot stores in the Southeast region is shown in the following dot plot.

100 10484 88 92 Number of employees

9680

(a) What are the maximum and minimum numbers of employees per store? (b) How many stores employ 91 people? (c) Around what values does the number of employees per store tend to cluster? 2. The rate of return for 21 stocks is:

8.3 9.6 9.5 9.1 8.8 11.2 7.7 10.1 9.9 10.8 10.2 8.0 8.4 8.1 11.6 9.6 8.8 8.0 10.4 9.8 9.2

S E L F - R E V I E W 4–1

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 101

Organize this information into a stem-and-leaf display. (a) How many rates are less than 9.0? (b) List the rates in the 10.0 up to 11.0 category. (c) What is the median? (d) What are the maximum and the minimum rates of return?

1. Describe the differences between a histogram and a dot plot. When might a dot plot be better than a histogram?

2. Describe the differences between a histogram and a stem-and-leaf display. 3. Consider the following chart.

6 72 3 4 51

a. What is this chart called? b. How many observations are in the study? c. What are the maximum and the minimum values? d. Around what values do the observations tend to cluster?

4. The following chart reports the number of cell phones sold at a big-box retail store for the last 26 days.

199 144

a. What are the maximum and the minimum numbers of cell phones sold in a day? b. What is a typical number of cell phones sold?

5. The first row of a stem-and-leaf chart appears as follows: 62 | 1 3 3 7 9. Assume whole number values.

a. What is the “possible range” of the values in this row? b. How many data values are in this row? c. List the actual values in this row of data.

6. The third row of a stem-and-leaf chart appears as follows: 21 | 0 1 3 5 7 9. Assume whole number values.

a. What is the “possible range” of the values in this row? b. How many data values are in this row? c. List the actual values in this row of data.

7. The following stem-and-leaf chart shows the number of units produced per day in a factory.

Stem Leaf 3 8 4 5 6 6 0133559 7 0236778 8 59 9 00156 10 36

a. How many days were studied? b. How many observations are in the first class?

E X E R C I S E S

102 CHAPTER 4

c. What are the minimum value and the maximum value? d. List the actual values in the fourth row. e. List the actual values in the second row. f. How many values are less than 70? g. How many values are 80 or more? h. What is the median? i. How many values are between 60 and 89, inclusive?

8. The following stem-and-leaf chart reports the number of prescriptions filled per day at the pharmacy on the corner of Fourth and Main Streets.

Stem Leaf 12 689 13 123 14 6889 15 589 16 35 17 24568 18 268 19 13456 20 034679 21 2239 22 789 23 00179 24 8 25 13 26 27 0

a. How many days were studied? b. How many observations are in the last class? c. What are the maximum and the minimum values in the entire set of data? d. List the actual values in the fourth row. e. List the actual values in the next to the last row. f. On how many days were less than 160 prescriptions filled? g. On how many days were 220 or more prescriptions filled? h. What is the middle value? i. How many days did the number of filled prescriptions range between 170 and 210?

9. A survey of the number of phone calls made by a sample of 16 Verizon sub- scribers last week revealed the following information. Develop a stem-and-leaf chart. How many calls did a typical subscriber make? What were the maximum and the minimum number of calls made?

52 43 30 38 30 42 12 46 39 37 34 46 32 18 41 5

10. Aloha Banking Co. is studying ATM use in suburban Honolulu. Yesterday, for a sample of 30 ATM's, the bank counted the number of times each machine was used. The data is presented in the table. Develop a stem-and-leaf chart to summa- rize the data. What were the typical, minimum, and maximum number of times each ATM was used?

83 64 84 76 84 54 75 59 70 61 63 80 84 73 68 52 65 90 52 77 95 36 78 61 59 84 95 47 87 60

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 103

MEASURES OF POSITION The standard deviation is the most widely used measure of dispersion. However, there are other ways of describing the variation or spread in a set of data. One method is to determine the location of values that divide a set of observations into equal parts. These measures include quartiles, deciles, and percentiles.

Quartiles divide a set of observations into four equal parts. To explain further, think of any set of values arranged from the minimum to the maximum. In Chapter 3, we called the middle value of a set of data arranged from the minimum to the maximum the median. That is, 50% of the observations are larger than the median and 50% are smaller. The median is a measure of location because it pinpoints the center of the data. In a similar fashion, quartiles divide a set of observations into four equal parts. The first quartile, usu- ally labeled Q1, is the value below which 25% of the observations occur, and the third quartile, usually labeled Q3, is the value below which 75% of the observations occur.

Similarly, deciles divide a set of observations into 10 equal parts and percentiles into 100 equal parts. So if you found that your GPA was in the 8th decile at your univer- sity, you could conclude that 80% of the students had a GPA lower than yours and 20% had a higher GPA. If your GPA was in the 92nd percentile, then 92% of students had a GPA less than your GPA and only 8% of students had a GPA greater than your GPA. Per- centile scores are frequently used to report results on such national standardized tests as the SAT, ACT, GMAT (used to judge entry into many master of business administration programs), and LSAT (used to judge entry into law school).

Quartiles, Deciles, and Percentiles To formalize the computational procedure, let Lp refer to the location of a desired percen- tile. So if we want to find the 92nd percentile we would use L92, and if we wanted the median, the 50th percentile, then L50. For a number of observations, n, the location of the Pth percentile, can be found using the formula:

LO4-3 Identify and compute measures of position.

LOCATION OF A PERCENTILE Lp = (n + 1) P

100 [4–1]

An example will help to explain further.

E X A M P L E

Morgan Stanley is an investment company with offices located throughout the United States. Listed below are the commissions earned last month by a sample of 15 brokers at the Morgan Stanley office in Oakland, California.

$2,038 $1,758 $1,721 $1,637 $2,097 $2,047 $2,205 $1,787 $2,287 1,940 2,311 2,054 2,406 1,471 1,460

Locate the median, the first quartile, and the third quartile for the commissions earned.

S O L U T I O N

The first step is to sort the data from the smallest commission to the largest.

$1,460 $1,471 $1,637 $1,721 $1,758 $1,787 $1,940 $2,038 2,047 2,054 2,097 2,205 2,287 2,311 2,406

104 CHAPTER 4

In the above example, the location formula yielded a whole number. That is, we wanted to find the first quartile and there were 15 observations, so the location formula indicated we should find the fourth ordered value. What if there were 20 observations in the sample, that is n = 20, and we wanted to locate the first quartile? From the loca- tion formula (4–1):

L25 = (n + 1) P

100 = (20 + 1)

25 100

= 5.25

We would locate the fifth value in the ordered array and then move .25 of the distance between the fifth and sixth values and report that as the first quartile. Like the median, the quartile does not need to be one of the actual values in the data set.

To explain further, suppose a data set contained the six values 91, 75, 61, 101, 43, and 104. We want to locate the first quartile. We order the values from the minimum to the maximum: 43, 61, 75, 91, 101, and 104. The first quartile is located at

L25 = (n + 1) P

100 = (6 + 1)

25 100

= 1.75

The position formula tells us that the first quartile is located between the first and the second values and it is .75 of the distance between the first and the second values. The first value is 43 and the second is 61. So the distance between these two values is 18. To locate the first quartile, we need to move .75 of the distance between the first and second values, so .75(18) = 13.5. To complete the procedure, we add 13.5 to the first value, 43, and report that the first quartile is 56.5.

We can extend the idea to include both deciles and percentiles. To locate the 23rd percentile in a sample of 80 observations, we would look for the 18.63 position.

L23 = (n + 1) P

100 = (80 + 1)

23 100

= 18.63

The median value is the observation in the center and is the same as the 50th percen- tile, so P equals 50. So the median or L50 is located at (n + 1)(50/100), where n is the number of observations. In this case, that is position number 8, found by (15 + 1) (50/100). The eighth-largest commission is $2,038. So we conclude this is the median and that half the brokers earned com- missions more than $2,038 and half earned less than $2,038. The result using

formula (4–1) to find the median is the same as the method presented in Chapter 3.

Recall the definition of a quartile. Quartiles divide a set of observations into four equal parts. Hence 25% of the observations will be less than the first quartile. Seventy-five percent of the observations will be less than the third quartile. To locate the first quartile, we use formula (4–1), where n = 15 and P = 25:

L25 = (n + 1) P

100 = (15 + 1)

25 100

= 4

and to locate the third quartile, n = 15 and P = 75:

L75 = (n + 1) P

100 = (15 + 1)

75 100

= 12

Therefore, the first and third quartile values are located at positions 4 and 12, respectively. The fourth value in the ordered array is $1,721 and the twelfth is $2,205. These are the first and third quartiles.

© Ramin Talaie/Getty Images

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 105

To find the value corresponding to the 23rd percentile, we would locate the 18th value and the 19th value and determine the distance between the two values. Next, we would multiply this difference by 0.63 and add the result to the smaller value. The result would be the 23rd percentile.

Statistical software is very helpful when describing and summarizing data. Excel, Minitab, and MegaStat, a statistical analysis Excel add-in, all provide summary statistics that include quartiles. For example, the Minitab summary of the Morgan Stanley com- mission data, shown below, includes the first and third quartiles, and other statistics. Based on the reported quartiles, 25% of the commissions earned were less than $1,721 and 75% were less than $2,205. These are the same values we calculated using formula (4–1).

There are ways other than formula (4–1) to lo- cate quartile values. For example, another method uses 0.25n + 0.75 to locate the position of the first quartile and 0.75n + 0.25 to locate the position of the third quartile. We will call this the Excel Method. In the Morgan Stanley data, this method would place the first quartile at position 4.5 (.25 × 15 + .75) and the third quartile at position 11.5 (.75 × 15 + .25). The first quartile would be interpolated as 0.5, or one-half the difference between the fourth- and the fifth-ranked values. Based on this method, the first quartile is $1739.5, found by ($1,721 + 0.5[$1,758 − $1,721]). The third quar- tile, at position 11.5, would be $2,151, or one-half the distance between the eleventh- and the

twelfth-ranked values, found by ($2,097 + 0.5[$2,205 − $2,097]). Excel, as shown in the Morgan Stanley and Applewood examples, can compute quartiles using either of the two methods. Please note the text uses formula (4–1) to calculate quartiles.

Is the difference between the two methods important? No. Usually it is just a nui- sance. In general, both methods calculate values that will support the statement that ap- proximately 25% of the values are less than the value of the first quartile, and approximately 75% of the data values are less than the value of the third quartile. When the sample is

large, the difference in the results from the two methods is small. For example, in the Applewood Auto Group data there are 180 vehicles. The quartiles computed using both methods are shown to the left. Based on the variable profit, 45 of the 180 values (25%) are less than both values of the first quartile, and 135 of the 180 values (75%) are less than both values of the third quartile.

When using Excel, be careful to understand the method used to

STATISTICS IN ACTION

John W. Tukey (1915–2000) received a PhD in mathe- matics from Princeton in 1939. However, when he joined the Fire Control Re- search Office during World War II, his interest in ab- stract mathematics shifted to applied statistics. He de- veloped effective numerical and graphical methods for studying patterns in data. Among the graphics he developed are the stem- and-leaf diagram and the box-and-whisker plot or box plot. From 1960 to 1980, Tukey headed the statistical division of NBC’s election night vote projection team. He became renowned in 1960 for preventing an early call of victory for Richard Nixon in the presi- dential election won by John F. Kennedy.

Morgan Stanley Commissions

1460 Equation 4-1 2047 1471

Quartile 1 Quartile 3

1721 2205

Alternate Method Quartile 1 Quartile 3

1739.5 2151

2054 1637 2097 1721 2205 1758 2287 1787 2311 1940 2406 2038

Pro�tAge Applewood

Equation 4-1 Quartile 1 Quartile 3

1415.5 2275.5

Alternate Method Quartile 1 Quartile 3

1422.5 2268.5

$1,387 $1,754 $1,817 $1,040 $1,273 $1,529 $3,082 $1,951 $2,692 $1,342

21 23 24 25 26 27 27 28 28 29

106 CHAPTER 4

calculate quartiles. Excel 2013 and Excel 2016 offer both methods. The Excel function, Quartile.exc, will result in the same answer as Equation 4–1. The Excel function, Quar- tile.inc, will result in the Excel Method answers.

The Quality Control department of Plainsville Peanut Company is responsible for checking the weight of the 8-ounce jar of peanut butter. The weights of a sample of nine jars pro- duced last hour are:

7.69 7.72 7.8 7.86 7.90 7.94 7.97 8.06 8.09

(a) What is the median weight? (b) Determine the weights corresponding to the first and third quartiles.

S E L F - R E V I E W 4–2

11. Determine the median and the first and third quartiles in the following data.

46 47 49 49 51 53 54 54 55 55 59

12. Determine the median and the first and third quartiles in the following data.

5.24 6.02 6.67 7.30 7.59 7.99 8.03 8.35 8.81 9.45 9.61 10.37 10.39 11.86 12.22 12.71 13.07 13.59 13.89 15.42

13. The Thomas Supply Company Inc. is a distributor of gas-powered generators. As with any business, the length of time customers take to pay their invoices is im- portant. Listed below, arranged from smallest to largest, is the time, in days, for a sample of The Thomas Supply Company Inc. invoices.

13 13 13 20 26 27 31 34 34 34 35 35 36 37 38 41 41 41 45 47 47 47 50 51 53 54 56 62 67 82

a. Determine the first and third quartiles. b. Determine the second decile and the eighth decile. c. Determine the 67th percentile.

14. Kevin Horn is the national sales manager for National Textbooks Inc. He has a sales staff of 40 who visit college professors all over the United States. Each Saturday morning he requires his sales staff to send him a report. This re- port includes, among other things, the number of professors visited during the previous week. Listed below, ordered from smallest to largest, are the number of visits last week.

38 40 41 45 48 48 50 50 51 51 52 52 53 54 55 55 55 56 56 57 59 59 59 62 62 62 63 64 65 66 66 67 67 69 69 71 77 78 79 79

a. Determine the median number of calls. b. Determine the first and third quartiles. c. Determine the first decile and the ninth decile. d. Determine the 33rd percentile.

E X E R C I S E S

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 107

BOX PLOTS A box plot is a graphical display, based on quartiles, that helps us picture a set of data. To construct a box plot, we need only five statistics: the minimum value, Q1 (the first quartile), the median, Q3 (the third quartile), and the maximum value. An example will help to explain.

LO4-4 Construct and analyze a box plot.

E X A M P L E

Alexander’s Pizza offers free delivery of its pizza within 15 miles. Alex, the owner, wants some information on the time it takes for delivery. How long does a typical delivery take? Within what range of times will most deliveries be completed? For a sample of 20 deliveries, he determined the following information:

Minimum value = 13 minutes

Q1 = 15 minutes

Median = 18 minutes

Q3 = 22 minutes

Maximum value = 30 minutes

Develop a box plot for the delivery times. What conclusions can you make about the delivery times?

S O L U T I O N

The first step in drawing a box plot is to create an appropriate scale along the horizontal axis. Next, we draw a box that starts at Q1 (15 minutes) and ends at Q3 (22 minutes). Inside the box we place a vertical line to represent the median (18 minutes). Finally, we extend horizontal lines from the box out to the minimum value (13 minutes) and the maximum value (30 minutes). These horizontal lines outside of the box are sometimes called “whiskers” because they look a bit like a cat’s whiskers.

12 14 16 18 20 22 24 26 28 30 32

Q1 Median

Q3

Minimum value

Maximum value

Minutes

The box plot also shows the interquartile range of delivery times between Q1 and Q3. The interquartile range is 7 minutes and indicates that 50% of the deliveries are between 15 and 22 minutes.

The box plot also reveals that the distribution of delivery times is positively skewed. In Chapter 3, we defined skewness as the lack of symmetry in a set of data. How do we know this distribution is positively skewed? In this case, there are actually two pieces of information that suggest this. First, the dashed line to the right of the box from 22 minutes (Q3) to the maximum time of 30 minutes is longer than the dashed line from the left of 15 minutes (Q1) to the minimum value of 13 minutes. To put it another way,

108 CHAPTER 4

the 25% of the data larger than the third quartile is more spread out than the 25% less than the first quartile. A second indication of positive skewness is that the median is not in the center of the box. The distance from the first quartile to the median is smaller than the distance from the median to the third quartile. We know that the number of delivery times between 15 minutes and 18 minutes is the same as the number of de- livery times between 18 minutes and 22 minutes.

E X A M P L E

Refer to the Applewood Auto Group data. Develop a box plot for the variable age of the buyer. What can we conclude about the distribution of the age of the buyer?

S O L U T I O N

Minitab was used to develop the following chart and summary statistics.

The median age of the purchaser is 46 years, 25% of the purchasers are less than 40 years of age, and 25% are more than 52.75 years of age. Based on the sum- mary information and the box plot, we conclude:

• Fifty percent of the purchasers are between the ages of 40 and 52.75 years. • The distribution of ages is fairly symmetric. There are two reasons for this con-

clusion. The length of the whisker above 52.75 years (Q3) is about the same length as the whisker below 40 years (Q1). Also, the area in the box between 40 years and the median of 46 years is about the same as the area between the median and 52.75.

There are three asterisks (*) above 70 years. What do they indicate? In a box plot, an asterisk identifies an outlier. An outlier is a value that is inconsistent with the rest of the data. It is defined as a value that is more than 1.5 times the inter- quartile range smaller than Q1 or larger than Q3. In this example, an outlier would be a value larger than 71.875 years, found by:

Outlier > Q3 + 1.5(Q3 − Q1) = 52.75 + 1.5(52.75 − 40) = 71.875

An outlier would also be a value less than 20.875 years.

Outlier < Q1 − 1.5(Q3 − Q1) = 40 − 1.5(52.75 − 40) = 20.875

DESCRIBING DATA: DISPLAYING AND EXPLORING DATA 109

The following box plot shows the assets in millions of dollars for credit unions in Seattle, Washington.

0 10 20 30 40 50 60 70 80 90 100

What are the smallest and largest values, the first and third quartiles, and the median? Would you agree that the distribution is symmetrical? Are there any outliers?

S E L F - R E V I E W 4–3

From the box plot, we conclude there are three purchasers 72 years of age or older and none less than 21 years of age. Technical note: In some cases, a single asterisk may represent more than one observation because of the limitations of the software and space available. It is a good idea to check the actual data. In this in- stance, there are three purchasers 72 years old or older; two are 72 and one is 73.

15. The box plot below shows the amount spent for books and supplies per year by students at four-year public colleges.

0 350 700 1,050 1,400 $1,750

a. Estimate the median amount spent. b. Estimate the first and third quartiles for the amount spent. c. Estimate the interquartile range for the amount spent. d. Beyond what point is a value considered an outlier? e. Identify any outliers and estimate their value. f. Is the distribution symmetrical or positively or negatively skewed?

16. The box plot shows the undergraduate in-state tuition per credit hour at four-year public colleges.

*

0 300 600 900 1,200 $1,500

a. Estimate the median. b. Estimate the first and third quartiles. c. Determine the interquartile range. d. Beyond what point is a value considered an outlier? e. Identify any outliers and estimate their value. f. Is the distribution symmetrical or positively or negatively skewed?

17. In a study of the gasoline mileage of model year 2016 automobiles, the mean miles per gallon was 27.5 and the median was 26.8. The smallest value in the study was 12.70 miles per gallon, and the largest was 50.20. The first and third quartiles were 17.95 and 35.45 miles per gallon, respectively. Develop a box plot and comment on the distribution. Is it a symmetric distribution?

E X E R C I S E S

110 CHAPTER 4

SKEWNESS In Chapter 3, we described measures of central location for a distribution of data by re- porting the mean, median, and mode. We also described measures that show the amount of spread or variation in a distribution, such as the range and the standard deviation.

Another characteristic of a distribution is the shape. There are four shapes com- monly observed: symmetric, positively skewed, negatively skewed, and bimodal. In a symmetric distribution the mean and median are equal and the data values are evenly spread around these values. The shape of the distribution below the mean and median is a mirror image of distribution above the mean and median. A distribution of values is skewed to the right or positively skewed if there is a single peak, but the values extend much farther to the right of the peak than to the left of the peak. In this case, the mean is larger than the median. In a negatively skewed distribution there is a single peak, but the observations extend farther to the left, in the negative direction, than to the right. In a negatively skewed distribution, the mean is smaller than the median. Positively skewed distributions are more common. Salaries often follow this pattern. Think of the salaries of those employed in a small company of about 100 people. The president and a few top executives would have very large salaries relative to the other workers and hence the distribution of salaries would exhibit positive skewness. A bimodal distribu- tion will have two or more peaks. This is often the case when the values are from two or more populations. This information is summarized in Chart 4–1.

LO4-5 Compute and interpret the coefficient of skewness.

M ed

ia n

M ea

n

45

Fr eq

ue nc

y

Fr eq

ue nc

y

Fr eq

ue nc

y

Fr eq

ue nc

y

Years

Ages

Symmetric

Monthly Salaries

Positively Skewed

$3,000 $4,000

M ed

ia n

M ea

n

Median Mean

Test Scores

Negatively Skewed

75 80 Score

Mean

Outside Diameter

Bimodal

.98 1.04 Inches$

CHART 4–1 Shapes of Frequency Polygons

There are several formulas in the statistical literature used to calculate skewness. The simplest, developed by Professor Karl Pearson (1857–1936), is based on the differ- ence between the mean and the median.

18. A sample of 28 time shares in the Orlando, Florida, area revealed the follow- ing daily charges for a one-bedroom suite. For convenience, the data are ordered from smallest to largest. Construct a box plot to represent the data. Comment on the distribution. Be sure to identify the first and third quartiles and the median.

Homework is Completed By:

Writer Writer Name Amount Client Comments & Rating
Instant Homework Helper

ONLINE

Instant Homework Helper

$36

She helped me in last minute in a very reasonable price. She is a lifesaver, I got A+ grade in my homework, I will surely hire her again for my next assignments, Thumbs Up!

Order & Get This Solution Within 3 Hours in $25/Page

Custom Original Solution And Get A+ Grades

  • 100% Plagiarism Free
  • Proper APA/MLA/Harvard Referencing
  • Delivery in 3 Hours After Placing Order
  • Free Turnitin Report
  • Unlimited Revisions
  • Privacy Guaranteed

Order & Get This Solution Within 6 Hours in $20/Page

Custom Original Solution And Get A+ Grades

  • 100% Plagiarism Free
  • Proper APA/MLA/Harvard Referencing
  • Delivery in 6 Hours After Placing Order
  • Free Turnitin Report
  • Unlimited Revisions
  • Privacy Guaranteed

Order & Get This Solution Within 12 Hours in $15/Page

Custom Original Solution And Get A+ Grades

  • 100% Plagiarism Free
  • Proper APA/MLA/Harvard Referencing
  • Delivery in 12 Hours After Placing Order
  • Free Turnitin Report
  • Unlimited Revisions
  • Privacy Guaranteed

6 writers have sent their proposals to do this homework:

Assignment Helper
WRITING LAND
Top Class Engineers
Writing Factory
Assignments Hut
Unique Academic Solutions
Writer Writer Name Offer Chat
Assignment Helper

ONLINE

Assignment Helper

Being a Ph.D. in the Business field, I have been doing academic writing for the past 7 years and have a good command over writing research papers, essay, dissertations and all kinds of academic writing and proofreading.

$23 Chat With Writer
WRITING LAND

ONLINE

WRITING LAND

I am a PhD writer with 10 years of experience. I will be delivering high-quality, plagiarism-free work to you in the minimum amount of time. Waiting for your message.

$15 Chat With Writer
Top Class Engineers

ONLINE

Top Class Engineers

I have read your project description carefully and you will get plagiarism free writing according to your requirements. Thank You

$50 Chat With Writer
Writing Factory

ONLINE

Writing Factory

I am a professional and experienced writer and I have written research reports, proposals, essays, thesis and dissertations on a variety of topics.

$49 Chat With Writer
Assignments Hut

ONLINE

Assignments Hut

I have assisted scholars, business persons, startups, entrepreneurs, marketers, managers etc in their, pitches, presentations, market research, business plans etc.

$18 Chat With Writer
Unique Academic Solutions

ONLINE

Unique Academic Solutions

I am an elite class writer with more than 6 years of experience as an academic writer. I will provide you the 100 percent original and plagiarism-free content.

$29 Chat With Writer

Let our expert academic writers to help you in achieving a+ grades in your homework, assignment, quiz or exam.

Similar Homework Questions

601/8 dorcas street south melbourne - Principle of Business Management - Fitness components for cricket - Long buccal nerve block - Political science research methods questions - A distinguishing feature of managerial accounting is - Chapman university modesto ca - Circuit breaker adjustable trip settings - Orcad pcb editor tutorial - Who determines the currency exchange rates - An inconvenient truth guided viewing questions answers - Journal of organizational behavior pdf - What is the consensus model in criminal justice - What is omnibus certification - Are moon jellyfish pelagic or benthic organisms - Botany bay school excursion - 3 phase 4 wire energy meter connection diagram with ct - Spacegas - Test your grammar skills - Pharmacology final exam nursing - Family therapy gained its initial legitimacy during the 1950's by - Walk across america chart - Stareway to spelling programme - The musketeers season 2 episode guide - Lord and lady montague - Geography Assignment - Reflection paper - South florida virtual fusion center - Leading and managing in nursing free pdf - Nec portable generator requirements - Sl power electronics ltd - Criminal profiling cases solved - Statement of purpose for global business management - Legal services commissioner v michael vincent baker [2005] lpt 002 - Toms shoes annual report 2015 - A firm offers routine physical examinations - Eating disorder - Dot and cross diagram for hf - Autism spectrum rating scales asrs - Discussion - Compare and contrast two fundamental security design principles - Operation excellence - General design rules for machining ppt - Classify database technologies and healthcare information systems used to manage data and information - Schneider electric graduate program - Jazz moves from new orleans to chicago worksheet answers - Combines text graphics animation audio video and or virtual reality - Determination of pka of unknown acid - Week 3 discussion form: peer review for scientific and mathematical analytic inquiry draft - Rsa authentication manager 8.1 - Cherry toothpaste pot lid - Worldview paper apol 104 - There will come soft rains by ray bradbury summary - Common denominator word problems - How to motivate fred maiorino - Spynie dental practice elgin - Similarities of conceptual and theoretical framework - Canadian solar panels made in china - Ithaca college transfer gpa - Ama hipaa violations and enforcement - Dbmss typically include report generating tools in order to - Physics - Statistics Intro - Nickel and Dimed by Barbara Ehrenreich - Tesla case study strategic management - Scg cement price in myanmar - Single phase power measurement - Political Science Organizational Assessment Power point Briefing - Discussion 5(2) - Managerial Accounting - Business law textbook answers - Mini box power supply - Training package contextualisation guidelines - A food handler must report which symptom to a manager - DISASTER MANAGEMENT PAPER - Target market for dark chocolate - Galileos analysis of projectile motion - Gen 499 week 4 quiz - Secretive contracts briefly crossword clue - Handbook of Research Methods for Tourism and Hospitality Management - How many pounds is $30 - Norton introduction to literature portable 13th edition - 7 habits of highly effective people weekly planner - Cirque du soleil paris 2014 - Portfolio - Case study - Leadership and Change in organization - Chain of custody flow chart - Excel independent project 1 6 - The defender 16 personalities - The epic of gilgamesh norton anthology pdf - Social Work Process Recording - How is trigonometry used in sports - Bundaberg rum promotional products - Mitch prinstein grad school advice - Chem data sheet hsc - Biology - Tiny toy miniature horse stud - Western electric company hawthorne - Determination of the copper content in a brass sample - Econ 312 week 1 quiz