CH271
Spreadsheet Exercise 2
Reference: Section 4.5C in textbook
In this exercise, we will set up a spreadsheet that will allow us to compare the means of two sets of data obtained on two separate samples and determine if the means are the same or significantly different. In other words, is the difference between the two sets of data real, i.e. the means are not the same due to a systematic error in one or both analyses, or the samples are not the same, or is the difference simply the result of random error. In order to do this, we will use the Student’s t test. As stated in your textbook, for two sets of data consisting of n1 and n2 measurements (with averages x1 and x2 and standard deviations s1 and s2), the value of t is calculated using the following formula (equation 4-4, p. 89):
tcalc = ((x1 - x2(/spooled)[(n1n2)/(n1 + n2)]1/2 (1)
where spooled is a “pooled” standard deviation for both sets of data. The latter is calculated with the assumption that the standard deviations for the two sets of data are not significantly different. Spooled may be calculated using the formula
spooled = [(s12(n1-1) + s22(n2-1))/(n1 + n2 -2)]1/2 (2)
Once t has been calculated, it is compared to the critical value of t from a table (Table 4-2, p. 87) for the desired confidence level for n1 + n2 -2 degrees of freedom. If the calculated t is less than the critical t value, then there is no significant difference between the two means. If the calculated t is greater than the critical t value, then there is a significant difference between the two means.
To obtain the critical value of t with Excel, tcrit, use the TINV function of Excel. The syntax used by Excel is TINV(probability, deg_freedom). The confidence level is determined from the probability value as 100*(1-probability). For example, for the 95% confidence level, the probability is entered as 0.05. Assuming 10 degrees of freedom and 95 % confidence level, the formula entered into Excel to obtain the t-value would be =TINV(0.05,10).
The data for replicate measurements carried out by Carol Khemistry using two different analytical methods with similar precision on the same sample are given below:
Trial
Method 1A Results
Method 2A Results
1
1.97
1.89
2
1.93
1.88
3
1.97
1.91
4
1.96
1.92
5
1.98
1.90
6
1.94
1.88
7
1.99
1.88
8
1.96
1.89
9
1.93
1.92
10
1.96
1.91
11
1.95
1.90
12
1.97
1.89
13
1.94
1.91
Set up a spreadsheet that will calculate the average for each sample, the standard deviation for each average, the number of data points for each method, the degrees of freedom (calculated as n1 + n2 – 1), the pooled standard deviation for the two sets of data (using equation 2), the tcalc value for the two sets of data (using equation 1), and the tcrit value. The spreadsheet should also determine if the means are the same or statistically different using the IF function of Excel.
The average and standard deviation may be calculated directly by using the AVERAGE() and STDEV() functions in Excel. For example, to average the numeric values found in cells A1 through A5, use the following formula: =AVERAGE(A1:A5). A similar formula would be written to calculate the standard deviation. Because the formula for spooled uses the variance, i.e., the square of the standard deviations for each set of data, you can use the VAR function of Excel to calculate the variance to be used in the spreadsheet equation for spooled. To calculate the variance for the numeric values found in cells A1 through A5, use the following formula: =VAR(A1:A5). The COUNT() function should be used to count the number of trials in each set of data. To use this function to count the numeric values found in cells A1 through A5, use the formula: =COUNT(A1:A5).
Once a tcalc value has been calculated by Excel, you can compare it to tcrit via Excel using the IF function and print the appropriate statement: (1) “The two methods give the same results.”, or (2) “The two methods give statistically different results.” The syntax for this function is IF(logical_test,value_if_true,value_if_false). Thus, to use the IF function to compare tcalc in B18 to tcrit in B19, the formula would be: =IF(B18>B19, " The two methods give statistically different results.", " The two methods give the same results.")
Your spreadsheet should have the following properly labeled columns: Trial, Method 1A Results, Method 2A Results. The average, standard deviation, variance, and number of measurements should be calculated below each column of data. You should also include the following properly labeled values below the columns of data and the average and standard deviation values: degrees of freedom, spooled, tcrit, and tcalc. Below the data, the IF function should indicate whether or not the two methods give the same or different results. An example spreadsheet is attached for your reference.
Note that all calculations are to be completed using Excel. Be sure to type your name, the date, and a title for the spreadsheet at the top of the worksheet. Include at the bottom of the page text which clearly gives the formulas used in the calculations and the columns or cells in which they are located. See the example spreadsheet given to you with the Introduction to Spreadsheets handout.
Although students can help each other with questions and problems that might arise while writing a spreadsheet, each student is to complete his or her own spreadsheet!!! Handing in the exact same spreadsheet generated with or by other students is academic dishonesty, and will negate the original purpose of doing the exercise, which is to learn how to use spreadsheets.
Hand in a paper copy of your spreadsheet, and email it as an attachment to Dr. Crawford at pcrawford@semo.edu .
Be sure to save this spreadsheet since we will be using it again in lab.
CH271
Spreadsheet Exercise 2
Comparison of Two Means Using the t-Test
Dr. Crawford's Example
Trial
Sample 1 Results
Sample 2 Results
1
35.51
35.25
2
35.46
35.25
3
35.44
35.28
4
35.48
35.28
5
35.50
35.27
6
35.43
35.26
7
35.44
35.30
8
35.47
Average
35.47
35.27
Standard Deviation
0.029246489
0.018257419
Variance
0.000855357
0.000333333
N
8
7
Degrees of Freedom
13
spooled
0.024787559
tcrit
2.160368652
tcalc
15.29763484
Conclusion:
The two means are significantly different.
Equations Used
Average
AVERAGE(B9:B15)
Standard Deviation
STDEV(B9:B15)
Variance
VAR(B9:B15)
N
COUNT(B9:B16)
Degrees of Freedom
B20+C20-2
spooled
SQRT(((B18*(B19-1))+(C18*(C19-1)))/(B19+C19-2))
tcrit
TINV(0.05,(B19+C19-2))
tcalc
((abs(B16-C16))/B22)*SQRT((B19*C19)/(B19+C19))
Conclusion:
IF(B25>B24, "The two means are significantly different.", "The two means are not significantly different.")