PROBABILITY AND STATISTICAL INFERENCE Ninth Edition
Robert V. Hogg
Elliot A. Tanis
Dale L. Zimmerman
Boston Columbus Indianapolis New York San Francisco
Upper Saddle River Amsterdam Cape Town Dubai
London Madrid Milan Munich Paris Montreal Toronto
Delhi Mexico City Sao Paulo Sydney Hong Kong Seoul
Singapore Taipei Tokyo
Editor in Chief: Deirdre Lynch Acquisitions Editor: Christopher Cummings Sponsoring Editor: Christina Lepre Assistant Editor: Sonia Ashraf Marketing Manager: Erin Lane Marketing Assistant: Kathleen DeChavez Senior Managing Editor: Karen Wernholm Senior Production Editor: Beth Houston Procurement Manager: Vincent Scelta Procurement Specialist: Carol Melville Associate Director of Design, USHE EMSS/HSC/EDU: Andrea Nix Art Director: Heather Scott Interior Designer: Tamara Newnam Cover Designer: Heather Scott Cover Image: Agsandrew/Shutterstock Full-Service Project Management: Integra Software Services Composition: Integra Software Services
Copyright c⃝ 2015, 2010, 2006 by Pearson Education, Inc. All rights reserved. Manufactured in the United States of America. This publication is protected by Copyright, and permission should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission(s) to use material from this work, please submit a written request to Pearson Higher Education, Rights and Contracts Department, One Lake Street, Upper Saddle River, NJ 07458, or fax your request to 201-236-3290.
Many of the designations by manufacturers and seller to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed in initial caps or all caps.
Library of Congress Cataloging-in-Publication Data Hogg, Robert V.
Probability and Statistical Inference/ Robert V. Hogg, Elliot A. Tanis, Dale Zimmerman. – 9th ed.
p. cm. ISBN 978-0-321-92327-1
1. Mathematical statistics. I. Hogg, Robert V., II. Tanis, Elliot A. III. Title. QA276.H59 2013 519.5–dc23
2011034906
10 9 8 7 6 5 4 3 2 1 EBM 17 16 15 14 13
www.pearsonhighered.com ISBN-10: 0-321-92327-8 ISBN-13: 978-0-321-92327-1
http://www.pearsonhighered.com
Contents Preface v
Prologue vii
1 Probability 1 1.1 Properties of Probability 1
1.2 Methods of Enumeration 11
1.3 Conditional Probability 20
1.4 Independent Events 29
1.5 Bayes’ Theorem 35
2 Discrete Distributions 41 2.1 Random Variables of the Discrete Type 41
2.2 Mathematical Expectation 49
2.3 Special Mathematical Expectations 56
2.4 The Binomial Distribution 65
2.5 The Negative Binomial Distribution 74
2.6 The Poisson Distribution 79
3 Continuous Distributions 87 3.1 Random Variables of the Continuous
Type 87
3.2 The Exponential, Gamma, and Chi-Square Distributions 95
3.3 The Normal Distribution 105
3.4* Additional Models 114
4 Bivariate Distributions 125 4.1 Bivariate Distributions of the Discrete
Type 125
4.2 The Correlation Coefficient 134
4.3 Conditional Distributions 140
4.4 Bivariate Distributions of the Continuous Type 146
4.5 The Bivariate Normal Distribution 155
5 Distributions of Functions of Random Variables 163
5.1 Functions of One Random Variable 163
5.2 Transformations of Two Random Variables 171
5.3 Several Random Variables 180
5.4 The Moment-Generating Function Technique 187
5.5 Random Functions Associated with Normal Distributions 192
5.6 The Central Limit Theorem 200
5.7 Approximations for Discrete Distributions 206
5.8 Chebyshev’s Inequality and Convergence in Probability 213
5.9 Limiting Moment-Generating Functions 217
6 Point Estimation 225 6.1 Descriptive Statistics 225
6.2 Exploratory Data Analysis 238
6.3 Order Statistics 248
6.4 Maximum Likelihood Estimation 256
6.5 A Simple Regression Problem 267
6.6* Asymptotic Distributions of Maximum Likelihood Estimators 275
6.7 Sufficient Statistics 280
6.8 Bayesian Estimation 288
6.9* More Bayesian Concepts 294
7 Interval Estimation 301 7.1 Confidence Intervals for Means 301
7.2 Confidence Intervals for the Difference of Two Means 308
7.3 Confidence Intervals for Proportions 318
7.4 Sample Size 324 iii
iv Contents
7.5 Distribution-Free Confidence Intervals for Percentiles 331
7.6* More Regression 338
7.7* Resampling Methods 347
8 Tests of Statistical Hypotheses 355
8.1 Tests About One Mean 355
8.2 Tests of the Equality of Two Means 365
8.3 Tests About Proportions 373
8.4 The Wilcoxon Tests 381
8.5 Power of a Statistical Test 392
8.6 Best Critical Regions 399
8.7* Likelihood Ratio Tests 406
9 More Tests 415 9.1 Chi-Square Goodness-of-Fit Tests 415
9.2 Contingency Tables 424
9.3 One-Factor Analysis of Variance 435
9.4 Two-Way Analysis of Variance 445
9.5* General Factorial and 2k Factorial Designs 455
9.6* Tests Concerning Regression and Correlation 462
9.7* Statistical Quality Control 467
Epilogue 479
Appendices
A References 481 B Tables 483 C Answers to Odd-Numbered
Exercises 509
D Review of Selected Mathematical Techniques 521
D.1 Algebra of Sets 521
D.2 Mathematical Tools for the Hypergeometric Distribution 525
D.3 Limits 528
D.4 Infinite Series 529
D.5 Integration 533
D.6 Multivariate Calculus 535
Index 541
Preface
In this Ninth Edition of Probability and Statistical Inference, Bob Hogg and Elliot Tanis are excited to add a third person to their writing team to contribute to the continued success of this text. Dale Zimmerman is the Robert V. Hogg Professor in the Department of Statistics and Actuarial Science at the University of Iowa. Dale has rewritten several parts of the text, making the terminology more consistent and contributing much to a substantial revision. The text is designed for a two-semester course, but it can be adapted for a one-semester course. A good calculus background is needed, but no previous study of probability or statistics is required.
CONTENT AND COURSE PLANNING In this revision, the first five chapters on probability are much the same as in the eighth edition. They include the following topics: probability, conditional probability, independence, Bayes’ theorem, discrete and continuous distributions, certain math- ematical expectations, bivariate distributions along with marginal and conditional distributions, correlation, functions of random variables and their distributions, including the moment-generating function technique, and the central limit theorem. While this strong probability coverage of the course is important for all students, it has been particularly helpful to actuarial students who are studying for Exam P in the Society of Actuaries’ series (or Exam 1 of the Casualty Actuarial Society).
The greatest change to this edition is in the statistical inference coverage, now Chapters 6–9. The first two of these chapters provide an excellent presentation of estimation. Chapter 6 covers point estimation, including descriptive and order statistics, maximum likelihood estimators and their distributions, sufficient statis- tics, and Bayesian estimation. Interval estimation is covered in Chapter 7, including the topics of confidence intervals for means and proportions, distribution-free con- fidence intervals for percentiles, confidence intervals for regression coefficients, and resampling methods (in particular, bootstrapping).
The last two chapters are about tests of statistical hypotheses. Chapter 8 consid- ers terminology and standard tests on means and proportions, the Wilcoxon tests, the power of a test, best critical regions (Neyman/Pearson) and likelihood ratio tests. The topics in Chapter 9 are standard chi-square tests, analysis of variance including general factorial designs, and some procedures associated with regression, correlation, and statistical quality control.
The first semester of the course should contain most of the topics in Chapters 1–5. The second semester includes some topics omitted there and many of those in Chapters 6–9. A more basic course might omit some of the (optional) starred sections, but we believe that the order of topics will give the instructor the flexibility needed in his or her course. The usual nonparametric and Bayesian techniques are placed at appropriate places in the text rather than in separate chapters. We find that many persons like the applications associated with statistical quality control in the last section. Overall, one of the authors, Hogg, believes that the presentation (at a somewhat reduced mathematical level) is much like that given in the earlier editions of Hogg and Craig (see References).
v
vi Preface
The Prologue suggests many fields in which statistical methods can be used. In the Epilogue, the importance of understanding variation is stressed, particularly for its need in continuous quality improvement as described in the usual Six-Sigma pro- grams. At the end of each chapter we give some interesting historical comments, which have proved to be very worthwhile in the past editions.
The answers given in this text for questions that involve the standard distribu- tions were calculated using our probability tables which, of course, are rounded off for printing. If you use a statistical package, your answers may differ slightly from those given.
ANCILLARIES Data sets from this textbook are available on Pearson Education’s Math & Statistics Student Resources website: http://www.pearsonhighered.com/mathstatsresources.
An Instructor’s Solutions Manual containing worked-out solutions to the even- numbered exercises in the text is available for download from Pearson Education’s Instructor Resource Center at www.pearsonhighered.com/irc. Some of the numer- ical exercises were solved with Maple. For additional exercises that involve sim- ulations, a separate manual, Probability & Statistics: Explorations with MAPLE, second edition, by Zaven Karian and Elliot Tanis, is also available for download from Pearson Education’s Instructor Resource Center. Several exercises in that manual also make use of the power of Maple as a computer algebra system.
If you find any errors in this text, please send them to tanis@hope.edu so that they can be corrected in a future printing. These errata will also be posted on http://www.math.hope.edu/tanis/.
ACKNOWLEDGMENTS We wish to thank our colleagues, students, and friends for many suggestions and for their generosity in supplying data for exercises and examples. In particular, we would like to thank the reviewers of the eighth edition who made suggestions for this edition. They are Steven T. Garren from James Madison University, Daniel C. Weiner from Boston University, and Kyle Siegrist from the University of Alabama in Huntsville. Mark Mills from Central College in Iowa also made some helpful com- ments. We also acknowledge the excellent suggestions from our copy editor, Kristen Cassereau Ng, and the fine work of our accuracy checkers, Kyle Siegrist and Steven Garren. We also thank the University of Iowa and Hope College for providing office space and encouragement. Finally, our families, through nine editions, have been most understanding during the preparation of all of this material. We would espe- cially like to thank our wives, Ann, Elaine, and Bridget. We truly appreciate their patience and needed their love.
Robert V. Hogg
Elliot A. Tanis tanis@hope.edu
Dale L. Zimmerman dale-zimmerman@uiowa.edu
http://www.pearsonhighered.com/mathstatsresources
http://www.pearsonhighered.com/irc
http://www.math.hope.edu/tanis/
Prologue
The discipline of statistics deals with the collection and analysis of data. Advances in computing technology, particularly in relation to changes in science and business, have increased the need for more statistical scientists to examine the huge amount of data being collected. We know that data are not equivalent to information. Once data (hopefully of high quality) are collected, there is a strong need for statisticians to make sense of them. That is, data must be analyzed in order to provide informa- tion upon which decisions can be made. In light of this great demand, opportunities for the discipline of statistics have never been greater, and there is a special need for more bright young persons to go into statistical science.
If we think of fields in which data play a major part, the list is almost endless: accounting, actuarial science, atmospheric science, biological science, economics, educational measurement, environmental science, epidemiology, finance, genetics, manufacturing, marketing, medicine, pharmaceutical industries, psychology, sociol- ogy, sports, and on and on. Because statistics is useful in all of these areas, it really should be taught as an applied science. Nevertheless, to go very far in such an applied science, it is necessary to understand the importance of creating models for each sit- uation under study. Now, no model is ever exactly right, but some are extremely useful as an approximation to the real situation. Most appropriate models in statis- tics require a certain mathematical background in probability. Accordingly, while alluding to applications in the examples and the exercises, this textbook is really about the mathematics needed for the appreciation of probabilistic models necessary for statistical inferences.
In a sense, statistical techniques are really the heart of the scientific method. Observations are made that suggest conjectures. These conjectures are tested, and data are collected and analyzed, providing information about the truth of the conjectures. Sometimes the conjectures are supported by the data, but often the conjectures need to be modified and more data must be collected to test the mod- ifications, and so on. Clearly, in this iterative process, statistics plays a major role with its emphasis on the proper design and analysis of experiments and the resulting inferences upon which decisions can be made. Through statistics, information is pro- vided that is relevant to taking certain actions, including improving manufactured products, providing better services, marketing new products or services, forecasting energy needs, classifying diseases better, and so on.
Statisticians recognize that there are often small errors in their inferences, and they attempt to quantify the probabilities of those mistakes and make them as small as possible. That these uncertainties even exist is due to the fact that there is variation in the data. Even though experiments are repeated under seemingly the same condi- tions, the results vary from trial to trial. We try to improve the quality of the data by making them as reliable as possible, but the data simply do not fall on given patterns. In light of this uncertainty, the statistician tries to determine the pattern in the best possible way, always explaining the error structures of the statistical estimates.
This is an important lesson to be learned: Variation is almost everywhere. It is the statistician’s job to understand variation. Often, as in manufacturing, the desire is to reduce variation because the products will be more consistent. In other words, car
vii
viii Prologue
doors will fit better in the manufacturing of automobiles if the variation is decreased by making each door closer to its target values.
Many statisticians in industry have stressed the need for “statistical thinking” in everyday operations. This need is based upon three points (two of which have been mentioned in the preceding paragraph): (1) Variation exists in all processes; (2) understanding and reducing undesirable variation is a key to success; and (3) all work occurs in a system of interconnected processes. W. Edwards Deming, an esteemed statistician and quality improvement “guru,” stressed these three points, particularly the third one. He would carefully note that you could not maximize the total operation by maximizing the individual components unless they are inde- pendent of each other. However, in most instances, they are highly dependent, and persons in different departments must work together in creating the best products and services. If not, what one unit does to better itself could very well hurt others. He often cited an orchestra as an illustration of the need for the members to work together to create an outcome that is consistent and desirable.
Any student of statistics should understand the nature of variability and the necessity for creating probabilistic models of that variability. We cannot avoid mak- ing inferences and decisions in the face of this uncertainty; however, these inferences and decisions are greatly influenced by the probabilistic models selected. Some persons are better model builders than others and accordingly will make better infer- ences and decisions. The assumptions needed for each statistical model are carefully examined; it is hoped that thereby the reader will become a better model builder.
Finally, we must mention how modern statistical analyses have become depen- dent upon the computer. Statisticians and computer scientists really should work together in areas of exploratory data analysis and “data mining.” Statistical software development is critical today, for the best of it is needed in complicated data anal- yses. In light of this growing relationship between these two fields, it is good advice for bright students to take substantial offerings in statistics and in computer science.
Students majoring in statistics, computer science, or a joint program are in great demand in the workplace and in graduate programs. Clearly, they can earn advanced degrees in statistics or computer science or both. But, more important, they are highly desirable candidates for graduate work in other areas: actuarial science, indus- trial engineering, finance, marketing, accounting, management science, psychology, economics, law, sociology, medicine, health sciences, etc. So many fields have been “mathematized” that their programs are begging for majors in statistics or computer science. Often, such students become “stars” in these other areas. We truly hope that we can interest students enough that they want to study more statistics. If they do, they will find that the opportunities for very successful careers are numerous.
Chapte rChapte r
1Probability 1.1 Properties of Probability 1.2 Methods of Enumeration 1.3 Conditional Probability
1.4 Independent Events 1.5 Bayes’ Theorem
1.1 PROPERTIES OF PROBABILITY It is usually difficult to explain to the general public what statisticians do. Many think of us as “math nerds” who seem to enjoy dealing with numbers. And there is some truth to that concept. But if we consider the bigger picture, many recognize that statisticians can be extremely helpful in many investigations.
Consider the following:
1. There is some problem or situation that needs to be considered; so statisticians are often asked to work with investigators or research scientists.
2. Suppose that some measure (or measures) is needed to help us understand the situation better. The measurement problem is often extremely difficult, and creating good measures is a valuable skill. As an illustration, in higher educa- tion, how do we measure good teaching? This is a question to which we have not found a satisfactory answer, although several measures, such as student evaluations, have been used in the past.
3. After the measuring instrument has been developed, we must collect data through observation, possibly the results of a survey or an experiment.
4. Using these data, statisticians summarize the results, often with descriptive statistics and graphical methods.
5. These summaries are then used to analyze the situation. Here it is possible that statisticians make what are called statistical inferences.
6. Finally, a report is presented, along with some recommendations that are based upon the data and the analysis of them. Frequently such a recommendation might be to perform the survey or experiment again, possibly changing some of the questions or factors involved. This is how statistics is used in what is referred to as the scientific method, because often the analysis of the data suggests other experiments. Accordingly, the scientist must consider different possibilities in his or her search for an answer and thus performs similar experiments over and over again.
1
2 Chapter 1 Probability
The discipline of statistics deals with the collection and analysis of data. When measurements are taken, even seemingly under the same conditions, the results usu- ally vary. Despite this variability, a statistician tries to find a pattern; yet due to the “noise,” not all of the data fit into the pattern. In the face of the variability, the statistician must still determine the best way to describe the pattern. Accordingly, statisticians know that mistakes will be made in data analysis, and they try to mini- mize those errors as much as possible and then give bounds on the possible errors. By considering these bounds, decision makers can decide how much confidence they want to place in the data and in their analysis of them. If the bounds are wide, per- haps more data should be collected. If, however, the bounds are narrow, the person involved in the study might want to make a decision and proceed accordingly.
Variability is a fact of life, and proper statistical methods can help us understand data collected under inherent variability. Because of this variability, many decisions have to be made that involve uncertainties. In medical research, interest may cen- ter on the effectiveness of a new vaccine for mumps; an agronomist must decide whether an increase in yield can be attributed to a new strain of wheat; a meteo- rologist is interested in predicting the probability of rain; the state legislature must decide whether decreasing speed limits will result in fewer accidents; the admissions officer of a college must predict the college performance of an incoming freshman; a biologist is interested in estimating the clutch size for a particular type of bird; an economist desires to estimate the unemployment rate; an environmentalist tests whether new controls have resulted in a reduction in pollution.
In reviewing the preceding (relatively short) list of possible areas of applications of statistics, the reader should recognize that good statistics is closely associated with careful thinking in many investigations. As an illustration, students should appreci- ate how statistics is used in the endless cycle of the scientific method. We observe nature and ask questions, we run experiments and collect data that shed light on these questions, we analyze the data and compare the results of the analysis with what we previously thought, we raise new questions, and on and on. Or if you like, statistics is clearly part of the important “plan–do–study–act” cycle: Questions are raised and investigations planned and carried out. The resulting data are studied and analyzed and then acted upon, often raising new questions.
There are many aspects of statistics. Some people get interested in the subject by collecting data and trying to make sense out of their observations. In some cases the answers are obvious and little training in statistical methods is necessary. But if a person goes very far in many investigations, he or she soon realizes that there is a need for some theory to help describe the error structure associated with the various estimates of the patterns. That is, at some point appropriate probability and math- ematical models are required to make sense out of complicated data sets. Statistics and the probabilistic foundation on which statistical methods are based can provide the models to help people do this. So in this book, we are more concerned with the mathematical, rather than the applied, aspects of statistics. Still, we give enough real examples so that the reader can get a good sense of a number of important applications of statistical methods.
In the study of statistics, we consider experiments for which the outcome can- not be predicted with certainty. Such experiments are called random experiments. Although the specific outcome of a random experiment cannot be predicted with certainty before the experiment is performed, the collection of all possible outcomes is known and can be described and perhaps listed. The collection of all possible out- comes is denoted by S and is called the outcome space. Given an outcome space S, let A be a part of the collection of outcomes in S; that is, A ⊂ S. Then A is called an event. When the random experiment is performed and the outcome of the experiment is in A, we say that event A has occurred.
Section 1.1 Properties of Probability 3
Since, in studying probability, the words set and event are interchangeable, the reader might want to review algebra of sets. Here we remind the reader of some terminology:
• ∅ denotes the null or empty set; • A ⊂ B means A is a subset of B; • A ∪ B is the union of A and B; • A ∩ B is the intersection of A and B; • A′ is the complement of A (i.e., all elements in S that are not in A).
Some of these sets are depicted by the shaded regions in Figure 1.1-1, in which S is the interior of the rectangles. Such figures are called Venn diagrams.
Special terminology associated with events that is often used by statisticians includes the following:
1. A1, A2, . . . , Ak are mutually exclusive events means that Ai ∩ Aj = ∅, i ̸= j; that is, A1, A2, . . . , Ak are disjoint sets;
2. A1, A2, . . . , Ak are exhaustive events means that A1 ∪ A2 ∪ · · · ∪ Ak = S. So if A1, A2, . . . , Ak are mutually exclusive and exhaustive events, we know that
Ai ∩ Aj = ∅, i ̸= j, and A1 ∪ A2 ∪ · · · ∪ Ak = S. Set operations satisfy several properties. For example, if A, B, and C are subsets
of S, we have the following:
Commutative Laws
A ∪ B = B ∪ A A ∩ B = B ∩ A
S
A A B
A B
C
A B
S
S S
(b) A ∪ B
(c) A ∩ B (d) A ∪ B ∪ C
(a) A´
Figure 1.1-1 Algebra of sets
4 Chapter 1 Probability
Associative Laws
(A ∪ B) ∪ C = A ∪ (B ∪ C) (A ∩ B) ∩ C = A ∩ (B ∩ C)
Distributive Laws
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
De Morgan’s Laws
(A ∪ B)′ = A′ ∩ B′
(A ∩ B)′ = A′ ∪ B′
A Venn diagram will be used to justify the first of De Morgan’s laws. In Figure 1.1-2(a), A ∪ B is represented by horizontal lines, and thus (A ∪ B)′ is the region represented by vertical lines. In Figure 1.1-2(b), A′ is indicated with hori- zontal lines, and B′ is indicated with vertical lines. An element belongs to A′ ∩ B′ if it belongs to both A′ and B′. Thus the crosshatched region represents A′ ∩ B′. Clearly, this crosshatched region is the same as that shaded with vertical lines in Figure 1.1-2(a).
We are interested in defining what is meant by the probability of event A, denoted by P(A) and often called the chance of A occurring. To help us understand what is meant by the probability of A, consider repeating the experiment a number of times—say, n times. Count the number of times that event A actually occurred throughout these n performances; this number is called the frequency of event A and is denoted by N (A). The ratio N (A)/n is called the relative frequency of event A in these n repetitions of the experiment. A relative frequency is usually very unsta- ble for small values of n, but it tends to stabilize as n increases. This suggests that we associate with event A a number—say, p—that is equal to the number about which the relative frequency tends to stabilize. This number p can then be taken as the num- ber that the relative frequency of event A will be near in future performances of the experiment. Thus, although we cannot predict the outcome of a random experiment with certainty, if we know p, for a large value of n, we can predict fairly accurately the relative frequency associated with event A. The number p assigned to event A is
A
(a) (b)
B A B
Figure 1.1-2 Venn diagrams illustrating De Morgan’s laws
Section 1.1 Properties of Probability 5
called the probability of event A and is denoted by P(A). That is, P(A) represents the proportion of outcomes of a random experiment that terminate in the event A as the number of trials of that experiment increases without bound.
The next example will help to illustrate some of the ideas just presented.
Example 1.1-1
A fair six-sided die is rolled six times. If the face numbered k is the outcome on roll k for k = 1, 2, . . . , 6, we say that a match has occurred. The experiment is called a success if at least one match occurs during the six trials. Otherwise, the experiment is called a failure. The sample space is S = {success, failure}. Let A = {success}. We would like to assign a value to P(A). Accordingly, this experiment was simulated 500 times on a computer. Figure 1.1-3 depicts the results of this simulation, and the following table summarizes a few of the results:
n N (A) N (A)/n
50 37 0.740
100 69 0.690
250 172 0.688
500 330 0.660
The probability of event A is not intuitively obvious, but it will be shown in Example 1.4-6 that P(A) = 1 − (1 − 1/6)6 = 0.665. This assignment is certainly supported by the simulation (although not proved by it).
Example 1.1-1 shows that at times intuition cannot be used to assign probabil- ities, although simulation can perhaps help to assign a probability empirically. The next example illustrates where intuition can help in assigning a probability to an event.
freq/n
n 0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 100 200 300 400 500
Figure 1.1-3 Fraction of experiments having at least one match
6 Chapter 1 Probability
Example 1.1-2
A disk 2 inches in diameter is thrown at random on a tiled floor, where each tile is a square with sides 4 inches in length. Let C be the event that the disk will land entirely on one tile. In order to assign a value to P(C), consider the center of the disk. In what region must the center lie to ensure that the disk lies entirely on one tile? If you draw a picture, it should be clear that the center must lie within a square having sides of length 2 and with its center coincident with the center of a tile. Since the area of this square is 4 and the area of a tile is 16, it makes sense to let P(C) = 4/16.
Sometimes the nature of an experiment is such that the probability of A can be assigned easily. For example, when a state lottery randomly selects a three-digit integer, we would expect each of the 1000 possible three-digit numbers to have the same chance of being selected, namely, 1/1000. If we let A = {233, 323, 332}, then it makes sense to let P(A) = 3/1000. Or if we let B = {234, 243, 324, 342, 423, 432}, then we would let P(B) = 6/1000, the probability of the event B. Probabilities of events associated with many random experiments are perhaps not quite as obvious and straightforward as was seen in Example 1.1-1.
So we wish to associate with A a number P(A) about which the relative fre- quency N (A)/n of the event A tends to stabilize with large n. A function such as P(A) that is evaluated for a set A is called a set function. In this section, we consider the probability set function P(A) and discuss some of its properties. In succeeding sections, we shall describe how the probability set function is defined for particular experiments.
To help decide what properties the probability set function should satisfy, con- sider properties possessed by the relative frequency N (A)/n. For example, N (A)/n is always nonnegative. If A = S, the sample space, then the outcome of the experi- ment will always belong to S, and thus N (S)/n = 1. Also, if A and B are two mutually exclusive events, then N (A ∪ B)/n = N (A)/n + N (B)/n. Hopefully, these remarks will help to motivate the following definition.
Definition 1.1-1 Probability is a real-valued set function P that assigns, to each event A in the sample space S, a number P(A), called the probability of the event A, such that the following properties are satisfied:
(a) P(A) ≥ 0; (b) P(S) = 1; (c) if A1, A2, A3, . . . are events and Ai ∩ Aj = ∅, i ̸= j, then
P(A1 ∪ A2 ∪ · · · ∪ Ak) = P(A1) + P(A2) + · · · + P(Ak)
for each positive integer k, and
P(A1 ∪ A2 ∪ A3 ∪ · · · ) = P(A1) + P(A2) + P(A3) + · · ·
for an infinite, but countable, number of events.
The theorems that follow give some other important properties of the probabil- ity set function. When one considers these theorems, it is important to understand the theoretical concepts and proofs. However, if the reader keeps the relative frequency concept in mind, the theorems should also have some intuitive appeal.
Section 1.1 Properties of Probability 7
Theorem 1.1-1
For each event A,
P(A) = 1 − P(A′).
Proof [See Figure 1.1-1(a).] We have
S = A ∪ A′ and A ∩ A′ = ∅.
Thus, from properties (b) and (c), it follows that
1 = P(A) + P(A′). Hence
P(A) = 1 − P(A′). !
Example 1.1-3
A fair coin is flipped successively until the same face is observed on successive flips. Let A = {x : x = 3, 4, 5, . . .}; that is, A is the event that it will take three or more flips of the coin to observe the same face on two consecutive flips. To find P(A), we first find the probability of A′ = {x : x = 2}, the complement of A. In two flips of a coin, the possible outcomes are {HH, HT, TH, TT}, and we assume that each of these four points has the same chance of being observed. Thus,
P(A′) = P({HH, TT}) = 2 4
.
It follows from Theorem 1.1-1 that
P(A) = 1 − P(A′) = 1 − 2 4
= 2 4
.
Theorem 1.1-2
P(∅) = 0.
Proof In Theorem 1.1-1, take A = ∅ so that A′ = S. Then
P(∅) = 1 − P(S) = 1 − 1 = 0. !
Theorem 1.1-3
If events A and B are such that A ⊂ B, then P(A) ≤ P(B).
Proof We have
B = A ∪ (B ∩ A′) and A ∩ (B ∩ A′) = ∅.
Hence, from property (c),
P(B) = P(A) + P(B ∩ A′) ≥ P(A) because, from property (a),
P(B ∩ A′) ≥ 0. !
8 Chapter 1 Probability
Theorem 1.1-4
For each event A, P(A) ≤ 1.
Proof Since A ⊂ S, we have, by Theorem 1.1-3 and property (b),
P(A) ≤ P(S) = 1,
which gives the desired result. !
Property (a), along with Theorem 1.1-4, shows that, for each event A,
0 ≤ P(A) ≤ 1.
Theorem 1.1-5
If A and B are any two events, then
P(A ∪ B) = P(A) + P(B) − P(A ∩ B).
Proof [See Figure 1.1-1(b).] The event A ∪ B can be represented as a union of mutually exclusive events, namely,
A ∪ B = A ∪ (A′ ∩ B).
Hence, by property (c),
P(A ∪ B) = P(A) + P(A′ ∩ B). (1.1-1)
However,
B = (A ∩ B) ∪ (A′ ∩ B),
which is a union of mutually exclusive events. Thus,
P(B) = P(A ∩ B) + P(A′ ∩ B) and
P(A′ ∩ B) = P(B) − P(A ∩ B).
If the right-hand side of this equation is substituted into Equation 1.1-1, we obtain
P(A ∪ B) = P(A) + P(B) − P(A ∩ B),
which is the desired result. !
Example 1.1-4
A faculty leader was meeting two students in Paris, one arriving by train from Amsterdam and the other arriving by train from Brussels at approximately the same time. Let A and B be the events that the respective trains are on time. Suppose we know from past experience that P(A) = 0.93, P(B) = 0.89, and P(A ∩ B) = 0.87. Then
P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 0.93 + 0.89 − 0.87 = 0.95
is the probability that at least one train is on time.
Section 1.1 Properties of Probability 9
Theorem 1.1-6
If A, B, and C are any three events, then
P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) −P(A ∩ C) − P(B ∩ C) + P(A ∩ B ∩ C).
Proof [See Figure 1.1-1(d).] Write
A ∪ B ∪ C = A ∪ (B ∪ C)
and apply Theorem 1.1-5. The details are left as an exercise. !
Example 1.1-5
A survey was taken of a group’s viewing habits of sporting events on TV during the last year. Let A = {watched football}, B = {watched basketball}, C = {watched baseball}. The results indicate that if a person is selected at random from the sur- veyed group, then P(A) = 0.43, P(B) = 0.40, P(C) = 0.32, P(A ∩ B) = 0.29, P(A ∩ C) = 0.22, P(B ∩ C) = 0.20, and P(A ∩ B ∩ C) = 0.15. It then follows that
P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C) −P(B ∩ C) + P(A ∩ B ∩ C)
= 0.43 + 0.40 + 0.32 − 0.29 − 0.22 − 0.20 + 0.15 = 0.59
is the probability that this person watched at least one of these sports.
Let a probability set function be defined on a sample space S. Let S = {e1, e2, . . . , em}, where each ei is a possible outcome of the experiment. The integer m is called the total number of ways in which the random experiment can terminate. If each of these outcomes has the same probability of occurring, we say that the m outcomes are equally likely. That is,
P({ei}) = 1 m
, i = 1, 2, . . . , m.
If the number of outcomes in an event A is h, then the integer h is called the number of ways that are favorable to the event A. In this case, P(A) is equal to the number of ways favorable to the event A divided by the total number of ways in which the experiment can terminate. That is, under this assumption of equally likely outcomes, we have
P(A) = h m
= N(A) N(S)
,
where h = N(A) is the number of ways A can occur and m = N(S) is the number of ways S can occur. Exercise 1.1-15 considers this assignment of probability in a more theoretical manner.
It should be emphasized that in order to assign the probability h/m to the event A, we must assume that each of the outcomes e1, e2, . . . , em has the same probability 1/m. This assumption is then an important part of our probability model; if it is not realistic in an application, then the probability of the event A cannot be computed in this way. Actually, we have used this result in the simple case given in Example 1.1-3 because it seemed realistic to assume that each of the possible outcomes in S = {HH, HT, TH, TT} had the same chance of being observed.
10 Chapter 1 Probability