APRIL 2014 / THE CPA JOURNAL6
The following discussion will not dwell upon “attributes” sampling (used principally to test the effectiveness of internal controls) or the details of how to use the various sampling tools avail- able for testing amounts or balances. Instead, it will focus on which sam- pling tools to use in a variety of situ- ations and why I think each is the prop- er tool for the situation described.
Setting the Stage Auditors audit financial statements
to provide users with assurance that the financial statements are not materially misstated. They employ a variety of testing techniques to accumulate suf- ficient appropriate audit evidence in order to support an opinion on the financial statements: inquiry, observa- tion, reperformance, confirmation, ana- lytical procedures, tests of internal con- trol, and tests of details. Some of these techniques are nonsampling procedures and others are sampling procedures, which include statistical sampling and nonstatistical sampling. Audit sampling figures in many pop- ular myths that are not quite true, as shown in the sidebar, Myths and Truths About Audit Sampling.
Of these techniques and sampling procedures, how do we decide which tools to use?
Nonsampling Many tests are not samples. One of
the most common is to “select all items > x” for testing and either ignore the remainder of the items or test “a few” of the items < x. This approach can be effective for testing highly skewed populations, because the > x items make up a very large percentage of the population, rendering the < x items immaterial in the aggregate; how- ever, such an approach is not necessar- ily the most efficient, because the number of items that must be tested to reduce the aggregate untested popula- tion to an immaterial amount might be more than would be tested in a statisti- cal sample.
Another nonsampling procedure is a substantive analytical procedure. Such procedures can be both effective and efficient in many scenarios where the item being tested has a known rela- tionship to some other item that has already been tested (e.g., commission expenses have a predictable relation- ship to sales).
Nonstatistical Sampling Any sample where the sample items
are not selected according to the laws of chance—that is, by probability sam- pling—is a nonstatistical sample. Selecting an arbitrary number of items for testing is an example of nonstatistical sampling. The problems with nonstatistical sampling are— n while the test provides evidence about the items tested, the results can- not be projected to the untested popu- lation, and n it is easy to inadvertently bias the sample solution.
An Audit Partner’s Perspective
udit sampling, like any powerful tool, is of great value when used prop- erly but of great potential harm when used improperly. The auditing standards provided by the AICPA, PCAOB, and International Auditing and Assurance Board (IAASB) provide scant guidance about audit sam- pling. The AICPA’s audit sampling guide is useful, but I believe that
more guidance is needed.
Myths and Inconvenient Truths about Audit Sampling
A
P E R S P E C T I V E S a u d i t i n g
By Howard Sibelman
APRIL 2014 / THE CPA JOURNAL 7
The auditing literature specifies that non- statistical sampling is acceptable—but one might ask why. The literature also specifies that nonstatistical sample sizes should approximate statistical sample sizes. If one needs to do the same amount of work in either scenario, why choose a tool (nonstatistical sampling) that does not permit conclusions about the untested population? From my point of view, unless the number of items in the population is very small (fewer than 200), statistical sam- pling should always be preferred over non- statistical sampling.
Statistical Sampling Statistical samples include those selected
and evaluated using proper statistical method- ology, based upon either an equal probabil- ity or a probability proportional to size. Such samples include the following.
Monetary unit sampling (MUS). This approach selects samples with a proba- bility proportional to their size. This is the tool to use when the concern is overstatement. The most common uses, in my experience, are to test for the existence of accounts receivable, inven- tory, and fixed asset additions, and to search for unrecorded liabilities (com- pleteness test) by testing the population of subsequent disbursements for existence (overstatement)—that is, expenditures that should have been recorded in the period under audit.
As noted in the sidebar, I believe MUS is a relatively easy and highly efficient tool to use when testing for overstatement, and this is often the principal audit concern; how- ever, “principal” does not mean “exclusive,” and so auditors must think about assertions where understatement may be as significant a concern as overstatement.
Equal probability sampling. This tool is frequently referred to as classical vari- ables sampling (CVS). When should audi- tors be as concerned about understatement as overstatement? There is no correct answer to this question other than “always,” because material misstatement is a bidirectional concern. Nevertheless, there are certain realities to consider when evaluating the risk of material under- statement:
n Many entities are motivated to under- state results. Family-owned entities may understate to reduce taxes, but not so much so as to inhibit lenders from extending credit. Not-for-profit entities may under- state so as to encourage donations. n Some accounts are prone to both inad- vertent overstatement and understatement misstatement (e.g., inventory valuation— pricing misstatement)
In the first scenario, how hard can the auditor look for understatement? In the real world, client relations come into the pic- ture. Nevertheless, auditors generally per- form a variety of nonsampling procedures that address the potential for understate- ment in one way or another, such as a rev- enue cutoff test, inventory observation (assuming that all of the inventory loca- tions are known), and substantive analytical procedures that are just as likely to indicate understatement as overstatement (e.g., the reasonableness of depreciation, inventory
turnover, changes in cost of goods sold). The second scenario is another matter.
Putting aside deliberate understatement, client personnel and systems might be just as prone to understating amounts as overstating amounts. Inventory valuation is a classic example because of the possi- bility of inputting errors (e.g., a misplaced decimal point) or making a mistake while doing an inventory count (e.g., units of measure, incorrect arithmetic, overlooked quantities). Once auditors conclude that they must test for understatement because of the assessment of the possibility of mate- rial understatement, MUS goes out the win- dow as a sampling tool. The reason is prob- ably evident—because MUS selection is weighted toward high-value items, the selection of understated items would be mere happenstance.
The only valid method to address the understatement concern is to use CVS. In CVS tests, items in the population have an
Myths Truths
n This is a 100% audit. n Virtually all audits involve sampling.
n Errors in data are nonexistent, or so n Stuff happens. Humans are not rare that it is a waste of time trying to perfect and computer systems are detect any errors. designed by humans. Risk assessment
drives the scope, but one cannot audit based on an assumption that errors do not exist.
n Understated = conservative, therefore n In many scenarios, auditors are mostly auditors don’t need to worry about concerned about overstatement and little understatement. concerned with understatement.
n MUS is an appropriate tool when n Testing for overstatement is relatively concerned about both over- and easy (MUS); testing for understatement understatement. is relatively difficult (CVS).
n Extrapolating the error rate from a n Statistical sampling results take into nonstatistical sample is a valid means of account sampling risk, and can be used estimating error in the untested population. as a proposed adjusting journal entry.
MYTHS AND TRUTHS ABOUT AUDIT SAMPLING
equal probability of selection (i.e., selec- tion is random), so an understated item has as much chance of being selected for test- ing as an overstated item (assuming under- statements and overstatements occur at the same rate in the population). The problem is that CVS is a much more complicated tool to use than MUS and requires larger sample sizes. One of the reasons for larg- er sample sizes is that—as opposed to MUS, which looks at high-value items that are easy to see for overstatement—CVS must consider all items, because the con- cern is understatement and understated items do not “stick out.”
Two Practical Approaches The following are what I consider to be
two practical approaches that balance a con- cern for understatement with the need to be efficient. The context is testing inventory val- uation, and so the right tool is CVS. Readers will immediately note that I propose using only two strata. Most of the CVS literature talks about many strata, which are better suit- ed for estimating population value than detecting the presence of material misstate- ment. When using CVS to estimate a pop- ulation’s total value, or to come to a statis-
tically valid conclusion that a population is within an acceptable range of values, many strata may be needed to “efficiently” achieve the required precision. When attempting to detect material misstatement, the preferable approach is to use fewer strata. My objec- tive in these practical approaches is not to calculate population values, but rather to assess the likelihood of material misstate- ment in the population.
Practical Approach 1 Step 1. Stratify the population into two
strata: all items > x (the 100% layer) and all other items. The 100% layer will include all items greater than or equal to the tolerable misstatement (performance materiality) for the test, but will likely be lower if there are little or no individual items greater than or equal to perfor- mance materiality. In other words, x can be no greater than performance materiali- ty for the test and will likely be lower; this is more or less the same as the soft- ware would do for an MUS test.
Step 2. Determine the sampling parame- ters as in an “attributes” test, with regard to the number of items in the population, con- fidence level (the flip side of risk), and tol-
erable error rate. The expected error rate should be set at one-half of the tolerable error rate. The attribute in this test is whether the inventory item is properly valued. The result- ing sample size will be three to four times larger than a sample for discovery sampling. The benefit of this larger sample size is that if the population is error-prone, the sample will include enough errors to provide the basis for the calculation of a confidence inter- val that can serve as a reasonable basis for an audit decision.
Step 3. Input the sampling parameters into an “attributes” sample size calculator. The result will be a sample size of more items than MUS, but fewer than CVS (which considers the variability of the entire population).
Step 4. Perform the test on the 100% layer and the sample items (selected randomly).
Step 5. Input the results, including the 100% layer, into a CVS evaluation program.
What happens next depends on the sam- ple results. The CVS evaluation program will produce several numbers that are used in the following evaluation rules: n Precision n Lower confidence limit/lower error limit (LEL)
APRIL 2014 / THE CPA JOURNAL8
EXHIBIT 1 Results for Practical Approach 1
Sample Number of Rate of Lower Estimated Upper Proposed
Number Misstatements Misstatement Precision Confidence Misstatement Confidence Correction
1 13 7.4% 286,692 (258,141) 28,552 315,244 —
2 17 9.7% 478,327 (257,115) 221,212 699,540 —
3 15 8.6% 315,483 (550,994) (235,510) 79,973 —
4 15 8.6% 237,220 (399,814) (162,594) 74,626 —
5 8 4.6% 252,794 (324,703) (71,908) 180,886 —
6 17 9.7% 800,424 (368,253) 432,171 1,232,594 Reject
7 21 12.0% 368,268 (493,441) (125,174) 243,094 —
8 10 5.7% 334,009 (329,553) 4,457 338,466 —
9 17 9.7% 330,763 (772,299) (441,536) (110,774) —
10 14 8.0% 346,298 (697,857) (351,558) (5,260) —
APRIL 2014 / THE CPA JOURNAL 9
n Projected error n Upper confidence limit/upper error limit (UEL).
Rule 1. If precision is greater than per- formance materiality (PM), the test fails— that is, because the results are not suffi- ciently precise, there is no basis upon which to propose any correction. This leaves one with three alternatives: n Increase the sample size and test enough items in a single additional step to achieve a precision that is less than PM; this might be a lot of work. n Ask the client to rework the population to reduce the error rate and then retest the population from scratch.
Neither of these two alternatives will endear an auditor to a client, but it is not an auditor’s fault that the population has so many errors in it. n Finally, see if lowering the confidence level produces an acceptable precision.
It is with some trepidation that I even mention this third alternative. While theo- retically acceptable, this choice is too prone, in my opinion, to rationalization. The confidence level at which the test was originally designed and performed should not be second-guessed because of
the results. Accordingly, I urge that one not be seduced by this alternative. If the work is ever questioned, it will be difficult to defend the assertion that confidence was lowered to accommodate the results of the test.
Rule 2. If precision is less than PM, the test may be relied upon, but further anal- ysis is required.
Rule 2a. If the magnitude of either LEL or UEL is greater than PM, a cor- rection of misstatement must be posted to the schedule of unadjusted differences for further evaluation. See Rule 3.
Rule 2b. If the magnitudes of LEL and UEL are both less than PM, the popula- tion can be accepted as presented.
Rule 3. When Rule 2a requires a pro- posed adjustment, the amount of the adjust- ment is the greater of the difference between PM and either the LEL or UEL, and it may be a positive or negative value, depending upon the sign of the LEL or UEL.
For example, consider a series of results based on the following: n Popula tion, 4,890 ite ms; va lue, $14,512,000; performance materiality, $800,000 n 230 items are misstated, some over, some under; a 4.7% rate of misstatement; n Net misstatement is approximately zero (this is a fact one will never know in the real world, but this is a test case)
EXHIBIT 2 Results for Practical Approach 2
Sample Number of Rate of Lower Estimated Upper Proposed
Number Misstatements Misstatement Precision Confidence Misstatement Confidence Correction
11 10 8.6% 853,594 (684,355) 169,240 1,022,834 Reject
12 8 6.9% 464,457 (842,441) (377,984) 86,473 (42,441)
13 10 8.6% 561,791 (176,698) 385,093 946,884 146,884
14 16 13.8% 740,005 (866,017) (126,012) 613,993 (66,017)
15 7 6.0% 461,602 (769,426) (307,825) 153,777 —
16 11 9.5% 313,667 (288,498) 25,170 338,837 —
17 11 9.5% 293,546 (300,782) (7,236) 286,309 —
18 13 11.2% 658,178 (1,071,067) (412,889) 245,288 (271,067)
19 5 4.3% 442,952 (282,611) 160,341 603,293 —
20 12 10.3% 836,411 2,566 838,977 1,675,388 Reject
Many auditors may not like the idea of proposing
correction where none is needed, but in the real world
you will never know whether the population is misstated.
n Using an attributes sampling program, a sample of 175 is calculated using 90% confidence, 5% tolerable error, and 2.5% expected error n For illustrative purposes, 10 different random numbers were used to pull 10 different samples. (Do not attempt this in the field—clients won’t pay for it.)
How do we interpret the results shown in Exhibit 1?
Sample #6 will be rejected because precision (800,424) is greater than PM
(800,000). This follows Rule 1 above. This will happen from time to time, as there is always some chance of the sample pro- ducing out-of-bounds results, even for this test population where the net mis- statement is zero. As noted above, auditors have a choice: increase the sample size until enough items are tested to achieve precision that is less than PM, or ask the client to rework the population to reduce the error rate and then retest the popula- tion from scratch. Because there are 17 errors in this sample, a 9.7% error rate, one should not encounter any resistance to this request.
The other nine results are quite inter- esting in that there is no proposed correc- tion for any of them—this is a good thing, given that the population is, in fact, not at all misstated.
Practical Approach 2 Step 1. Input the population into CVS
software that will divide the population into two strata, items > x (the 100% layer) and all other items. The 100% layer will
include all items ≥ tolerable misstatement (PM) for the test, but will likely be lower, perhaps much lower, if there are no indi- vidual items, or few individual items, ≥ performance materiality. In other words, x can be no greater than PM for the test and will likely be lower.
Step 2. Part of the input to the CVS soft- ware requires entering values for PM and risk of incorrect acceptance (the flip side of confidence, which in my experience is usually 80%, 90%, or 95%; i.e., risk of
20%, 10%, or 5%). How one approaches which variables to use has a significant impact on the calculated sample size. For a more thorough discussion, see “Relating Statistical Sampling to Audit Objectives” by Robert K. Elliott and John R. Rogers, Journal of Accountancy, June 1972.
Step 3. Allow the CVS software to cal- culate the sample size. There will be two components, the 100% layer and the actu- al sample of the remaining population. The resulting number of items to be tested (both strata) will be more than an MUS test but will likely be fewer than Practical Approach 1. This comes at a price, as shown below.
Step 4. Select the 100% layer items. Select the sample items randomly. Perform the test.
Step 5. Input the results, including the 100% layer, into a CVS evaluation pro- gram. What happens next depends on the sample results.
Example 2. The results in Exhibit 2 are for the same 10 tests as in Practical Approach 1, but with only 116 items tested. Testing
the smaller number of items, two samples will be rejected (#11 and #20). Four of the tests (#12, #13, #14, and #18) will result in proposed corrections where no correction is actually required—but if the proposed cor- rection is recorded, the adjusted amount of the population is, in all cases, within PM of the actual amount of the population. Many auditors may not like the idea of proposing correction where none is needed, but in the real world you will never know whether the population is misstated, or by exactly how much, and as long as the planning about PM is sound, it is appropriate to propose such adjustments.
Which Is the Better Approach? There is no correct answer to this ques-
tion. The science is inescapable—the more items tested, the more precise the results become, allowing for better deci- sions. This argues for Practical Approach 1. The problem is that the extra precision comes at the expense of additional time, as compared to Practical Approach 2. Aside from the difference in rejected samples, both approaches yield accept- able audit results.
My recommendation to auditors is try to find the budget for Practical Approach 1. There will be fewer rejected results, but, even more importantly, there will be fewer pro- jected errors to request that the client correct.
As much as auditors might like to ignore the risk of understatement—however we rationalize it, such as the financial state- ments are “conservative”—the auditor’s report speaks of material misstatement. That is a bidirectional concept. Of course, there is also the point to be made that understatement is “conservative” only for assets. Are the non-sampling and substan- tive analytical procedures described above that provide some comfort about understatement enough? Ultimately that is a matter of an auditor’s judgment. The practical approaches outlined above should illustrate why CVS is not too much work to use to test inventory valuation. q
Howard Sibelman, MBT, CPA, is a direc- tor of subscriber services at Crowe Horwath International, New York, N.Y.
APRIL 2014 / THE CPA JOURNAL10
As much as auditors might like to ignore the risk of
understatement, the auditor’s report speaks of material
misstatement. That is a bidirectional concept.
Copyright of CPA Journal is the property of New York State Society of CPAs and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.