Figures 1-2, table 1 CHAPTER IV-I WHY STATISTICS IS SO DIFFICULT, AND WHY RESAMPLING IS EASIER Julian L. Simon A successful statistical inference is as difficult a feat of the intellect as one commonly meets. This is not because of mathematical difficulties. Rather, it is due to the long chain of reasoning connecting the original question with a sound conclusion; indeed, the mathematical operations involved in estimating probabilities once the problem has been correctly specified can be straightforward, especially when one estimates with the experimental resampling method rather than with formulaic sample-space methods. Indeed, perhaps the greatest benefit of the resampling approach is that it clears away the mathematical difficulties so that the difficult philosophical and procedural issues can be seen clearly, and hence may be tackled head on. Ironically, however, this characteristic of reducing the computational difficulties has also been a drawback of the resampling method; by making the necessary mathematical operations so simple as to be accessible to any clear-thinking layperson, resampling has made the assistance of professional statisticians seem less necessary, which naturally educes resistance by the traditionalists in that profession. To restate the point: The great challenge presented by the ideas of statistics does not stem from the need for a large body of prerequisite knowledge, or for mathematical sophistication and inclination. The challenge stems, rather, from the inherent difficulty of making sense of a complicated situation. Indeed, if any particular situation is not hard to understand, statistical inference is not needed. The difficulty lies in there being a very long sequence of decisions that must be made correctly about such matters as the nature of the correct hypothetical population, the correct sampling procedure, and so on. This essential difficulty will be seen more clearly in Chapters 00 where canonical procedures for confidence intervals and hypothesis tests are set forth. THE ESSENCE OF THE DIFFICULTY OF MATHEMATICS\statlect\xintro for use in first day's lecture and maybe improvbed there\statbook\yintro_c. Equated on December 28, 1995 What is the area of a table that is 6 feet long and 4 feet wide? Twenty-four square feet, you say. But how do you know that the answer is 24? Multiply length times width, you say. But are the 6 and 4 units in the lengths you are multiplying the same as the 24 units in the answer? Notice that a length is in feet and the answer is in square feet. So how can you start talking about one entity and end up talking with another entity? Never mind; that's just a minor complication. Why do you multiply length times width? Because that's the formula I learned in school, you say. But can you prove that that is the correct formula? Most of us can't. Simple as it is, such a deductive proof is not easy. It includes a chain of definitions and substitutions which is not easy to follow, let along to produce. So it should not be surprising to us that more complex problems in mathematics are hard to follow. Do we have to do such difficult intellectual work? No. Let's instead take a piece of graph paper, mark off a length of 6 units and a width of 4 units, and count the number of squares that are included. The formula does the same thing as we do in the graph but more abstractly, and the proof of the formula probably only deduces why the formula is the same as what we have done. So we can do without the formula. And we can use the same device for more complex problems - say, a trapezoid. One day I sat down to read an essay about the famous Goedel's proof which shook mathematics in the 1930s by showing that the foundations of mathematics are less intellectually solid that mathematicians had believed. I had no hope of ever reading Goedel's paper itself, but I tackled an essay by the philosopher Ernest Nagel and the mathematician-journalist James R. Newman (1956) in hopes that they could relate to me enough of the central idea so that I could grasp its spirit. The paper contains almost no formulae, and very little mathematical manipulation. Yet as I tackled it weekend after weekend, I found that I could not follow its logic in a satisfactory manner. So I tried to figure out why I was having such difficulty. The difficulty for me seems to be that in the series of steps in the logic there is continual substitution of one set of symbols - either a formula, or part of a formula - for another formula. This means that at each substitution one must remember the meaning of both the symbols in the previous step as well as in the next step - that is, one must keep in one's mind the meaning of all the many symbols being used. Nagel and Newman reduce the number of symbols and sets of them very sharply - they note that Goedel's paper contains "Forty-six preliminary definitions together with several important lemmas must be mastered" before even beginning the main work (p. 1688); nevertheless, I realized that my memory could not hold the necessary meanings of even the smaller number of symbols Nagel and Newman use as they make one substitution after another. You hear and read mathematicians saying that the ability to hold a long chain of reasoning in mind is a characteristic of their field. Let's agree with them about that. But we need not accept the apparent implication of the statement - that being able to do so is the mark of superior intelligence. Rather, handling long chains of reasoning is just one special ability, and one that can be counter-productive. Consider an analogy: A juggler shows you a pile of glass balls atop a tool that you need to do a job. S/he says: "Watch how I get this tool for you", and proceeds to throw all the glass balls in the air and keep them there, remove the tool and give it to you, then take the balls out of the air and replace them one by one in the pile without breaking any. Then s/he puts the tool back and says, "Here is the pile. Now you lift the balls, remove the tool, and put the balls back in the original spot". The implication is that we, too, should learn how to juggle the balls so as to extract the tool. But there is another way: Take all the balls off the pile and put them in another pile nearby, take the tool away, and again pile the balls on the original spot. That way you do not need to keep the balls in the air at all. And so it is with statistics and probability. You do not need to obtain the tool you need by expending great amounts of time and effort to learn how to keep all the intellectual balls in your mind - which you probably could not learn to do satisfactorily no matter how you try, if you are like me. Instead, you can learn how to get what you need without keeping anything in memory by instead using concrete elements - such as the use of a picture for solving the fathers-and-sons puzzle (see Chapter 00, page 000). Many of us find it difficult to turn away from the mathematician's demand that we carry out the task using his/her deductive method. In contrast, if the juggler brags about his/her skill with hand and eye, we reject the claim that that is a sign that s/he is just plain a better athlete than the rest of us. For example, I doubt my ability to learn juggling well, but my hand-eye coordination hitting a a ball to a tiny spot with a squash or racquetball racket was at one time quite good for my age and experience; and juggling is just one among many hand-eye skills. But with formulaic mathematics it is different. We all have to play that game at least a little bit in school and college. And hence it is harder for us to turn away from the claim that skill in mathematical manipulation is just one special talent, and not a higher one. It is interesting to note the discussion by Friedrich Hayek - a Nobel-prize winner in economics, and in my view the greatest social scientist of the 20th Century - of two kinds of minds among eminent scientists (1978, Chapter 4). One type calls "master of the subject", and the other he calls "the puzzler" and sometimes "the muddler"; it is the latter that he said he himself possessed. I would suggest that a "master of the subject" is the type that (among other characteristics) handles formulaic methods with dexterity, and operates comfortably at high levels of abstraction. In contrast, a puzzler constantly feels the need to go back to first principles, if only because s/he cannot well remember what underlies propositions at a high level of abstraction. The puzzler is likely to more drawn to resampling and experimental methods generally that is a master of the subject. WHY SIMULATION MAKES STATISTICS EASIER The intellectual advantage of simulation in general, and of the resampling method in statistics, is that though it takes repeated samples from the sample space, it does not require that one know the size of the sample space, or the number of points in a particular partition of it. To calculate the likelihood of getting (say) 26 "4"s in 130 dice throws with the binomial formula requires that one calculate the total number of possible permutations of 130 dice, and the number of those permutations that include 26 or more "4"s. Gaussian distribution-based methods often are used to approximate such processes when sample sizes become large, which introduces another escalation in the level of intellectual difficulty and obscurity for persons who are not skilled professional mathematicians. In contrast, with a simulation approach one needs to know only the conditions of producing a single "4" in one roll. Indeed, it is the much lesser degree of intellectual difficulty which is the secret of simulation's success, because it improves the likelihood that the user will arrive at a sound solution to the problem at hand - which I hope that you will agree is the ultimate criterion. One may wonder how simulation performs this economical trick of avoiding the complex abstraction of sample-space calculations. The explanation is that it substitutes the particular information about how elements in the sample are generated in that specific case, as derived from the facts of the actual circumstance; the analytic method does not use this information. Recall that Galileo solved the problem of why three dice yield "10" and "11" more frequently than "9" and "12" by inventing the concept of the sample space. He listed all 256 possible permutations, and found that there are 27 permutations that yield "10" and "11", but only 25 that yield "9" and "12". But it took a Galileo to perform this intellectual feat, simple as it seems to professionals now; lesser theorists had made the error of assuming that the probabilities of all four numbers are the same because the same number of combinations yields all four numbers. That is, for the gamblers before Galileo - who from long experience had correctly determined that "10" and "11" had the higher chances of success than "9" and "12" - simulation used the assumed facts that three fair dice are thrown with an equal chance of any outcome. They then took advantage of the results of a series of such events, performed one at a time stochastically; in contrast, Galileo made no use of the actual stochastic element of the situation, and did not gain information from a sequence of such trials; instead, he replaced all possible sequences by computation of their number (actually, a complete enumeration). Simulation is not "just" a stochastic shortcut to the results of the formulaic method. Rather, it is a quite different route to the same endpoint, using different intellectual processes and utilizing different sorts of inputs. As a partial analogy, it is like fixing a particular fault in an automobile with the aid of a how-to-do-it manual's checklist, compared to writing a book about the engineering principles of the auto; the author may be no better at fixing cars than the hobbyist is at writing engineering books, but the hobbyist requires less time to learn the task to which he addresses him\herself -- fixing the car's fault -- than the author needs to learn to write learned works. COMPLEXITY, NOT BUILT-IN PERVERSITY, MAKES PUZZLES DIFFICULT A key implication of the deservedly-famous research on errors in probabilistic judgments of Daniel Kahnemann and Amos Tversky (interchangeably, Tversky and Kahnemann) is that human thinking is often unsound. And some writers in their school of thought assert that the unsoundness of thinking is hard-wired into our brains; this point of view is expressed vividly in the title of Massimo Piattelli-Palmarini's book Inevitable Illusions; he calls the unsoundness "bias", and says that "we are instinctively very poor evaluators of probability" (1994, p. 3, italics in original). Another possibility - not necessarily inconsistent with genetic explanation - is that the reason we arrive at unsound answers to certain types of problems is that the problems are inherently very difficult, especially when they are tackled without the assistance of tools, because the problems require many steps and also because the steps often involve reversals in the path. Without the aid of memory aids such as paper and pencil, and the skill of using them well, the problems are just too difficult for most persons. One piece of evidence against the genetic-bias explanation is that the wrong answers to problems are not all the same; they do not even concentrate at one end of the probability spectrum. As the work of Kahnemann and Tversky amply shows, the errors often are widely distributed among most or all of the simple arithmetical combinations of the numbers involved in the problems. The outstanding characteristic of the answers is that they are wrong, and not the nature of the errors. In following long chains of logic and assessing complex assortments of information, our brains may be weaker than we would like, but we need not think of our brains as twisted. The two explanations have quite different implications for remediation, and two different remedies are offered; I suggest resorting to simulation whereas others suggest additional training (especially in probability theory) to improve people's logic. The different remedies are not necessarily connected to the two explanations, however; I believe that the remedy I suggest is implied by the bias explanation as well as by the weakness explanation. Martin Gardner paraphrases the great American philosopher Charles Sanders Peirce as saying that "in no other branch of mathematics is it so easy for experts to blunder as in probability theory" (1961, p. 220).Even great mathematicians have blundered on simple problems, including D'Alembert and Leibniz. But when you tackle problems in probability with experimental simulation methods rather than with logic, neither simple nor complex problems need be difficult for experts or beginners. THE LOGICAL STEPS IN THE "THAT MAN'S FATHER..." PUZZLE Though the main issue is problems in probability, it is illuminating to begin with a famous deterministic (non-probabilistic) problem: A man points to the image of a person and says, "Brothers or sisters have I none. That man's father is my father's son." This puzzle is difficult for most of us to solve in our heads. In his book of puzzles, Raymond Smullyan says (and only about this puzzle), The remarkable thing about this problem is that most people get the wrong answer but insist (despite all argument) that they are right. I recall one occasion about 50 years ago when we had some company and had an argument about this problem which seemed to last hours, and in which those who had the right answer just could not convince the others that they were right (1978, p. 7). What is it that causes puzzles to be difficult? There seem to be at least two elements that cause trouble: 1) A large number of logical steps, as will soon be documented. This requires a large memory. 2) Often one must switch directions of thought back and forth several times. This requires that you hold all of the relevant information in your mind, which makes it hard to remember where you are - just as when moving around in a city and making many lefts and rights. Both of these elements are found in the "That man's father is my father's son" puzzle. When we spell out and number the separate logical steps in the problem, as in Figure 1, there are fully 11 of them (13, including the sub-steps), and they go backwards and forwards. It should not be surprising that one cannot effectively store and sort out all this material in short- run memory. Figure 1 here Here is the list of the steps. (The steps are not perfectly denominable, and what I list as a single step might be considered two steps by someone else, and vice versa.) 1. Point to a picture of a man and say "That man". 2. Follow the logic backwards a generation from "That man" to "That man's father". 3. Draw a picture frame at a generation earlier than the generation of "That man", and label it "That man's father". 3. Show the equation (taken from the original language of the puzzle) of "That man's father" with "My father's son", by drawing another box at the same generational level as "That man's father", and connecting the two boxes with a sign of equality. 4. From "My father's son"--a complex label for a person who is not yet specifically identified--we pass to the previous generation and to the completely identified framed personage of "My father". 5. With "My [historical] father" identified, we now move again to the next generation and identify the historical person "My father's son". 6. Now equate the two-element concept of "My father's son" with the one-element "I". 7. We can now make clear exactly who "That man's father" is by working back through the two equalities on the generational level of "I", and then equating "I" directly with "That man's father". 8. Now we pass from "I", who is "That man's father", to "The son of I" ("son of me", to grammarians), and picture that concept on the generational level below "I". 9. We can now pass from the logical concept "The son of I" to the same (because logically equal) personage, "That man". 10. Last step: Re-label (because they are equal) the concept "That man" as "My son", the latter a clearly-defined historical person. This is the answer to the riddle. The lightning thinker with a laser-like mentality may contemptuously dismiss the piecemeal, step-wise reasoning above as obvious and unnecessary, and write "It can be shown that...". But as such great mathematicians and quantifiers as Alfred North Whitehead and Wassily Leontief have noted, there can be an enormous psychological difference between two equivalent logical constructions. And the task at hand is to find safe and efficient psychological routes to sound logical endpoints, even if they are extended rather than compact and therefore unesthetic. When you show the pictorial analysis in Figure 1, people quickly understand and agree about the right answer. This is unlike Smullyan's experience (cited above) that he could not persuade some people of the right answer. This should be strong evidence of the power of concrete illustration to ease logical difficulties and help obtain the correct answer. Simulation's strength is its concreteness. (This is also the power of mathematical notation, of course.) If you use paper and pencil, or a simulation, some people will say, "But you didn't do the problem in your head". True. One might well be amused and impressed by someone who can do incredible mathematical/logical feats in his/her head. But let's keep separate the sphere of amusement-cum-awesome-spectacle and the sphere of useful tools. If you want to perform great works in science, business, and the rest of life, it is dexterity with functional instruments you want, not the ability to impress other people with feats akin to sword-swallowing and lifting small autos with your bare hands. THE LOGICAL STEPS IN BERTRAND'S PUZZLE Next, let us consider a problem that Piattelli-Palmarini considers a canonical "illusion", the three-chests problem discussed in Chapter III-4. Here it is again: A Spanish treasure fleet of three ships was sunk at sea off Mexico in the 1500s. One ship had a trunk of gold forward and another aft, another ship had a trunk of gold forward and a trunk of silver aft, while a third ship had a trunk of silver forward and another trunk of silver aft. A scuba diver just found one of the ships and a trunk of gold in it, but she ran out of air before she could check the other trunk. On deck, they are now taking bets about whether the other trunk found on the same ship will contain silver or gold. What are fair odds that the trunk will contain gold? These are the logical steps I distinguish in arriving at a correct answer with deductive logic (portrayed in Figure 2): Figure 2 1. Postulate three ships, one (call it "I") with two gold chests (G-G), II with one gold and one silver chest (G-S), and III with S-S. (Choosing notation might well be considered one or more additional steps.) 2. Assert equal probabilities of each ship being found. 3. Step 2 implies equal probabilities of being found for each of the six chests. 4. Fact: Diver finds a chest of gold. 5. Step 4 implies that S-S ship III was not found; hence remove it from subsequent analysis. 6. Three possibilities: 6a) Diver found chest I-Ga, 6b) diver found I-Gb, 6c) diver found II-Gc. From step 2, the cases a, b, and c in step 6 have equal probabilities. 7. If possibility 6a is the case, then the other trunk is I-Gb; the comparable statements for cases 6b and 6c are I-Ga and II-S. 8. From steps 6 and 7: From equal probabilities of the three cases, and no other possible outcome, p (6a) = 1/3, p (6b) = 1/3, p (6c) = 1/3, 9. So p(G) = p(6a) + p(6b) = 1/3 + 1/3 = 2/3. Now let us list the steps in a simulation that would answer the question: 1. Create three urns each containing two balls labeled "0,0", "0,1", and "1,1" respectively. 2. Choose an urn at random, and shuffle its contents. 3. Choose the first element in the chosen urn's vector. If "1", stop trial and make no further record. If "0", continue. 4. Record the second element in the chosen urn's vector on the scoreboard. 5. Repeat (2 - 5), and calculate the proportion "0's" on scoreboard. AN APPLIED BAYESIAN PROBLEM Now consider this classic Bayesian problem that Tversky and Kahnemann quote from Cascells, Schoenberger, and Grayboys (1978, p. 999): If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the persons's symptoms or signs? Tversky and Kahnemann note that among the respondents - students and staff at Harvard Medical School - "The most common response, given by almost half of the participants, was 95%", very much the wrong answer. To obtain an answer by simulation, rephrase the question above with hypothetical numbers as follows: If a test to detect a disease whose prevalence has been estimated to be about 100,000 in the population of 100 million persons over age 40 (that is, about 1 in a thousand) has been observed to have a false positive rate of 60 in 1200, and never gives a negative result if a person really has the disease, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the persons's symptoms or signs? (Please note in passing that the use of percentages rather than raw numbers is an unnecessary abstraction and is often misleading, in addition to being a barrier to simulation. If the raw numbers are not available, the problem can be phrased in terms of "about 1 case in 1000" or "about 5 cases in 100".) If one has the habit of saying to oneself, "Let's simulate it", one may then get an answer as follows: 1. Construct urn A with 999 white beads and 1 black bead, and urn B with 95 green beads and 5 red beads. A more complete problem that also has false negatives would need a third urn. 2. Pick a bead from urn A. If black, record "T", replace the bead, and end the trial. If white, continue to step 3. 3. If a white bead is drawn from urn A, select a bead from urn B. If red, record "F" and replace the bead, and if green record "N" and replace the bead. 4. Repeat steps 2-4 perhaps 10,000 times, and count the proportion of "T"s to "T"s plus "F"s (ignore the "N"s) in the results. Of course 10,000 draws would be tedious, but even after a few hundred draws a person would be likely to draw the correct conclusion that the proportion of "T"s to ("T"s plus "F"s) would be small. And it is easy with a computer to do 10,000 trials very quickly. Note that the respondents in the Cascells et al. study were not naive; the staff members were supposed to understand statistics. Yet most produced wrong answers. If simulation can do better, then simulation would seem to be the method of choice. And only one piece of training for simulation is required: Mastery of the self-reminder "Try it". THE THREE-DOOR PROBLEM The now-famous problem of the three doors, discussed in Chapter III-v, is Piattelli-Palmarini's piece de resistance and "Grand Finale" - his "Super-Tunnel" (p. 161) of an "inevitable illusion" due to a "super blind spot" (p. 7). But it is indeed if complex problem, as one sees if one diagrams it. But as we have seen, hands-on simulation with physical symbols, rather than computer simulation is a surefire way of obtaining and displaying the correct solution. Not only does the best choice become obvious, but one is likely to understand quickly why switching is better. No other mode of explanation or solution brings out this intuition so well. And it is much the same with other problems in probability and statistics. Simulation can provide not only answers but also insight into why the process works as it does. In contrast, formulas produce obfuscation and confusion for most non-mathematicians. One may attempt to elucidate the three-box problem by listing the elements of the sample set (all of which have the same probability). But though this sample space is easy to understand after it is placed before you, constructing it requires much more depth of understanding of the problem's logic than does simulation - which does not require that one count the possible outcomes or the number of "successes"; this may explain why it is much easier to err with sample-space analysis in this and other problems. DISCUSSION Kahnemann and Tversky and others of their school (e.g., Nisbett et al.) infer from their findings the need for better instruction in the logic of probability and statistics. The role of logic may be seen in these statements by Piattelli-Palmarini: The ultimate measuring instruments are those offered by pure logic, probability theory, economics, and decision theory. (Piattelli-Palmarini, 1994, p. 5.) Yet, the mere fact that we intuitively come to see the situation as anomalous is not sufficient to set us right. It requires thought, and thought based on real data (such as those offered by the cases in this book) and on well-constructed theories that can ultimately and persuasively gain our assent. That is how rationality is fostered. The very fact that we turn our backs on such abstract concepts is, however, an unpardonable resistance to the progress of reason. The game is worth it, because these are important matters for all of us; they are fundamental to ourselves and to those whom we love. (Piattelli-Palmarini, 1994, p. 14.) Through thinking enough about these matters, one day we may come to a certain rationality. (Piattelli- Palmarini, 1994, p. 88.) After presenting a problem asking the probabilities of RGRRR (5 outcomes), GRGRRR (6 outcomes), and GRRRRR (six outcomes), he asks the subject: "On which do you bet? Think." (Piattelli-Palmarini, 1994, p. 50.). In contrast, I'd say only "On which do you bet?", and hope that people would try out the possibilities. But this inference that more instruction in logic is needed flows from the way the cognitive psychologists frame their test questions, forcing a choice among a narrow and specific range of alternatives and thereby excluding the possibility of simulating the problem. If instead one widens the range of alternative answers, we may see that a better choice is to teach people the habit of not relying on the sort of logic that Kahnemann and Tversky find to be often defective, but instead resorting to some kind of experimentation - simulation - to seek an answer. Indeed, Kahneman and Tversky sometimes find that even persons trained in statistics do little or not at all better than untrained respondents -- and this means doing badly. "[T]he study of research psychologists...reveals that a strong tendency to underestimate the impact of sample size lingers on despite knowledge of the correct rule and extensive statistical training" (1972/1982, pp. 45-46). So training in statistical logic does not teach people what they need to know. Yet the response of Kahnemann and Tversky - and even more so, the recommendation of such researchers as Nisbett et al. - is that the appropriate remedy is more conventional statistical training. This remedy of giving more instruction in statistical logic has a bit of the flavor of: If beating does not produce improvement in the child's behavior, beat the child three times as hard. Or, if drilling for oil produces no results at 1,000 feet and then at 2,000 feet, resolve to dig another 2,000 feet. A better choice of remedy than the instruction that has already failed might to instruct people in a different fashion -- that is, by simulation. It is, of course, an empirical question whether people will with higher probability arrive at a correct answer with simulation or with deductive logic. Studies with other sorts of questions clearly give the palm to simulation (Simon, Atkinson, Shevokas, 1976). Though I do not think that a genetic bias is responsible for our inability to deductively calculate probabilities well, I do believe that another (and even more dangerous) genetic bias is implicated in the situation, to wit: The powerful urge to use our brains deductively rather than by producing and assessing experimental data. Even among scientists whose everyday business is experimentation, it is extremely difficult to get a person to address problems like those alluded to here with actual trials; people generally resist the suggestion of simulation as if it threatens an end to pleasure - which indeed it does. There is one major shortcoming of the simulation approach that we must be mentioned: the inadequacy of problem-solution work done by simulation to satisfy the powerful human desire to meet and solve challenges to one's power of reasoning. This is what David Hume had in mind when he compared the excitement of difficult intellectual activity to the hunt and the chase. Further discussion of this crucial topic must be left to another place, however. REFERENCES Gardner, Martin, The Second Scientific American Book of Mathematical Puzzles & Diversions (New York: Simon and Schuster, 1961). Kahneman, Daniel, and Amos Tversky, "Subjective probability: A judgment of representativeness," abbreviated version of a paper originally appearing in Cognitive Psychology, 1972, 3, 430-454, reprinted in Judgment under uncertainty: Heuristics and biases, edited by Kahneman, Daniel, Paul Slovic, and Amos Tversky (Cambridge: Cambridge University Press, 1982), pp. 32-47. Nisbett, Richard E., David H. Krantz, Christopher Jepson, and Geoffrey T. Fong, "Improving inductive inference," Judgment under uncertainty: Heuristics and biases, edited by Kahneman, Daniel, Paul Slovic, and Amos Tversky (Cambridge: Cambridge University Press, 1982), pp. 445-459. Piattelli-Palmarini, Massimo, Inevitable Illusions (New York: Wiley, 1994). Smullyan, Raymond, What is the Name of This Book? (Englewood Cliffs: Prentice-Hall, 1978) ENDNOTES SUMMARY Simulation - and its sub-class of resampling problems in statistics - is a much simpler task intellectually than the formulaic method of probabilistic calculations because it does not require that one calculate a) the number of points in the entire sample space, and b) the number of points in some sub-set, so as to estimate the ratio of the latter to the former. Instead, one directly estimates the ratio from a sample. Even slightly difficult problems involving permutations and combinations are sufficiently difficult as to require advanced training. Similarly, it is much easier to sample the proportion of black grains of sand on a beach than it is to take a census of the total number of grains of each color. The latter is a task both of great intellectual difficulty as well as great practical difficulty. page # teachbk IV-Idiff May 9, 1996