CHAPTER I-2
ON STATISTICS TEACHING, TEACHERS, AND CURRICULA
Everyone agrees that an understanding of statistics and
probability is a crucial element in a person's quantitative
skills. But for decades, the public and the relevant professions
have lamented that statistics teaching desperately needs to be
improved. Though various educational palliatives have been
offered, none has shown any evidence of success.
The problem -- really a disease -- in the practice and
teaching of statistics has been diagnosed by many writers:
Obfuscation of teeth-gritting students by teachers who repeat by
rote the body of complex algebra and tables that only a rare
expert understands down to the roots. Years ago, Allen Wallis
and Harvey Roberts had it right: "The great ideas of statistics
are lost in a sea of algebra" 1956, p. viii.
Just as surely as there is a problem, the resampling method
can contribute mightily to solving it. Indeed, it is the only
scientifically tested and proven cure for the malady.
A quarter century ago, claims for resampling were derided as
ridiculous. Now it is agreed that resampling is theoretically
valid. (Its new respectability is shown by its passage from
"Ridiculous" to "Everyone always knew that.") Some writers
(Edginton, referred to by Manly, 1991, p. 17) even suggest that
in many cases the classical tests are simply approximations to
resampling methods1, and that the parametric methods are only a
second-choice substitute for resampling methods. Yet resampling
is still mostly passed by when the curriculum is set, though a
few leaders in the profession have now come out in its favor.
A long-time problem in promoting resampling has been to make
clear that it is a basic tool for researchers and decision-
makers. I emphasize "tool" because resampling often gets
confused with Monte Carlo simulation as a way of teaching
conventional parametric methods. The subject here is resampling
as a substitute and complement to conventional methods, and as
the method of first choice in handling actual everyday problems.
The subject is not a device to improve the standard pedagogy.
THE CRITICISMS OF STATISTICS PRACTICE AND EDUCATION
To set the scene, here are some comments by thoughtful
critics of statistics education. Please forgive me for
multiplying the quotations, but it is only the fact of this
sentiment being widespread, and the view held by respected
statisticians, that lends authority to the criticism.
The introductory statistics course is troublesome. Many
readers will surely confirm that assertion with their own
knowledge of what students and teachers say about the subject.
And there is much written testimony to this effect by thoughtful
critics of statistics education. Here are some examples:
1. Garfield (1991): "A review of the professional
literature over the past thirty years reveals a consistent
dissatisfaction with the way introductory statistics courses are
taught" (p. 1). Garfield asserts (referring to her dissertation,
1981, and to work by Wise) that "It is a well known fact that
many students have negative attitudes and anxiety about taking
statistics courses" (p. 1). "Students enrolled in an
introductory statistics course have criticized the course as
being boring and unexciting... Instructors have also expressed
concern that after completing the course many students are not
able to solve statistical problems... (1981, quoting Duchastel,
1974).
2. Dallal (1990, p. 266): "[T]he field of statistics is
littered with students who are frustrated by their courses,
finish with no useful skills, and are turned off to the subject
for life".
3. Hey (1983, p. xii):
For more years than I care to recall, I have been
teaching introductory statistics and econometrics to
economics students. As many teachers and students are
all too aware, this can be a painful experience for all
concerned. Many will be familiar with the apparently
never-ending quest for ways of reducing the pain - by
redesigning courses and by using different texts or
writing new ones. But the changes all too often turn
out to be purely cosmetic, with the fundamental problem
left unchanged.
4. Barlow (1989, Preface)
Many science students acquire a distinctly negative attitude
towards the subject of statistics...As a student I was no
different from any other in this respect.
5. Hogg: "[S]tudents frequently view statistics as the
worst course taken in college." He explains that "many of us are
lousy teachers, and our efforts to improve are feeble" (1991, p.
342).
6. Vaisrub (1990) about her attempt to teach medical
students conventional statistical methods: "I gazed into the sea
of glazed eyes and forlorn faces, shocked by the looks of naked
fear my appearance at the lectern prompted" .
5. Freedman et. al. noting that most students of
probability and statistics simply memorize the rules: "Blindly
plugging into statistical formulas has caused a lot of
confusion." (1991, p. xv)
6. Ruberg (1992):
It seems that many people are deeply afraid of statistics. [They
say] `Statistics was my worst subject' or `All those
formulas'...I wish they had a deeper understanding of the
statistical method...rather than the general confusion about
which formulas are most appropriate for a particular data set.
7. Freedman et. al.:
[W]hen we started writing, we tried to teach the conventional
notation... But it soon became clear that the algebra was getting
in the way. For students with limited technical ability,
mastering the notation demands so much effort that nothing is
left over for the ideas. To make the point by analogy, it is as
if most the undergraduates on the campus were required to take a
course in Chinese history--and the history department insisted on
teaching them in Chinese. (from the introduction to the first
edition)
8. Based on their review of the literature, Garfield and
Ahlgren say that "students appear to have difficulties developing
correct intuitions about fundamental ideas of probability", and
they proceed to offer reasons why this is so (1988, p. 45).
These sorts of negative comment are not commonly heard about
other subjects and other groups of students; both the nature of
the criticisms and their volume with respect to statistics are
unusual, we believe. One of us has been teaching economics,
business, and demography for three decades without hearing such
complaints.
7. Statisticians have long worried about the unthinking use
of parametric tests whose foundations are poorly understood.
"Students are often given the false impression that `easy-to-use
packages can be a substitute for a proper knowledge of
statistical methodology'" (Dallal, 1990, p. 266, quoting Searle,
1989). And now the readily available computer packages, which
perform conventional tests with a single command, exacerbate this
problem. As the Encyclopedia of Statistics notes:
Use of Inappropriate "Canned Programs"
As statisticians we find all too frequently that an experimenter
takes data directly to a computer center programmer (usually
called an analyst) for "statistical analysis." The programmer
pulls a canned statistical program out of the file and there may
result extensive machine outputs, all of which are irrelevant to
the purpose of the experiment. This deplorable situation can be
avoided only through having competent statistical advice,
preferably in the design stage and certainly in the analysis
stage. ("Computers and Statistics", Vol 2, p. 95)
Blindly picking formulae for inferential procedures always
has afflicted statistics. But now such computerized routines
have made the problem even worse. The statistics user does not
even feel the need to learn the conditions under which a test may
or may not be appropriate.
To illustrate, SigmaStat's advertisement in Science Magazine
has a researcher say: "Dear SigmaStat Advisor ... I need a
foolproof way to pick the best statistical tests on my research
data." The advertisement then goes on to say, "The SigmaStat
Advisor automatically picks the best test for you. SigmaStat
automatically points you to the appropriate tests. You simply
type in your goals, and SigmaStat's Advisor automatically
suggests the best statistical test for your data."
To confirm that most conventional teaching of statistics is
a miserable failure, check for yourself the state of
understanding of even an advanced student or user of statistics.
Present to a researcher or student a set of data for a standard
situation - say, four groups of rats given different drugs. Ask
a series of questions such as: What kind of statistical analysis
will you do? What test will you use? Why an F test (or
whatever)? What exactly does the F statistic (or whatever) mean?
What is the meaning of the table that you will use? What do the
numbers in the table signify? How are the numbers derived? Is
the Normal distribution relevant here? What has that particular
distribution got to do with your data? What is the formula for a
Normal distribution? What is the reason for each of the elements
in the formula? Why are they combined in the fashion they are?
Unless the user can answer every one of those questions clearly
and confidently, the person is at risk of proceeding
inappropriately with the data for lack of full understanding.
A person must have a full intuitive understanding
of the statistical process, and must be able to answer
a comparable set of questions, though they are much
easier to handle in order to develop an appropriate
resampling test to analyze those data. A user of
resampling methods therefore is less at risk of simply
plugging in an inappropriate procedure. All the
more need for Resampling, then.WHAT IS THE NATURE OF
THE DIFFICULTY? Many writers have discussed the
nature of the difficulty with conventional methods.
Hogg says:
Statistics teaching is often stagnant; statistics teachers
resent change. The most popular elementary texts evolve but
slowly over decades. Meanwhile, statistics is progressing
rapidly. (1990, p. 20)
Hollander and Proschan assert that the source of the
difficulty in teaching statistics is "Simple -- the books are
written in a foreign language...The books primarily explain the
mechanics of statistics using the language of algebra" (1984, p.
v). They "aim to narrate in plain English words" the nature of
statistics. But this leads them to completely eliminate from
their instruction the core element, probabilistic inferential
statistics, because plain English does not enable them to present
the conventional apparatus for significance testing and
confidence limits.
In the last decade or so, the discipline's greybeards have
decided that teaching probabilistic-statistics is just too tough
a nut to crack, and have concluded that students should be taught
mainly descriptive statistics -- tables and graphs -- rather than
how to draw inferences probabilistically. For example, Scheaffer
(1990) calls for "a more empirical, data-oriented approach to
statistics, sometimes termed exploratory data analysis" (p. 90
check quote and itals). And Moore and Roberts (1989) and Singer
and Willett (1990) suggest that actual rather than hypothetical
data will increase student interest, and they offer suggestions
about data sets. But probability and inferential statistics
clearly are the heart of the matter. A statistics course without
inferential statistics is like Hamlet without the Prince
appearing.
Gardner suggests that all mathematics is inherently
difficult to teach.
A teacher of mathematics, no matter how much he loves his subject
and how strong his desire to communicate, is perpetually faced
with one overwhelming difficulty: How can he keep his students
awake? (Gardner, 1977, p. x)
Efron says conventional statistics is a very
difficult theory.
The theory that's usually taught to elementary
[statistics] students is a watered-down version of a very
complicated theory that was developed in order to avoid a
great deal of numerical calculation... It's really quite a hard
theory, and should be taught second, not first. (Quoted by
Peterson, 1991, p. 56)
Elsewhere Efron (with Tibshirani) says (page xiv, 1993):
"The traditional road to statistical knowledge is blocked, for
most, by a formidable wall of mathematics".
Tbe inherent difficulty of statistical inference is
discussed at greater length in Chapter 00.
Hogg argues that the formal equational approach is
unsound not only because it is difficult, but also
because it points the student away from deep
understanding of scientific-statistical problems.
Statistics is often presented as a branch of
mathematics, and good statistics is often equated with
mathematical rigor or purity, rather than with careful
thinking.
There is little attempt to measure what statistics courses
accomplish (1990).
Hey says (in touting Bayesian statistics as the answer):
I was aware of the real problem for some time, but it was
not until about three years ago that I finally admitted it to
myself. The fundamental malaise with most statistics and
econometrics courses is that they use the Classical approach to
inference. Students find this unnatural and contorted. It is
not intuitively acceptable and does not accord with the way that
people assimilate information... (1983, p. xi).
And Kempthorne writes:
...there has been a failure in the teaching of statistics
that originates with a failure of the teaching of teachers
of statistics,...Part of the malaise that I see occurs, I
believe, because it is easy to think of counting and of
areas and volumes, so rather than teach something about
statistics, one takes the easy route of teaching a species
of mathematics. And one can get a partial justification
because this species of mathematics is a critical part of
the whole area. What must happen is that the ideas and aims
of statistics must determine the mathematics of statistics
that is taught and not vice versa. Mathematics is surely a
beautiful art form (in addition to being useful). If the
statistics that is taught is to have this good form, then
its form is determined by its mathematical form. And then,
I suggest, form wins out over content, and essential ideas of
statistics are lost...(1980, p. 19)
All the above comments may be correct. But I believe that
statistics - as distinguished from probability theory - has some
very special and very great difficulties, and that the core of
the problem is this: There is no way to induce students to enjoy
the body of conventional inferential statistics because there is
no way to make the ideas intuitively clear and perfectly
understood. And even more fundamental than whether the
students enjoy the material is whether they will acquire a set of
techniques that they can put to effective use.
The trouble in statistics teaching is in the product, and
not the packaging and advertising. Sooner or later the
conventional teaching of statistics founders on the body of
complex algebra and tables.
Freedman et. al. say :
[W]hen we started writing, we tried to teach the conventional
notation... But it soon became clear that the algebra was getting
in the way. For students with limited technical ability,
mastering the notation demands so much effort that nothing is
left over for the ideas. To make the point by analogy, it is as
if most the undergraduates on the campus were required to take a
course in Chinese history--and the history department insisted on
teaching them in Chinese. (from the introduction to the first
edition)
The various devices that have been suggested to mitigate the
problem certainly can be valuable. But - and please forgive me
if I am very blunt - such devices are like bandaids on internal
bleeding.
One must note a certain schizophrenia. The very
statisticians who assert that the problem is the "wall of
algebra" proceed to themselves use this tool heavily - even in
discussions of resampling which renders much (if not all) of the
formulaic approach nugatory (see for example Efron and
Tibshirani, 1993; Hall, 1992; Westfall and Young, 1993).
THE SOLUTION TO THE PROBLEM: RESAMPLING
The resampling approach mitigates the problem, especially in
connection with the facilitating computer program RESAMPLING
STATS.
A physical process necessarily precedes any statistical
procedure. Resampling methods stick close to the underlying
physical process by simulating it, requiring less abstraction
than classical methods. The abstruse structure of classical
mathematical formulas used with tables based on restrictive
assumptions concerning data distributions tend to separate the
user from the actual data or physical process under
consideration; this is a major source of statistical error.
Resampling has most commonly been used when classical
methods are not promising. In contrast, I argue that resampling
should be used, and so taught, as the tool of first resort in
everyday practice of statistical inference, mainly because there
is a greater chance that a wrong classical test will be used than
a wrong resampling test. That is, the likelihood of "Type 4
error" decreases when the user is oriented to resampling.
The situation here is like people suffering for years with a
serious disease, clearly diagnosed. There is a treatment even
when it has been shown to work. But people refuse the treatment
for ideological or religious or esthetic reasons.
A CRITERION FOR CHOOSING A STATISTICAL METHOD AND FOR DECIDING
WHETHER TO TEACH RESAMPLING
For operational comparison of methods we need a criterion.
I suggest "statistical utility." By this I mean a composite of
a) the likelihood that an appropriate test will be used (that is,
avoidance of Type 4 error), plus b) the technical characteristics
of the tests. This criterion is like an over-all cost-benefit
analysis, or a loss-function approach, to choosing methods.
The point of view of Wallis and Roberts squares with
focusing on statistical utility. They worry that "Techniques and
details, beyond a comparatively small range of fairly basic
methods, are likely to do more harm than good in the hands of
beginners...The great ideas...are lost...nonparametric [methods]
involving simpler computations, are more nearly foolproof in the
hands of the beginner" (1956, viii, xi). And Wallis and Roberts
were prepared to accept some loss of power for this purpose. But
the same argument applies in a much stronger way to resampling,
because it is not less powerful in general, and because it is
even less subject to error because it is entirely rather than
partially intuitive.
Singh makes a similar point about operations research:
In operations research, in particular, the danger of
applying unwarily the wrong procedure or method is great because
it is quite likely that the assumptions underlying the methods do
not hold in the case of the problem under study. The only
safeguard against such misapplication is a general understanding
of the ideas underlying operations-research methods." (Singh,
1972, pp. 20-21)
The statistical utility of resampling and other methods is
an empirical issue, and the test population should be non-
statisticians. American Statistical Association president Arnold
Zellner wrote that at a meeting on statistics education
I challenged participants to design and perform controlled
experiments to show that their proposed solutions to the
suffering problem actually work. Perhaps you can do society a
service by developing the methodology and showing that your
resampling approach produces significantly less suffering and
more statistical educational value than other approaches. Such
scientific, positive approaches would be extremely valuable, in
my opinion. (Correspondence, November 12, 1990)
But such studies were done, and long ago, by my colleagues and me
at the University of Illinois, and published in a very widely
read journal. Controlled experiments (Simon, Atkinson, and
Shevokas, 1976) found better results for resampling methods, even
without the use of computers. Students handle more problems
correctly, and like statistics much better, with resampling than
with conventional methods. This should constitute a prima facie
case for resampling. And as Table I-1-1 in Chapter I-1 shows,
instruction at both the levels of introductory statistics and
graduate courses in statistics, from Frederick (Maryland) Junior
College to Stanford, yields positive evaluations from students.
But more empirical study would be welcome.
Table 1
Singh notes:
In operations research, in particular, the danger of applying
unwarily the wrong procedure or method is great because it is
quite likely that the assumptions underlying the methods do not
hold in the case of the problem under study. The only safeguard
against such misapplication is a general understanding of the
ideas underlying operations-research methods. (1972, pp. 20-21)
This historical note is interesting:
When statistics was in its infancy, W.S. Gosset replied
to an explanation of the sampling distribution of the
partial correlation coefficient by R.A. Fisher [from
letter No. 6, May 5, 1922, in Letters From W.S. Gosset
to R.A. Fisher 1915-1936, Arthur Guinness Sons and
Company, Ltd., Dublin. Issued for private
circulation.]:
"...I fear that I can't conscientiously claim to
understand it, but I take it for granted that you
know what you are talking about and thankfully use
the results!
It's not so much the mathematics, I can often say
"Well, of course, that's beyond me, but we'll take
it as correct' but when I come to 'Evidently' I
know that means two hours hard work at least
before I can see why."
Considering that the original "Student" of statistics was
concerned about whether he could understand the mathematical
underpinnings of the discipline, it is reasonable that today's
students have similar misgivings. Lest this concern keep our
students from appreciating the importance of statistics in
research, we consciously avoid theoretical mathematical
discussions."
(Dowdy and Wearden, 1991, pp. XV, XVI.)
If Gossett himself could not understand even such simple
formulaic material, what should we expect of ordinary students?
WHY RESAMPLING SUCCEEDS
We can see resampling's greatest strength by considering the
now-famous problem of the three doors. Logical mathematical
deduction is grossly inadequate for almost everyone to arrive at
the right answer to that problem. Simulation, however -- and
hands-on simulation with physical symbols, rather than computer
simulation -- is a surefire way of finding and showing the
correct solution. (Simon, forthcoming) Furthermore, the
explanation soon appears when one examines the simulation results
for the three-door problem. Important from the mathematician's
point of view, such simulation provides sound insights into why
the process is what it is.
It is much the same with other problems in probability and
statistics. Simulation can provide not only answers but also
insight, whereas for most non-mathematicians, formulas produce
obfuscation and confusion.
The resampling method is not really taught by an instructor.
Rather, it is learned by the students. With a bit of guidance
from the instructor, the students invent - from scratch -
resampling methods of doing statistics problems. Through a
process of self-discovery students develop useful operating
definitions of necessary concepts such as "universe," "trial,"
"estimate," and so on. And together they invent -- after false
starts and then moves in new directions -- sound approaches for
easy and not-so-easy problems in probability and statistics. For
example, with a bit of guidance an average university class can
be brought to reinvent such devices as the resampling version of
Fisher's randomization test.
The students learn more than how to do problems. They gain
the excitement of true intellectual discovery. And they come to
understand something of the nature of mathematics and its
creation.
Of course, this "discovery" method of teaching causes
difficulties for some teachers. It requires that the teacher
react spontaneously and let the discussion find its own path,
rather than having everything prepared in advance. For some
teachers this requires practice. Others may never find it
congenial. But for the teacher who is open and responsive and a
bit inventive, teaching the resampling method in this fashion is
wonderfully exciting. Perhaps most exciting is to see ordinary
students inventing solutions to problems that conventional
probability theory did not discover for many centuries.
The openness of resampling learning also bothers some
students, especially at first. They miss the comfort students
derive from a notebook full of well-organized cut-and-dried
formulae; the apparent lack of course structure worries some. But
after a few weeks the average student comes to like the
resampling approach better, as the controlled experiments of
Shevokas and Atkinson show.
OTHER BENEFITS OF RESAMPLING Resampling has benefits beyond its
greater statistical utility.
Resampling has many characteristics that contemporary
educators (correctly, in my view) call for to improve the basic
quality of student learning in mathematics and science education,
as discussed in the guidelines in the National Council of
Teachers of Mathematics's (NCTM's) Professional Standards for
Teaching Mathematics.
1. NCTM has urged greater use of simulation in
teaching probability and statistics.
Concepts of probability ... should be taught
intuitively. ... The focus of instructional time should be
shifted from the selection of the correct counting technique to
analysis of the problem situation and design of an
appropriate simulation procedure. ... students should value
both [theoretical and simulation] approaches. What should not be
taught is that only the theoretical approach yields the
"right" solution. (NCTM, 1989)
2. There also are calls for active and hands-on learning
rather than a passive process. The National Research Council
says that "in reality no one can teach mathematics, and that
effective teachers are actually those who can stimulate students
to learn mathematics" (1989, p. 58, quoted from Garfield, 1991,
p. 5). Herbert Simon writes that "learning results from things
the student does, and not (except indirectly) from things a
teacher does" (1991, p. 284) - witness the fact that the single
sentence a student is likely to remember from a course is the
sentence the student him/herself spoke in class (is it true for
you?).
In his famous How to Solve It, Polya writes:
A great discovery solves a great problem but there
is a grain of discovery in the solution of any problem.
Your problem may be modest; but if it challenges your
curiousity and brings into play your inventive
faculties, and if you solve it by your own means, you
may experience the tension and enjoy the triumph of
discovery. Such experiences at a susceptibly age may
create a taste for mental work and leave their imprint
on mind and character for a lifetime.
Thus, a teacher of mathematics has a great opportunity. If
he fills his allotted time with drilling his students in routine
operations he kills their interest, hampers their intellectual
development, and misuses his opportunity. But if he challenges
the curiosity of his students by setting them problems
proportionate to their knowledge, and helps them to solve their
problems with stimulating questions, he may give them a taste
for, and some means of, independent thinking. (1957, p. v).
Resampling in the classroom is as active as any learning can
be.
3. One of the arguments given for studying formal deductive
mathematics and logic has been that the material teaches sound
thinking processes. Perhaps so. But there are also reasons to
believe that a person's general intellectual development can
benefit from learning to handle quantitative problems by
empirical methods such as resampling. These are some of the
reasons: a. The step-by-step resampling process resembles the
historical process in which mathematicians commonly develop more
general abstract ideas. Mathematical invention often is
empirical. That is, one may have a mathematical idea, first try
it out with numerical experiments, and only later generalize and
formalize it. Or, one may solve a set of individual problems
individually by numerical methods, and only later see the thread
that runs through them and then arrive at the general abstract
solution. Resampling has much in common with the first part of
this process.
Littlewood (1986, p. 97) tells us that the great Indian-
English mathematician Ramanujan often worked just that way: "by
empirical induction from particular numerical cases". [(He was
not a "Martian" a la Barrow's scenario in Chapter 00 because he
did not stop with the induction in all cases, though sometimes he
did.)]
Mathematicians often lament that students see a mathematical
result only in its logical form, sanitized of this production
process, and therefore learners do not know what goes into the
making of mathematics. This lament should produce some sympathy
for teaching resampling when teaching probability.
Consider as an example Mosteller's problem 39 in
his Fifty Challenging Problems.
In a laboratory, each of a handful of thin 9-inch glass rods had
one tip marked with a blue dot and the other with a red. When
the laboratory assistant tripped and dropped them onto the
concrete floor, many broke into three pieces. For these, what
was the average length of the fragment with the blue dot?
If one does not at first see the general answer, one may
approach the problem empirically with the following steps:
1. Numbers 1-900, to model a nearly-continuous (discrete
marks 1/100 inch apart) glass rod nine inches long.
2. Select randomly two numbers from (1), without
replacement (because a break can only take place once at a given
spot). (This makes the same unrealistic assumption implicit in
Professor Mosteller's problem and solution that the probabilities
of break points are the same throughout. I'd guess that very
small pieces are less likely than are pieces somewhat less than
1/3 of rod, say. Working up a simulated approach, or even more
so an actual experiment, is more likely to spotlight such an
unrealistic assumption - if it is indeed so - than proceeding
with formulae. Practical people should care about this.)
3. Consider that the blue dot is at the "high number" end
of the rod. Hence subtract the larger number in step (2) from
"9," and record.
4. Repeat (1-3) say 1000 times, and take the mean of the
results, to get the answer sought.
After doing the simulation, one may notice that all three
pieces average one-third of the total length. By reviewing the
results of the specific trials one may then understand why this
is so, and then find the "explanation" for the result, which
constitutes the more abstractly-based answer.
b. Teachers of problem-solving often focus on "breakthrough"
processes that involve putting aside assumptions that restrict
thinking from getting to the solution; a well-known example is
the invisible boundary around a set of points that keeps one from
going outside a square to connect the nine points with a given
number of straight lines. Another focus of such "creativity"
instruction is increasing the flow of ideas, as in the
brainstorming procedure. But there is reason to think that
teaching people how to overcome the limitations of the amount of
material our unaided brains can handle at once is also crucial.
These are some pieces of evidence:
i) Experience with dynamic programming (a.k.a. backward
induction, and the decision tree) shows that in a very
large proportion of cases where the technique is useful,
the benefit comes not from computation (which gets done
only in about 20 percent of the cases) but rather from
reducing to paper the cloud of vague thoughts about the
issue at hand - the various alternatives, probabilities,
and consequences that float around our brains all at once.
Many of our problems seem too difficult for "rational"
systematic thought - for example, many life decisions such
as what to choose for an occupation - because there are so
many connections and so many interactions. But the "tool"
of pencil and paper and constructing a decision tree can
help greatly. ii) The famous "Brothers and sisters have I
none..." puzzle illustrates the power of simple techniques
for mastering information that is too complex to handle by
ratiocination alone.
INTELLECTUAL OBJECTIONS TO RESAMPLING, AND REBUTTALS TO THEM
The resistances to teaching resampling include intellectual
objections, turf problems, and individuals' investments in their
stock of professional knowledge. Indeed, these barriers have
been great enough to prevent it from being adopted widely since
the late 1960s, when I first began teaching it and publishing
about it. However, the intellectual objections are beginning to
melt away, in considerable part because of the more recent
development of the theory of the bootstrap by Bradley Efron and
many, many others in his wake. The opponents of resampling now
are becoming defensive, to the point of dwelling on such
irrelevancies as whether the random number generator is good
enough. I'll consider only the intellectual objections here.
These are some of the root objections to resampling as a
basic tool for everyday use:
1. With card and dice experiments one can make statistical
tests without the mathematical theory of probability. But
figuring the answer analytically is sometimes quicker. For
example, imagine that you want to know how often you will get
two aces if you deal a hand of only two cards from a bridge
deck. A satisfactory simulation would take some time, but one
can figure the answer in a hurry by multiplying 4/52 x 3/51; the
formulaic method quickly tells us that, on the average, we shall
get 2 aces in a hand of 2 cards once in 220 hands.
Though analytical methods may produce an answer more
quickly than the resampling method, a person who does not expect
to use probability statistics often might find that, in the long
run, it is more efficient to spend a bit of extra time on doing
resampling trials rather than studying the analytical methods.
However, people who plan scientific careers might well
eventually study analytic statistics as well as learning
simulation methods because that study deepens one's intuition
about scientific and mathematical relationships.
2. The resampling method demands that every step be
thought through from first principles. This may seem to be a
disadvantage because such thinking about models is hard work.
However, this discipline reduces the probability of erroneous
calculations that arise from blindly plugging in the wrong
method and formula. Moreover, learning and practicing the
experimental approach aids problem-solving in general.
A side-benefit is that for the decreasing number of
students who come to the class unfamiliar with computers, the
resampling method together with the computer program RESAMPLING
STATS also serves as a painless introduction to computers and
computer programming. Such basic concepts as IF and looping are
learned without special instruction because they are seen to be
necessary for repeated resampling trials. And general notions
such as booting up, menus, and the operating system also are
learned without fuss, as natural parts of the process, simply
because this is what the students find themselves doing. Fear
of computers also is rapidly dispelled in this environment.
3. In earlier years, a frequent objection to resampling --
and the easiest to meet -- was that the estimates are
inaccurate. Resampling sample sizes must, of course, be large
enough for any desired level of accuracy. But in many
situations, an adequate sample of random numbers can be drawn by
hand in a few minutes. If not, it is easy to program any of
these procedures onto a personal computer using RESAMPLING STATS
or other languages, and thereby obtain samples of huge sizes in
seconds. With the computer, even probabilities close to 0 or 1
can be estimated accurately in a short span of time.
Furthermore, students quickly grasp the importance of
inaccuracy due to sampling error as they observe the variation
in their resampling samples. This causes them to worry about
sample variability -- which is perhaps the most important of all
statistics learning. Then the students increase their
resampling samples in size until they reach acceptable levels of
accuracy.
4. In some statistical testing situations, resampling may
be less effective than conventional methods. Mathematical
statisticians are now busy investigating the conditions under
which the bootstrap is to be preferred by trained statisticians.
But this does not alter very much -- if at all -- the role of
resampling for that large number of persons who will never come
to have a satisfactory command of conventional analytic methods.
5. Resampling may be considered obscurantist, anti-
intellectual, and likely to limit the student's advance into
formal mathematical analysis. There are several counter-
arguments, however:
a. There are many students who will never go forward
with the study of statistics and mathematics, even if they are
never exposed to this instruction. For these students, this
powerful engine of problem-solving is an important educational
bonus.
b. This instruction interests many students. It
probably pushes far more students into further study than it
keeps out.
c. Perhaps most important, learning resampling is of
great intellectual value to those who will make a further study
of statistics and probability. The procedure by which one must
explicitly structure problems in this method is also necessary
when problems are solved analytically, though when using the
analytic method the structuring process is too often done
implicitly or without awareness, leading to the wrong model or
unsound choice of a cookbook formula. Resampling experience is
therefore of great value in teaching what analytic methods are
good for, and how to use them correctly.
Consider these remarks by Alvan Feinstein,
professor of clinical biostatistics at Yale Medical
School who has spent decades trying to get medical
researchers to use statistics effectively:
"[E]laborate analyses have sometimes obscured rather than
clarified the scientific problems...[they cause people who might
otherwise ask hard questions to be] too baffled or awed by the
intricate mathematics to want to speak up" (1988, p. 476)... The
clinician, forgetting the importance of his own contribution to
the logic and data of the research, becomes mesmerized by what he
does not understand: the statistical analyses. He assumes that
the statistical computations will somehow validate the more basic
activities, rectifying errors in observation and correcting
distorted logic" (1970a, p. 143).
Feinstein also quotes with approval P. G. H. Gell:
"Mathematics has now matured into a sacred cow which as often as
not gets in the way of scientific traffic" (in Feinstein, 1970b,
p. 291)
Similar laments may be found by wise students of the
research and decision-making process in such fields as business,
psychology, sociology, and economics, as well as biology.
6. After years of attempting to attract the interest of
technically-minded statisticians, and to satisfy them that
simulation is not intellectually inferior, resampling advocates
now hear thunder on the other side. The growing return to
teaching the philosophy and techniques of data analysis in the
schools as well as in universities is itself a welcome
development. An editorial in The American Statistician says "The
battle over whether more of the curriculum should be taught from
a data-analytic point of view has essentially been won"
(Scheaffer, 1990, p. 2). But as discussed earlier, this new
emphasis has led some to downgrade the importance of
probabilistic tools for determining the reliability of
descriptive statistics. The pendulum will surely swing back,
though, because probabilistic-statistics is too fundamental to be
pushed to the side for very long.
To their credit, the professional associations - the
National Council of Teachers of Mathematics, and the American
Statistical Association - recognize the problem with the old
methods. They have even given their blessing to the resampling
simulation method, though still in a somewhat muted fashion. But
any innovation takes a long time to be fully adopted, and
educational innovations are perhaps slowest of all, and hence
until now the resampling methods are fully taught at relatively
few universities and high schools.
7. The "Shiny Toy" Difficulty. In connection with changing
paradigms in economics, Robert Solow comments: "You can only
replace one shiny toy with another" (in Klamer, 1983, p. 144).
He is addressing himself to the behavior of students, but the
remark is equally appropriate with respect to instructors, who
enjoy having shiny toys to present to their classes. Resampling
done by hand was a very dull occupation. But the computer makes
this method a much more shiny new toy, and may help overcome this
barrier to adoption.
[The Battle for Turf. There are two aspects of this barrier
- the battle for intellectual turf, and the battle to keep jobs.
This anecdote is telling: At a seminar to bio-statisticians at
the National Institutes of Health on December 14, 1993, one of
the audience said at the end of the talk: "This is a very
egalitarian method. But if users can understand everything
that's being done, and can create their own methods, what do they
need me for?" That, of course, is one of the main barriers to
the resampling method.]
CONCLUSION
Estimating probabilities with conventional formal mathematical
methods usually is so complex that the process scares many
people. And properly so, because the complexities frequently
cause error. The statistical profession has long expressed grave
concern about the widespread use of conventional tests whose
foundations are poorly understood. But the easy availability of
statistical computer packages that can perform conventional tests
with a single command, irrespective of whether the user
understands what is going on or whether the test is appropriate,
has recently exacerbated this problem. This has led a call for
emphasizing descriptive statistics and even ignoring inferential
statistics.
Probabilistic analysis is essential, however. Judgments
about whether to allow a new medicine on the market, or whether
to re-adjust a screw machine, require more than eyeballing the
data to assess chance variability. But the conventional practice
and teaching of probabilistic statistics, with its abstruse
structure of mathematical formulas cum tables of values based on
restrictive assumptions concerning data distributions -- all of
which separate the user from the actual data or physical process
under consideration -- will not do the job. Resampling can do
the job. But there are large barriers against its widespread
adoption in everyday use. (See Chapter 00 on the future of
resampling.)
When will it be given the job?
REFERENCES
Barlow, Roger, Statistics (New York: Wiley, 1989).
Dallal, Gerard E., "Statistical Computing Packages: Dare We
Abandon Their Teaching to Others?", in The American Statistician,
November 1990, Vol. 44, No. 4, p. 265-6.
Dowdy, Shirley and Stanley Wearden, Statistics for Research,
Wiley & Sons, 1991.
Edginton, Eugene S., Randomization Tests (New York: Marcel
Dekker, 1980), referred to by Manly, 1991, p. 17.
Efron, Bradley and Tibshirani, Robert J., An Introduction to
the Bootstrap (New York: Chapman and Hall, 1993).
Einstein, Albert, Relativity (New York: Crown Press,
1916/1952).
Encyclopedia of Science, "Computers and Statistics," Vol. 2,
p. 95.
Feinstein, Alvan R., "Clinical Biostatistics - I", Clinical
Pharmacology and Therapeutics, Vol 11, No 1, 135-148.
Feinstein, Alvan R., "Clinical Biostatistics - II:
Statistics Versus Science in the Design of Experiments", Vol. 11,
No 2, 282-292.
Feinstein, Alvan R., "Fraud, Distortion, Delusion, and
Consensus: The Problems of Human and Natural Deception in
Epidemiologic Science", The American Journal of Medicine, Vol 84,
March, 1988, 475-478.
Freedman, David, Robert Pisani, Roger Purves, and Ani
Adhikari, Instructor's Manual for Statistics (second edition)
(New York: Norton, 1991).
Gardner, Martin, Mathematical Carnival (New York: Vintage
Books, 1977).
Garfield, B. Joan, "Reforming the Introductory Statistics
Course," paper presented at the American Educational Research
Association Annual Meeting, Chicago, 1991.
Hayek, F. A., New Studies (in Philosophy, Politics,
Economics and the History of Ideas) (Chicago: The University of
Chicago Press, 1978).
Hey, John D., Data in Doubt: An Introduction to Bayesian
Statistical Inference for Economists (Oxford: Martin Robertson,
1983).
Hogg, Robert V., , "Statistical Education: Improvements are
Badly Needed", The American Statistician, vol. 45, Nov. 1991,
342-343.
Hollander, Myles and Frank Proschan, The Statistical
Exorcist (New York: Marcel Dekker, (1984).
Hotelling, H., "The Teaching of Statistics," Annals of
Mathematical Statistics, 11, 1940: 457-72.
Kempthorne, Oscar, "The Teaching of Statistics: Content
Versus Form," The American Statistician, February 1980, vol. 34,
no. 1, pp. 17-21.
Littlewood, J. E., Littlewood's Miscellany, edited by Bela
Bollobas (New York: Cambridge, 1953/1986).
Manly, Bryan F. J., Randomization and Monte Carlo Methods in
Biology (New York: Chapman and Hall, 1991).
Moore, Thomas L., and Rosemary A. Roberts, "Statistics at
Liberal Arts Colleges," The American Statistician, May 1989, vol.
43, no. 2, pp. 80-85.
Mosteller, Frederick, Fifty Challenging Problems in
Probability (New York: Dover, 1965/1987).
Nagel, Ernest, and James R. Newman, "Goedel's Proof", in
Newman, 1956, pp. 1668-1695.
National Research Council (1989, p. 58, quoted from
Garfield, 1991, p. 5).
NCTM, 1989.
Peterson, Ivars, "Pick a Sample," Science News, July 27,
1991, pp. 56-58.
Polanyi, Michael, Knowing and Being, Edited by Marjorie
Grene (Chicago: The University of Chicago Press, 1969)
Polanyi, Michael, Personal Knowledge (Chicago: U of C
Press, 1962).
Polya, G., How To Solve It, A New Aspect of Mathematical
Method ( Garden City, New York: Doubleday & Company, Inc.,1957).
Ruberg, Stephen J., Biopharmaceutical Report, Vol 1, Summer,
1992.
Scheaffer, Richard L., "Toward a More Quantitatively
Literate Citizenry", The American Statistician, 44, February
1990, p. 2.
Simon, Herbert (1991, p. 284).
Singer, Judith D. and John B. Willett, "Improving the
Teaching of Applied Statistics: Putting the Data Back Into Data
Analysis," The American Statistician, August 1990, vol. 44, no.
3, pp. 223-230.
Singh, Jagjit, Great Ideas of Operations Research (New York:
Dover Publications, Inc. 1972).
Solow, Robert, in Klamer, Arjo, Conversations With
Economists (Rowman & Littlefield, MD 1983), p. 144.
Vaisrub, Naomie, Chance, Winter, 1990, p. 53.
Wallis, W. Allen, and Harry V. Roberts, Statistics: A New
Approach (Chicago: Free Press, 1956).
Westfall, Peter H., and S. Stanley Young, Resampling-Based
Multiple Testing (New York: Wiley, 1993).
Vaisrub, Naomie, Chance, Winter, 1990, p. 53
OUT? IN SFUTURE??
As a general matter, the law and the bureaucrats prevent
people from learning from the best teachers in the nation, hence
preventing intellectual progress and productivity gains in
education.
Examples: 1) In twenty-seven states, high school students
may not receive credit for courses taught by television. 2) In
Maryland, the state university may not beam lower-level
undergraduate programs to other institutions, including junior
colleges, a cartel-like scheme that would be illigal if
undertaken by private firms.
Even tougher are the informal barriers against presenting
the best teaching by the best minds through high-tech media -
video cassettes, computer tutorials, and "distance learning" via
tv. For example, at George Mason University, Dr. X finds that
the professors will not participate in "course-sharing" -- that
is, bringing in television programs from other universities.
All the talk, and all the commissions, intended to improve
education are a waste of time if the most important steps that
can be taken to improve education are stymied by organizational
and individual self-interest on the part of the education
establishment.
No one should be surprised at the existence of these
barriers to the production of the finest education. Put yourself
in the place of teachers, and it's easy enough to imagine
yourself not welcoming technical changes which will reduce the
demand for your services. Teachers are no different than weavers
in the eighteenth century, caboose-riding brakemen in the early
twentieth century, and industrial workers right now.
REFS
Bisgaard, Soren, "Teaching Statistics to Engineers", The
American Statistician, vol 45, #4, November, 1991, 274-283
Singh, Jagjit, Great Ideas of Operations Research (New York:
Dover Publications, Inc. 1972).
Dowdy, Shirley and Stanley Wearden, Statistics for Research,
Wiley & Sons, 1991.
Efron, quoted by Peterson, 1991, p. 56.
Hogg, Robert V., "Statisticians Gather to Discuss
Statistical Education," Amstat News, November 1990.
Hogg, Robert V., , "Statistical Education: Improvements are
Badly Needed", The American Statistician, vol. 45, Nov. 1991,
342-343.
Hey (1983, p. xi).
Kempthorne (no citation).
Freedman (from the introduction to the first edition).
Efron, Bradley and Tibshirani, Robert J., An Introduction to
the Bootstrap (New York: Chapman and Hall, 1993).
Hall (no citation).
Westfall and Young (no citation).
Singh (1972, pp. 20-21).
NCTM, 1989.
Simon, Herbert (1991, p. 284).
Polya (1957, p. v).
Littlewood (1986, p. 97).
Mosteller (Fifty Challenging Problems, problem 39).
Feinstein (1988, p. 476; 1970a, p. 143; and 1970b, p. 291).
Hall, Peter, The Bootstrap and Edgeworth Expansion (New
York: Springer-Verlag New York, Inc., 1992).
Scheaffer (1990, p. 2).
Solow, Robert (in Klamer, 1983, p. 144).
Watts, Donald, "Why is Introductory Statistics Difficult to
Learn?" And What Can We Do to Make It Easier?", The American
Statistician, vol 45, #4, November, 1991, 290-291
Westfall, Peter H., and S. Stanely Young, Resampling-Based
Multiple Testing (New York: Wiley, 1993).
page # teachbk I-2prob 5-8-6997