PHI DELTA
![]()
THE PROFESSIONAL JOURNAL FOR EDUCATION
ONLINE ARTICLE | ||
The Seventh Bracey Report on the Condition of Public EducationIllustrations © 1997 by Jem Sullivan | |
|
By Gerald W. Bracey So much data, so little time! Mr. Bracey brings a researcher's skeptical eye to the events of a year in which educators found themselves awash in a deluge of new data. LAST YEAR
I lamented the relative absence of new data bearing on the performance of
American schools. As if in response, the gates opened, and a deluge of new
studies poured forth. So much so that this year space precludes more than
mere mention of several interesting reports. I ignore these reports in the
knowledge that they are first-year accounts of multi-year studies and that
the more important data will be forthcoming later. For instance, the RAND
Corporation's evaluation of the activities of the New American Schools (formerly
the New American Schools Development Corporation), while interesting, will
provide much more information at a later date about whether these schools
are in fact able to "break the mold." Similarly, the U.S. Department
of Education and the Hudson Institute are both conducting large studies
of charter schools, and it is the future data that will tell us if these
schools are actually keeping the promise of improved achievement that earned
them their charters in the first place. Finally, more is likely to become
visible from Achieve, the entity constructed after the 1996 Education Summit,
which is housed by the National Governors' Association. In the meantime,
there still remains a flood of new data for our consideration. |
Achieving comparability across international borders is even more problematic. The European nations are currently struggling to pare down a social services safety net that threatens to impede the progress of the European Union toward a single currency. For the purposes of FIALS, though, these safety nets mean that other nations' samples do not contain the extremes of poverty and illness that the U.S. sample does.
Sampling aside, imagine for a moment that you are
a 40-year-old high school graduate. You have probably not taken a reading
test in 22 years. You might have sat through some aptitude or personality
inventory for a would-be employer, but the tests that are so pervasive in
schools have been absent from your life for more than 20 years.
Now consider that the reading assessment before you looks unlike anything
you ever saw -- even during your school days. Some of the prose passages
resemble those on the achievement tests you took in school, but those assessing
what the testers called document and quantitative literacy mostly do not.
For instance, one item presents an entire ratings page from Consumer
Reports. It rates 19 clock radios in three categories, and much of
the page is filled with those colored-in circles and half-empty circles
and check marks and keys to advantages and disadvantages that have to be
matched up against the comments at the bottom of the page. I imagine that,
however well you read your morning paper, if you have never seen one of
these ratings pages, you would be rather intimidated by the sheer quantity
of unusual text and figures to deal with. Of course, most of the text on
the page is irrelevant to your task, but only the testers know that. You
do not.
Although the tasks are all "real-life" tasks, there is some question
about how well they assess people's "real-life" coping skills.
That is, will people taking a reading test with bus schedules on it be as
careful as they would be if they really had to catch a bus to a specific
destination by a certain time? In real life, they might well verify the
schedule with someone standing nearby, particularly if that someone seemed
to be a regular rider. If at home, a person might well not use the schedule
at all but call the company and ask for bus departure times within a certain
range. Many people no doubt possess the requisite skills that allow them
to cope with tasks by means other than reading a complex document.
The results from NALS have been pounced on by school critics -- who, for
some odd reason, tend also to be phonics fanatics. They claim that NALS
shows that many of our people can't read well. This statement is typical.
"In 1993 the NALS reported that 40 million American adults cannot read
or write. Another 50 million are functionally illiterate: that is, they
can only read at an elementary level or can write their names, but little
else."4 Such statements totally misrepresent NALS
and FIALS.
NALS and FIALS both place people in five levels of reading. Consider what
people at the lowest two levels can do. At level 1, people "performed
simple, routine tasks involving brief and uncomplicated tasks and documents.
For example, they were able to total an entry on a deposit slip, locate
the time or place of a meeting on a form, and identify a piece of specific
information in a brief news article."5
At the next level -- the level referred to in the above quote as "functionally
illiterate" -- people can "locate information in text, make low-level
inferences using printed material, and integrate easily identifiable pieces
of information. Further, they demonstrated the ability to perform quantitative
tasks that involve a single operation where the numbers are either stated
or can be easily found in text. For example, adults at that level were able
to calculate the total cost of a purchase or determine the difference in
price between two items. They could also locate a particular intersection
on a street map and enter background information on a simple form."6
This is what people at levels 1 and 2 cannot do: "they were apt to
experience considerable difficulty in performing tasks that required them
to integrate or synthesize information from complex or lengthy texts or
to perform quantitative tasks that involved two or more sequential operations
and in which the individual had to set up the problem."7
This is not "Look, Dick. See Spot. See Spot run."
NALS and FIALS did not ask adults to read texts normally seen by, say, fifth-graders.
They used realistic adult reading tasks. Recall that U.S. 9-year-olds finished
second in the world in reading. In spite of that, one wonders how -- or
even what -- these children would do if confronted with the Consumer
Reports document mentioned above, one of whose tasks was classified
as only level 2. Given the complexity of the tasks in FIALS, it is little
wonder that most of the people who have difficulty with some of these tasks
report that they do not have problems coping with reading in their
daily lives.
FIALS claims that these people do have a reading problem and simply do not
recognize it. And some of the data suggest that this claim has some validity.
Contrasting the nature of the skills tested with the skills required in
daily life calls to mind the title of a David Berliner article, "Nowadays
Even the Illiterates Read and Write," a title Berliner borrowed from
a quote by the Italian novelist Alberto Moravia.
It is worth noting in connection with both NALS and FIALS that Berliner
analyzed just who the people in the lowest levels of NALS were. The lowest
levels contained 76% of all people in the study who were more than 75 years
of age, 67% of those who were physically or mentally impaired, 80% of those
who were visually impaired, 66% of those who were hearing impaired, 72%
of those who had learning disabilities, 72% of those who had a mental or
emotional difficulty, 79% of those with speech difficulties, and 70% of
those with long-term illnesses (six months or more). Twenty-five percent
of all those finishing at the lowest level were not born in this country.8
Of course, as one tests older and older adults, one is moving further and
further away from any results directly attributable to schooling. FIALS
acknowledges that "literacy is a fragile skill, one that requires continued
use."9 This wise view of reading stands in stark contrast
to the nutty notions put forward by some phonics enthusiasts, who suggest,
for example, that once children learn to decode they can read anything.
Finally, as regards the nature of NALS and FIALS, it would be wrong to think
that prose, document, and quantitative literacy represent three independent
skills or that the three scales are independent. The correlations between
them are quite high. Still, there are some differences among the countries
that suggest that the three concepts can be separated to some degree.
There are only seven or eight countries in FIALS, depending on whether you
count French-speaking and German-speaking Switzerland as one or two countries.
FIALS counts them as two. With these caveats and considerations in mind,
here is a summary of the FIALS results.
Prose literacy, as understood by FIALS, uses three
kinds of information-processing activities: locating, integrating, and generating
(which often involves making inferences). Illustrative passages consisted
of instructions on how to use medicines or assemble a bicycle or a description
of a particular kind of cultured plant. Document literacy reflected similar
activities for tables, schedules, charts, graphs, and maps. Quantitative
literacy added arithmetic operations into the mix. For example, a person
might be asked to determine the percentage of male teachers in Italy from
a chart that provides only the percentage of women teachers.
There is bad news and good news in the basic results. The U.S. has a higher
proportion of adults at prose level 1 (20.7%) than any other nation except
Poland (42.6%). The U.S. also has a higher proportion of adults at level
5 (3.8%) than any other country except Sweden (6.4%). Most countries do
not have even 1% of their readers at this highest level. The American proportion
at level 4 (17.3%) is exceeded only by Canada (20.0%) and Sweden (26.3%).
We tend to focus on the extremes, but, since the figures must add to 100%,
if the proportion of people at a given level is high, some other level must
perforce be low. And it is difficult to interpret the intermediate levels.
The results for document literacy resemble those for prose literacy except that the proportion of Americans at level 1 is higher than for prose (23.7%). Sweden again has the highest proportion of adults at level 5 (7.7%), and Canada also has a higher proportion of adults at level 5 than the U.S. (5.4% compared to 3.7%). This outcome might reflect American schools' emphasis on literature rather than on technical reading, although the U.S. is back in second place for quantitative literacy, with 5% compared to number-one Sweden's 8.5%.
Looking at the results by immigrant status is revealing. The U.S. has the highest proportion of native-born adults at level 1 for prose literacy (14.0%). Canada (12.9%) and Germany (12.3%) follow close behind. Other than Canada, the rest of the countries have considerably smaller proportions of native-born adults at low levels for document and quantitative literacy, again suggesting that the U.S. concentrates on literature.
The results for immigrants are even more dramatic, with 55.5% of U.S. immigrants scoring at level 1. Save for German-speaking Switzerland, no other nation comes close. Curiously, only Sweden and Canada have higher proportions of immigrants scoring at levels 4 and 5 on prose literacy than the U.S. But the U.S. has by far the lowest proportion of immigrants at levels 4 and 5 on document and quantitative literacy. (Because such small proportions of adults scored at level 5, aside from the basic results reported above, the FIALS analyses combine levels 4 and 5 into a single figure). This might reflect how much our schools concentrate on literature, but it might also reflect the kinds of skills that other countries are looking for in immigrants.
That the selection of immigrants by skill plays a role in the scores is shown dramatically by the results from Canada. While 31% of Canada's immigrants scored at prose level 1, 26% scored at levels 4 and 5. Only 21.8% of Canada's native-born adults scored this high. This outcome clearly reflects Canada's dual immigration policy: a relatively open door on the one hand and aggressive recruiting of skilled workers on the other. This policy should be kept in mind when making any comparisons of educational outcomes in the two countries.
These results also reveal the general worthlessness of international test scores by themselves. Cultural, social, and economic contexts must be taken into account.
It is when we parse the reading scores by income that we see the condition with which I opened this discussion. The results appear in Table 1.
| TABLE 1. Income by Prose Literacy Level in the U.S. | ||||
| Income Level | FIALS Level | |||
| 1 | 2 | 3 | 4/5 | |
| No income Quintile 1 Quintile 2 Quintile 3 Quintile 4 2 Quintile 5 |
45.2 26.1 17.9 8.0 2.6 0.3 |
31.3 22.0 20.2 15.5 8.0 3.0 |
20.6 19.7 19.2 21.3 15.1 4.1 |
11.9 16.3 16.3 21.6 21.8 12.1 |
| Source: Literacy, Economy, and Society: Results of the First International Adult Literacy Survey (Paris: Organisation for Economic Co-operation and Development, 1995). Values for the quintiles do not appear in the FIALS report. They were provided on request by the Census Bureau to Educational Testing Service, which relayed them on to me. | ||||
Note the relatively flat distribution of income across reading levels 4
and 5. Knowing how to read well will not guarantee you a living wage in
this country. Almost the same proportion of adults reading at levels 4 and
5 report no income as report income in the upper 20% of wage earners. But
not knowing how to read well will virtually guarantee you a life of poverty:
71.3% of American adults scoring at level 1 report either no income or an
income that puts them in the bottom 20% of all workers. Fewer than 3% of
those who score at level 1 find themselves in the upper 40% of wage earners.
For the year of the study, the upper bound of the first quintile was $6,400;
for the second, $14,560; and for the third, $23,000. This means that 89.2%
of those reading at level 1 earned $14,560 a year or less and that 97% of
all those reading at level 1 earned $23,000 a year or less.
If President Clinton gets his education plan fully enacted, a lot more highly
skilled readers are going to show up in those lower income quintiles, because
most of the jobs he likes to say that he has created fall in the low-paying
service sector. His plan would increase the supply of good readers (perhaps)
when the demand for them is already not that high (recall that the Second
Bracey Report found 26% of college graduates taking jobs that require no
college). Business and industry must love Clinton's plan because it is a
great way to depress the wages of skilled labor.
While the patterns for most other nations are similar, in none of them is
the separation of income by reading level as stark as in the U.S. Indeed,
some countries show some dramatically different results. In Sweden, for
example, 41% of readers at prose level 1 are in the top two quintiles of
income, compared to 55% of readers at prose levels 4 and 5. Sweden does
not distribute its wealth in the same way that we do. On the other hand,
41% of German readers at prose levels 4 and 5 report either no income or
income that puts them in the bottom quintile. Whether this means that Germany
is less far advanced along the "information highway" than the
U.S. or simply reflects that nation's severe, long-standing recession, or
both, is not clear.
It is probably a combination. A segment of National Public Radio's "Marketplace"
that was broadcast on 4 June 1997 noted that Mercedes and BMW, both of which
have recently built factories in the U.S., are taking the label "Made
in Germany" off their cars. According to one German interviewee, that
label now means "high costs, 100-year-old technology, and lousy service.
'Made in Germany' is no longer a boast. It's a curse." Another German
commentator observed that, if Bill Gates were German, "Bill Gates would
still be a mid-level technician in Siemens. Microsoft would never have happened."
But, as the American Federation of Teachers never tires of pointing out,
a lot more German kids pass the Abitur than American kids pass Advanced
Placement exams. I wonder what happens to them as a result. The AFT has
been silent on this.
Some data in FIALS I do find disturbing. Save for Poland, the U.S. has by
far the largest proportion (23.5%) of level 1 readers between the ages of
16 and 25. Canada has 10.7% of its young adults at level 1, and the rest
of the nations have less than 10% of young adults at this level. With the
exception of Poland again, the U.S. has the lowest percentage of readers
in the 16-25 age range at levels 4 and 5 (12.8%). Between the ages of 26
and 35, 21.6% of American readers are at levels 4 and 5, which compares
favorably with all countries except Sweden at 41.7%. Between the ages of
36 and 45, 29.2% of American readers are at levels 4 and 5, a figure exceeded
only by Canada (31.3%) and Sweden (31.7%). Between the ages of 46 and 55
and 56 and 65, the U.S. proportions at levels 4 and 5 (23.8% and 14.7%)
are surpassed only by those of Sweden (28.2% and 16.2%).
The American age-level pattern differs from most countries. It might reflect
a combination of our higher dropout rate and our higher college attendance
rate for those who graduate. Still, even taking into account immigration,
poverty, and the complexity of the FIALS definition of level 1 literacy,
it is disturbing to find so many younger Americans reading at so low a level.
TIMSS
The Third International Mathematics and Science Study was the big hit of
the year. Most educators who hold press conferences consider themselves
lucky if they draw 25 people. Some 300 showed up for the first release of
TIMSS data in November 1996. The number of TV cameras present made the event
seem more like the streets of Los Angeles on Academy Awards night.
For the record, I repeat the major finding of eighth-grade
test results: American eighth-graders placed 25th of the 41 nations in mathematics
and 19th of 41 nations in science. These facts and numerous others have
been discussed in my Research columns for March, April, and June, and I've
made mention of several new findings from TIMSS in connection with the awards
given to Linda Kulman and Pat Wingert (see the appropriate sidebars).
The press almost uniformly pronounced the eighth-graders' performance "mediocre,"
forgetting that "average" is a statistic and "mediocre"
is a judgment that might or might not be accurate. As I noted above, there
are about 30 "mediocre" nations, because the scores of most countries
were closely bunched -- and close to the scores of the U.S.
The U.S. media universally considered the high finishes of the Asian participants
as something to worry about and strive to match. Only The Economist
managed to observe an important point.
Just as western countries are busy seeking to emulate Japanese schools, schools and universities in Japan are coming under pressure from employers to turn out workers with the sort of creativity and individuality that the Japanese associate with western education. And just as American and British politicians are demanding that schools copy their more successful oriental counterparts and set their pupils more homework, the South Korean government is telling schools to give pupils regular homework-free days, so they can spend more time with their families -- just like western children. Perhaps in education there is such a thing as a happy medium.10
The eighth-grade data might have been ho-hum, but
the fourth-grade results were so upbeat that President Clinton made a last-minute
decision to announce them himself. If one considers only the countries that
met all the TIMSS sampling criteria, then U.S. fourth-graders were seventh
of 17 nations in math. If one adds those countries that violated one or
more of the requested sampling procedures, then American students finished
12th of 26. In science, these corresponding numbers are third of 17 and
third of 26.11
If one considers the results in terms of statistically significant differences,
as several reports evidently did, then seven countries had significantly
higher scores in mathematics than the U.S., and five of these seven met
all the sampling criteria. In science, only Korea had a significantly higher
score, which apparently led some reporters to give the U.S. finish as second
place. Japanese fourth-graders also finished ahead of the U.S, however,
with 70% of the science items correct versus 66% for the U.S., but the difference
was not significant.
I prefer to work with the raw percentage correct rather than with the scaled
scores most commonly used in TIMSS documents. While I understand the need
for statisticians to take account of sampling errors and other factors,
these statistical machinations bother me, as does spreading the results
over a 600-point scale -- a scale that makes tiny differences look big.
And some simply odd things happen (statistically explainable, but odd on
their face). For instance, in fourth-grade science, Korean students averaged
4% more items correct than Japanese students, who, in turn, averaged 4%
more items correct than American students. On the scaled scores, though,
the distance between Japan and the U.S. is only nine points, while the distance
between Japan and Korea is 23 points. And the scaled score for the Netherlands
is nine points below that of the U.S., even though Dutch students averaged
1% more items correct than did American students. Aside from anything I
might write for statistics journals, I'll stick with percentage correct.
The fourth-grade results caused the President to take to the Rose Garden
to declare that we didn't have to settle for second-class standards and
that the national goal of being first in the world in math and science by
2000 was at least a possibility at some grade levels. While cheering the
results overall, Clinton still emphasized that the U.S. was the only country
that slid from above-average in math at the fourth-grade level to below-average
at the eighth-grade level. Unfortunately, he described the slide in words
that suggested that the fourth-graders and the eighth-graders were the same
youngsters, which sent TIMSS personnel scrambling to explain that the two
grades were tested at the same time. Some observers implied that, since
recent education reforms had been in place for the full four years for the
fourth-graders, they might have benefited more than eighth-graders, who
got their school start before some of the reforms. The implication was that,
if we came back with a FIMSS four years from now, this cohort of fourth-graders
would show better. Of course, this assumes that all other countries would
run in place, which they won't. As one post-TIMSS paper revealed, no matter
what its TIMSS score, virtually no country is satisfied with its math and
science curriculum.12
I am stunned by the TIMSS fourth-grade science results and not necessarily
cheered by them. The word I have used most often to describe elementary
science instruction in this country is "haphazard." And I am not
alone in this view. In an unscientific survey, I asked the people I was
speaking with on the phone what they thought of elementary science instruction
in the U.S. "Hit or miss" was one answer; "flaky" was
another. And why not? Analyzing a school reform effort in a wealthy Boston
suburb, Richard Murnane of Harvard University and Frank Levy of MIT found
that some teachers there believed that we see by emitting light through
our eyes. (I discuss this reform effort in more detail below.) On NPR's
"All Things Considered," TIMSS Director Albert Beaton of Boston
College stated that at least some countries don't make much of science in
the early grades. We may be measuring their neglect more than our own success.
Still, the percentage of correct answers given by American students was
pretty good, and the sample questions don't look like set-ups.
People have offered various reasons for the "slide" from fourth
grade to eighth grade in mathematics. I would contend that it's primarily
because mathematics instruction stops for most students after grade 4 or
5. The TIMSS people found that U.S. students were still reviewing arithmetic
in eighth grade, but I think that, for most kids, the review kicks in even
earlier. I have only circumstantial evidence for this contention, but it
certainly seems to fit. The schools in the First in the World Consortium
score high at both grades (these results are discussed in more detail below).
Consortium students get a challenging curriculum, and more than half of
them take algebra as eighth-graders, compared to about 15% of students nationwide.
The Consortium students provide no evidence of a "slide," and
that makes sense, given what they are studying.
The Asian nations, especially Japan, are often portrayed as trying to make everyone look alike in terms of achievement. "The nail that stands up gets hammered down" is the oft-repeated Japanese saying. Thus it is interesting that the variabilities of Asian nations are not that much smaller than the variability of Americans at grade 4, and, for the most part, this variability increases from grade 4 to grade 8 to become larger than the U.S. variability in mathematics. Only Singapore reduced its variability from grade 4 to grade 8. (I have already alluded to nonschool factors that increase the scores in Singapore; one would imagine that these factors would also serve to reduce variability.)
In math, the standard deviation of fourth-graders
in the U.S. was 85 points on the 600-point scale; in Japan, 81; in Korea,
74; in Hong Kong, 79; and in Singapore, 104. The standard deviation in math
of eighth-graders in the U.S. was 91; in Japan, 102; in Korea, 109; in Hong
Kong, 101; and in Singapore, 88. In science, the standard deviation of fourth-graders
in the U.S. was 95; in Japan, 73; in Korea, 68; in Hong Kong, 79; and in
Singapore, 97. The standard deviation in science of eighth-graders in the
U.S. was 106; in Japan, 90; in Korea, 94; in Hong Kong, 89; and in Singapore,
95.
One would expect that the U.S., which by conventional wisdom champions individual
differences and tracks students, would show increasing variability over
time. But three of the four Asian nations also show increasing variability.
Indeed, the standard deviations for those three countries are larger for
mathematics at the eighth grade than is the standard deviation in the U.S.
We have so little data on these countries that we cannot even conjecture
what is happening, except to say that it runs counter to the conventional
wisdom about education in those nations. There's a hint of a possible explanation
for Japan in an article by Susan Goya, an American who has taught extensively
in Japanese schools. Goya contended that the curriculum in Japanese high
schools is tough and unbending and that at the advanced levels up to 95%
of the students don't know what is going on. Kazuo Ishizaka made a similar
point in differentiating between the presented curriculum in Japan and the
attained curriculum.13 But both were addressing high school
phenomena. Whether or not the process has begun in the middle school is
not clear. A spot check of five other nations finds similar increases in
variability in Australia, Austria, Hungary, and Norway, but a reduction
from fourth to eighth grade in Portugal.
Various TIMSS commentators and others looking at the TIMSS data have stated
that TIMSS results reveal something close to world consensus about what
a math curriculum should contain and less consensus on the curriculum for
science. (They've also pointed out that the U.S. math curriculum is not
like that of the rest of the world.) On the other hand, TIMSS Director Beaton
reminded the audience at a TIMSS symposium of the old saying that there
are two things one doesn't want to watch being made: sausage and legislation.
Beaton suggested that an international test might be a third. At a 1991
symposium on international comparisons, Tjeerd Plomp, an IEA official, sighed,
"We can only hope that the tests are equally unfair to everyone."
In 1997, Beaton restated Plomp's longing.
TIMSS did attempt to determine whether the test/curriculum mismatches affected
performance. People in all countries reviewed items to determine if they
were taught in that nation's curriculum. This review does not appear to
have constituted a major TIMSS activity, and the degree of accuracy of these
ratings cannot be known. TIMSS then looked to see how well a country did
on items selected as covered by other countries' curricula and how well
the other countries did on items selected as covered by the target country's
curriculum. Overall, there is little variation in performance in either
case.
For instance, Scottish students scored at the international average in science
(56%) on the actual TIMSS test. If we look at the performance of Scottish
students on items selected by other countries as covered by those countries'
curricula, the Scottish percentage correct ranges from a low of 53% on items
chosen by French-speaking Belgium to a high of 59% on items picked by Cyprus.
Similarly, using only the subset of items selected by Scotland, other countries
did not vary by more than one percentage point from their average on all
items. (I did not use the U.S. for this analysis because we selected all
the items, apparently as a statement of faith in the TIMSS endeavor.)
The similarities of performance among countries
on any subset of items holds up at the fourth-grade level for mathematics,
but there are some anomalies in science. All countries perform somewhat
better on the item sets chosen by Hong Kong and Scotland and a lot better
on the item set chosen by Ireland. For instance, American fourth-graders
got 66% of the items right on the whole test but 78% of the items right
using the Irish subset. This differential is typical across the 26 countries.
The above results seem to indicate that the TIMSS tests were "equally
unfair" across nations. However, the data presented for sample items
appear to refute the notion that there is a consensus curriculum. The "p-value"
of an item is the proportion of students who got the item right. Not only
do the p-values vary wildly across the various areas, they vary wildly across
countries. For instance, consider this item from the fourth-grade math test.
|
||
|
||
|
What fraction of the figure is shaded?
a) 5/4
b) 4/5
c) 6/9
d) 5/9
Eighty percent of American fourth-graders got this item right, compared
to 89% of Japanese, 92% of Koreans, 94% of Singaporeans, and 96% of fourth-graders
in Hong Kong. Students in the Czech Republic finished just behind the Asian
nations in total score, but only 43% of Czech students got this item right.
In Norway, only 25% of the students chose the correct answer, but that was
better than in Portugal, where just 16% of the students picked the correct
option. Why?
Or consider this item, a similar question but with
the answer alternatives presented as decimals.
|
Which number represents the shaded part of the
figure?
a) 2.8
b) 0.5
c) 0.2
d) 0.02
On this item, American fourth-graders fell to levels as low as Czech fourth-graders
(32% and 31% correct respectively). More interesting and more problematic
is the fact that in eight countries the proportion of correct answers declined
from third grade to fourth. (While the results are usually reported for
grades 4 and 8, there are also data from grades 3 and 7 in the TIMSS documents.)
On yet another proportionality item, a tough one where the international
average was only 21% correct, 25% of the students in the Czech Republic
got the item right, as did 25% of American kids, but only 11% of the students
in high-scoring Hong Kong aced it.
In a data representation item, students were shown a graph depicting the number of cartons of milk sold in school each day for a week. Part one of the question asked how many cartons were sold on Wednesday; the second part asked how many were sold for the week. In both parts, students had to produce the answer. Ninety-six percent of Korean students got the first part right, and 73% could add up the number of sales for the week. In Japan, though, while 94% of the students could pick how many were sold on Wednesday, only 32% could accurately find the sum for the week. American kids showed a similar drop, from 90% correct to 57%.
Such apparently capricious differences in items
from country to country and the capricious changes from third grade to fourth
do not seem to betoken a consensus "world curriculum" in math.
For the statistically oriented reader, I say this: the point-biserial correlation
coefficient between the "p-value" of items and the countries'
average percentage correct would be low for a number of items.
The data for eighth-graders appear more regular (from an "eyeball"
analysis), but instances of capricious variation within a country across
items of the same type do occur, and there are items for which some countries
show substantial increases from seventh to eighth grade, while others show
no gains or even declines in the percentage of students getting the items
right.
If anything, the items in science, where there is supposedly no consensus
curriculum, behave more as if there were than do the items in mathematics.
But items vary oddly here, too. For instance, one item shows a plain bounded
on two sides by mountains. A river flows through the plain, and a farm sits
near the river. Part one of the item requires students to give a good reason
for putting a farm in that location; part two asks for a bad reason. The
item was used in all four grades tested. At the third and fourth grades,
Korea (81%, third; 91%, fourth), the U.S. (66%, 83%), and Singapore (64%,
78%) had the highest percentage of correct answers to the first part of
the question. When asked for a bad reason, though, third- and fourth-graders
in England and Ireland scored best, an advantage these nations maintained
in the seventh and eighth grades as well, even though a smaller proportion
of Irish eighth-graders than Irish seventh-graders got the item right.
On this item, Hong Kong students showed an appreciable gain between third
and fourth grades, but not from fourth to seventh or eighth (45%, 65%, 65%,
70%). Of course, Hong Kong has no rivers that flood, and farms are not common
to the experience of most children there, although there are farms beyond
the mountains that separate the Kowloon Peninsula from the much larger area
of mainland Hong Kong.
On an item asking why mountains can have snow on their tops when the snow
on the lower elevations has melted, students in countries with snow-covered
mountains did better than students in countries without them. (The poor
performance of Austrian students was an anomaly.) TIMSS item selection techniques
are designed to weed out science items that are affected by cultural or
geographic features or by climatological considerations (e.g., items on
seasons), but they seem not to have been wholly successful.
The TIMSS data serve another purpose -- unintended, but useful. In the Second
Bracey Report, I criticized the findings of Harold Stevenson and his colleagues
at the University of Michigan on the grounds that the samples, at least
those that Stevenson bothered to describe, were neither representative of
their respective countries nor comparable to one another.14
I also outlined a number of nonschool factors that would bias results in
favor of Asian nations (e.g., fewer siblings, more parents and grandparents
living with the children).
There was no direct way of showing how biased the Asian samples were, but
consider this. Stevenson administered math tests to five high schools in
Fairfax County, Virginia, a diverse but mostly affluent suburb of Washington,
D.C.15 (Fairfax currently contains 148,000 students with
36% minority, 20% non-native English speakers, and 17% eligible for free
or reduced-price lunch.) Only 7% of the students in the highest-scoring
regular high school in Fairfax County (not counting selective and specialized
Thomas Jefferson High School for Science and Technology) scored as high
as the average student in the highest-scoring school in Sendai, Japan; only
1% scored as high as the average student in the best school in Taipei; only
about one-half of 1% scored as high as the average student in Beijing's
best regular high school (not counting a special school for math prodigies).
The scores of the best schools in the three Asian locales did not greatly
exceed a number of other schools there.
Students in the highest-scoring regular Fairfax County high school scored
at the 84th percentile on a commercial achievement test and at 591 on the
SAT math section. In 1995, an SAT math average of 591 placed the average
student in this high school at the 81st percentile nationally. Could Japanese
students really be scoring that much better? I didn't think so but had no
way of finding out.
Even at the Thomas Jefferson High School for Science and Technology, only
10% of the students scored as high as the average student in the best school
in the Taiwanese sample, and only 5% attained the scores of the average
student in Beijing's best regular high school. To me, this did not make
sense. After all, in 1995 the average student at Thomas Jefferson scored
720 on the SAT math section -- on the old scale.
The data from TIMSS make clear that, while the Japanese students maintain
a sizable edge over U.S. students in mathematics -- though not in science
-- their dominance is not nearly as great as Stevenson's results would suggest.
About 24% of American fourth-graders scored as high as the average Japanese
student, but only about half that many scored as high at the eighth-grade
level. (Stevenson's data were from grades 1 and 5.)
Additional evidence that Stevenson's sample overestimated Japanese attainment
can be found by comparing the TIMSS results for the First in the World Consortium
to the TIMSS results for the various nations.16 The Consortium
is a group of 20 suburban Chicago districts that paid to have their children
take the TIMSS tests. Their average eighth-grade math score was 585, compared
to Japan's 605, a difference that was not statistically significant. (I
report these results in scaled scores because the Consortium documents do
not show percentage of correct answers.) At the fourth-grade level, third-ranked
Japan scored 597 in math; the fourth-ranked Consortium, 591. In Stevenson's
Fairfax County study, that wealthy suburb, even at its specialized high
school, looked awful against supposedly representative schools in Asian
nations. In TIMSS, the suburbs hold their own. Indeed, in science, the Consortium
eighth-graders finished fourth, and no country had a score that was significantly
higher; at the fourth-grade level, the Consortium outscored all nations
in science, and only second-ranked Korea's score was not significantly lower.
These results make sense.
Stevenson had also compared Asian schools to a sample of 20 Chicago-area
schools, the latter chosen to represent all socioeconomic strata in the
city.17 Yet in a comparison of two grades and two types
of mathematics problems, only once did the average of the best school in
Chicago surpass the average of the worst school in Beijing. In view of TIMSS
data, even though China did not participate, I'd say this doesn't make sense
-- unless the Stevenson samples are not representative.
The performance of the First in the World Consortium raises questions about
generalizations that the TIMSS staff and the U.S. Department of Education
staff made about the American curricula, especially in mathematics. The
math curriculum was characterized as "a mile wide and an inch deep."
Yet the Consortium eighth-graders finished fifth in math, and the fourth-graders
finished fourth. At both grade levels only Singapore had a significantly
higher score. If the American math curriculum is superficial, how could
this be? Like most generalizations about American schools, this one doesn't
hold up well to scrutiny. In science, the Consortium eighth-graders were
second, and no one had a higher score; the fourth-graders were first, and
only Korea did not have a significantly lower score.
The performance of the Consortium is consistent with the earlier math results
from IAEP-2: the top third of American schools finished one point behind
top-ranked Taiwan and one point ahead of second-ranked Korea (these are
NAEP scale points; a report linking TIMSS and NAEP is forthcoming).18
The bottom third of American schools finished below last-place Jordan. There
simply is no "American math curriculum."
New Data: Domestic
The NAEP mathematics scores rose for the third assessment in a row. Ho hum.
At least, that's how the media reacted. Nothing showed up in the Washington
Post or the New York Times. Even Education Week minimized
the results. "More American students have upgraded their math skills,"
according to the page-one story, "but most still lag behind world class
standards."19
ACT (American College Testing program) scores rose for the third year in
a row. Ho hum. The New York Times gave one-fourth of a column to
the story under the heading "National News Briefs." The Washington
Post remained silent.
An early warning signal has arrived from Iowa. Trend lines for the Iowa
Tests of Basic Skills (ITBS) in the state of Iowa and in the nation have
risen and fallen in synch since the ITBS was radically revised in 1955.
Thus it is a little alarming to note that the ITBS scores in Iowa have fallen
the last two years, slipping from the record highs they attained in the
late 1980s.20 National data, which arrive only during the
years when the ITBS is renormed, are not yet available. Why might scores
be falling now? H. D. Hoover, director of the Iowa Testing Programs at the
University of Iowa, thinks the reason is that we're paying too much attention
to self-esteem and not enough to academics. Hoover stresses that this is
only his opinion, but his is certainly an opinion worth listening to.
Education and the Economy
As I noted in giving the award to the OECD, most people now realize that
the link between education and the health of a developed nation's economy
is a loose one at best. Lousy schools and all, the U.S. economy continues
to soar while the economies of Japan and Germany struggle, mired for years
in their worst recessions since World War II. Conditions look grim for South
Korea and its high TIMSS scorers as well. American unemployment has fallen
to a low that economists once thought was impossible -- until it happened.
(Remember, an economist is a person who can explain tomorrow why the predictions
he made yesterday didn't come true today.) According to NPR's "All
Things Considered" broadcast of July 29, workers are so scarce in Iowa
that Iowa businessmen are organizing recruiting trips to Texas.
Some new data do suggest that the productivity advantages reported in these
pages and elsewhere might have been somewhat misleading. As shown in Education
Indicators: An International Perspective, U.S. productivity has exceeded
productivity in Europe and Japan since measures began in 1961.21
Other nations have been gaining. What else would we expect as the world
rebuilt after World War II? But as of 1991, only France had approached the
U.S. level.
New data from the Bureau of Labor Statistics (BLS) indicate that, using
the traditional measure, in 1995 the U.S. maintained its advantage over
the 12 other countries studied.22 The technique arbitrarily
sets U.S. productivity at 100 and then calculates the productivity of other
countries from their Gross Domestic Products (GDPs) using Purchasing Power
Parities (PPPs), a formula for determining how much comparable goods and
services cost in various countries. However, the BLS suggests that the traditional
representation might not be the fairest. Its two new analyses include comparisons
using only employed people and productivity per hour for employed people.
Unemployed people don't contribute to the GDP, and Europe has lots of unemployed
people these days -- 13% overall as of early August 1997.
Looking at productivity per employed person, most countries gain on the
U.S., but they do not overtake it, while Japan falls even further back.
Using two formulas for calculating PPPs, the BLS finds that only Belgium's
productivity exceeds that of the U.S., while France and Italy come close.
People who might be scratching their heads over Italy should forget the
Italy depicted in movies and think about the highly populous, highly prosperous,
highly industrialized, and highly productive northern region. Think Milan,
not Naples.
If we move from productivity per employed person to productivity per employed
person per hour, Japan recedes even further, but three of the other five
countries for which data exist surpass the U.S. Setting U.S. productivity
at 100 in these computations, Japan finishes at 69.9, Sweden at 84.2, France
at 104.6, Germany (only that which is the former West Germany) at 106.7,
and Norway at 113.1. The BLS warns that, although it considers output per
hour the best productivity measure, it is also the hardest to measure accurately.
Most countries can't provide such data, and the BLS states that, even for
those countries it considered reliable enough to put into its publication,
"comparative figures on GDP per hour should be viewed with a greater
degree of caution than the figures on GDP per employed person."23
The New Basics Redux
The first time I recall seeing the term "new basics" tossed around
was in A Nation at Risk, and, with the exception of computer science,
there was nothing new about them. They were simply a call for more: more
English, more math, more science, more social studies. A few years later,
Jerry Brown, former governor of California, introduced his "new basics,"
which consisted of "the three C's": communication, computation,
and something else I can't remember. Now come Richard Murnane and Frank
Levy with their six "new basics," only two of which are any different
from the earlier lists.24 To those lists, Murnane and Levy
add "the ability to solve semistructured problems where hypotheses
must be formed and tested" and "the ability to work in groups
with persons of various backgrounds."
Murnane and Levy start from a troublesome statistic that indeed requires
our attention. In 1979, a 25- to 34-year-old male with a high school diploma
did earn considerably less per year than a comparable male with a college
degree: $27,000 as opposed to $32,000. By 1993, however, the gap (in constant
dollars) had become a chasm: $20,000 as opposed to $31,000.25
Given the plummeting real wages of people with high school diplomas, Murnane
and Levy argue that a diploma no longer unlocks the door to the middle class.
What to do? Well, we could send everyone to college. This is President Clinton's
notion. Murnane and Levy, though, argue that a more effective solution is
to alter the curriculum of high schools to teach more of the skills employers
say they need. Murnane and Levy point out not only that the higher-paying
jobs are going to college grads, but also that there is a stronger relationship
today between performance in high school and later earning power even for
those who do not go on to higher education. (Conventional wisdom holds that
employers don't pay any attention to high school performance. Nonetheless,
higher performance in high school already has an impact on later earnings.)
Murnane and Levy accept too much at face value the NAEP performance levels
that have been rejected in many quarters as political tools designed to
maintain a sense of crisis about schools. They use these NAEP results to
contend that many high school graduates now leave schools without the reading
and math skills sufficient to obtain good-paying jobs or to get the kinds
of entry-level jobs that can lead up a wage ladder.
Murnane and Levy contend that, for the "new basics," math need
be only ninth-grade level, which means "the ability to manipulate fractions
and decimals and to interpret line graphs and bar graphs." This poses
a problem for employers because, when it comes to these skills, "many
recent high school graduates don't have them."26
Oh? In the TIMSS sample questions, 90% of American fourth-graders correctly
provided a simple interpretation of a bar graph, and 57% of them successfully
made a complicated interpretation. Asked what proportion of a figure was
shaded, 80% of American fourth-graders got it right, a number exceeded by
only four other nations. On questions where few American students got the
right answer, few kids on any part of the globe got the right answer. Maybe
no one is learning the "new basics," but, when one finds commonality
among fourth-graders in 26 countries, I have to suspect that something other
than the low quality of American schools is at work.
At the eighth-grade level, one TIMSS item showed a line graph with stopping
distance on the ordinate and speed when brakes are applied on the abscissa.
The question asked the students to determine how fast a car was going if
it stopped 30 meters after the brakes were applied. Seventy-two percent
of American kids got the item right, a success rate topped by just eight
of the 40 other nations -- and not by much even in those eight countries.
Given a fraction and asked to write a larger one, 81% of American eighth-graders
succeeded. No nation had 90% or more correct on this item.
Given the size of a gas tank, the rate of fuel consumption, and the distance
driven, only 34% of American eighth-graders could accurately determine how
much fuel would be left at the end of the trip, but only three of the 41
countries had more than half of their eighth-graders succeed on this question.
When one couples this performance of fourth- and eighth-graders with the
data showing large increases in the proportion of high school students who
are taking more math and science courses,27 one must wonder
just what Murnane and Levy mean by "many students" who lack math
skills. Of course, with almost two-thirds of each June's high school graduates
attending some form of higher education in the fall following their graduation,
employers of high school graduates do not see many of the more academically
able students.
Elsewhere, Murnane and Levy have put their thesis thus:
For 15 years, the basic skills of high school seniors have risen slowly while the skills required for a decent job have increased radically. If schools gave tests that measured students' reading, writing, and math skills against employers' requirements, parents would see the problem and demand solutions. But few schools give such tests.28
In their book, they contend that "the most
important problem U.S. schools face is preparing children for tomorrow's
jobs."29 This proposition -- that the conditions of
business have changed, so schools must change -- has become so common since
A Nation at Risk that we have grown numb to its monumental arrogance.
Business conditions have changed. So what? Let 'em cope. Remember, unemployment
is remarkably low. Good jobs are not going abroad because, as New York
Times economics writer Sylvia Nasar wrote a few years back, America
is already the low-cost producer of many goods and services.30
And the TIMSS data now show that only six countries demonstrate much higher
math skills than the U.S.; only one country is much higher in science.
That employers are not coping was shown clearly in the Sandia Report.
The Sandia engineers examined the distribution of jobs according to skill
levels and the distribution of training dollars across those levels. The
jobs were fairly evenly distributed across three levels: college education
required, skilled labor, and unskilled labor. The dollars were not so evenly
distributed. Almost 70% of all training dollars went to people with college
educations. Almost 20% went to provide additional training for already skilled
labor. That left less than 15% of the dollars to help unskilled labor.31
The Sandia engineers noted that, whereas Japanese automakers provided about
325 hours of training for new workers at Japanese factories and almost 300
hours for new workers at their American factories, American automakers provided
fewer than 50 hours of training. A story on "All Things Considered"
in July 1997 observed that the average training provided retail workers
is seven hours.
Clearly, American industry is getting by on the cheap where training is
concerned. But you wouldn't know it to read about the doleful condition
of American labor. One report lamented that Lockheed Martin spends more
than a million dollars a year in basic skills remedial training.32
The report provides no citation for the figure, but let's assume that it
is true. Lockheed Martin's 1995 income was almost $27 billion.33
They're not even spending chump change on such training.
Murnane and Levy's attitude is so common in the
U.S. today that we might be inclined to chisel into the stone over every
high school entrance this inscription: "P.S. 139. A Wholly Owned Subsidiary
of the Business Roundtable." Even we educators seem tragically in danger
of equating two very different processes: education and training.
The solution proposed by Murnane and Levy suffers from another problem:
the youngsters who actually lack the reading and math and "new basic"
skills don't live where the good jobs are anyway. William Julius Wilson,
among others, has amply documented this fact.34 Students
in the urban ghettos who graduate with "the new basics" are going
to be very frustrated. Most poor urban youngsters probably already know
that the good-jobs myth is a hoax as far as they're concerned, so they won't
even have bothered.
Sometimes the good-jobs problem in rural settings pits the school against
the family. In the early stages of its monumental reform efforts, Kentucky
established the Pritchard Commission, named for the man who chaired it.
Among its duties was explaining to the people of Kentucky the good things
that would happen to their children as a result of reform. At one stop,
the audience booed. They knew that there was no market in their area for
the skills being described. They had quickly reached the logical conclusion
that their children would leave. They booed because they didn't want their
families to be broken up.
Murnane and Levy assume that, if the students have the skills, the jobs
will be there. They won't. The FIALS data showed that many people at the
highest levels of literacy had no income or earned only poverty-level wages.
Bureau of Labor Statistics job-creation projects show that the current jobs
boom is in the low-paying service sector: cashiers, sales clerks, janitors,
waiters, and so on. (Various aspects of this situation were discussed in
the Fourth and Fifth Bracey Reports and in the January 1996 Research column.)
Murnane and Levy are half right: wage prospects are bleak for people without
reading and math skills -- but they remain iffy even for those with such
skills.
The final problem with the Murnane and Levy hypothesis is that it might
well be wrong. According to MIT economist Lester Thurow, most of the job
growth in the last 30 years has been in the service sector.35
In the same period in which Murnane and Levy find the wages of high school
and college graduates diverging, the number of hours worked by retail workers
has fallen from slightly over 40 to 28. Retail wages, which were once comparable
to those in manufacturing, have declined, and the retail sector shows negative
productivity growth. The declines in wages and length of work week are both
the result of government policy: the government announced that employers
don't have to pay benefits to part-time workers, and so they don't. The
retail sector accounts for 74% of all part-time workers, including a large
percentage of contingent workers -- i.e., temps.
Thus not only have wages in the service sector declined, but workers in
this sector have lost their benefits packages. Retail jobs account for 25%
of all job generation, says Thurow. The largest single industry
in the country is currently the hospitality industry, another employer of
part-time, benefit-less workers. And, according to Christopher Cameron of
the Southwestern University School of Law, the contingent work force --
temporary, part-time, and contract workers -- now constitutes 25% of all
workers.36 Although some contingent workers make decent
wages as programmers, writers, or editors, most contingent jobs are likely
to be held by those without college degrees. Reductions in the length of
the work week and loss of benefits alone could account for much of the change
in wage differentials documented by Murnane and Levy. There is no need to
invoke a change in the job demands of business.
Murnane and Levy are on much firmer ground when they stick to describing
school reform efforts.37 In a Washington Post
article, they described a seven-year reform effort in a school system in
which all the odds were weighted toward success: average income $90,000
a year, parents active in the schools, 95% of the graduates attending college.
Still, they found that, "in this community as in most, elementary school
science, when it was taught at all, was accomplished through a method dubbed
'chalk and talk.' The teachers would lecture, the kids would take notes,
and a quiz would follow." According to Murnane and Levy, the "chalk
and talk" technique was used for science even by teachers who had larger
repertoires of strategies for other subjects. (Once again, this makes one
wonder about the meaning of the fourth-grade TIMSS data.)
Through a process of trial and error -- and plenty
of both -- this school system changed its elementary science instruction.
(This is the same district mentioned earlier in which some teachers believed
that we see by emitting light from our eyes.) The reform produced a modicum
of success. "As the teachers became more comfortable with the material,
some formed teams to help one another organize the science lessons and assemble
the equipment. They discussed how they could modify experiments to make
them work better." Sounds like what we want. It took seven years.
Murnane and Levy declare the project a success.
A happy ending, but a cautionary tale. Compared to the national landscape, this affluent community has big advantages. But even here, a serious piece of teacher retraining -- a throwaway line in most task force reports on what education reforms are needed -- was a hard slog with a big misstep. Most school districts start from further back.
This article by Murnane and Levy should be the
office wallpaper of those who have tried to find some magic bullet of school
reform, e.g., people such as John Chubb and Terry Moe, who in their book
actually did call choice a panacea.38
Teacher Preparation (Ho Hum!)
Teacher preparation comes close to matching the weather as something everyone
talks about -- mostly complains about -- but does little to change. Still,
proposals and programs -- Lee Shulman's work at Stanford, the Holmes Group,
and so on -- seem to have difficulty staying on the radar screen. The latest
blip comes from the National Commission on Teaching and America's Future.39
Although that group's report regurgitates the nonsense from A Nation
at Risk in a tone-setting introduction, it does contain some important
pieces of data.
For instance, students in mostly minority schools are much more likely to
be taught by teachers who did not major in the field they teach (42% who
did major in their fields, versus 69% in mostly white schools)
or teachers who have no certification in their field (54% certified, as
opposed to 86% in mostly white schools). Teachers with not even a minor
in the field they teach in are much more likely to be found in schools with
50% or more free-lunch recipients than in schools with fewer than 20% free-lunch
recipients.
Although not new, some of the commission's findings bear repeating: schools
have low expectations for students, there are no enforced standards for
teachers, there are major flaws in teacher preparation, teacher recruitment
is "painfully" slipshod, induction for beginning teachers is inadequate,
and there is little reward for professional development, knowledge, or skills.
Moreover, the report is a tough-talking document.
Although no state will permit a person to write wills, practice medicine, fix plumbing, or style hair without completing training and passing an examination, more than 40 states allow districts to hire teachers who have not met these basic requirements. Most states pay more attention to the qualifications of veterinarians treating America's cats and dogs than those of the people educating the nation's children and youth.40
A Coda
I drifted into education from psychology in the late Sixties. One of the
first words I recall hearing was "portability." It was a problem,
this portability. Educators couldn't seem to carry their successful innovations
and programs from one site to another and repeat their accomplishments.
The programs just weren't portable, and it was a cause of great concern.
It still is.
In 1997 comes a story about how one of our most successful -- and one of
our most famous -- teachers could not carry himself from one setting to
another and how a program he left behind could not sustain its vitality
without him.41 The ultimate in failed portability. Jaime
Escalante -- teacher, mentor, inspiration for the movie Stand and Deliver
-- left his Los Angeles school in 1991. In that year, 143 of his poor, Hispanic
calculus students took the Advanced Placement examination in calculus, and
87 of them scored high enough to earn college credit. Last year, only 37
students at the school took the test, and only seven obtained college credit
for their efforts.
Escalante is now in Sacramento, attempting to replicate his success. Sadly,
he can't. This year he managed to get only 11 students to take the AP calculus.
Part of the reason for his earlier success appears to have been the Hispanic
culture and language he shared with his students in L.A. Escalante often
spoke to them in a gruff Spanish. Now, with his classes about equally divided
among blacks, whites, Hispanics, and Asians, the cultural force is mostly
gone.
This should be a cautionary tale for reformers. A more formal study, though,
found the same thing: local conditions prevail.42
What kind of year was it? Well, the year that found George Will mumbling
about federal control of cats also found him discovering that schools don't
have total control over children: "Between birth and age 18, a young
American spends 9% of his or her time in school. What occurs in the other
91% colors, and overwhelms, the 9."43 Bravo, George.
It was a year like all years, filled with events that alter and illuminate
our lives. The U.S. posted better finishes in international studies and
showed some rising test scores. And we were there. Save for Peter Applebome
of the New York Times, though, the media were largely absent. Perhaps
they'll show up next year.
Worst Untested Hypothesis of the Year Award
Denis Doyle garners one prize for the Worst Untested Hypothesis of the Year. In the education chapter for the conservative Heritage Foundation's Issues '96: The Candidate's Briefing Book, Doyle took me to task for noting in an op-ed article for the Washington Post that the proportion of students scoring above 650 on the SAT mathematics section was at an all-time high. In a section of his chapter alluding to me, which he called "Chicken Little in Reverse," Doyle gasped that "[Bracey] does not tell the reader who is pushing the SAT math scores higher (mostly Asian and Asian-American students). . . . Candidates for public office should not be fooled by fatuous assertions that test scores are 'climbing.' "1 Ignoring the possibility of any veiled racism in this remark, what is the point of the pronouncement even if it is true? And it is not.
Although Doyle could easily have tested his hypothesis about Asian students,
he presented no data. A single phone call to the College Board yielded the
requisite data, and it took me about 15 minutes to do the calculations.
From 1981 to 1995, the proportion of students scoring above 650 grew by
almost 75%, from 7.1% to 12.1%. If Doyle's hypothesis were correct, then
removing the Asian students from the test-taking sample should cause this
75% figure to disappear or at least become very small. But it doesn't. With
the Asian youngsters held out of the pool of test-takers, the gain is still
57%.2 In 1995, 57% more black, white, Hispanic, and Native
American seniors scored above 650 on the SAT-M than in 1981. It's a record.
Score one for fatuous claims.

Stiffest Resistance to Data Award
Doyle collects his other award in the category of Stiffest Resistance to Data. In the same Heritage Foundation document, Doyle writes, "Government spending on education has skyrocketed, even as school performance and student achievement have remained static."1 Were all these contentions not in a single sentence, they would each have won a prize. As shown by an Economic Policy Institute study, Where Has the Money Gone?, from 1969 to 1994 new spending for education rose by 61%.2 Hardly a Roman candle, much less a skyrocket. More important, most of that new money was spent in areas that would not lead reasonable people to expect increases in test scores, such as special education.
However, it is not the claim itself that takes the day, but its durability in the face of massive evidence to the contrary. When I debated him five years ago, Doyle made a similar contention about spending and test scores during the 1980s. He was unmoved by the data I presented then and has remained impervious to other data since. Indeed, Reinventing Education, his 1994 collaboration with Louis Gerstner, Jr., and others, made precisely the same claims.3
At the time of the original debate, Doyle mumbled
something about the flatness of the SATs -- but, if demographic changes
in the test-taking pool are taken into account, SAT scores have been rising
since 1975.4 Other critics have pointed to the "stagnant"
scores in the various areas tested by the National Assessment of Educational
Progress (NAEP).5 However, because the NAEP has no impact
on anyone and no one takes it seriously, kids practically sleep through
the NAEP testing, so it's hard to say what it really measures. When the
district I was working in at the time participated in an NAEP state-by-state
tryout, about half of the teachers involved told me that they had difficulty
keeping the children on task. In any case, what some observers have called
"stagnant," others have called "stable." And no one,
to my knowledge, has put forth any specific reasons why NAEP scores should
be rising. Yet NAEP mathematics scores have been climbing, probably
because of the larger numbers of students taking more and more mathematics
courses since 1982.6 The media have ignored these gains:
neither the Washington Post nor the New York Times reported
the 1996 NAEP math results.

Silence of the Sheep Award
David Broder can claim a share of Doyle's Stiffest Resistance Award, not for what he has said, but for what he hasn't. We might call his half-award the Silence of the Sheep Award. For the last six years I have bombarded Broder with data refuting his negative columns about American public schools. Once, on the phone, I asked him what it would take to make him believe that what I was saying was true. He replied, "Other voices." In May 1996 the American Association of School Administrators adorned the cover of School Administrator with photos of David Berliner, Harold Howe II, Iris Rotberg, Harold Hodgkinson, Richard Jaeger, and yours truly. The text on the cover identified us as the leading defenders of the nation's public schools. All of us had articles in that issue, and at the end of each article was a short list of other pieces we had written about the schools. I sent this issue to Broder as evidence of "other voices." He has yet to reply, but then, it has only been 17 months.

Least-Valid Index Of Effectiveness Award
Herbert Walberg's prize is the Least-Valid Index of Effectiveness Award. He earned it for his use of the OECD index of "reading progress" to "prove" that the U.S. teaches reading less productively than other OECD nations.1 For its part, the OECD acknowledged from the beginning that the index of progress had nothing to do with effectiveness of teaching reading.
Although it had no definitive data, the OECD noted
that "the fact that children do not all begin school at the same age
may furnish an explanation for the differences among countries in reading
progress, especially for 9-year-olds." In addition, the OECD reckoned
that different countries might emphasize literacy in different grades. And
it mentioned that France and Italy begin exten-sive education experiences
at early ages (2 and 3 respectively), while students in the Netherlands
begin school at age 4.2 In fact, when I calculated the
rank order correlation coefficient between the countries' ranks at age 9
and their progress, it came in at -.69. That is, countries whose children
look good early don't show much "progress"; countries whose children
show poorly at age 9 show lots of "progress." Or, put another
way, the countries that start school late do poorly at age 9 but catch up
by age 14. Makes sense.

The Swiss Bitter Chocolate Award
The entirety of E. D. Hirsch's book The Schools We Need: And Why We Don't Have Them presents a strong case for an award. The book's central thesis is an argument that, for the last 75 years, anti-knowledge progressives have held American education in such a stranglehold that they have created a "thoughtworld," as Hirsch calls it, in which educationists cannot imagine any other way of looking at their field. Improbable? Not to Hirsch. "If thousands of Marxist thinkers could have been caught for decades in the grip of a wrong socioeconomic theory, it is not beyond imagination that a cadre of American educational experts could have been captivated by wrong theories over roughly the same period."1
While the whole book is a towering rhetorical performance, one small section of it is so far beyond the pale that it earns the Swiss Bitter Chocolate Award. Hirsch's Swiss neighbors, says he, complain about the low quality of the public schools around the University of Virginia, where Hirsch is a faculty member. Of his neighbors' native schools, Hirsch writes, "Switzerland has one of the most detailed and demanding core curriculums in the world, with each canton specifying in detail the minimum knowledge and skill that each child shall achieve in each grade, and an accountability system that insures the attainment of those universal standards. . . . Each child therefore receives a highly coherent, carefully monitored sequence of early learnings."2
If this be true, then one would expect the Swiss kids to "slam dunk American students" (to borrow one of President Bush's vibrant but not-so-clear images about how our youngsters stack up against those in other countries). Where we have comparative data, though, it doesn't seem to work out that way:
Reading |
Math |
Science | ||
| Age Switzerland United States |
9 511 547 |
14 536 535 |
14 62 63 |
14 56 58 |
The 9- and 14-year-old reading results are on a
600-point scale identical to that of the SAT, while the math and science
data are simply percentages of correct answers. (The Swiss percentages are
simple averages of results for French- and German-speaking areas; the separate
percentages don't differ very much.) The reading data, taken from How
in the World Do Students Read? -- the virtually unknown 1992 study
by the International Association for the Evaluation of Educational Achievement
(IEA) -- show U.S. teenagers in a dead heat with their Swiss counterparts.3
(So little attention was given this study when it was published in 1992
that Secretary of Education Richard Riley tried, with some success, to resuscitate
it with a press conference in 1996.) The Third International Mathematics
and Science Study (TIMSS), which began releasing data in November 1996,
provides the eighth-grade math and science data.4 (Switzerland
did not participate in the fourth-grade TIMSS assessment.) Swiss students
outscore American students in math, but they trail slightly in science.
With regard to the populations of the two nations as a whole, reading data
for ages 16 through 65 are available from the First International Adult
Literacy Survey (FIALS). The U.S. and Switzerland both have high percentages
of people at the lowest levels of literacy (20.7% and 18.5% respectively),
but the U.S. has a much higher percentage of people than Switzerland at
the two highest levels (21.1% versus 9.3%). FIALS and TIMSS results are
discussed in detail in the Seventh Bracey Report. Hardly a slam dunk. Not
even a lay-up.5

The Media's Lack of Institutional Memory Award
Pat Wingert takes home the Media's Lack of Institutional
Memory Award. In 1992 she and Newsweek hailed American students'
math and science performance in the Second International Assessment of Educational
Progress (IAEP-2) as "An 'F' in World Competition."1
She fretted over how students in Taiwan and Korea outperformed our kids.
While concentrating on the mostly low American ranks, she even tried to
explain away the third-place finish of American 9-year-olds in science.
(While U.S. ranks were mostly low, the scores were only slightly below the
international average, and most countries had similar scores.)
One might have thought that, looking at the TIMSS eighth-grade data, Wingert
would have noticed that these ranks were a lot better than those of IAEP-2.
In IAEP-2, American 13-year-olds finished 13th out of 15 countries in science
and 14th out of 15 in math. In TIMSS, American eighth-graders were 25th
out of 41 nations in mathematics and 19th out of 41 in science.
Some of the countries that finished ahead of us in IAEP-2 did not repeat
their success in TIMSS, so it wasn't as if the TIMSS testers had rounded
up a bunch of patsy nations for us to beat up on. Only six or so countries
that participated in TIMSS can be considered developing nations. Of the
industrialized nations of Europe and Asia, only Finland and Taiwan did not
participate. The 41 nations, by far the largest group ever assembled in
such a study, offer a representative view of achievement around the world,
although some of them didn't meet the TIMSS criteria for sampling or student
participation rates. (These countries are presented in separate categories
in most TIMSS documents, but few people have paid much attention to the
distinction.)
It would seem logical for Wingert to check the files and compare the two
studies. But no. There is no mention of IAEP-2 in Wingert's story on TIMSS.2
Even though the TIMSS headline, "The Sum of Mediocrity," is a
little less shrill, it is impossible to distinguish material from the two
stories: "In math, the gap between the American and Asian countries
was especially wide. Even the very best American students didn't measure
up. And that's not even the worst part. Consider this as you try to figure
out which countries will dominate the technology markets of the 21st century:
the top 10% of America's math students scored about the same as the average
kid in the global leader, Singapore." The first two quoted sentences
are from the 1992 story; the last two, from the 1996 edition.
Wingert fears that Singapore will "dominate" the technology markets.
Each grade in American schools contains about 3,000,000 students.
The island nation of Singapore contains 3,000,000 people. At least at night.
In the morning, thousands of poor Malaysians cross into Singapore, do the
dirty work, and return to Malaysia, sparing Singapore the task of educating
their children. Longer-term "guest workers" from Indonesia and
the Philippines must leave their families at home. Even some number of Singapore
families of means, whose children are not making it in the Singapore schools,
send their children to schools in Malaysia. Finally, Singapore will admit
Malaysian students into its schools -- if they score high enough on tests.
Whatever the merits of Singapore's schools, a nation that can "outsource"
its poverty and low achievers while importing academic aces has got a leg
up on the rest of us.3
Meanwhile, an editor at Education Daily phoned in May. She had
just spent a couple of days in a conference with Singapore's minister of
education. He was visiting the U.S. to get ideas about how to teach children
to think. He seemed to feel that he could learn something from us. And Newsweek's
reaction to the stellar TIMSS ranks of U.S. fourth-graders was an authorless
squib inserted on the same page as a larger article on forced summer school
participation of low-achieving students in such diverse places as Chicago,
Denver, Niagara Falls, and Santa Paula, California.4

The Accuracy as a Frill Award
At least Pat Wingert, although accentuating the negative, reported the facts. Over at U.S. News & World Report, people were busy making stuff up. An article by Linda Kulman contended that, in the TIMSS eighth-grade science results, our kids were "on a par with science students in New Zealand, China, Iceland, and Bulgaria."1
A peek at the TIMSS data shows that New Zealand, indeed, had the same rank we did (19th of 41), but Bulgaria was fifth and Iceland 30th. What on earth does Kulman mean by "on a par with"? What's more, China didn't even take part in the study. Details of this sorry reporting appeared in the June 1997 Research column, and I repeat the incident here only to illustrate why Kulman earns the Accuracy as a Frill Award.
For the record, here's an important statistic concerning
Iceland and Bulgaria: the two countries are 25 ranks apart. So, did Bulgarian
kids "slam dunk" their Icelandic peers? Hardly. Bulgarian students
got 62% of the items right; Icelandic students got 52% correct. A 10% difference
in the scores meant a 25-country gap in ranks. A 10% difference in scores
often means a difference of one letter grade, seldom more, and, even in
the case of grades, there is some question as to whether a 10% difference
is a meaningful difference in real achievement. As was the case in the IEA
reading study mentioned above, the TIMSS data show that there is not a dime's
worth of difference among most nations. The press branded U.S. performance
in TIMSS as "mediocre." If so, virtually the entire industrialized
world is mediocre.

The Blindest to Data Award
Louis Gerstner, Jr., and Richard Mills narrowly
beat out President Clinton to earn their prize, for all three have unfairly
impugned the reading skills of American students. Clinton announced that
"only 40% of third-graders can read independently." From later
comments by Secretary of Education Riley, this appears to have been a rather
liberal interpretation of the NAEP reading levels (perhaps Clinton's only
"liberal" interpretation of anything during the year). Gerstner
and Mills, though, went him one better. At the Education Summit, hosted
by Gerstner, the CEO of IBM declared, "We can teach them job skills.
What is killing us is having to teach them to read." (I was not at
the Education Summit, of course; Gerstner's remark was relayed to me by
Gov. George Allen of Virginia.) Mills, for his part, declared on National
Public Radio's "All Things Considered" that "what we have
is a shortage of young people who can read." These statements earn
these gentlemen the Blindest to Data Award (which, in future years, might
be convertible to a Stiffest Resistance to Data Award).
Since both Gerstner and Mills live in New York, it is possible, though highly
improbable, that they are accurately reflecting the realities of their home
state. (I leave it to New York educators to let them know whether or not
this is true.) Nationally, though, American students finished second in
the IEA reading study.1 Critics have contended that the
IEA study tested only "basic" reading skills, and they have looked
to the NAEP to provide more credible evidence that American students can't
read. However, in a joint study, the U.S. Department of Education and the
Pelavin Research Institute compared NAEP results with the IEA study and
concluded that, while the NAEP test contains much more comprehensive and
difficult material than the IEA assessment, "It seems reasonable to
conclude that American students would do well as compared with students
in other countries even if the NAEP test were administered [in other nations]."2
It is the NAEP proficiency levels, initially promulgated to sustain the
sense of crisis produced by A Nation at Risk, that lack credibility.

Best Perpetuation of a Discredited Myth Award
The OECD attains the coveted award for Best Perpetuation
of a Discredited Myth for the statement in one of its publications that,
"in recent years, adult literacy has come to be seen as crucial to
the economic performance of industrialized nations. . . . Today, adults
need a higher level of literacy to function well: society has become more
complex and low-skill jobs are disappearing."1 Although
technically not eligible for a prize because it was not published during
the past year, this study has been largely overlooked in the U.S. and so
is granted a special dispensation.
The OECD study contains too few nations to permit correlation between its
outcomes and economic competitiveness. Suffice it to say that my own analysis
shows that the correlation between TIMSS eighth-grade mathematics results
and economic competitiveness -- as judged by the Davos, Switzerland, World
Economic Forum -- is +.09. That is, virtually zero.
The fable perpetuated by OECD gained currency from
that monumental lie, A Nation at Risk. It became a very popular
fantasy during the recession some six years later, a recession for which
the schools were blamed in many quarters.
The myth was dispelled first by Lawrence Cremin, who noted that it was largely
the President, the Congress, and federal agencies that determined U.S. competitiveness.2
Attacking the schools was a frequent dodge of those actually responsible
for competitiveness. Larry Cuban followed up with "The Great School
Scam," observing that, while schools took the blame for the recession
of the late 1980s, they got no credit for the boom of the mid-1990s.3
Cuban could have noted that this scapegoating reprised history from three
decades earlier. When the Russians launched Sputnik in 1957, schools took
the hit; when America put a man on the moon 12 years later, no one felt
that the schools had had anything to do with it.
Today, a wider audience is beginning to catch on. New York Times
education writer Peter Applebome notes that "many educators and economists
are increasingly skeptical of the notion that better schools mean a more
prosperous nation."4 Applebome points out something
I have noted in many speeches and in most of the earlier Bracey Reports.
If schools are linked to the economy and our schools are so bad, while the
German and Japanese schools are so good, how come our economy is booming
and their economies are mired in awful, long-standing recessions? Applebome
also noticed that "most experts now regard A Nation at Risk as
brilliant propaganda."5 The sleepers awaken.
Applebome quotes Peter Capelli of the University of Pennsylvania as saying,
"The link between education and the national economy is tenuous in
all but the grossest sense -- say the difference between developed and undeveloped
countries." Capelli's comment stands in stark contrast to one by Gerstner,
who in 1994 told a Vermont television talk show host that, if we didn't
shape up our schools, we'd soon be a Third World economy (a comment that
earned Gerstner the first of his many trophies in these pages). Gerstner
made this comment about the time that the World Economic Forum ranked our
economy number one. The Forum has since changed its formula, and we have
now fallen all the way to fourth. The International Institute for Management,
another Swiss operation, has maintained its formula, similar to the Forum's
old one, and has put the U.S. in first place for the last four years.
The OECD statement also overlooks what many people overlook: technology
makes jobs easier, not harder. When a new technology appears, it is true
that it is hard to play with and that only a few specialists know how. As
the technology matures, however, it becomes more user friendly. When I first
started programming computers in 1961, I discovered that I could earn good
money -- about $100 an hour in today's dollars -- for the simplest programming
efforts. And being a somewhat sloppy programmer, I charged only half of
what the more skilled programmers charged. Computers were tough to deal
with, and programmers were in short supply. It took at least six months
and an army of experts to set up and debug a new computer, a far cry from
today's plug-and-play machines. In those days, novices who attempted to
read the documentation accompanying programs were heard to mutter things
like "I don't remember signing up for a foreign language." Today,
the prose in user manuals is straightforward, and, if it still poses a problem,
there is a manual for "dummies" available for virtually every
application.
Similarly, my 1973 Canon F-1 35-millimeter single lens reflex cameras were
far easier to use than the view cameras that preceded them or the other
35-millimeter cameras that had no built-in light meters. And these F-1s,
in turn, were much more difficult to operate than the next generation of
cameras. I still take them out occasionally. Sometimes one needs to control
shutter speed and depth of field, and this calls for a machine that lets
you (actually makes you) adjust shutter speeds and f-stops. Mostly, though,
I use a point-and-shoot with a single lens that zooms from 38 millimeters
to 135 millimeters at the push of a button, flashes when the light is dim,
and does everything else automatically. There is a lot of hokum going around
about how technology makes jobs more difficult.

The Golden Apple Awards
Now to the genuine awards. The award for Most Accurate Perception of a Politician Who Once Impersonated an Educator goes to Michael Lewis for his comments on Monitor Radio on 20 June 1997. Lewis is the author of Trail Fever, a chronicle of the 1996 Presidential campaign. While being interviewed about his book, Lewis declared that candidate and former education secretary Lamar Alexander "did something I didn't think possible in this campaign. He proved you could be too phony. This is why Clinton feared him most of all the candidates. He was so malleable. He even looks a little like putty." Attaboy, Michael.
Lewis' category also features an award for Best
Supporting Column, this by Frank Rich of the New York Times, who
wrote, "What Bill Gates is to software, Lamar Alexander is to hypocrisy."1
Rich drew this analogy in view of Gates' $200-million donation to local
libraries in poor areas while Alexander, according to Rich, was "fronting
for a moneyed 'National Commission on Philanthropy' whose highest priority
is not giving but ideological warfare."
Last updated 22 October 1997
URL: http://www.pdkintl.org/kbra9710.htm
Copyright 1997 Phi Delta
Kappan