Kappan Home
PDK Home

Standards for Standards-Based Accountability Systems

By Kenneth A. Sirotnik and Kathy Kimball

Who will hold the accountability systems accountable? Mr. Sirotnik and Ms. Kimball propose 11 standards that accountability systems themselves should meet.

THIS standards-based movement seems to be hanging around a bit longer than we imagined it would. We thought for a while that the movement might have a half-life similar to that of the minimum competency movement back in the Seventies, but not so. Perhaps there is more appeal to the notion of developing challenging standards and the content based on those standards, assessment systems that are a cut above the usual norm-referenced multiple-choice tests, and accountability systems with more "teeth." Whatever the reasons for the durability of the movement -- and for better or for worse, depending on your point of view -- these tougher assessment and accountability systems are having an impact on the schools and on the people in and around them.

Like most movements in education, the standards-based movement has its either/or camps bent on circling the wagons, polarizing the issues, and arguing one side or the other. In the pages of the Kappan, for example, battles have been waged between the champions of testing and accountability and the debunkers of those evil practices. There is merit, of course, in having such arguments -- engaging in the dialectic, as it were -- with the hope of clarifying positions and possibilities. In the meantime, though, contested practices continue to be carried on in the schools, and people continue to be affected, sometimes profoundly.

We are neither advocates nor apologists for testing and accountability. One of us has been quite critical in past writings on these topics;1 the other worked for five years in the policy world trying to help build sensible assessment and accountability systems.2 We are both concerned about the ways in which such systems are often used against people -- be they students, teachers, or parents -- and against schools and school districts. We are sympathetic with the positions taken by the authors in a recent issue of the Kappan that critiqued performance assessment and high-stakes accountability practices.3 Yet the turmoil in the trenches over these matters is palpable, and -- if history is any lesson -- it will continue in some form or other.

We are searching, therefore, for another rhetorical base, another way to persuade the education community -- especially those members who make education policy -- to think very carefully about what should go on in the name of assessment and accountability. We will experiment here with an argument based on a simple and familiar moral imperative, "Do unto others as you would have them do unto you." Ironically, the developers of standards-based accountability systems, while positing standards for others, seem not to have developed standards for their own systems.

We think it's time for these accountability systems to practice what they preach. Therefore, we will propose and discuss 11 standards for them that we think could foster deliberation -- even future action -- by educators and policy makers who sincerely wish to improve public education.

Terminology

But first we must define our terminology. More specifically, we must clarify the differences -- at least as we perceive them -- between testing and assessment and between assessment systems and accountability systems.

Testing. A test is a single way of getting a sample (and thus a limited amount) of information about what a student knows or is able to do in reference to a specified domain of knowledge or behavior. We do not have sufficient space here to discuss all the ramifications of this deceptively simple statement. But most Kappan readers are at least somewhat familiar with such test-related issues as cultural bias, distinctions between norm- and criterion-referenced tests, types of reliability and validity, the impact of accommodations for students with special needs, and tradeoffs between reliability and generalizability in performance assessment. They understand, too, how these issues relate to the standards-based performance tests that play a role in the assessment systems now operating in a number of states.4

There are plenty of criteria, then, for constructing and using good tests for various purposes. The standards that we will be proposing will assume that such criteria have been reasonably well met by any tests used in assessment and accountability systems.

Assessment. This is a synonym for evaluation. It requires both a description of and judgment regarding whatever phenomenon is being assessed, be it students' learning, teachers' teaching, school climate, state-level commitment and support, or any other education-related construct. A test is not an assessment, but the descriptive part of an assessment may be based on the results of a test. What we make of test results, how we appraise them, requires judgment. And, to be defensible, judgment requires relevant criteria for deciding what is better, or worse, or good enough. No statistical or psychometric magic can produce these criteria; ultimately, they are human constructions. The cutoff scores that define performance categories -- e.g., "needs improvement," "proficient," "expert" -- are good examples.

Assessment systems. Adequate description of a given phenomenon typically requires more than one piece of relevant information. To be sure, important information about teaching and learning can be obtained from well-constructed and appropriately used achievement tests. But test scores are just one piece of information in a complex picture. If we really want to understand what goes on in schools and classrooms, other information is also necessary, and the whole corpus of information selected for this purpose makes up an assessment system.

For example, in the state of Washington we have worked with policy makers to construct a four-level, comprehensive assessment system that uses statewide testing as only one of several important data sources. In this system, the state test (e.g., on writing, literacy, or mathematics) can be likened to a dipstick that pierces several important layers to check on the level of student achievement at a particular point in time (e.g., fourth, seventh, or 10th grade) and with all the limitations of a single test. The layers are crucial, however; what is in them and what we can learn from them can help to flesh out the story of teaching, learning, and schooling in ways not possible from test scores alone. Starting in the center, the classroom is filled with evidence about teaching and learning. Often, this evidence is neither recognized nor organized in a fashion that allows it to be communicated, but much of it can be so organized, given proper resources and training. One obvious example is the student portfolio of work that conforms to rigorous standards for what to include and how the whole is to be evaluated.5

An effective assessment system must include evidence on professional development -- by which we do not mean the traditional "inservice," where teachers disappear one day to listen to the latest guru and appear the next day to continue doing what they've always done. We are talking about ongoing, expensive, labor-intensive reflection and relearning, both individual and organizational. We are talking about meaningful individual and organizational change, about critical inquiry, about school renewal -- topics that have received much play in this and other education journals and books.6

Meaning in context -- is there any other kind?7 To ignore the differing conditions and circumstances within which teachers teach and their students learn, to ignore the quality and quantity of professional development provided for educators, and to ignore all that there is to be learned from each year's 180 days of classroom experience renders test data only minimally informative.8 Ignoring such contextual factors is akin to depriving physicians of all sources of information save their thermometers. A doctor can apply her thermometer to the system and get a reading but still know little about the system or the appropriate diagnosis of its problems.

Accountability systems. What is done with test data and the rest of the information in an assessment system is a whole other matter. Accountability systems are not tests; they are not assessment systems. We often hear people bashing or blaming "the test" (or the larger assessment system) when it's the accountability system that really upsets them. Many have noted that test- or assessment-driven reforms don't work. More likely, though, what doesn't work are accountability-driven reforms. We prefer to think in terms of renewal rather than reform,9 and we think that with careful consideration of all the limitations of tests and assessment systems, this information can actually be useful in improving teaching and learning.

Crassly put (since we are not in the mood for euphemisms), accountability systems are ways in which test data (and sometimes other information) are used to assign praise or blame. Put another way, accountability systems are constructed to distribute rewards or punishments to people and institutions -- i.e., students and teachers, schools and school systems. Wouldn't it be nice to replace this pathological way of doing business with a process of informed, adult learning? We could call the new approach education and describe it as using information and knowledge to help us learn from past mistakes in order to improve future practice.

But let's get real. Accountability is the name of the game today and for the foreseeable future. So let's play by some rules. And let's construct those rules in good faith. After all, the public has a right to know how well the public schools are doing, just as the public schools have a right to be judged in reasonable and conscientious ways. If our students and our schools are to be held accountable for meeting "world-class" content and performance standards, then let's ensure that sensible standards exist against which the accountability systems themselves can be held accountable.

Holding accountability systems accountable is only reasonable. Common sense suggests that scores on one test (which itself is only a sample of many possible performances) cannot possibly represent all that is going on in a school, any more than the temperature reading on a thermometer can represent all that is going on in a human body. Making inferential leaps from test results to the causes of those test results is foolish in the extreme. What other data -- even just a minimal set -- might be useful in understanding the average test score posted by a given school? Consider the following.

With regard to the students, we might focus on such factors as race/ethnicity, economic status, attendance rates, mobility, suspension and expulsion rates, placements in tracks, course enrollment patterns, special education placements, scores on other tests, and the usual array of opportunity-to-learn indicators. With regard to the faculty, we might find race/ethnicity, years of teaching experience, number of teachers teaching out of field, and hours of paid in-school time set aside for planning to be useful indicators. Knowing something about the principal's leadership skills and experience and about the frequency of principal turnover at the school would add to our understanding. We might also wish to focus on such things as the level of parent involvement at the school, the instructional resources available per student, and the quality of those instructional resources.

This list of indicators raises an interesting question. Should accountability schemes be applied uniformly to schools regardless of their particular conditions or circumstances? Take the two schools described below. Should they be treated identically?

School A: suburban; middle/upper-middle class; experienced and stable teaching staff; same principal for five years; 90% of students have English as their first language; less than 1% student turnover (transiency) from grade to grade, e.g., from grade 3 to grade 4 (when the statewide test is administered).

School B: urban; lower/middle class; less experienced and less stable teaching staff; third principal in as many years; 40% of students have a first language other than English; 50% transiency rate, e.g., from grade 3 to grade 4.

But what about those studies, you ask -- the ones that report examples of "School B's" that have beaten the odds, that have somehow shown improvements in test scores well beyond what might have been expected given their conditions and circumstances? If they can do it, why not all the other schools that are like them?

We certainly think that schools can do better and that schoolpeople should not excuse their own failure to improve by blaming community demographics and student or family characteristics. But there are two sides to this coin. On the other side, what are the moral and ethical ramifications of exhorting all schools to make improvements based on studies of heroic efforts by just a few schools at discrete points in their histories -- post hoc studies that report what appeared to happen at those few schools, with no mention at all of how to make it happen?10 It is highly unlikely that most schools can make similar improvements without a deliberate infusion of leadership, strong and committed teachers, ongoing professional development, mechanisms for dealing with burnout, significantly greater fiscal resources, and more -- all sustained over years of improvement efforts.

So how do we hold schools reasonably responsible? And what is "reasonable"?

Eleven Standards for Standards-Based Accountability Systems

What follows are really placeholders for sets of ideas. They come closest to what might be called content standards in a performance-based accountability system (although some critics might argue that they are not specific enough). They are not performance standards, since they are not operationalized in a way that would enable users to determine levels of proficiency. We will leave those steps for others because, if people are serious about implementing these standards, they will have to negotiate meanings and specifications with relevant stakeholders. We will suggest some questions and issues, however, that might stimulate such deliberations and start the process of making each standard operational.

1. The accountability system must not be driven by a single indicator (e.g., test scores) and simplistic formulas for rewards or sanctions based on that indicator. Assessments of student learning are an important part of a total system, but they are only a part.

Much of the preceding discussion has already addressed this standard. The key question is, What can we learn about students and schools from the information contained in any single indicator? Since test scores seem to be the single indicator of choice (notwithstanding the loftier principles articulated in many places about the importance of multiple indicators), this question should be directed toward the tests themselves. The intent is not to discount or debunk them but to thoughtfully consider what they can and cannot tell us about teaching and learning.

2. The accountability system must evaluate each school in terms of its own context as well as in comparison to other schools. If "formulas" are developed to gauge school improvement, they must go beyond test scores to include a variety of community-, student-, teacher-, and school-based indicators, and they must have empirical justification. This standard, of course, goes hand-in-hand with the previous one and plays off our discussion about the importance of school contexts. But it raises the issue of using aggregated indicators and combinations of indicators to somehow rate schools or districts.

We see little value in the rather arbitrary formulas developed by some states to combine indicators into a single index of performance. There are no empirical data on how to construct such an index, nor does any evidence support arbitrary criteria for improvement. Though the necessary studies of construct validity have not been and are not now being done -- at least as far as we know -- the accountability bandwagon is rolling over people.

To simplify this discussion (by violating the first standard), suppose that the chosen indicator is one specific test and that the index is the percentage of students reaching or exceeding "proficiency" on whatever that test measures (e.g., competence in mathematics). How many percentage points of improvement will be enough to merit praising a school? How many years should a school be given to demonstrate the desired improvement? (Clearly, identifying a trend requires at least three years.) How much improvement is even possible, given the difficulty of the test and the realities of classrooms? Without reasonable answers to these and similar questions, no determinations regarding a school's accountability can be made.

3. The accountability system must include monitoring of and support for equitable and substantial learning opportunities for all students (regardless of race, ethnicity, gender, economic status, or disability). Any lingering hope that authentic performance assessment might erase the perennial correlations of race/ethnicity and economic status with achievement outcomes has probably been dashed by now. These correlations have little or nothing to do with the kinds of tests that are used; they spring instead from societal conditions and school practices that continue to disenfranchise students of color and the poor.11

Practices that severely limit students' opportunities to learn, such as tracking, render accountability systems a sham. All students require substantial opportunities to learn what will be tested, and accountability systems must be able to demonstrate that such opportunities have been available. (Moreover, the evidence will have to hold up in the courts, especially if high-stakes decisions -- such as granting or withholding high school diplomas -- will be based on the assessment outcomes.)

4. The accountability system must be flexible enough to allow for individual differences in pace and style of learning; a "one-size-fits-all" philosophy and strict retention policies have no place in high-quality education systems that are sensitive to the needs of every student. Some of the most enduring findings in educational research include what we know about individual differences in learning and about the failure of retention and other rigid practices that are based on chronological, rather than developmental, levels.12 Yet accountability systems seem bent on having everybody pass tests -- often with high stakes attached -- at exactly the same point in their lives.

Of course, it is easy to criticize this practice but much harder to figure out how to strike a balance between our knowledge of individual growth and development and the need for system-level accountability. At the very least, though, there ought to be ways to aggregate the data that are routinely collected on students without having to use those data to make decisions -- high stakes or otherwise -- about any given youngster.13

The ultimate retention policy is withholding a high school diploma from those students who do not pass an exit examination. We fail to see the wisdom of this kind of punitive action and disagree with the notion that the practice puts "real teeth" into an accountability system. In our view, those "teeth" are likely to bring down a well-intentioned assessment and accountability system, when even larger numbers of students -- those already served least well by the system -- drop out or end up opting for General Education Development (GED) certification.

In the state of Washington, for example, the "certificate of mastery" (achieved by passing a 10th-grade proficiency test that covers several basic subjects) is currently a mandated requirement for high school graduation (with the mandate to be implemented at the discretion of the state board of education). But there is no empirical evidence showing that a passing score on that test correlates with future well-being for high school graduates. Nor is there evidence suggesting that students will profit from being denied diplomas when they have completed high school without flunking out but cannot seem to reach some standard of "proficiency" on a given test. Those who do pass the test might have the words "with academic distinction" appended to their diplomas, but withholding diplomas from students who don't pass the test strikes us as morally indefensible.

5. The accountability system must include support for and monitoring of substantial, long-term professional development opportunities for teachers and administrators to inquire into their disciplines and to review and revise their pedagogical content knowledge and teaching and leadership skills (including evaluation and assessment). Can anybody think of a reasonably complex and important industry that invests less than public education does in ongoing job training for its workers? How can a system that professes to educate the nation's children be so negligent when it comes to continued education for these children's teachers? There are 2.5 million teachers on the job today, and the job is changing even as we write. It is irresponsible to demand that teachers be instantly accountable for challenging new content standards when we have not invested in the relearning that teachers require if they are to meet those standards. We are not talking here about narrowly focused inservice training workshops that help teachers teach to the tests. We are talking about thoughtful new learning, about pedagogical content knowledge, and about critical and reflective practice.

6. The accountability system must include support for and monitoring of ongoing classroom-based assessment by teachers that is aligned with a high-quality curriculum and the content standards of the system. We have already mentioned the wealth of information accumulated in classrooms every day as part of the ordinary process of teaching and learning. Happily, some of the current state-level accountability systems are finally waking up to more authentic ways of assessing valued learning outcomes -- though constraints of time and scoring costs force compromises with regard to "performance items." For example, if you want students to be able to write, they have to write! And you have to assess it. It's much more expensive, but you can find out more (not all) of what you need to find out. Solving math problems is more complicated than merely answering multiple-choice items; you actually need to have students solve problems and explain what they are doing and why.

These are not only better assessment practices; they are better teaching and learning practices (assuming you're a fan of higher-order thinking skills and the like). This is not about "assessment-driven reform," this is about the reciprocal relationship between teaching and learning. It is clear that assessment, teaching, and learning must be coordinated -- curriculum and measurement experts have been telling us this for decades. Not to coordinate the kind of assessment (and teaching and learning) that goes on in the classroom with what goes on in state-level assessment and accountability systems seems foolish, at best.14

Oh my! Did we really write those lines? Are we now advocating state control of the curriculum -- and the manipulation of teachers and teaching practices -- through mandated tests? Well clearly, how manipulated one feels depends on how closely what is being assessed matches what one values in curriculum and instruction. It also depends on how restricted one feels in terms of being able to address all the other valued aspects of schooling. So read on.

7. The accountability system must be based on high-quality content standards that allow districts, schools, and teachers to be creative, flexible, and thoughtful in constructing and delivering a curriculum that meets the standards but is not so narrow that it limits the rich array of curricular experiences and possibilities for teaching and learning in a multicultural and democratic society. We know for certain that accountability systems that are dictatorial and punitive, ones that rule more by fear than by reason and compassion, will invariably narrow the curriculum and limit the educational opportunities of students by virtue of the choices that teachers feel compelled to make.15 For an education system that seeks to honor both cultural differences and commonalities within a democratic society, "What is tested is what is taught" cannot be the sole guiding principle.

In the state of Washington, assessments of social studies, the arts, and health/fitness have been approved but have not yet come to fruition. Now we are hearing reports about classroom time being taken from these subject areas in order to focus almost exclusively on reading, writing, and mathematics, the curricular areas that are being tested. Assessment of science is in the works. But the poor track record in getting consensus on content standards for social studies and the arts suggests that assessments in those areas may never come about. Thus we ask, How can accountability systems be held accountable for making sure that what is not tested is not ignored? And how can accountability systems be held accountable for ensuring opportunities for educators to exercise their own creativity and flexibility in the classroom, notwithstanding state-level assessment and accountability systems?

In our view, these questions will not be answered by developing more tests for all conceivable subjects and curricula. Wisdom and common sense must prevail; it is just as important to decide what will not be tested as to decide what will be tested. But what is not tested may still be important and deserving of equal classroom time. For example, in our view, spending quality time at all grade levels on cultivating students' respect for one another and their sense of responsibility to participate critically in our social and political democracy is far more important than spending time trying to test them on these matters.

8. The accountability system must not be punitive, either to students or to their teachers and schools. Instead, through injections of human and fiscal resources, the accountability system must nurture and support districts and schools in decline or those making little or no progress. Behavioral psychology has taken a bad rap lately, in light of developments in cognitive psychology and the popularization of constructivism. Yet we recall some pretty sound principles of learning theory that we suspect still hold true today. One is that punishment doesn't work very well in changing behavior. What works a whole lot better is counter-conditioning -- finding ways to reinforce desired behaviors that interfere with less desired ones.

Okay, forget about behaviorism if it bothers you. Ponder instead everything we know about human learning and organizational change, about critical inquiry and reflective practice. Educators and their schools should not get off the hook when it comes to demonstrating student progress in reasonable ways. But, as we noted earlier, progress depends on learning, and that's true for educators as well as their students. Thus accountability systems need to be held accountable for the ways in which they use rewards and sanctions.

9. The accountability system must ensure that the teachers and administrators entrusted to make the desired improvements in teaching and learning are compensated for their efforts at levels commensurate with the critical importance of their work. Does anyone know of an occupation other than teaching in which salary levels are so out of synch with the importance of the work? The ridiculously low salaries that we pay teachers to take intellectual and emotional care of our children for the 13,500 hours or more that each child spends in school during the K-12 years are a national disgrace. We need 2.5 million teachers to do this work -- and about 200,000 new ones each year to take the places of those who leave.16 We try to attract these replacements with a beginning salary of around $25,000 a year; eventually, if they remain in the classroom, they will probably earn the average teacher's salary of $38,500.17

Oh yes, teachers do get summers and holidays off -- nearly three months of "vacation" time. They need it! On average, they report that their jobs require them to work 12 extra hours per week during the school year, either before and after school or on the weekends.18 That's more than 430 extra hours over the course of the school year -- an amount of time nearly equal to the three months they have "off"!

We find it morally reprehensible that the bar has now been raised to "world-class standards" for teaching and learning, while "third-world" standards continue to guide teachers' salary schedules. The public and politicians may tolerate this, but teachers won't for long.

10. The goals of the accountability system and the funding required to pay for all the things we have mentioned above must have the support of the public and of the political infrastructure. In other words, pay for it, or don't do it!

11. The public and the political infrastructure must support the accountability system by protecting the educational functions of schools and protecting the school environments within which those functions take place. If the schools must take on many other functions required to sustain the health and welfare of their students, then substantial human and fiscal resources must be added, supported, and sustained. The last two standards are pretty straightforward. They simply ask the public and the policy makers to put their money where their mouths are. They also ask that laypersons and politicians face up to the daily realities faced by many schools.

We have a suggestion. Every legislator and education policy maker should be required to spend three months in an urban school system, teaching (under the supervision of a certified teacher) for a month at each level: elementary, middle school, and high school. (If a month is simply too unrealistic, a day would probably suffice for many.) These classroom experiences should be repeated every three years or so, enabling the decision makers to maintain their credibility. They could thus join their similarly enlightened colleagues to rethink their notions about assessment and accountability systems -- not to abandon them out of frustration and cynicism, but to ponder them critically, optimistically, and responsibly.

 

In proposing these standards for standards-based accountability systems, we have tried to find another way to enter the often-polarized conversation about how to appraise our public schools. But we are not looking to find some middle ground, and we are not talking about compromising principles. Rather, we are talking about finding common ground that supports sound moral and educational practices.

Frankly, we do not believe that any existing standards-based accountability system lives up to our proposed standards. Of course, meeting world-class standards takes time. We would give the states about six years to get their systems up and running and to demonstrate substantial results in meeting the spirit and intent of these standards in authentic ways. (This is about two to six times the amount of time that states are giving the schools to improve.) But if substantial progress is not being made, if some states are still below "proficiency" on the 11 standards, then close down the accountability systems and let the teachers do their jobs in the classrooms as they have always done.

From the statehouse to the schoolhouse, serious discussions of how to help schools do better, keep track of their progress, and use information in caring and just ways for everyone involved will be essential. And those deliberations will not be easy, especially if the standards proposed here are taken seriously. Perhaps there is one of 50 states out there that is willing to take on this challenge and show the rest of us the way.


1. See, for example, Kenneth A. Sirotnik and John I. Goodlad, "The Quest for Reason Amidst the Rhetoric of Reform: Improving Instead of Testing Our Schools," in William J. Johnston, ed., Education on Trial: Strategies for the Future (San Francisco: Institute for Contemporary Studies Press, 1985), pp. 277-98.

2. Kathy Kimball was the assistant executive director of the Commission on Student Learning in Washington State, a body that was charged with developing and implementing the statewide assessment and accountability systems.

3. See the May 1999 issue of the Phi Delta Kappan.

4. See, for example, Robert L. Linn, Eva L. Baker, and Stephen B. Dunbar, "Complex, Performance-Based Assessment: Expectations and Validation Criteria," Educational Researcher, November 1991, pp. 15-21.

5. Portfolios are not panaceas; like all assessment tools, they have their own problems related to reliability and validity. Nonetheless, they can be useful sources of evaluative information for teachers and parents -- perhaps even entire education systems. See, for example, Sheila W. Valencia and Nancy A. Place, "Literacy Portfolios for Teaching, Learning, and Accountability: The Bellevue Literacy Assessment Project," in Sheila W. Valencia, Elfrieda H. Hiebert, and Peter P. Afflerbach, eds., Authentic Reading Assessment: Practices and Possibilities (Newark, Del.: International Reading Association, 1994), pp. 134-65.

6. Individuals have argued from many angles to support these notions. See, for example, Kenneth A. Sirotnik, "The School as the Center of Change," in Thomas J. Sergiovanni and John H. Moore, eds., Schooling for Tomorrow: Directing Reforms to Issues That Count (Boston: Allyn & Bacon, 1989), pp. 89-113; Seymour B. Sarason, The Predictable Failure of Educational Reform: Can We Change Course Before It's Too Late? (San Francisco: Jossey-Bass, 1990); and Mark A. Smylie, "Redesigning Teachers' Work: Connections to the Classroom," in Linda Darling-Hammond, ed., Review of Research in Education, vol. 20 (Washington, D.C.: American Educational Research Association, 1994), pp. 129-77.

7. Elliot G. Mishler first posed that question as the title of his classic article in the February 1970 Harvard Educational Review, pp. 1-19.

8. See Jeannie Oakes, "What Educational Indicators? The Case for Assessing the School Context," Educational Evaluation and Policy Analysis, Summer 1989, pp. 181-99.

9. See the special section on educational renewal in the April 1999 issue of the Phi Delta Kappan.

10. We had plenty of studies like these in the heyday of the school effectiveness movement, and we are beginning to see a new wave of them again today. And from this second wave of studies, the same "stuff" is being relearned: exceptionally good schools seem to have better principals and teachers who are more focused on instructional leadership, time-on-task, assessment of learning outcomes, a safe and orderly environment, and so on.

11. Linn, Baker, and Dunbar, op. cit.

12. See Linda Darling-Hammond and Beverly Falk, "Using Standards and Assessments to Support Student Learning," Phi Delta Kappan, November 1997, pp. 190-99.

13. Unfortunately, we are seeing a new political resistance to the use of matrix sampling, which yields highly accurate estimates of school-, district-, and state-level averages without requiring every student to respond to every question on the test. Indeed, the investment of time and money would probably be only 10% to 20% of current levels if we used such sampling, and we could assess a greater range of authentic, complex performance at the system level. With this method, however, student-level scores become less reliable and not at all useful for high-stakes decisions. Yet why do we need individual-level scores for purposes of evaluating systems like schools? Moreover, why can't we use all the other more useful and diagnostic information that can be obtained for each student when it is appropriate to do so?

14. We are well aware of the problems that dog authentic performance assessment: e.g., the inverse relationship between reliability and generalizability, the negative impact of overly specified rubrics on teaching and student performance. Nonetheless, we believe that a balance can be achieved between high-quality classroom practices and system-level assessments in an accountability environment that is genuinely sensitive to the standards we are proposing.

15. See Lorrie A. Shepard, "The Effects of High-Stakes Testing on Instruction," paper presented at the annual meeting of the American Educational Research Association, Chicago, 1991.

16. See Ann Bradley, "States' Uneven Teacher Supply Complicates Staffing of Schools," Education Week, 10 March 1999, pp. 1, 10-11.

17. These are estimates from data in National Center for Education Statistics, Digest of Education Statistics (Washington, D.C.: U.S. Department of Education, 1997).

18. Reported in National Center for Education Statistics, Findings from the Condition of Education, No. 7, Teachers' Working Conditions (Washington, D.C.: U.S. Department of Education, 1996).


KENNETH A. SIROTNIK is a professor in the College of Education, University of Washington, Seattle, and director of the Institute for the Study of Educational Policy. KATHY KIMBALL is an assistant professor in the College of Education, University of Washington, and director of Administrator Preparation Programs.

 


PDK Home | Site Map
Kappan Professional Journal
Last updated 8 December1999
URL: http://www.pdkintl.org/kappan/ksir9911.htm
Copyright 1999 Phi Delta Kappa International