Consumer-Referenced Testing

Many Americans have less accurate information about the tests used to measure their children's academic achievement than they do about a can of beans, Mr. Behuniak argues. If we wish to remedy this situation, we need to focus on the needs of those who are the "consumers" of our tests.



by Peter Behuniak

Photo: Eyewire

WHY ARE so many people so in love with testing? As a former teacher turned measurement specialist, I recall having left the classroom to lend my efforts to improving the quality of student assessments. I have long considered assessments as educational tools -- not as solutions to our problems or as substitutes for good instruction. So who are all these folks who are so infatuated with more frequent testing in more subjects with more students? Why are we so emotionally invested in these educational tools while practitioners in other fields respect the utility of the tools of their trades but remain essentially dispassionate about them?

To answer these questions, it is useful to consider each of the major constituencies in education. Educational administrators at the local level are certainly emotional about high-stakes testing. There is an obvious reason for this, since their jobs are often riding on the test results. However, this group hardly falls into the category of ardent testing proponents. Similarly, teachers are almost never advocates of large-scale assessments, though they regularly use both formal and informal measures on a small scale within their own classrooms and schools. Parents are generally supportive of broadly defined assessment programs because they like having the feedback. They tend to bring a bottom-line perspective to the process, skipping much of the rhetoric and detail that educators tend to dwell on --
"Just tell me how my daughter did and what you're going to do as a result." These groups are definitely not the source of the demands that are causing the proliferation of high-stakes testing.

Legislators are the logical choice, since virtually every high-stakes testing program has legislative backing. However, it is naive to think that such widespread legislative support is occurring in a vacuum. Each legislative action promoting high-stakes assessments reflects perceptions in the general population that the large institution of education ought to be held accountable. Kurt Landgraf of the Educational Testing Service, in his testimony supporting President Bush's testing proposal before Congress, commented, "Results from these tests will provide important information that the American people and policy makers need to move this matter forward and to ensure significant education reform."
1

Most political and business leaders and many educational leaders would agree. Thus the source of the trend toward high-stakes, large-scale assessments in our schools is Main Street, USA -- the public, supporting through its representatives a perceived need for objective indicators of educational achievement.

There is an interesting footnote to this observation that the culprit in the call for accountability is all of us. Just as parents tend to go to the bottom line with regard to their children's progress, so too does the public have a penchant for skipping the details and demanding the executive summary version about how well our schools are doing. It is usually satisfactory to the public if the presiding body -- usually the local, state or national education agency -- can provide objective information that confirms that reasonable progress is occurring. Members of the public are not often as concerned with the specifics about which educators love to argue, such as content, format, procedural rules, and the methods by which standards are set. This is an important point.

Introducing Consumer-Referenced Testing

The current circumstances pose a real opportunity. Broad support and resources are available for building systems to provide reliable indicators of student progress, and there are many possible ways to proceed. Yet the manner in which testing programs have been implemented has caused protests, boycotts, and criticism. The time is now to approach the issue of large-scale assessment from a new perspective.

The introduction of the concept of Consumer-Referenced Testing is intended to be a reminder of a crucial but elusive element of developing useful educational assessment systems: we need to understand clearly the purpose of the tests we are imposing on our children. In most arenas, consumers' needs drive the creation and delivery of products and services. A high-quality product or service is one that successfully meets a consumer need. If you think of what distinguishes the products and services with which you are most satisfied, the characteristics of dependability, effectiveness, cost-effectiveness, and ease of use come to mind.

Educational assessment systems can be created that exhibit these traits. However, to achieve this aim, it is necessary to focus on the needs of the consumers of the public education system as a means of establishing priorities for our testing programs. The better we articulate our reasons for creating these systems, the better our chances of producing effective programs. A good place to start is to identify the consumer needs that most often form the basis for new large-scale assessments.

One reason for the strong interest in testing is our penchant for quantifying our environment and our experiences. Most people want objective, empirical evidence about their environment. The observation that it is "chilly" outside is not as informative as "It is 18°F with 25-mph gusts of wind." Education is particularly well suited to quantitative analysis, a fact that has helped to shape many of the developments in educational assessments over the past 20 years. Writing about the advent of minimum competency testing, David Cohen and Walter Haney observed: "Schooling is considered in discrete quantitative entities -- years, semesters, test scores -- and in this respect is unlike most other social services. This common quantitative language for discussing achievement makes education particularly susceptible to input-output analysis."
2

The danger with using assessment programs for this purpose is overquantifying or oversimplifying educational achievement. Few would advocate trying to summarize student learning based on a single measure or indicator. Yet the development of testing programs with a focus on consumers' needs cannot ignore the relevance of quantification to increasing the public's comfort level.

A second common purpose for assessment programs is to force the education system to be accountable. There are serious reasons to be concerned about this aim. If it is true that different functions are best served by different tests, then mixing the function of accountability with more instructionally related purposes should be considered the educational equivalent of starting a five-alarm fire and should set all warning bells clanging. Lorrie Shepard pointed this out when she subtitled a portion of her 2000 AERA presidential address "Protecting Classroom Assessment from the Negative Effect of High-Stakes Accountability Testing."
3 However, despite the problems with this approach, it is a modern-day reality that the need for accountability -- as perceived by education's consumers -- has placed an additional burden on the shoulders of large-scale assessment programs.

Our capacity to develop high-quality assessment systems that serve our need for accountability may depend on achieving a constructive reconciliation of conflicting perspectives. There are two universal truths about applying large-scale assessment for accountability purposes: 1) it makes many educators uneasy, and 2) it sounds like a good idea to noneducators. Both groups have some justification for their perspectives but tend to support their views rather single-mindedly. Educators often cite limitations of the tests, the danger of narrowed curricula, and the need to attend to students' individuality as reasons for their resistance. Noneducators largely consider these arguments to be esoteric whining. Their emphasis is placed on the need for objective indicators to confirm academic progress. They frequently chastise educators for being fearful of an external audit.

A third purpose, one that many consider the central role for assessments, relates to the desire to have measures that guide or focus classroom instruction. The argument is compelling. Identify the most essential academic skills and outcomes, focus the education system appropriately, and regularly monitor student progress via strategically placed assessments. Unfortunately, various aspects of higher-stakes, externally mandated assessments undermine the prospects of attaining this purpose. As the stakes attached to testing rise, so too does the likelihood of distortions of sound instructional practices. James Popham observed: "The nation's educators, the ones who are trying to teach children, find themselves in a no-win situation. They are under increasing pressure to boost achievement scores, yet the tests being used to obtain those scores are large-scale assessments designed specifically to fulfill an accountability function rather than an instructional function."
4

This apparent paradox might appear to be hopeless until one realizes how many successful applications of educational assessments occur every day, despite the complications caused by these cross purposes. Every capable teacher can provide numerous examples of ways in which formal and informal means of assessing student achievement helped to diagnose a learning problem, document progress, identify an effective instructional approach, and produce numerous other desirable outcomes. These benefits accrue directly to the consumers -- the students, their parents, and the public at large -- as enhancements to the educational experience.

Meeting the Challenge

The challenge is to design educational assessment systems based on clearly articulated needs of educational consumers and to implement these systems in ways that enhance rather than undermine the efficacy of the teaching and learning going on in the schools using them. The challenge is to move educational assessment from a no-win to a win-win experience. I believe this is possible if educators and policy makers are willing to cooperate to make changes in a number of conceptual and applied areas. We need to recognize that if we allow testing to become a battleground, students will be the victims.

There are five key areas that are ripe for development. The suggestions offered for each of these areas are based on well-established principles of sound assessment practices as well as on practical experience and even common sense. All are worth attempting as positive steps toward a healthier educational climate in the classroom.

Designing Assessment Systems

The design of a testing system is critical to its success. Unfortunately, we tend to design tests, not systems. First, let's look at the problem. The typical pattern with large-scale assessments looks like this. A law or mandate is enacted, and a test is implemented. Soon a new mandate is created (perhaps by a higher policy or legislative body), and a new test is implemented either in addition to or as a replacement for the first test. Either option causes confusion and disruption.

I recently reviewed one "system" of statewide tests (with fairly high stakes attached):

1. criterion-referenced tests: grades 3, 5, and 8;
2. norm-referenced tests: grades 2, 4, and 6; and
3. graduation tests: high school.

The problem with this design is not the mere presence of different types of assessments in the same system, for it is possible for both norm- and criterion-referenced measures to provide useful information. The problem is that complex systems require higher degrees of integration, with the role of each component articulated in relation to the purpose it is intended to fulfill. Do teachers, parents, and students know what to do based on the test results from grades 2, 3, 4, and so on? If they do not, the design is flawed.

Recognizing that assessments must satisfy multiple purposes is a necessary but not sufficient first step. Instead, close attention must be paid to the ways in which components of the assessments interact with one another. A sixth-grade math test of very basic skills with a rigorous standard will not have the same effect in the classroom as a sixth-grade math test covering demanding, more advanced concepts but with a very low passing standard. Both tests may initially produce the same proportion of students meeting the goal, but the use of these tests will produce very different reactions from students, teachers, and parents. It is important to note that there is not just one right answer here. Either of these options might be appropriate, depending on the context. The crucial point is that the implications of the test features need to be studied and used to guide design choices.

The interactions of the test with other educational entities is also important. Tests affect curriculum development, instructional strategies, allocation of instructional time, scheduling, and professional development. Projecting the likely effects of tests on these critical educational functions should be considered a prerequisite for any agency intending to implement an assessment system.

A multi-tiered approach is almost certainly necessary whenever an assessment system must satisfy multiple purposes. For example, a state legislature may want a vehicle for accountability while district educators prefer a lower-stakes measure directed at identifying student strengths and weaknesses. The district superintendent, local board of education, and the state education agency may have additional purposes in mind. An assessment system could be designed that includes a high-stakes, top-tier assessment administered statewide on an annual or biannual basis to satisfy the accountability purpose. A series of smaller, coordinated measures aligned to the same content could be designed, to be used in more frequent but shorter administrations to satisfy those who need data for instructional purposes. Other purposes may require additional tiers of testing or reporting. For example, the district superintendent may want to collect classroom assessment data once annually to produce a school- or district-level report while allowing most classroom results to remain solely in the hands of the teachers.

The assessments required in the No Child Left Behind legislation could become part of such a multi-tiered system. The nationwide mandate for assessment has a clear aura of accountability about it. The challenge will be to build a system that allows legitimate educational purposes to be met at the state, district, and classroom levels without being overwhelmed by the magnitude of the national component. This will require conceptualizing how the pieces fit together in a much more coordinated manner than has typically been done. Catherine Snow and Jacqueline Jones concluded their observations regarding a potentially useful role for the national assessments as follows: "Annual [national] tests should be one piece of an integrated system of ongoing classroom-based assessment and professional development, targeted where the need is greatest."
5

One interesting opportunity afforded by the institution of a nationwide assessment component is that models could be designed incorporating the national perspective but using different combinations of options at the state, local, and classroom levels. These models could accommodate the varying political and educational climates that exist in different communities.

Professional Development

Professional development is a serious consumer need that often receives inadequate attention. There are at least three areas in which teacher effectiveness with regard to the use of tests and test results could be enhanced: 1) understanding of the salient features of different tests and basic measurement principles, 2) familiarity with the specific attributes and purposes of assessment programs directly affecting one's students, and 3) facility with making intelligent use of available results.

An achievement test is a useful tool in the hands of a knowledgeable educator. Unfortunately, too many teachers enter the classroom lacking even a rudimentary understanding of how to select and use tests for particular purposes. Basic measurement principles, such as standardization, are widely misunderstood. Then, for the new teacher alone at the head of the class, it gets worse. Some set of tests (often poorly conceived) is adopted as a matter of policy, and the new teacher is expected to muddle through. This is the educational equivalent of "Survivor": everyone is on his or her own and whoever is still standing when the test results are released is considered best. Apparently, this formula is what passes for interesting entertainment, but it is not a very promising way to promote good instruction.

It is tempting to lay most of the need for initial professional development regarding educational assessment in the lap of teacher education institutions. It has long been known that improvements are necessary in both the content and the emphasis of these programs. However, it is probably more realistic to have the responsibility shared between the undergraduate programs that prepare teachers and the districts that employ new teachers. A regional approach that provides high-quality inservice activities combined with district support to release teachers from classroom responsibilities during the first year or two would be one reasonable alternative. It is important that the training and support be provided early in a teacher's career to avoid allowing misunderstandings or unproductive practices to become ingrained.

The second professional development need concerns the specifics of the assessments in use in an educator's school. The focus of this inservice training would be to fully explore the purposes and features of any assessments a teacher (or administrator) could reasonably be expected to use in his or her current assignment. These would, of course, include all assessments whether mandated districtwide (or statewide or nationwide) or available for optional use at the classroom or school level. Such training may seem to be an obvious suggestion, but it is one that is simply not acted on in many places.

One frequent observation is that "Tests drive instruction." This is true in part because teachers are often made to wait until the administration of a test is imminent (or history) before they become aware of the content to be assessed. What choice do they have but to play catch-up? This problem is not the fault of teachers, but rather the absence of educational leadership in the sanctioning agency. The preparation for implementing a new assessment system ought to include as standard operating procedure the thorough familiarization of all educators with the content represented by the tests, as well as the tests' purposes, formats, and reporting procedures. Teachers should have ample opportunity to discuss and receive training regarding appropriate instructional strategies linked to the assessed content.

The third focus for professional development concerns improving teachers' use of test results. The best test, carefully tailored to meet important purposes, will ultimately fail if teachers leave the results sitting on the shelf. To an extent, the professional development efforts described above will have a positive effect on this aspect of assessment as well. The reasonable and effective use of test results, however, deserves special attention in the training of both new and veteran educators. One subtle but important measurement principle establishes that validity is determined by the use of a test score for a particular purpose or decision rather than for a test per se. This means that each action or decision made by Ms. Jones in her classroom must be justified if the outcome of the testing is to be valid for her students. This further means that the validity of large-scale assessment systems cannot be guaranteed by any central agency at the national, state, or district level. It can only be achieved one school and one classroom at a time.

The Testing Experience

The discussion in this section focuses on how students perceive their participation in testing. It concerns the nature of the tests and how they are delivered. Testing proponents often justify testing as being another learning experience. If that is true, and I believe it is, then we should do all we can to make participation in testing -- not only the actual test administration period, but also the weeks or months leading up to the test -- a positive experience.

Most of the discussion of the impact students experience from testing focuses on major program or life-changing decisions: the denial of a diploma, retention in grade, or enrollment in a special program. There are many other aspects of the student experience that are less obvious but still important in shaping the role testing plays in schools. For example, consider the issue of test preparation. At one extreme, a student in an early elementary grade who receives little or no preparation may experience a formal test period as a confusing or even frightening episode, drawing nothing beneficial from the exposure and producing no usable results. At the other extreme, over-preparation for tests can degenerate into a repetitive, redundant series of mind-numbing exercises that can lead students to the type of resentment and hostility that is the stuff of protests and boycotts, particularly with older students.

A workable compromise needs to be fashioned from a moderate approach that improves the fairness of the assessment by leveling the playing field without usurping undue proportions of classroom time. Improving the testing experience for students has as much to do with the knowledge and attitudes of educators as it does with the format and procedures of the assessment. The departures from good practice illustrated in the above examples typically occur in the absence of the types of professional development described earlier. Teachers and administrators who have been afforded the opportunity to understand the purpose and nature of the assessments they are expected to use will be much better equipped to deliver the measures in a constructive and supportive manner. Clearly, the work of shaping the testing experience needs to begin long before the first day of testing.

A sporting event is a maximum-effort activity for the competing athletes. The event would lose meaning if the athletes didn't try or were half-hearted. Similarly, achievement tests rely on students' demonstrating the best work of which they are capable. Yet it is frequently true that teachers share openly with students their misgivings, complaints, or dislike regarding mandated assessments. These attitudes can have a powerful depressive effect on student performance on the tests, as well as on student attitudes about testing in general.

These comments are not intended as a blanket indictment of educators' complaints. There are many assessments that are poorly conceived or poorly implemented, at least in some respects. It is responsible for educators to initiate steps to correct problematic aspects of mandated assessments. Teachers are being professionally responsible if they work with administrators, policy makers, and other teachers to institute better assessments. However, reducing student motivation can only serve to compound any problems that are already present.

It becomes reasonable to expect teachers to make student participation in a test a positive learning experience only if the attributes of the assessment are appropriate for the stated purposes. Much has been done over the past two decades to improve the nature of educational assessments. Greater care is given to identifying target content and levels of cognitive demands that form the basis for the tests. Multiple-item formats and extended tasks are now common, broadening content coverage and improving validity. The increased use of manipulatives, hands-on experimentation, and portfolios are enhancements that can provide additional benefits in certain programs.

However, there is a need for greater attention to the manner in which these test formats are integrated within the overall design of the assessment system. It is far too common for certain features to be selected without adequate consideration for how students will ultimately be affected. Some features work better as lower-stakes, informal applications. Other features stand up better to the greater rigor and demands of high-visibility, high-stakes tests. Guided by knowledgeable teachers working within well-designed assessment systems, students can be brought to the point where they experience a series of formal and informal tests as beneficial opportunities to demonstrate their skills and abilities and to receive constructive feedback.

Technology

Technological advances have already had a profound effect on educational assessments, but the developments so far represent just the first step down a path that will eventually revolutionize the field. Examples of existing assessment technology include online and offline computer-based testing (CBT), computer adaptive testing (CAT), response imaging, Internet-based reporting, and CD-ROM and Web-based professional development software. These innovations have increased test delivery options, introduced new formats, improved efficiency, and expanded monitoring and reporting alternatives. One sign of the times is that most major assessment programs make use of a website as a communication tool.

The next few years promise to be critical in determining how the field of assessment embraces the emerging technological developments. Virtually every commercial and not-for-profit organization involved in large-scale assessment is working on how best to use technology to improve educational assessment. As the level of hardware in schools rises, the possible uses expand at an even faster rate. Also the continued wiring of American schools to the Internet and to various wide and local area networks geometrically increases the opportunities available.

In a forward-looking article titled "How the Internet Will Help Large-Scale Assessment Reinvent Itself," Randy Bennett highlights one of the reasons that we currently are at a crossroads. He discusses CBT and the concepts of sustaining versus disruptive technology.

Historically, most technological advances in any given industry have been sustaining ones (e.g., in the personal computer industry, faster chips and bigger, higher-resolution monitors). Occasionally, disruptive technologies emerge. Companies introduce these technologies hoping their features will provide competitive edge. However, these features characteristically overshoot the market, giving customers more than they need or are willing to pay for. Thus, disruptive technologies result in worse product performance, at least in the near-term. . . .6

The disruption is already apparent. Some institutions are launching large-scale CBT and other technology-based efforts at unprecedented levels, despite the obstacles. Many other institutions are remaining steadfast in their commitment to traditional testing programs, based on either skepticism or lack of resources or both. The challenge to the assessment field is to manage this transition period so that each new technological advance is used to improve how our assessment systems achieve their intended purposes.

Many topics currently warrant consideration and possibly research:

Managing technological developments in these and other areas will require cooperation and a sustained effort. But the potential benefits of laying a solid foundation for this revolution are too important for us not to invest such energy.

Sunshine Mentality

All government agencies and many of the private organizations that provide assessment support to them are required by law to share evidence of their work (e.g., records, documents, minutes) with anyone who might be interested. Yet test questions must be held confidential for security purposes. This necessity creates a tension that operates mostly to the detriment of achieving the maximum benefits and purposes of high-quality assessment systems. We need to ask ourselves, "Why should the public (or teachers, or parents) support the purposes of an assessment system operating in secrecy?"

Let us acknowledge that, while all testing programs make some information available, much more can be done to improve this situation. Take the issue of advance notification when a new system is being implemented. All testing programs issue some announcements, typically in the form of memoranda or brochures. Perhaps meetings or workshops are held. These are good ideas, but they are grossly insufficient. These efforts are generally defended as "the best we can do" because of politically established timetables or limited budgets and staffing.

It is illogical and counterproductive to implement high-stakes assessments before teachers have had a reasonable opportunity to become familiar with the covered content and introduce appropriate instruction in the classroom. The courts long ago indicated the minimum due notice required when student property rights, such as a high school diploma, are at stake. However, we as educators should be driven not by the minimum but by reasonableness and common sense. I suggest that there is only one appropriate implementation sequence to support meaningful educational improvement: 1) define and disseminate target content, 2) align classroom instruction, and 3) initiate assessments. For most practical purposes, other sequences should be abandoned.

The need for the security of the test questions poses some interesting problems. It is clear that test validity will suffer if students enter the test session having memorized or rehearsed their answers based on prior exposure to the questions. However, in many cases, this situation is already occurring, even while the security of the actual test items is generally maintained. Many test specifications are very detailed and are supported by sample items, practice tests, and previously released tests. The coaching activities for the SAT are a good example. If the live test items are mirror images of the practice test questions and students practice repeatedly prior to the live test, it is foolish to ignore the implications for instruction and for interpreting assessment results.

Measurement specialists support the principle that educational consumers should be dealt with fairly.
7 Believing in the principle, however, is not the same as delivering on the promise. We can and should do more to bring our assessment programs out of the darkness produced by a black-box mentality and a veil of secrecy. There is a need to shine some light on the appropriate relationship between a well-conceived assessment program and the education system it is intended to serve. Here are some steps that can help to accomplish this.

One use and out. Each administration of a large-scale assessment should be followed by the prompt release of most if not all of the test questions. Other than a few questions needed for linking one year's or one grade's test to the next, the main reason for test reuse is cost. We should spend the money. The secrecy, the breaches in security, the negative perceptions of what is being "hidden" are not worth the cost.

Provide support materials and information. Abundant and timely support materials and workshops need to be provided to all interested parties at least two years before anyone will be held accountable -- and farther in advance for truly high-stakes assessments. The features covered should include content frameworks; test content and formats; appropriate instructional strategies; identification of target populations, including available accommodations; purposes of the program; details of the accountability model; and how to obtain additional resources.

Articulate appropriate test preparation. It is an act of negligence for agencies to implement large-scale assessments and then do little or nothing when teachers spend weeks or months drilling students on very similar items and tasks. This is a matter of professional responsibility and awareness. Once teachers are fully informed about the content and the format embodied by the assessment, they can and should be guided through professional development activities to employ sound instructional strategies that offer their students much more than do repetitive drills. The agency implementing the assessment shoulders the responsibility to see that this is accomplished.

Report results quickly and effectively. A consumer-oriented assessment program must recognize that much of the potential value of any such program is lost if teachers and parents do not understand the results or need to wait too long to receive them. Timely reporting using multiple media for dissemination should be a priority. CBT and websites are two technological developments that offer improved response times and reporting options. A major future need is to build support mechanics such as CD-ROM and Web-based tutorials and in-person workshops to promote the reasonable interpretation of test results.

Institute disclosure statements. Many years ago the Food and Drug Administration mandated informative labels for all food products. The Securities and Exchange Commission has long required disclosure statements regarding financial instruments, although these are not nearly as user-friendly. As a result, today's consumers have basic nutritional and financial information to guide their choices and decisions. The measurement community should undertake the same enhancement with educational assessments, both to ensure availability of the information and to promote a productive dialogue. The information should be presented briefly and simply, with more extensive details available for the asking. Possible elements that might be included in such a disclosure statement are:

The components of the statements could be recommended by a national, broadly constructed panel. This action would help address the fact that, currently, many Americans have less accurate information about the measures of their children's academic well-being than they have about a can of beans.

Concluding Remarks

This country is in the midst of a several-decades-long expansion of large-scale assessment in terms of both quantity and stakes. In this article, I have argued that we should focus at least as much attention on improving the quality of the new assessments. To accomplish this aim, we must keep a focus on the needs of the consumers being served. These needs dictate more carefully designed, integrated assessment programs; better professional development opportunities for educators; improvements in how students experience educational tests; greater use of technology; and the adoption of a more open approach to dealing with stakeholders. Vernon Law is credited with the maxim "Experience is the worst teacher: it gives the test before presenting the lesson." We should be able to ensure that education does better than this for our students.


1. Kurt M. Landgraf, Using Assessments and Accountability to Raise Student Achievement (Princeton, N.J.: Educational Testing Service, 2001), p. 4.
2. David K. Cohen and Walter Haney, "Minimum Competency Testing and Social Policy," in Richard M. Jaeger and Carol K. Tittle, eds., Minimum Competency Achievement Testing: Motives, Models, Measures, and Consequences (Berkeley, Calif.: McCutchan, 1980), p. 16.
3. Lorrie A. Shepard, "The Role of Assessment in a Learning Culture," Educational Researcher, October 2000, p. 9.
4. W. James Popham, "Where Large-Scale Assessment Is Heading and Why It Shouldn't," Educational Measurement: Issues and Practices, Fall 1999, p. 14.
5. Catherine E. Snow and Jacqueline Jones, "Making a Silk Purse . . . ," Education Week, 25 April 2001, p. 41.
6. Randy E. Bennett, "How the Internet Will Help Large-Scale Assessment Reinvent Itself," Education Policy Analysis Archives, February 2001, pp. 1-24.
7. See Standards for Educational and Psychological Testing (Washington, D.C.: American Educational Research Association, American Psychological Association, and National Council of Measurement and Evaluation, 1999); and Code of Fair Testing Practices (Washington, D.C.: Joint Committee on Testing Practices, 1988).


PETER BEHUNIAK, formerly director of Student Assessment and Testing for the state of Connecticut, is acting chief of Certification and Professional Development for Connecticut and an educational assessment consultant.

 





PDK Home | Site Map
Kappan Professional Journal
Last updated 22 October 2002
URL: http://www.pdkintl.org/kappan/k0211beh.htm
Copyright 2002 Phi Delta Kappa International