By Robert Ervin
What is evolving in Bangor, Maine, Mr. Ervin points out, is a seamless, continuous-progress performance assessment in reading -- and all because 24 first-grade teachers objected to the use of standardized achievement tests.
IN THE SPRING of 1990, 24 first-grade teachers in Bangor, Maine, objected to the use of standardized achievement tests. These achievement tests were routinely administered in grades 1 through 10 and were the basis for analyzing pupil progress. The issues raised by the teachers focused on familiar problem areas: accountability without validity, instructional irrelevance, and one-shot measurement of a full year's work. Their objections focused especially on reading, a subject over which the teachers believed they had clear territorial rights. The response from the administration was simple: find or build a better assessment.
What was greeted as a victory soon became a daunting challenge. The teachers' initial relief that the standardized test was gone faded when they realized that they were responsible for coming up with a replacement. The teachers worked together, but the energetic and confusing exchanges characteristic of the national reading debate soon took over. After a year of unproductive discussions, standardized testing was back in Bangor. That reintroduction shocked the teachers into reaching consensus and taking action. And so began a five-year process of assessment development in the areas of reading and writing.
Reading instruction in Bangor's five elementary schools exhibited the pedagogical mixture found in many schools. The predictable flow of basal instruction coexisted with the wide-open empowerment of literature-based instruction. Unfortunately, the entire instructional process lacked a coherent assessment of student achievement.
In the spring of 1991, a "literacy assessment team" composed of five first-grade teachers and the assistant superintendent of schools set out to define and consolidate the beliefs about reading and the expectations for performance across the school system. Teacher involvement in this process was critical to the eventual acceptance of its efforts, and the members of the team returned to their school faculties again and again to seek input. The team was quite certain that an agreement on the instructional goals for beginning readers was fundamental to any constructive dialogue. It was also the answer to any future questions about validity.
After months of effort, the literacy assessment team published a working draft of the beliefs and expectations of the district and began the process of designing and piloting an assessment. Initially, the team drew inspiration and borrowed actual content from the work of the Upper Arlington (Ohio) School District. Like Bangor, Upper Arlington had implemented Reading Recovery, was having teachers systematically place literature into levels for reading instruction and assessment, and was using a running record of student performance for assessment purposes. What has evolved is the Bangor Assessment of Reading (BAR), a triennial process of assessing fluency, construction of meaning, comprehension, reading strategies, and student attitudes toward literature.
The impact has been astounding. In 1991 a systemwide random assessment using the running record placed the district's approximately 400 first-graders at an instructional reading level (IRL) of 12, as defined by the assessment team. By 1995, three years after the start of extensive standards setting, staff development, and "homegrown" performance assessment, 90% of first-graders were reading at IRL 16. Most students were well beyond IRL 20. Of the remaining 10% of students who were below IRL 16, 6% were students with significant handicaps. Data through spring 1998 are in line with these figures. Standardized norm-referenced reading tests have been eclipsed by a more powerful form of performance assessment used in every classroom. Not coincidentally, the BAR has served as a powerful device for personal and systemwide professional development.
Building a Better Assessment
The BAR categorized first-grade reading materials into 30 levels, which can be used to develop a profile of an individual student reader in five categories. Initially, leveled literature was borrowed from other school systems or from publishers and compared with criteria developed at Ohio State University. It was evident that commonly accepted criteria for leveling did not exist. Furthermore, the Bangor schools intended to use only high-quality literature, and too often the publishers' material that was pegged to various levels was deemed unappealing. After numerous field tests and evaluations, the core of the BAR was presented as a kit of books linked to specific levels, with enough alternative choices to permit retesting. Not surprisingly, the assessment experience of seven years has led to a selective replacement of more accurately leveled literature.
Equally important, the kit also contained a manual to standardize the practice of assessment. If the BAR was to have credibility, teachers had to analyze their students' reading performance with high reliability. This reliability derives partly from specific assessment protocols that are strictly applied in classrooms and partly from prescriptive criteria for evaluation in each of the five assessed areas. For instance, assessment in comprehension involves specific questions -- literal, inferential, predictive, and critical -- for each book, along with a rubric of rated responses. In assessing the construction of meaning, the teacher pursues understanding of the main idea, setting, major events, primary characters, and story sequence. Fluency is not generalized but draws from specific descriptors of volume, enunciation, pronunciation, intonation, and rate. Reading strategies are assessed through the running record, which permits an accurate analysis of cuing systems, self-correction rates, and (in Bangor) graphophonic errors. While the tight requirements of administration and evaluation provide some initial consistency, it is the calibration of the performance analyses of many teachers as part of our ongoing staff development that ensures high reliability.
Teacher Training Is Critical
There is no doubt that the arrival of Reading Recovery in Bangor created a welcome atmosphere for teacher dialogue. The methodology and content of the program focused attention on reading. At long last educators in Bangor had an instructional approach that provided a common vocabulary, stressed data-based decision making, and made use of the inherent appeal of children's literature.
The Bangor School Department and the nearby University of Maine, Orono, started to offer courses specifically designed to support the data-based teaching of literacy. Reading instruction became less art and more science. While the analysis of beginning readers became the focus of faculty conversations, we still lacked the comparative data to launch a systemwide discussion of improvement. With the development of the BAR, teachers had the means they needed to improve the reading achievement of their students.
Nevertheless, teachers' evident enthusiasm for literature-based instruction was tempered by an understanding of the problems of performance evaluation. The track record for similar assessment projects was not good. However, the goal of finally having a response to persistent questions about reading achievement in plain and statistically unadorned language seemed worth pursuing. But an assessment system of this kind requires multiple evaluations and a significant commitment to training. The teachers watched the trial runs of the assessment in early 1993, and in late March the literacy assessment team unveiled the new assessment.
In spite of the tremendous demands on their time, the teachers did not hesitate to embrace the new assessment. They sensed that they were part of a grand experiment. A day was spent in intensive training, and members of the literacy assessment team carried their message into school faculty meetings. Faculty workrooms buzzed with conversations about reading comprehension issues and discussions of student work. The more teachers talked, the more improvements they suggested, and the literacy assessment team members moved quickly to blend the teachers' emerging ideas with the evaluation goals.
By late spring of 1993, all students had full reading profiles, and the teachers were ready for the next stage of implementation: the comparison of instructional reading levels with a developmental reading scale, a parallel assessment instrument that evaluated the characteristics of developing readers and generated a developmental reading level (DRL). At a "teacher academy" held the following fall, teachers were challenged to assess their students by means of this additional set of developmental criteria. When members of the literacy assessment team recognized the value of the intersection of the two assessments, the school system was ready to undertake a grassroots process of standard setting.
Standards for Teachers and Administrators
The availability of solid achievement data fueled discussions of student performance among teachers and administrators. The teachers were moving at lightning speed, and the principals quickly joined their deliberations. Recognizing that expectations for achievement were rising, the central office supported the activities of both groups. Classroom doors opened as teachers sought answers. Comparisons of student work became commonplace. Percentiles and grade equivalents were gone. Old systems of rating teachers evaporated. And questions arose: Where do my students stand? How are others doing? How good is good enough?
From such discussions teachers and principals derived standards, first through an averaging of performance within and between classrooms and finally through school- and systemwide agreement on what students must be able to do. The collaboration led to high reading achievement as measured by the indices of accuracy, comprehension, fluency, and meaning. The efforts of teachers have been rewarded with higher student achievement, and, not surprisingly, the teachers have been eager to include parents in their success.
Educating Parents
The new standards in first grade in Bangor are unequivocally committed to grade-level achievement in reading, and parents are advised of the performance of their children through a systemwide reporting process. From the first day of school, parents receive a steady flow of letters and descriptions of the instructional and developmental scales, and this correspondence continues throughout the year.
Parents learn that assessments are given in September, January, and May and that the first one is designed to coincide with the fall parent/teacher conferences. At the conferences parents review the starting IRL and DRL of their child. They also receive the first comprehensive "literacy assessment profile," as well as a "literacy assessment graph," which defines unambiguously the goal of their child's annual program. Parents can easily find the shaded area on the graph that indicates the minimum performance expected of all Bangor first-graders. Many students far exceed this level by the end of the year, and parents can follow their child's growth from the first assessment in September to the last in late May. In addition, quarterly report cards have been modified to record progress by means of the same performance descriptors as the literacy graph.
Change Continues
Inspired by the successful work of the literacy assessment team in first grade, a second-grade literacy assessment team was formed in 1993. The members understood that placing second-grade literature into levels would require different definitions from those used in first grade. Their challenge was compounded by the need to address multiple genres and higher levels of comprehension required in the curriculum. Once again, an examination of the beliefs and performance expectations that second-grade teachers bring to their reading instruction had to be established. And any assessment had to incorporate an extensive measure of comprehension and independent written responses.
Moreover, a connection with the first-grade assessment was imperative. In fact, after witnessing early developments in second-grade work, the first-grade literacy team quickly created a "transitional reading assessment" to help advanced first-grade readers bridge the gap to the second-grade program. In two years, the development of test protocols, piloting, and staff training were complete. Kits and rubrics were put into final form. And in September 1996 the second-grade assessment was under way and was receiving enthusiastic endorsement. Today, Bangor teachers train other Maine schools in the use of their assessments.
Not surprisingly, the third-grade teachers soon took up the challenge of continuing the development of the assessment system. What is evolving is a seamless, continuous-progress performance assessment in reading that provides teachers with valid feedback on their instruction, that relates information about growth in achievement to parents in clear terms, and that reports the progress of the school system to the public.
![]()
PDK Home | Site Map
Kappan Professional
Journal
Last updated 25 January 1999
URL: http://www.pdkintl.org/kappan/kerv9811.htm
Copyright 1998 Phi
Delta Kappa International