Reports
Please note that CRESST reports were called "CSE Reports" or "CSE Technical Reports" prior to CRESST report 723.
#371 – Can Portfolios Assess Student Performance and Influence Instruction? The 1991-92 Vermont Experience
Daniel Koretz, Brian Stecher, Stephen Klein, Daniel McCaffrey, and Edward Deibert
Daniel Koretz, Brian Stecher, Stephen Klein, Daniel McCaffrey, and Edward Deibert
CSE Report 371, 1993
Summary
Summary
Vermont's statewide assessment initiative program has garnered widespread attention nationwide because of its reliance on portfolios of student work. This 145 page report describes the results of a multifaceted evaluation of the program and provides information about the implementation of the Vermont assessment; program effects on educational practice; reliability and validity of portfolio scores; and tensions that exist between assessment and instructional reform. "Findings from the evaluation," said the research team, "suggest that the assessment program resulted in changes in curriculum content and instructional style." Additionally, the researchers noted that the amount of classroom time devoted to problem solving increased, as did the amount of time students worked in small groups. Finally, portfolios seem to increase teachers' enthusiasm for their subjects and for teaching. While there was widespread support for the reform at the school level throughout the state--nearly one-half of the schools were voluntarily expanding the use of portfolios to other grade levels--substantial problems remained. The mathematics portfolio assessment created new burdens for principals, teachers and students; including demands on teachers' time and school resources. Over 80% of fourth-grade teachers and over 60% of eighth-grade teachers reported that they often had difficulty covering the required curriculum. Researchers anticipate that some of these demands are likely to decline with experience, although others represent continuing burdens. "The Vermont experience has important implications for reforms that are underway or under consideration in other jurisdictions," wrote the researchers, "but only time and careful scrutiny will show how fully the goals of the program--and of similar reform programs centered on performance assessment--can be met."
#691 – Beyond Summative Evaluation: The Instructional Quality Assessment as a Professional Development Tool
Amy C. Crosson, Melissa Boston, Allison Levison, Lindsay Clare Matsumura, Lauren B. Resnick, Mikyung Kim Wolf, and Brian W. Junker
Amy C. Crosson, Melissa Boston, Allison Levison, Lindsay Clare Matsumura, Lauren B. Resnick, Mikyung Kim Wolf, and Brian W. Junker
CSE Report 691, 2006
Summary
Summary
In order to improve students' opportunities to learn, educators need tools that can assist them to reflect on and analyze their own and others' teaching practice. Many available observation tools and protocols for studying student work are inadequate because they do not directly engage educators in core issues about rigorous content and pedagogy. In this conceptual paper, we argue that the Instructional Quality Assessment (IQA)-a formal toolkit for rating instructional quality that is based primarily on classroom observations and student assignments-has strong potential to support professional development within schools at multiple levels. We argue that the IQA could be useful to teachers for analyzing their own and their colleagues' practice; additionally, the IQA could aid the efforts of principals in their work as instructional leaders, identifying effective practitioners to help lead professional development within a school and targeting professional development needs that would require external support. Although the IQA was designed for summative, external evaluation, we argue that the steps taken to improve the reliability of the instrument- particularly the efforts to make the rubric descriptors for gradations of instructional quality as transparent as possible-also serve to make the tool a resource for professional growth among educators.
#372 – Assessment, Equity, and Diversity in Reforming America's Schools
Linda F. Winfield and Michael D. Woodard
Linda F. Winfield and Michael D. Woodard
CSE Report 372, 1994
Summary
Summary
National standards and assessments recently proposed as a strategy for improving schools in the United States have been accompanied by considerable tension between the goals of educational quality and equality of opportunity. "Proposed federal policies for implementation [of new standards and assessments] raise serious concerns about the extent to which national standards and assessments alone will help improve the quality of public education for all," write CRESST researchers Linda Winfield and Michael Woodard in their new report, Assessment, Equity, and Diversity in Reforming Americas Schools. The authors question whether or not some elements of the Goals 2000 program may "serve to deepen the already severe educational and economic cleavages that exist in this nation, especially along racial/ethnic lines." Providing a framework to review equity, diversity and assessment, the authors present a variety of research findings to support their position. Findings of a national study of promising programs in disadvantaged urban and rural schools, for example, suggest that opportunity to learn is influenced by factors such as level of implementation, budgets, staff development, and administrative support. Winfield and Woodard believe that by omitting these factors from consideration in reform measures such as the Goals 2000 program, existing inequalities will be further exacerbated by creating additional barriers and limiting upward mobility opportunities for minority students. Rather than pursue national standards and assessments, the authors suggest that reformers focus on policies and practices that have a greater probability of improving school learning and achievement, including equitable school financing, improved funding for curriculum development, and increased staff development for both teachers and administrators in content area assessments. The authors conclude: "Only when policy makers consider opportunity to learn standards as important as implementing national standards and assessment, will we ensure that those students and individuals historically disenfranchised will share in the American dream of opportunity for educational achievement and economic success."
#406 – Teachers' and Students' Roles in Large-Scale Portfolio Assessment: Providing Evidence of Competency With the Purposes and Processes of Writing
Maryl Gearhart and Shelby Wolf
Maryl Gearhart and Shelby Wolf
CSE Report 406, 1995
Summary
Summary
From 1992-1994, the California Department of Education and the Center for Performance Assessment of Educational Testing Service were engaged in the development of a new standards-based portfolio component for the California Learning Assessment System (CLAS). Based on interviews with four teachers from different school settings, the researchers sought answers to the following questions: How did teachers participating in trials of the program understand the CLAS Portfolio Assessment Program and how did they use the dimensions of learning to guide their language arts curriculum and assessment practices? How did their students understand the dimensions of learning, and how did they use the dimensions to guide their portfolio choices? What implications do the findings have for large-scale portfolio assessment?
The CRESST researchers found that teachers' curriculum varied, providing students with quite different opportunities to learn about the dimensions of learning measured by the portfolios; teachers also varied in their approach to documentation of students' writing, providing students with different opportunities to demonstrate their competencies with portfolio choices. Findings suggest a need to balance the vision of student choice as a desirable goal for students with what is needed to ensure that portfolio raters are provided appropriate evidence of student performance.
The CRESST researchers found that teachers' curriculum varied, providing students with quite different opportunities to learn about the dimensions of learning measured by the portfolios; teachers also varied in their approach to documentation of students' writing, providing students with different opportunities to demonstrate their competencies with portfolio choices. Findings suggest a need to balance the vision of student choice as a desirable goal for students with what is needed to ensure that portfolio raters are provided appropriate evidence of student performance.
#608 – Effectiveness and Validity of Accommodations for English Language Learners in Large-Scale Assessments
Jamal Abedi, Mary Courtney, and Seth Leon
Jamal Abedi, Mary Courtney, and Seth Leon
CSE Report 608, 2003
Summary
Summary
As the population of English language learners (ELLs) in U.S. public schools continues to grow, issues concerning their instruction and assessment are steadily among the top national priorities in education. The goal of this study was to examine the effectiveness, validity, and feasibility of selected language accommodations for ELL students on large-scale science assessments. In addition, student background variables were studied to judge the impact of such variables on student test performance.
Both ELL and non-ELL students in Grades 4 and 8 were tested in science under accommodation or under a standard testing condition. Language accommodation strategies (Customized English Dictionary, Bilingual/English Glossary, and Linguistic Modification of test items) were selected based on frequency of usage, nationwide recognition, feasibility, and first-language literacy factors. Students were sampled from different language and cultural backgrounds. We also included a measure of English reading proficiency to control for any initial differences in reading ability.
The effectiveness of accommodation for Grade 8 students was different from the findings for Grade 4 students. In Grade 8, the Linguistic Modification accommodation helped ELL students increase their performance while the accommodated performance of non-ELL students was unchanged. A non-significant impact of the linguistically modified test on the non-ELL group assures the validity of this accommodation. As for feasibility, this accommodation requires up-front preparation, but is easy to implement in the field; therefore, it is feasible for large-scale assessments.
In general, accommodations did not have a significant impact on students’ performance in Grade 4. We believe this may be because of the lower language demand in the lower grades. With an increase in the grade level, more> complex language may interfere with content-based assessment. Though language factors still have an impact on the assessment of ELL students in lower grades, other factors such as poverty and parent education may be more powerful predictors of students’ performance in lower grades. Another consideration is that Grade 4 students may be less familiar with glossary and dictionary use, as well as less exposed to science.
The lack of significant impact on Grade 4 non-ELL students is an encouraging result because it suggests that the accommodation did not alter the construct under measurement.
Both ELL and non-ELL students in Grades 4 and 8 were tested in science under accommodation or under a standard testing condition. Language accommodation strategies (Customized English Dictionary, Bilingual/English Glossary, and Linguistic Modification of test items) were selected based on frequency of usage, nationwide recognition, feasibility, and first-language literacy factors. Students were sampled from different language and cultural backgrounds. We also included a measure of English reading proficiency to control for any initial differences in reading ability.
The effectiveness of accommodation for Grade 8 students was different from the findings for Grade 4 students. In Grade 8, the Linguistic Modification accommodation helped ELL students increase their performance while the accommodated performance of non-ELL students was unchanged. A non-significant impact of the linguistically modified test on the non-ELL group assures the validity of this accommodation. As for feasibility, this accommodation requires up-front preparation, but is easy to implement in the field; therefore, it is feasible for large-scale assessments.
In general, accommodations did not have a significant impact on students’ performance in Grade 4. We believe this may be because of the lower language demand in the lower grades. With an increase in the grade level, more> complex language may interfere with content-based assessment. Though language factors still have an impact on the assessment of ELL students in lower grades, other factors such as poverty and parent education may be more powerful predictors of students’ performance in lower grades. Another consideration is that Grade 4 students may be less familiar with glossary and dictionary use, as well as less exposed to science.
The lack of significant impact on Grade 4 non-ELL students is an encouraging result because it suggests that the accommodation did not alter the construct under measurement.
#377 – Engaging Teachers in Assessment of Their Students' Narrative Writing: Impact on Teachers' Knowledge and Practice
Maryl Gearhart, Shelby A. Wolf, Bette Burkey, and Andrea K. Whittaker
Maryl Gearhart, Shelby A. Wolf, Bette Burkey, and Andrea K. Whittaker
CSE Report 377, 1994
Summary
Summary
In the past two decades, the ways in which writing has been taught and assessed have shifted from a focus on final products to an emphasis on writing as a process. In this latest report from the Writing What You Read (WWYR) project, CRESST researchers Maryl Gearhart, Shelby Wolf, Bette Burkey, and Andrea Whittaker summarize the impact of the WWYR program, designed to enhance elementary teachers' competencies in narrative writing assessment. This comprehensive report details the project's history, the design and implementation of WWYR, and the research methods used to gain insight into teachers' knowledge and practice. Numerous examples of the WWYR workshop materials, including the narrative rubric used to guide teachers' practice in narrative assessment, are provided. One of the findings discussed in the report is that the assessments were not typically implemented as recommended. Teachers perceived the in-service program as imposed, rather than collaboratively designed. As a result, though teachers in the study were able to see productive possibilities for action and change in their methods of assessment, there were differences among the teachers in the pattern of their changes in understanding and practice. "[O]ur story is not a happily-ever-after tale," conclude the researchers, "but a tale of real research with classroom teachers. A central point in Writing What You Read is to take what you learn from literature and carry it in to your own writing. As teachers and researchers, we will take what we have learned from this experience and carry it into our future classrooms and projects, reshaping and learning along the way."
#465 – Model-Based Performance Assessment
Eva Baker
Eva Baker
CSE Report 465, 1998
Summary
Summary
Performance assessment has been described by its proponents as a major strategy to assist teachers to improve the learning of their students. This piece will describe both the values ascribed to performance assessment and the major criticisms of assessment that have developed in the last few years of exploration.
One approach, model-based performance assessment, will be described as a way to remedy and to avoid criticisms of performance assessment. We have developed models for performance assessment in [five] learning areas i.e., problem solving, communication, collaboration, metacognition, and content understanding - and in some areas we have more than one approach. Our strategy makes some trade-offs. It emphasizes comparability among different assessments, reasonable cost, technical quality, fairness, and utility for instruction. It gives up on a wide, anything-goes approach to assessment and focuses on deeper assessment of fewer interpretations of types of learning. [Questions remain] about performance assessments, primarily related to how best to schedule them (as they are time consuming) . . . [and about] how to involve teachers in a reasonable and cost-efficient way in the scoring of the assessments We believe it is important to align content standards, classroom assessment, and external assessment in a practical way, particularly when assessment is used for policy purposes. Our model-based assessment is one way to do it.
One approach, model-based performance assessment, will be described as a way to remedy and to avoid criticisms of performance assessment. We have developed models for performance assessment in [five] learning areas i.e., problem solving, communication, collaboration, metacognition, and content understanding - and in some areas we have more than one approach. Our strategy makes some trade-offs. It emphasizes comparability among different assessments, reasonable cost, technical quality, fairness, and utility for instruction. It gives up on a wide, anything-goes approach to assessment and focuses on deeper assessment of fewer interpretations of types of learning. [Questions remain] about performance assessments, primarily related to how best to schedule them (as they are time consuming) . . . [and about] how to involve teachers in a reasonable and cost-efficient way in the scoring of the assessments We believe it is important to align content standards, classroom assessment, and external assessment in a practical way, particularly when assessment is used for policy purposes. Our model-based assessment is one way to do it.
#611 – An Evidentiary Framework for Operationalizing Academic Language for Broad Application to K-12 Education: A Design Document
Alison L. Bailey and Frances A. Butler
Alison L. Bailey and Frances A. Butler
CSE Report 611, 2003
Summary
Summary
With the No Child Left Behind Act (2001), all states are required to assess English language development (ELD) of English language learners (ELLs) beginning in the 2002- 2003 school year. Existing ELD assessments do not, however, capture the necessary prerequisite language proficiency for mainstream classroom participation and for taking content-area assessments in English, thus making their assessment of ELD incomplete. What is needed are English language assessments that go beyond the general, social language of existing ELD tests to capture academic language proficiency (ALP) as well, thereby covering the full spectrum of English language ability needed in a school setting. This crucial testing need has provided impetus for examining the construct of academic language (AL) in depth and considering its role in assessment, instruction, and teacher professional development. This document provides an approach for the development of an evidentiary framework for operationalizing ALP for broad K-12 educational applications in these three key areas. Following the National Research Council (2002) call for evidence-based educational research, we assembled a wide array of data from a variety of sources to inform our effort. We propose the integration of analyses of national content standards (National Science Education Standards of the National Research Council), state content standards (California, Florida, New York, and Texas), English as a Second Language (ESL) standards, the language demands of standardized achievement tests, teacher expectations of language comprehension and production across grades, and the language students actually encounter in school through input such as teacher oral language, textbooks, and other print materials. The initial product will be a framework for application of ALP to test specifications including prototype tasks that can be used by language test developers for their work in the K-12 arena. Long-range plans include the development of guidelines for curriculum development and teacher professional development that will help assure that all students, English-only and ELLs alike, receive the necessary English language exposure and instruction to allow them to succeed in education in the United States.
#378 – Policymakers' Views of Student Assessment
Lorraine McDonnell
Lorraine McDonnell
CSE Report 378, 1994
Summary
Summary
A new report by CRESST/RAND researcher Lorraine McDonnell, Policy makers' Views of Student Assessment, confirms that, despite the trend towards new types of assessments, the debate over the appropriate uses of student assessment continues. "Policy makers," writes McDonnell, "have varying and sometimes conflicting expectations about what assessment can accomplish." McDonnell's study found that policy makers tended to fall into three categories of expectations about student assessment: (a) agreement with testing experts that assessments should primarily inform teaching; (b) belief that testing should be used to hold schools and educators accountable; or (c) perception that assessments should be used to bring greater curricular coherence to schools, motivate students to perform better, and act as a lever to change instructional content and strategies. McDonnell's study was based on interviews with 34 national and state policy makers and focuses primarily on those states where alternative assessments are under development or in use. "As long as policy makers," concludes McDonnell, "see assessments as exerting a powerful leverage over school practice and, at the same time, are constrained by cost and other considerations, they will continue to use the same assessments for multiple purposes--some of which will have serious results for students, teachers, and schools."
#762 – Moving to the Next Generation of Standards for Science: Building on Recent Practices
Joan L. Herman
Joan L. Herman
CRESST Report 762, October 2009
Summary
Summary
In this report, Joan Herman, director for the National Center for Research, on Evaluation, Standards, & Student Testing (CRESST) recommends that the new generation of science standards be based on lessons learned from current practice and on recent examples of standards-development methodology. In support of this, recent, promising efforts to develop standards in science and other areas are described, including the National Assessment of Educational Progress (NAEP) 2009 Science Assessment Framework, the Advanced Placement Redesign, and the Common Core State Standards Initiative (CCSSI). From these key documents, there are discussions about promising practices for a national effort to better define science standards. Lastly, this report reviews validation issues including the evidence that one would want to collect to demonstrate that national science standards are achieving their intended purposes.
To cite from this report, please use the following as your APA reference:
Herman, J. L. (2009). Moving to the next generation of standards for science: Building on recent practices (CRESST Report 762). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
To cite from this report, please use the following as your APA reference:
Herman, J. L. (2009). Moving to the next generation of standards for science: Building on recent practices (CRESST Report 762). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).

