Like this Product? Get CRESST News Every Month!

Don't miss out on the latest CRESST news including our FREE reports and products. Subscribe to the monthly CRESST E-newsletter right now!

We only use this address for the monthly CRESST E-Newsletter and will ask for confirmation before adding you to our list.



No thanks | Don't ask again

Reports

Please note that CRESST reports were called "CSE Reports" or "CSE Technical Reports" prior to CRESST report 723.

#512 – Professional Development: A Key to Kentucky's Reform Effort
Hilda Borko, Rebekah Elliott, and Kay Uchiyama

Summary
Educational reform leaders generally agree that professional development opportunities for teachers are crucial to the success of any effort to make meaningful, sustainable changes in educational practice. As Fullan (1991) explained, "Continuous development of all teachers is the cornerstone for meaning, improvement, and reform. Professional development and school development are inextricably linked" (p. 315). Kentucky Department of Education (KDE) personnel charged with the responsibility to operationalize the Kentucky Educational Reform Act (KERA) understood this link. They developed an extensive professional development (PD) program to help Kentucky educators achieve the ambitious KERA goals. In this paper we describe the Department's multi-faceted approach to professional development and provide evidence for its impact on schools' achievement of KERA goals. We draw upon data from the exemplary case study component of a larger research project, The Effects of Standards-Based Assessments on Schools and Classrooms.

Perhaps the biggest challenge that KDE faced in providing PD services was geography. Many of Kentucky's school districts are located in remote rural areas, accessible only by mountain roads which are particularly treacherous to travel during the winter months. To reach these districts, KDE relied on a system of nine regional service centers, which provided a wide variety of services to districts, schools, and individual teachers. However, as Ed Reidy, then-Deputy Commissioner of Education, explained, "We have a real commitment that what kids learn should not be a function of geography --You could draw a circle around [the regional service] centers. Most of our audited schools were outside those circles and most were poor." To supplement the work of the centers, KDE developed a variety of materials and activities specifically designed to meet emerging needs of teachers as they worked to achieve KERA goals. This paper focuses on the two major categories of services--school-based professional development and professional development for mathematics and writing portfolios.

All four case study schools exhibited a strong commitment to professional development and a belief in the importance of ongoing support for teacher learning. They used state PD resources to enhance their instructional programs in areas explicitly connected to KERA, such as curriculum alignment and development of materials and activities keyed to the core content standards. Further, teachers at each school served in leadership roles in the KDE Division of Portfolio Initiatives professional development activities. These teachers saw their leadership roles as benefiting their schools, their students, and their colleagues, as well as supporting their own professional growth. Thus, using state resources and opportunities, these four exemplary schools created extensive professional development programs to suit the specific needs of their teachers and students. Through their successful efforts, they provide an existence proof that Kentucky's approach to professional development can provide the resources needed to support statewide, standards-based educational reform. The paper concludes with recommendations for approaches to professional development that seem to hold promise for facilitating statewide standards-based educational reform efforts.

#806 – District Adoption and Implementation of Interim and Benchmark Assessments
Kristen L. Davidson and Greta Frohbieter

Summary

In order to provide more frequent information about student progress during the year, many school districts have been implementing "interim" or "benchmark" assessment programs. To date, little research has examined the implementation of interim assessments or their effects on teaching and learning. This new CRESST report investigates purposes in adopting interim or benchmark assessments, ensuing implementation efforts, and actual assessment uses. The researchers found a number of substantial barriers to success including test questions that were predominantly multiple-choice, lack of professional development for teachers, and minimal coherence in shared understandings of assessment purposes and uses across district, school, and classroom levels. Based on the results, the researchers provide recommendations for a successful interim or benchmark assessment system.


#368 – Cross-Scorer and Cross-Method Comparability and Distribution of Judgments of Student Math, Reading, and Writing Performance: Results From the New Standards Project Big Sky Scoring Conference
Lauren Resnick, Daniel Resnick, and Lizanne DeStefano

Summary
Partially funded by CRESST, the New Standards Project is an effort to create a state- and district-based assessment and professional development system that will serve as a catalyst for major educational reform. In 1992, as part of a professional development strategy tied to assessment, 114 teachers, curriculum supervisors, and assessment directors met to score student responses from a field test of mathematics and English language arts assessments. The results of that meeting, the Big Sky Scoring Conference, were used to analyze comparability across scorers and comparability across holistic and anaholistic scoring methods. Interscorer reliability estimates," wrote the researchers, "for reading and writing were in the moderate range, below levels achieved with the use of large-scale writing assessment or standardized tasks. Low reliability limits the use of [the] 1992 reading and writing scores for making judgments about student performance or educational programs," concluded the researchers. However, interscorer reliability estimates for math tasks were somewhat higher than for literacy. For six out of seven math tasks, reliability coefficients approached or exceeded acceptable levels. Use of anaholistic and holistic scoring methods resulted in different scores for the same student response. The findings suggest that the large number and varied nature of participants may have jeopardized the production of valid and reliable data. "Scorers reported feeling overwhelmed and overworked after four days of training and scoring," wrote the researchers. Despite these difficulties, evidence was provided that scoring of large-scale performance assessments can be achieved when ample time is provided for training, evaluation, feedback and discussion; clear definitions are given of performance levels and the distinctions between them; and well-chosen exemplars are used.

#450 – Assessment of Transfer in a Bilingual Cooperative Learning Curriculum
Margaret H.Szymansky, Richard P. Durán

Summary
Although existing standardized language proficiency tests can provide reliable information on students' language arts skills, they fail to provide information on how students develop those skills. In this study of third- and fourth-grade bilingual classrooms, CRESST researchers sought to better understand the link between curriculum and language development. Investigating implementation of a bilingual adaptation of the Cooperative Integrated Reading and Composition curriculum, the researchers analyzed differences in students pre- and posttest performance, focusing on changes in English performance during the school year. The researchers found that increased performance could be attributed to classroom discussions of strategies to answer questions consistent with the curriculum model. "Our results," said Richard Durán and Margaret Szymanski, "suggest the value of studying how assessments of bilingual students' literacy skills might be tied to students' awareness of performance standards."

#359 – Issues in Innovative Assessment for Classroom Practice: Barriers and Facilitators
Pamela Aschbacher

Summary
As proven by the British experience, we cannot assume that new innovative assessments will be immediately understood and embraced by American teachers. Implementing performance assessments may demand new roles for teachers and students and require a radical paradigm shift among educators--from a focus on content coverage to outcomes achieved. This paper, utilizing an action research approach, describes the findings of CRESST researchers who observed, interviewed, and surveyed teachers involved in implementing alternatives assessments into their classrooms. Probably the most fundamental barrier to developing and implementing sound performance assessments was the pervasive tendency of teachers to think about classroom activities rather than student outcomes. Teachers who used portfolios, for example, focused on what interesting activities might be documented in the portfolios rather than what goals would be achieved as a result of these instructional activities. The study revealed other basic barriers in the development and implementation of alternative assessments, including teacher assessment anxiety, lack of teacher time and training, and teachers' reluctance to change.

#773 – Validity Evidence for Games as Assessment Environments
Girlie C. Delacruz, Gregory K. W. K. Chung, & Eva L. Baker

Summary
This study provides empirical evidence of a highly specific use of games in education—the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also indicate that game performance significantly predicts posttest scores, even when controlling for prior knowledge. These results provide evidence that game performance taps into mathematical understanding.

To cite from this report, please use the following as your APA reference: Delacruz, G. C., Chung, G. K. W. K., & Baker, E. L. (2010). Validity evidence for games as assessment environments (CRESST Report 773). Los Angeles, CA: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).

#364 – Dilemmas and Issues for Teachers Developing Performance Assessments in Mathematics
Roberta J. Flexer and Eileen A. Gerstner

Summary
This report examines some of the dilemmas and issues that arose during the first two terms of work with teachers participating in a mathematics performance assessment development program. Additionally, the authors report on changes in teachers' instruction and assessment as a result of the project. During the study, many dilemmas and issues arose that were unique to each of the three schools studied, but the most challenging problems across all schools were the teachers' focus on what was important to teach, and therefore assess; and how children could learn what was taught--all within the constraints of limited teacher time. As expected, preliminary results of the project were mixed, but hopeful. Researchers believe that future development and implementation of performance assessments in these classrooms hinge on teachers' beliefs in these assessments as useful and practical tools.

#484 – Instructional Validity, Opportunity to Learn and Equity: New Standards Examinations for the California Mathematics Renaissance
Bokhee Yoon and Lauren Resnick

Summary
In this report, CRESST researchers examined the relationship between professional development opportunities for teachers, the kinds of instruction offered to students, and student performance on the New Standards Mathematics Reference Examination. By comparing teachers (and their students) who had participated in the California Mathematics Renaissance professional development program with teachers and students elsewhere the researchers were able to evaluate both the effectiveness of the Renaissance program and the instructional validity of the Reference Examination.

#602 – Teachers’ Assignments and Student Work: Opening a Window on Classroom Practice
Lindsay Clare Matsumura, Jenny Pascal

Summary
In this report, four years of CRESST's research is described developing indicators of classroom practice that have the potential to be used in large-scale settings and that draw attention to important aspects of standards-based learning and instruction. CRESST's method was based on the collection of teachers' assignments with student work. The assignments then were rated and results were summarized to create indicators of classroom practice. Results to date indicated an acceptable level of inter-rater reliability across study years. It likely would be necessary to collect as many as three or four assignments from teachers to obtain a stable estimate of quality. Additionally, this method was reliable when teachers created their own assignments, but not when teachers submitted assignments created by outside sources. The quality of classroom assignments was associated with the quality of observed instruction, as well as the quality of students' written work. Students who were exposed to teachers who created more cognitively challenging assignments and who had clearer grading criteria also made greater gains on the Stanford Test of Achievement, 9th Edition (Stanford 9). The quality of teachers' assignments submitted at each of the study years, however, tended to be of basic quality only. Teachers' reactions to the data collection and implications for the use of this method in collaborative professional development sessions also are discussed.

#392 – Generalizability of New Standards Project 1993 Pilot Study Tasks in Mathematics
Robert Linn, Elizabeth Burton, Lizanne DeStefano, and Matthew Hanson

Summary
Students may have to take as many as 9-17 "long" performance assessment tasks if educators are to be confident that student performance matches true ability in a given domain, according to this important new CRESST study. Because a long task typically requires students to give complex, multifaceted responses requiring one-to-three hours to administer, the time and cost implications are significant. The performance tasks analyzed are from the New Standards Project, a joint project of the National Center on Education and the Economy and the Learning Research and Development Center. Robert Linn, Elizabeth Burton, Lizanne DeStefano, and Matthew Hanson, conducted the CRESST study. Using a generalizability analysis of New Standards tasks, the CRESST researchers analyzed two primary sources of measurement error that typically lead to unreliability in measurement of student performance: performance tasks and raters, and the interactions of pupils with tasks or raters. Because the New Standards raters were carefully trained and monitored, consistency in rating was generally very high. The greatest error therefore, was due to tasks. Essentially, student performance varied greatly from one performance task to another, suggesting that the tasks may be measuring different skills or that the skills were not measured well by the different tasks. The results confirm findings from several other studies. States or school districts that administer just a few performance tasks and then report individual student scores, may face unacceptably large measurement error. The authors make recommendations that may help resolve some problems. "Since each task," write the authors, "requires an hour or more to administer, a strategy needs to be developed either for combining some shorter tasks with long tasks or for collecting information about student performance over more extended periods of time." The authors add that researchers in the New Standards Project are pursuing both strategies.