Reports
Please note that CRESST reports were called "CSE Reports" or "CSE Technical Reports" prior to CRESST report 723.
#396 – How "Messing About" with Performance Assessment in Mathematics Affects What Happens in Classrooms
Roberta J. Flexer, Kate Cumbo, Hilda Borko, Vicky Mayfield, Scott F. Marion
CSE Report 396, 1995
Summary
When provided adequate staff development and administrative support, teachers will adopt performance assessment and new instructional methods into their classroom, conclude the authors in How "Messing About" with Performance Assessment in Mathematics Affects What Happens in Classrooms. The researchers conducted an in-depth qualitative study in three urban Denver schools, tape recording, transcribing and coding 15 mathematics workshops and interviewing all project teachers. "In short, the introduction of performance assessment," says Roberta Flexer, "provides teachers with richer instructional goals than mere computation and raises their expectations of what their students can accomplish in mathematics and what [teachers] can learn about their students." The researchers also found that teachers significantly shifted their instructional practices when exposed to performance assessment. Even some of the most text-dependent teachers began to change the way they taught mathematics: "We found holes in the [mathematics] text book," said one teacher, "so we used a variety of resources in order to build a unit around probability and statistics." Many teachers felt that students were learning more, even if this learning was not necessarily borne out by performance on the Maryland performance tasks. "...I just think they understand it [mathematics] more," said one teacher, "it is not just rote memorization-that they really know what it means when you say 20 times 80 even if they don't know the answer...There is a much deeper understanding." Teachers felt, at times, overwhelmed in trying to implement new assessments and instruction into both reading and mathematics. Further, it remains unknown if the changes will be long term. But the study provides further evidence that performance assessment can lead to an integration of instruction with assessment, more hands-on and problem-based activities aligned with the NCTM standards, and greater academic challenges for both teachers and students.
#591 – The Los Angeles Annenberg Metropolitan Project: Evaluation Findings
Joan Herman and Eva Baker
CSE Report 591, 2003
Summary
In the latter part of the 1990s, education in California was caught in a whirlwind of change. Schools scrambled to find enough teachers and enough classroom space to fulfill state-mandated class-size reduction requirements. The voters eliminated bilingual education, leaving schools with no specific classroom tool for teaching English-language learners. Schools were required to administer a new standardized test each spring to Grades 2 through 11 that was not aligned to classroom work and yet carried great weight for both students and educators.
Amidst this upheaval, a major new school-reform initiative was trying to make headway in Los Angeles County. From 1994 through 2000, the Los Angeles Annenberg Metropolitan Project, or LAAMP, was one of 18 major school improvement initiatives across the country to be funded by the $1.1 billion Annenberg Challenge. Its centerpiece was a new educational structure known as the School Family, which brought together teachers, administrators, and parents from high schools and their feeder middle schools and elementary schools, plus others with an interest in education. LAAMP organizers hoped the School Families would create a stable learning environment for students by encouraging coordination among schools and between grade levels.
Today, the Annenberg Challenge has drawn to a close. A final report released in June called the national effort a partial success. The report credited the program with strengthening urban, rural, and arts education and with raising the quality of teaching. The report also found that school-reform programs must learn to deal with rapid leadership turnover, changes in direction, and other setbacks. And it found that the grant money, while generous, frequently was spread too thin over too many schools.
The national findings parallel conclusions drawn about the 6-year Los Angeles program, which received $53 million from the Annenberg Challenge in December 1994. LAAMP commissioned a group of education researchers from UCLA and USC to evaluate the local project. Known as the Los Angeles Consortium for Evaluation, or LACE, the researchers found that LAAMP accomplished some of what it set out to do. But for a variety of reasons, it did not attain its ultimate goal of improving student performance.
The tumultuous period of California history that suctioned off time, energy, and financial resources from schools and the people working in them bears much of the blame. Researchers found other explanations. Among them were:
• School Family teams of teachers, administrators, and parents needed more time than was anticipated to develop the group process skills necessary for success and spent much of their time trying to learn how to collaborate instead of instituting change.
• The teams needed time to learn about and understand the concepts of results- or standards-based school reform and to develop the skills required to analyze available data and use them in the planning process.
• There were insufficient resources to ensure adequate support for teachers attempting to implement programs devised by the School Families. There also was no mechanism for extending the reforms to teachers not directly involved in the reform project.
LACE also acknowledged that the research methodologies it used to evaluate LAAMP, although the best that were available, might not have presented a full and accurate picture of the effects the program had on its participating schools. In addition, researchers suggested the need for more sensitive gauges of student accomplishment that measure the actual curriculum taught. The primary measurement used—California’s Stanford 9 test—may not have been the best tool for detecting the effects of specific changes in teaching and learning.
Overall, the researchers found that the LAAMP reform can claim many achievements that have benefited K-12 education in Los Angeles County, including:
• Creation of the School Family concept, which in many cases was responsible for productive changes that could not have been realized by a single school working alone.
• Strengthening of schools’ acceptance of accountability, their focus on performance, and their capacity for self-evaluation especially in regard to accessing and using student-achievement data.
• Creation of valuable teacher professional development activities and access to new instructional programs, which were especially helpful for the many new and uncredentialed teachers who were hired to fulfill class-size reduction requirements.
• Encouragement of parental involvement in the schools and in children’s learning at home, which had demonstrable effects on student performance.
• Demonstration of the potential of stable learning communities for curing many of the ills facing urban schools.
Looking at test scores, LACE researchers saw improvement at LAAMP schools over the 3-year period from 1997-1998 to 2000-2001. However, there was no statistically significant difference between LAAMP schools and non-LAAMP schools with regard to student performance on the state’s Stanford 9 standardized test.
Researchers also found no indication that LAAMP has had a wide impact on classroom practices. In other words, its core school-reform principles have not yet permeated participating schools. However, researchers saw signs that LAAMP initiatives were starting to move into the classroom in the later years of the program after so much time and energy were spent initially on developing the School Family structure.
When Walter Annenberg issued his challenge in December 1993 by giving what at the time was the largest gift ever dedicated to improving public education, he called it a “crusade for the betterment of our country.” Nine years later, that crusade has made a difference. The public schools “in most major cities are still not doing the job they must,” the June report said, but they are “better today than they were a decade ago and teachers are better equipped to help children overcome obstacles and achieve higher standards.”
#820 – Validating Measures of Algebra Teacher Subject Matter Knowledge and Pedagogical Content Knowledge
Rebecca E. Buschang, Gregory K.W.K. Chung, Girlie C. Delacruz, and Eva L. Baker
CRESST Report 820, September 2012
Summary
The purpose of this study was to validate inferences about scores of one task designed to measure subject matter knowledge and three tasks designed to measure aspects of pedagogical content knowledge. Evidence for the validity of inferences was based on two expectations. First, if tasks were sensitive to expertise, we would find group differences. Second, tasks that measured similar types of knowledge would correlate strongly, and tasks that measured different types of knowledge would correlate weakly. We recruited and assessed four groups of participants including 46 experienced algebra teachers (2+ years experience), 17 novice algebra teachers (0-2 years experience), 10 teaching experts, and 13 subject matter experts. Results indicate one task differentiated among levels of expertise and measured several aspects of knowledge needed to teach algebra. Results also highlight that future studies should use a combination of tasks to accurately measure different aspects of teacher knowledge.
#615 – Artifact Packages for Measuring Instructional Practice: A Pilot Study
Brian M. Stecher, Alicia Alonzo, Hilda Borko, Shannon Moncure and Sherie McClam
CSE Report 615, 2003
Summary
A number of educational researchers are currently developing alternatives to survey and case study methods for measuring instructional practice. These alternative strategies involve gathering and analyzing artifact data related to teachers’ use of instructional materials and strategies, classroom learning activities, and students’ work, and other important features of practice. “The Impact of Accountability Systems on Classroom Practice” is one such effort. The goals of this 5-year project, funded through the National Center for Research on Evaluation, Standards, and Student Testing (CRESST), are to develop artifact collection and scoring procedures designed to measure classroom practice in mathematics and science; validate these procedures through classroom observations, discourse analysis, and teacher interviews; and then use the procedures, in conjunction with other CRESST projects, to conduct comparative studies of the impact of different approaches to school reform on school and classroom practices. The first phase of the project was a set of pilot studies, conducted in a small number of middle school science and mathematics classrooms, to provide initial information about the reliability, validity, and feasibility of artifact collections as measures of classroom practice. This report presents results of these pilot studies.
#627 – The Effects of Teacher Discourse on Student Behavior and Learning in Peer-Directed Groups
Noreen Webb, Kariane M. Nemer, Nicole Kersting, Marsha Ing, and Jeffrey Forrest
CSE Report 627, 2004
Summary
Previous research on small-group collaboration identifies several behaviors that significantly predict student learning. These reports focus on student behavior to understand why, for example, large numbers of students are unsuccessful in obtaining explanations or applying help received, leaving unexplored the role that teachers play in influencing small-group interaction. We examined the impact of teacher discourse on the behavior and achievement of students in the context of a semester-long program of cooperative learning in four middle school mathematics classrooms. We conclude that student behavior largely mirrored the discourse modeled by and the expectations communicated by teachers. Teachers tended to give unlabeled calculations, procedures, or answers instead of labeled explanations. Teachers often instructed using a recitation approach in which they assumed primary responsibility for solving the problem, having students only provide answers to discrete steps. Finally, teachers rarely encouraged students to verbalize their thinking or to ask questions. Students adopting the role of help-giver showed behavior very similar to that of the teacher: doing most of the work, providing mostly low-level help, and infrequently monitoring other students' level of understanding. The relatively passive behavior of students needing help corresponded to expectations communicated by the teacher about the learner as a fairly passive recipient of the teacher's transmitted knowledge. Finally, we confirmed previous analyses showing that the level of help received from the student or teacher, and the level of student follow-up behavior after receiving help significantly predicted student learning outcomes.
#350 – The Vermont Portfolio Assessment Program: Interim Report on Implementation and Impact, 1991-1992 School Year
Daniel Koretz, Brian Stecher, Edward Deibert
CSE Report 350, 1992
Summary
Vermont is the first state to make portfolios the backbone of a statewide assessment system. Daniel Koretz, Brian Stecher, and Edward Deibert, the authors of this CRESST/RAND report, have been evaluating the Vermont portfolio program for almost two years. The researchers found that support for the Vermont portfolio program, despite tremendous demands on teacher time, is widespread. "Perhaps the most telling sign of support for the Vermont portfolio program," write the authors, "is that [even in the pilot year] the portfolio program had already been extended beyond the grades targeted by the state." An interesting instructional phenomenon was that over 80% of the surveyed teachers in the Vermont study indicated that they had changed their opinion of students' mathematical abilities based upon their students' portfolio work. In many cases, teachers noted that students did not perform as well on the portfolio tasks as on previous classroom work. This finding, supported by other performance assessment research, suggests that portfolios may give teachers another assessment tool that appears to broaden their understanding of student achievement.
#391 – A First Look: Are Claims for Alternative Assessment Holding Up?
Joan Herman, Davina Klein, Tamela Heath, and Sara Wakai
CSE Report 391, 1994
Summary
Drawing on data from student surveys, demographic data, interviews with students and teachers, and structured classroom observations of students, CRESST researchers studied teachers and students who participated in the 1993 California Learning Assessment System (CLAS) test in mathematics. Among the key findings - alternative assessments stimulate student thinking and problem-solving. Students understand that something different and more rigorous is required in open-ended vs. multiple-choice questions. "This is not to say," write CRESST researchers Joan Herman, Davina Klein, Tamela Heath, and Sara Wakai, "that students `like' open-ended items more than multiple choice ones." In fact, add the authors, students prefer multiple-choice problems, perhaps because they are familiar with these types of problems and because they think they perform better on standardized tests. But the results tend to support the idea that students learn from performance assessments. One of the major research questions was whether or not students in different types of schools have equal opportunities-to-learn (OTL) the material being assessed. Researchers surveyed students on a variety of OTL indicators including if they had adequate access to calculators, opportunities to work on problems that can be solved in more than one way or problems that required them to explain their thinking. Surprisingly, the researchers found that urban school students had equal access to many OTL resources such as calculators and a curriculum that went beyond standard "drill and kill" instruction. More problematic was the finding that urban students tended to have more questions about key concepts in mathematical thinking and less access to current textbooks than their suburban counterparts. Finally, interviews and surveys indicated that suburban students clearly felt better prepared than either urban or rural students for the CLAS assessments. The authors note that their results are preliminary and say that the next part of this study will include actual student performance on the CLAS tests. "We will be looking more closely," say the researchers, "at the interrelationships among and between student demographics, instructional practices, attitudes, and performance."
#773 – Validity Evidence for Games as Assessment Environments
Girlie C. Delacruz, Gregory K. W. K. Chung, & Eva L. Baker
CRESST Report 773, July 2010
Summary
This study provides empirical evidence of a highly specific use of games in education—the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also indicate that game performance significantly predicts posttest scores, even when controlling for prior knowledge. These results provide evidence that game performance taps into mathematical understanding.
To cite from this report, please use the following as your APA reference: Delacruz, G. C., Chung, G. K. W. K., & Baker, E. L. (2010). Validity evidence for games as assessment environments (CRESST Report 773). Los Angeles, CA: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
#723 – Recommendations for Building a Valid Benchmark Assessment System: Interim Report to the Jackson Public Schools
David Niemi, Julia Vallone, Jia Wang, Noelle Griffin
CRESST Report 723, 2007
Summary
Many districts and schools across the U. S. have begun to develop and administer assessments to complement state testing systems and provide additional information to monitor curriculum, instruction and schools. In advance of this trend, the Jackson Public Schools (JPS) district has had a district benchmark testing system in place for many years. To complement and enhance the capabilities of district and school staff, the Stupski Foundation and CRESST (National Center for Research on Evaluation, Standards, and Student Testing at UCLA) worked out an agreement for CRESST to provide expert review and recommendations to improve the technical quality of the district’s benchmark tests. This report (which represents the first of two deliverables on this project) focuses on assessment development and is consistent with the district goal of increasing the predictive ability of the assessments for students’ state test performance, as well as secondary goals.
#394 – Effects of Introducing Classroom Performance Assessments on Student Learning
Lorrie Shepard, Roberta Flexer, Elfrieda Hiebert, Scott Marion, Vicky Mayfield, and Timothy Weston
CSE Report 394, 1995
Summary
A new CRESST study says that introducing performance assessments into the classroom does not automatically yield achievement improvements for students. "Results in reading showed no change or improvement attributable to the [performance assessment] project," write researchers in Effects of Introducing Classroom Performance Assessments on Student Learning. Additionally, the authors found only small performance gains in mathematics. However, they did find significant qualitative changes in mathematics classrooms that provide cause for optimism. "We noted qualitative changes in students' answers to math problems which suggest that at least in some project classrooms whole groups of students were having opportunities to develop their mathematical understandings that had not occurred previously."