Reports
Please note that CRESST reports were called "CSE Reports" or "CSE Technical Reports" prior to CRESST report 723.
#552 – Academic Language and Content Assessment: Measuring the Progress of English Language Learners (ELLs)
Robin A. Stevens, Frances A. Butler, and Martha Castellon-Wellington
CSE Report 552, 2000
Summary
As the nation moves toward inclusion of all students in large-scale assessments for purposes of accountability, there is an urgent need to determine when English language learners (ELLs) are able to express what they know on a standardized content test in English. At stake is the validity and reliability of the scores for ELLs and the resulting educational decisions made on the basis of these scores. Because tests are used increasingly to make high-stakes educational decisions, a means for including ELLs in a fair and equitable way is needed. One approach to assuring validity of test scores is to determine at what point ELLs are fluent enough to express what they know on a content test in English.
This study investigates the relationships between the language and performance of seventh-grade ELLs on two tests-a language proficiency test and a standardized achievement test. The language of the two tests is analyzed and compared, followed by analyses of concurrent performance on the same two tests. Language analyses indicate that the correspondence between the language of the two tests is limited. Data analyses of student performance show that, although there is a statistically significant relationship between performance on the two tests, the correlations are modest. An additional finding indicates that there are statistically significant within-group performance differences, providing evidence that ELLs are not a homogenous group. Furthermore, item-response analyses provide evidence that, for the highest performing group of ELLs in the study, language may be less of an issue than the content itself.
#823 – On the Road to Assessing Deeper Learning: The Status of Smarter Balanced and PARCC Assessment Consortia
Joan Herman, and Robert Linn
CRESST Report 823, January 2013
Summary
Two consortia, the Smarter Balanced Assessment Consortium (Smarter Balanced) and the Partnership for Assessment of Readiness for College and Careers (PARCC), are currently developing comprehensive, technology-based assessment systems to measure students’ attainment of the Common Core State Standards (CCSS). The consequences of the consortia assessments, slated for full operation in the 2014/15 school year, will be significant. The assessments themselves and their results will send powerful signals to schools about the meaning of the CCSS and what students know and are able to do. If history is a guide, educators will align curriculum and teaching to what is tested, and what is not assessed largely will be ignored. Those interested in promoting students’ deeper learning and development of 21st century skills thus have a large stake in trying to assure that consortium assessments represent these goals.
Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. This report summarizes CRESST findings thus far, describing the evidence- centered design framework guiding assessment development for both Smarter Balanced and PARCC as well as each consortia’s plans for system development and validation. This report also provides an initial evaluation of the status of deeper learning represented in both consortia’s plans.
Study results indicate that PARCC and Smarter Balanced summative assessments are likely to represent important goals for deeper learning, particularly those related to mastering and being able to apply core academic content and cognitive strategies related to complex thinking, communication, and problem solving. At the same time, the report points to the technical, fiscal, and political challenges that the consortia face in bringing their plans to fruition.
#361 – Sampling Variability of Performance Assessments
Richard Shavelson, Xiaohong Gao, and Gail Baxter
CSE Report 361, 1993
Summary
The authors of this study examined the cause of measurement error in a number of science performance assessments. In one part of the study, 186 fifth- and sixth-grade students completed each of three science tasks: an experiment to measure the absorbency of paper towels; a task that measured students' ability to discover the electrical contents of a black mystery box; and a task requiring students to determine sow bugs' preferences for various environments (damp vs. dry, light vs. dark). The researchers found that the measurement error was largely due to task sampling variability. In essence, student performance varied significantly from one task sample to another. Based on their study of both science and mathematics performance assessments, the authors concluded that "regardless of the subject matter (mathematics or science), domain (education or job performance) or the level of analysis (individual or school), large numbers of tasks are needed to get a generalizable [dependable] measure of performance." Considering that science experiments are time consuming, ten tasks may represent a significant cost burden for schools, districts, or even states. In another part of the study, the researchers evaluated the methods in which students were assessed on several of the same experiments including: a notebook, a method in which students conducted the experiment, then described in a notebook the procedures they followed and their conclusions; computer simulations of the tasks; and short-answer problems where students answered questions dealing with planning, analyzing or interpreting the tasks. The notebook and direct observations were the only methods that appeared to be fairly interchangeable. The results from both the short-answer problems and the computer simulations were disappointing. Increasing the number of tasks is costly and time consuming, conclude the authors. But they warn that trying to explain away technical problems is dangerous.
#702 – English Language Learners and Math Achievement: A Study of Opportunity to Learn and Language Accommodation
Jamal Abedi, Mary Courtney, Seth Leon, Jenny Kao, and Tarek Azzam
CSE Report 702, 2006
Summary
This study investigated the interactive effects between students’ opportunity to learn (OTL) in the classroom, two language-related testing accommodations, and English language learner (ELL) students and other students of varying language proficiency, and how these variables impact mathematics performance. Hierarchical linear modeling was employed to investigate three class-level components of OTL, two language accommodations, and ELL status. The three class-level components of OTL were: (1) student report of content coverage; (2) teacher content knowledge; and (3) class prior math ability (as determined by an average of students’ Grade 7 math scores). A total of 2,321 Grade 8 students were administered one of three versions of an algebra test: a standard version with no accommodation, a dual-language (English and Spanish) test version accommodation, or a linguistically modified test version accommodation. These students’ teachers were administered a teacher content knowledge measure. Additionally, 369 of these students were observed for one class period for student-teacher interactions. Students’ scores from the prior year’s state mathematics and reading achievement tests, and other background information were also collected.
Results indicated that all three class-level components of OTL were significantly related to math performance, after controlling for prior math ability at the individual student level. Class prior math ability had the strongest effect on math performance. Results also indicated that teacher content knowledge had a significant differential effect on the math performance of students grouped by a quick reading proficiency measure, but not by students’ ELL status or by their reading achievement test percentile ranking. Results also indicated that the two language accommodations did not impact students’ math performance. Additionally, results suggested that, in general, ELL students reported less content coverage than their non-ELL peers, and they were in classes of overall lower math ability than their non-ELL peers.
While it is understandable why a student’s performance in seventh grade strongly determines the content she or he receives in eighth grade, there is some evidence in this study that students of lower language proficiency can learn algebra and demonstrate algebra knowledge and skills when they are provided with sufficient content and skills delivered by proficient math instructors in a classroom of students who are proficient in math.
#385 – The Evolution of a Portfolio Program: The Impact and Quality of the Vermont Program in Its Second Year (1992-1993)
Daniel Koretz, Brian Stecher, Stephen Klein, and Daniel McCaffrey
CSE Report 385, 1994
Summary
Part of an ongoing evaluation of the Vermont portfolio assessment program by RAND/CRESST researchers, this reports presents recent analyses of the reliability of Vermont portfolio scores, and the results of school principal interviews and teacher questionnaires. The message, especially from Vermont teachers, say the researchers, remains mixed. Math teachers, for example, have modified their curricula and teaching practices to emphasize problem solving and mathematical communication skills, but many feel they are doing so at the expense of other areas of the curriculum. About one-half of the teachers report that student learning has improved, but an equal number feel that there has been no change. Additionally, teachers reported great variation in the implementation of portfolios into their classroom, including the amount of assistance provided to students. "One in four teachers," found the authors, "does not assist his or her own students in revisions, and a similar proportion does not permit students to help each other. Seventy percent of fourth-grade teachers and thirty-nine percent of eighth-grade teachers forbid parental or other outside assistance." Consequently, students who receive more support from teachers, parents and other students, may have a significant advantage over students who receive little or no outside help. Reliability problems continue. "The degree of agreement," wrote the authors, "among Vermont's portfolio raters was much lower than among raters in studies with other types of constructed response measures." The authors suggest that one cause of the low reliability was the diversity of tasks within each portfolio. Because teachers and students are free to select their own pieces, performance on the tasks is much more difficult to assess than if the work were standardized. Despite these problem areas, support for the portfolio program remains high. Teachers, for example, expressed strong support for expanding portfolios to all grade levels. Seventy percent of principals said that their schools had extended portfolio usage beyond the original Vermont state mandate.
#733 – Testing One Premise of Scientific Inquiry in Science Classrooms: A Study That Examines Students' Scientific Explanations
Maria Araceli Ruiz-Primo, Min Li, Shin-Ping Tsai, Julie Schneider
CRESST Report 733, 2008
Summary
In this study we analyze the quality of students' written scientific explanations in eight science inquiry-based middle-school classrooms and explore the link between the quality of students' scientific explanations and their students' performance. We analyzed explanations based on three components: claim, evidence to support it, and a reasoning that justifies the link between the claim and evidence. Quality of explanations was linked with students' performance in different types of assessments focusing on the content of the science unit studied. To identify critical features related with high quality explanations we also analyzed the characteristics of the instructional prompts that teachers used. Results indicated that: (a) Students' written explanations can be reliably scored with the proposed approach. (b) The instructional practice of constructing explanations has not been widely implemented despite its significance in the context of inquiry-based science instruction. (c) Overall, a low percentage of students (18%) provided explanations with the three expected components. The majority (40%) of the "explanations" found were presented as claims without any supporting data or reasoning. (d) The magnitude of the correlations between students' quality of explanations and their performance, all positive but of varied magnitude according to the type of assessment, indicate that engaging students in the construction of high quality explanations might be related to higher levels of student performance. The opportunities to construct explanations, however, seem to be limited. We report some general characteristics of instructional prompts that showed higher quality of written explanations.
#629 – Inclusion of Students with Limited English Proficiency in NAEP: Classification and Measurement Issues
Jamal Abedi
CSE Report 629, 2004
Summary
Research reports major concerns over classification and measurement for students with limited English proficiency (LEP). A poor operational definition of the English language proficiency construct and validity concerns about existing language proficiency tests are among these issues. Decisions on including LEP students in large-scale assessments such as the National Assessment of Educational Progress (NAEP) may be directly influenced by some of these factors. Poor relationships between the existing LEP classification codes with English proficiency and achievement test scores raise concern over the validity of the LEP classification system. These factors have contributed to inconsistencies in LEP classification across districts and states. Criteria used for the inclusion of LEP students in NAEP need to be more objectively defined. Based on the recommendations of existing research, the appropriate levels of English language proficiency for participation in NAEP should be determined by reliable and valid English language proficiency measures. With funding through a competitive bidding process authorized under the No Child Left Behind section on Enhanced Assessment Instruments, there are national efforts currently underway in developing English proficiency tests that can be used to provide valid measures of students' level of English proficiency. These efforts should be guided by the relevant theory and research findings, otherwise past problems relating to the validity of English proficiency tests may recur. Multiple criteria including valid and reliable measures of studentsí level of English proficiency could help with a more consistent decision-making process for the inclusion of LEP students.
#440 – Assessing Equity in Alternative Assessment: An Illustration of Opportunity-to-Learn Issues
Joan Herman, Davina C.D. Klein, Sara T. Wakai, and Tamela Heath
CSE Report 440, 1996
Summary
Based on the 1993 California Learning Assessment System (CLAS) Middle Grades Mathematics Performance Assessment, an innovative alternative assessment, the study explores whether all schools, regardless of the cultural, ethnic, or socioeconomic background of the students they serve, provide students equal opportunity to learn that which is assessed. Opportunity to learn was defined to include a range of variables likely to influence student performance, including access to resources, access to high-quality instructional content and processes, extra-school opportunities, and direct preparation for the CLAS. Data collection efforts included teacher interviews, student surveys, student retrospective think-aloud interviews, and classroom observations of the assessment administration. Researchers chose 13 schools across the state to represent three broad school categories: affluent suburban; low-SES urban; and remote, mixed SES rural. Findings highlight some differences between school types in various opportunity-to-learn measures and suggest directions for future research.
#589 – The Early Academic Outreach Program (EAOP) and Its Impact on High School Students’ Completion of the University of California’s Preparatory Coursework
Denise D. Quigley and Seth Leon
CSE Report 589, 2003
Summary
Providing academic development services to high school students is intended to improve a student’s skills and in turn assist them in completing the UC preparatory coursework, which is the first step in achieving UC eligibility, enrolling in college and completing a four-year degree. This report tests the hypothesis that the academic development services offered by the University of California in a program entitled, the Early Academic Outreach Program, result in more students completing the UC preparatory coursework, the first hurdle to being eligible for applying and being admitted to the University of California. We analyzed the course-taking behavior of two cohorts of high school students in a large urban school district in California. We analyzed their student level district data from their 7th through 12th grade years, which included student demographics, language information, course-taking behavior and course grades, spanning 1994/’95 to 1999/2000. This report uses the availability of EAOP at a school to correct for the endogeneity of participation in these programs. This technique, known as difference in differences, statistically separates the effect of participation in EAOP on students’ subsequent completion of the UC preparatory coursework from the effects of other characteristics of the student or the school. Our results are definitive, and suggest that students who participate in EAOP throughout high school are twice as likely to complete the UC preparatory coursework by the end of 12th grade than do nonparticipants of EAOP.
#644 – The Relationship Between School Quality and the Probability of Passing Standards-Based High-Stakes Performance Assessments
Pete Goldschmidt, Jose-Felipe Martinez-Fernandez
CSE Report 644, 2004
Summary
We examine whether school quality affects passing the California High School Exit Exam (CAHSEE), which is a standards-based high-stakes performance assessment. We use 3-level hierarchical logistic and linear models to examine student probabilities of passing the CAHSEE to take advantage of the availability of student, teacher, and school level data. The indicators of school quality are the Academic Performance Index (API) and magnet school status. The results indicate that both indicators of quality improve the probability of passing the CAHSEE, even after accounting for individual student characteristics. Also, the effect of opportunity-to-learn increases with school quality the relationship between school quality and the probability of passing standards-based high-stakes performance assessments.