Reports
Please note that CRESST reports were called "CSE Reports" or "CSE Technical Reports" prior to CRESST report 723.
#584 – How Are High School Students Faring in the College Prep Curriculum? A Look at Benchmark Data for UC Partner High Schools in the University of California's School/University Partnership Program
Denise D. Quigley and Seth Leon
CSE Report 584, 2002
Summary
Policymakers and educators are committed to increasing the competitive eligibility of high school students applying to the University of California (UC) and to increasing the representation of economically disadvantaged and underrepresented students on UC campuses. A core element of the University of California’s strategy to accomplish these goals is the School/University Partnership Program (S/UP) with its supportive academic development student programs. Increasing UC eligibility by increasing students’ ability to complete UC preparatory coursework is both a key programmatic strategy and a primary goal of the Partnerships. The overarching motivation of the School/University Partnership Program is to advance the rate at which students graduate from high school with a comprehensive educational background that makes them eligible for the University of California. Completion of the A-G required course pattern is the single best indicator of the accomplishment of this objective. This report establishes the A-G course completion rates and course-taking patterns for a group of urban UC School/University Partnership schools in a large urban school district in California. These data clarify the nature of the problems that must be systematically addressed and begin to identify actual baseline trends against which future goals can be realistically established. These data are crucial for Partnership, Partner school, and school district staff in understanding the basic issues and potential solutions for increasing UC eligibility and increasing UC preparatory course taking. We found that a large majority of the students in the UC Partner schools were not successfully completing the college prep curriculum. These data reveal that mobility and not taking or completing the A-G courses have resulted in very small percentages of students staying on track and attaining A-G completion by the end of 12th grade. The course-taking patterns outlined in this report provide a first step in setting the stage for gaining a set of diagnostic tools to be used both to increase the number of students on track and to keep students on track towards achieving A-G eligibility by the end of 12th grade.
#807 – How Middle School Mathematics Teachers Use Interim and Benchmark Assessment Data
Lorrie A. Shepard, Kristen L. Davidson and Richard Bowman
CRESST Report 807
Summary
In recent years, a large number of school districts have adopted interim or benchmark assessments to help inform instruction during the school year. In this study, researchers drew from interviews across seven school districts and analyzed: 1) teachers understandings of their school districts’ purposes for interim assessments; 2) professional development to support teachers effective use of assessment information; and 3) reported teacher use of the data to adjust instruction. The researchers found that although many teachers expressed an interest in using interim assessment results as intended, that they often received minimal professional development and frequently had a different understanding regarding the intended use of the assessments than did district leaders. The researchers concluded that, in general, the interim assessment data did not provide teachers with insights about what to do next to help students other than to reteach.
#368 – Cross-Scorer and Cross-Method Comparability and Distribution of Judgments of Student Math, Reading, and Writing Performance: Results From the New Standards Project Big Sky Scoring Conference
Lauren Resnick, Daniel Resnick, and Lizanne DeStefano
CSE Report 368, 1993
Summary
Partially funded by CRESST, the New Standards Project is an effort to create a state- and district-based assessment and professional development system that will serve as a catalyst for major educational reform. In 1992, as part of a professional development strategy tied to assessment, 114 teachers, curriculum supervisors, and assessment directors met to score student responses from a field test of mathematics and English language arts assessments. The results of that meeting, the Big Sky Scoring Conference, were used to analyze comparability across scorers and comparability across holistic and anaholistic scoring methods. Interscorer reliability estimates," wrote the researchers, "for reading and writing were in the moderate range, below levels achieved with the use of large-scale writing assessment or standardized tasks. Low reliability limits the use of [the] 1992 reading and writing scores for making judgments about student performance or educational programs," concluded the researchers. However, interscorer reliability estimates for math tasks were somewhat higher than for literacy. For six out of seven math tasks, reliability coefficients approached or exceeded acceptable levels. Use of anaholistic and holistic scoring methods resulted in different scores for the same student response. The findings suggest that the large number and varied nature of participants may have jeopardized the production of valid and reliable data. "Scorers reported feeling overwhelmed and overworked after four days of training and scoring," wrote the researchers. Despite these difficulties, evidence was provided that scoring of large-scale performance assessments can be achieved when ample time is provided for training, evaluation, feedback and discussion; clear definitions are given of performance levels and the distinctions between them; and well-chosen exemplars are used.
#635 – Consequences and Validity of Performance Assessment for English Learners: Assessing Opportunity to Learn (OTL) in Grade 6 Language Arts
Christy Kim Boscardin, Zenaida Aguirre-Muñoz, Marjorie Chinen, Seth Leon, and Hye Sook Shin
CSE Report 635, 2004
Summary
In response to the growing achievement gap between English Learners (ELs) and non-ELs, standards-based instruction and assessment have been promulgated at the state and federal level. Yet, the consequences of standards-based assessment reforms for ELs have rarely been systematically studied. The work reported here represents the initial study of a 4-year research project with the purpose of investigating how the implementation of standards-based performance assessments and related instructional strategies influences the achievement of ELs. In this study, we were specifically interested in identifying the opportunity-to-learn (OTL) variables that positively impact student performance. We also investigated potential differences in the impact of OTL on performance between ELs and non-ELs. Our study suggested that there are several factors contributing to studentsð performance on the Language Arts Performance Assignment (LAPA). At the student level, the analysis suggested that the greatest contributors to individual studentsð LAPA scores were performance on the Stanford 9 Language test, ethnicity, gender, and language proficiency status. At the teacher level, we found that content coverage was significantly associated with student performance. The study showed that higher levels of content coverage in both writing and literary analyses were associated with higher performance for all students, including ELs. We also found differential impact of one OTL variable, content coverage-writing, on ELs performance. This finding indicates that the gap between ELs and non-ELs increases as teacher reports of content coverage-writing increase.
#745 – Assessment of Content Understanding through Science Explanation Tasks
Christy Kim Boscardin, Barbara Jones, Claire Nishimura, Shannon Madsen, Jae-Eun Park
CRESST Report 745, 2008
Summary
Our recent review of content assessments revealed that language expectations and proficiencies are often implicitly embedded within the assessment criteria. Based on a review of performance assessments used in high school biology settings, we have found a recurring discrepancy between assessment scoring criteria and performance expectations. Without explicit scoring criteria to evaluate the language performance, it is difficult to determine how much of the overall performance quality can be attributed to language skills versus content knowledge. This is an especially important validity question for English Learners (ELs) under the current state assessment mandates. To date, studies of the validity and consequences of standards-based assessments for ELs have been limited. In the current study, we examined the various cognitive demands including language skills associated with successful performance on content assessments. Also, as part of the validity investigation, we developed and examined the relative sensitivity of performance-based assessment, which is constructed to be a more proximal measure of student understanding and sensitive to detecting instructional differences.
#727 – Developing Academic English Language Proficiency Prototypes for 5th Grade Reading: Psychometric and Linguistic Profiles of Tasks
Alison L. Bailey, Becky H. Huang, Hye Won Shin, Tim Farnsworth, Frances A. Butler
CRESST Report 727, 2007
Summary
Within an evidentiary framework for operationally defining academic English language proficiency (AELP), linguistic analyses of standards, classroom discourse, and textbooks have led to specifications for assessment of AELP. The test development process described here is novel due to the emphasis on using linguistic profiles to inform the creation of test specifications and guide the writing of draft tasks. In this report, we outline the test development process we have adopted and provide the results of studies designed to turn the drafted tasks into illustrative prototypes (i.e., tried out tasks) of AELP for the 5th grade. The tasks use the reading modality; however, they were drafted to measure the academic language construct and not reading comprehension per se. That is, the tasks isolate specific language features (e.g., vocabulary, grammar, language functions) occurring in different content areas (e.g., mathematics, science and social studies texts). Taken together these features are necessary for reading comprehension in the content areas. Indeed, students will need to control all these features in order to comprehend information presented in their textbooks. By focusing on the individual language features, rather than the subject matter or overall meaning of a text, the AELP tasks are designed to help determine whether a student has sufficient antecedent knowledge of English language features to be able to comprehend the content of a text.
#527 – Evaluation Report/UC Diagnostic Writing Service
Ellen Osmundson and Joan L. Herman
CSE Report 527, 2000
Summary
The University of California Diagnostic Writing Service (DWS), developed collaboratively by the University of California (UC), California State University (CSU), and Educational Testing Service (ETS), offers 11th-grade students and their teachers the opportunity to use college-level writing exams (prior versions of the Subject A exam for UC and the English Placement Test for CSU) and receive diagnostic feedback on their exam essays from university readers. The goal is to strengthen high school students' writing skills and increase general literacy. Both Web-based and paper-based delivery systems were used to administer the essay prompts. This report presents results of a formative evaluation regarding the value of the project for teachers and students and its effectiveness for future use and wider implementation.
#806 – District Adoption and Implementation of Interim and Benchmark Assessments
Kristen L. Davidson and Greta Frohbieter
CRESST Report 806
Summary
In order to provide more frequent information about student progress during the year, many school districts have been implementing "interim" or "benchmark" assessment programs. To date, little research has examined the implementation of interim assessments or their effects on teaching and learning. This new CRESST report investigates purposes in adopting interim or benchmark assessments, ensuing implementation efforts, and actual assessment uses. The researchers found a number of substantial barriers to success including test questions that were predominantly multiple-choice, lack of professional development for teachers, and minimal coherence in shared understandings of assessment purposes and uses across district, school, and classroom levels. Based on the results, the researchers provide recommendations for a successful interim or benchmark assessment system.
#557 – The Validity of Knowledge Mapping as a Measure of Elementary Students' Scientific Understanding
Davina C. D. Klein, Gregory K. W. K. Chung, Ellen Osmundson, Howard E. Herl, Harold F. O'Neil, Jr.
CSE Report 557, 2002
Summary
Although first popular as an instructional tool in the classroom, knowledge mapping has been used increasingly as an assessment tool. Knowledge mapping is expected to measure deep conceptual understanding and allow students to characterize relationships among concepts in a domain visually. Our research examines the validity of knowledge mapping as an assessment tool in science.
Our approach to investigating this validity is three-pronged. First, we outline a model for the creation of knowledge mapping tasks, proposing a standard set of steps and using content area and educational experts to ensure the content validity of the measures. Next, we describe a scoring method used to evaluate student performance, including a discussion of the method's reliability and its relationship to other possible scoring systems.
Finally, we present our statistical results including comparative analyses, our multitrait-multimethod validity analyses involving two traits (students' understanding of hearing and of vision) and three different measurement methods (knowledge mapping, essay, and multiple-choice tasks), critical proposition analyses, and analyses of students' propositional elaborations.
Results show knowledge maps to be sensitive to students' competency level, with mixed MTMM results. We conclude with a discussion of implications and directions for future work.
#387 – Specifications for the Design of Problem-Solving Assessments in Science
Brenda Sugrue
CSE Report 387, 1994
Summary
In Specifications for the Design of Problem-Solving Assessments in Science, CRESST researcher Brenda Sugrue draws on the CRESST performance assessment model to develop a new set of test specifications for science. Sugrue recommends that designers follow a straightforward approach for developing alternative science assessments. "Carry out an analysis of the subject matter content to be assessed," says Sugrue, "identifying key concepts, principles, and procedures that are embodied in the content." She adds that much of this analysis already exists in state frameworks or in the national science standards. Either multiple choice, open-ended, or hands-on science tasks can then be created or adapted to measure individual constructs, such as concepts and principles, and the links between concepts and principles. In addition to measuring content-related constructs, Sugrue's model advocates measuring metacognitive constructs and motivational constructs in the context of the content. This permits more specific identification of the sources of students' poor performance. Students may perform poorly because of deficiencies in content knowledge, and/or deficiencies in constructs such as planning and monitoring, and/or maladaptive perceptions of self and task. The more specific the diagnosis of the source of poor performance, the more specific can be instructional interventions to improve performance. Sugrue's model includes specifications for task design, task development, and task scoring, all linked to specific components of problem-solving ability. An upcoming CRESST report will discuss the results of a study designed to evaluate the effectiveness of the model for attributing variance in performance to particular components of problem solving and particular formats for measuring them.