Reports
Please note that CRESST reports were called "CSE Reports" or "CSE Technical Reports" prior to CRESST report 723.
#611 – An Evidentiary Framework for Operationalizing Academic Language for Broad Application to K-12 Education: A Design Document
Alison L. Bailey and Frances A. Butler
Alison L. Bailey and Frances A. Butler
CSE Report 611, 2003
Summary
Summary
With the No Child Left Behind Act (2001), all states are required to assess English language development (ELD) of English language learners (ELLs) beginning in the 2002- 2003 school year. Existing ELD assessments do not, however, capture the necessary prerequisite language proficiency for mainstream classroom participation and for taking content-area assessments in English, thus making their assessment of ELD incomplete. What is needed are English language assessments that go beyond the general, social language of existing ELD tests to capture academic language proficiency (ALP) as well, thereby covering the full spectrum of English language ability needed in a school setting. This crucial testing need has provided impetus for examining the construct of academic language (AL) in depth and considering its role in assessment, instruction, and teacher professional development. This document provides an approach for the development of an evidentiary framework for operationalizing ALP for broad K-12 educational applications in these three key areas. Following the National Research Council (2002) call for evidence-based educational research, we assembled a wide array of data from a variety of sources to inform our effort. We propose the integration of analyses of national content standards (National Science Education Standards of the National Research Council), state content standards (California, Florida, New York, and Texas), English as a Second Language (ESL) standards, the language demands of standardized achievement tests, teacher expectations of language comprehension and production across grades, and the language students actually encounter in school through input such as teacher oral language, textbooks, and other print materials. The initial product will be a framework for application of ALP to test specifications including prototype tasks that can be used by language test developers for their work in the K-12 arena. Long-range plans include the development of guidelines for curriculum development and teacher professional development that will help assure that all students, English-only and ELLs alike, receive the necessary English language exposure and instruction to allow them to succeed in education in the United States.
#736 – Assessment Portfolios as Opportunities for Teacher Learning
Maryl Gearhart, Ellen Osmundson
Maryl Gearhart, Ellen Osmundson
CRESST Report 736, 2008
Summary
Summary
This report is an analysis of the role of assessment portfolios in teacher learning. Over 18 months, 19 experienced science teachers worked in grade-level teams to design, implement, and evaluate assessments to track student learning throughout a curriculum unit, supported by semi-structured tasks and resources in assessment portfolios. Teachers had the opportunity to complete three assessment portfolios for two or three curriculum units. Evidence of teacher learning included (a) changes over time in the contents of 10 teachers' portfolios spanning Grades 1–9 and (b) the full cohort's self-reported learning in surveys and focus groups. Findings revealed that Academy teachers developed greater understanding of assessment planning, quality assessments and scoring guides, strategies for analysis of student understanding, and use of evidence to guide instruction. Evidence of broad impact on teacher learning was balanced by evidence of uneven growth, particularly with more advanced assessment concepts such as reliability and fairness as well as curriculum-specific methods for developing and using assessments and scoring guides. The findings point to a need for further research on ways to balance general approaches to professional development with content specific strategies to deepen teacher skill and knowledge.
#503 – On the Cognitive Validity of Interpretations of Scores From Alternative Concept Mapping Techniques
Maria Araceli Ruiz-Primo, Susan Schultz, Min Li, and Richard Shavelson
Maria Araceli Ruiz-Primo, Susan Schultz, Min Li, and Richard Shavelson
CSE Report 503, 1999
Summary
Summary
The emergence of alternative forms of achievement assessments and the corresponding claims that they measure "higher order thinking" have significantly increased the need to examine their cognitive validity (Glaser & Baxter, 1997; Linn, Baker, & Dunbar, 1991). This study evaluates the validity of connected understanding interpretation of three mapping techniques. The study focused on the correspondence between mapping-intended task demands, inferred cognitive activities, and scores obtained. We analyzed subjects' concurrent and retrospective verbalizations at different levels of competency performing the mapping tasks and compared the directedness of the mapping tasks, the characteristics of verbalization, and the scores obtained across techniques. Our results led to the following general conclusions: (a) Consistent with a previous study, we found that the three mapping techniques provided different pictures of students' knowledge. (b) Inferred cognitive activities across assessment tasks were different and corresponded to the directedness of the assessment task. The low-directed technique seemed to provide students with more opportunities to reflect their actual conceptual understanding than the high-directed techniques.
#562 – Looking Into Students' Science Notebooks: What Do Teachers Do With Them?
Maria Araceli Ruiz-Primo, Min Li, and Richard J. Shavelson
Maria Araceli Ruiz-Primo, Min Li, and Richard J. Shavelson
CSE Report 562, 2002
Summary
Summary
We propose the use of students' science notebooks as one possible unobtrusive method for examining some aspects of teaching quality. We used students' science notebooks to examine the nature of instructional activities they encountered in their science classes, the nature of their teachers' feedback, and how these two aspects of teaching were correlated with students' achievement. We examined the characteristics of students' science notebooks from 10 fifth-grade classrooms. Six students' notebooks in each classroom were randomly selected. Each entry of each student's science notebook was analyzed according to the characteristics of the activity, quality of student's performance as reflected by the notebook entry, and the teacher feedback in the notebook. Results indicated that (a) raters can consistently classify notebook entries despite the diversity of the forms of communication (written, schematic or pictorial). They can also consistently score the quality of a student's communication, conceptual and procedural understanding, and the quality of a teacher's feedback to the student. (b) The intellectual demands of the tasks required by the teachers were, in general, low. Teachers tended to ask students to record the results of an ,experiment or to copy definitions. (c) Low student performance scores across two curriculum units revealed that students' communication skills and understanding were far from the maximum score and did not improve over the course of instruction during the school year. And (d) teachers provided little, if any, feedback. Only 4 of the 10 teachers provided any feedback to students' notebook entries, and when feedback was provided, comments took the form of a grade, checkmark, or a code phrase. We concluded that the benefits of science notebooks as a learning tool for students and as a source of information for teachers were not exploited in the science classrooms studied.
#406 – Teachers' and Students' Roles in Large-Scale Portfolio Assessment: Providing Evidence of Competency With the Purposes and Processes of Writing
Maryl Gearhart and Shelby Wolf
Maryl Gearhart and Shelby Wolf
CSE Report 406, 1995
Summary
Summary
From 1992-1994, the California Department of Education and the Center for Performance Assessment of Educational Testing Service were engaged in the development of a new standards-based portfolio component for the California Learning Assessment System (CLAS). Based on interviews with four teachers from different school settings, the researchers sought answers to the following questions: How did teachers participating in trials of the program understand the CLAS Portfolio Assessment Program and how did they use the dimensions of learning to guide their language arts curriculum and assessment practices? How did their students understand the dimensions of learning, and how did they use the dimensions to guide their portfolio choices? What implications do the findings have for large-scale portfolio assessment?
The CRESST researchers found that teachers' curriculum varied, providing students with quite different opportunities to learn about the dimensions of learning measured by the portfolios; teachers also varied in their approach to documentation of students' writing, providing students with different opportunities to demonstrate their competencies with portfolio choices. Findings suggest a need to balance the vision of student choice as a desirable goal for students with what is needed to ensure that portfolio raters are provided appropriate evidence of student performance.
The CRESST researchers found that teachers' curriculum varied, providing students with quite different opportunities to learn about the dimensions of learning measured by the portfolios; teachers also varied in their approach to documentation of students' writing, providing students with different opportunities to demonstrate their competencies with portfolio choices. Findings suggest a need to balance the vision of student choice as a desirable goal for students with what is needed to ensure that portfolio raters are provided appropriate evidence of student performance.
#478 – Impact of Selected Background Variables on Students' NAEP Math Performance
Jamal Abedi, Carol Lord, and Carolyn Hofstetter
Jamal Abedi, Carol Lord, and Carolyn Hofstetter
CSE Report 478, 1998
Summary
Summary
The effects of students' background characteristics on their NAEP math performance was examined in this study. Secured NAEP math items were administered to 1394 eighth-grade students from schools with large Spanish-speaking student enrollments, sizable LEP student populations, and varying socioeconomic, language and ethnic backgrounds.
Three test booklets were developed (original English, linguistically modified English, original Spanish) using the 1996 NAEP Grade 8 Bilingual Mathematics booklet. The three booklets were randomly assigned to the students within a given class. All booklets contained the same math items, differing only in their linguistic demands. During the linguistic modification process, only linguistic structures and non-technical vocabulary were modified; mathematics vocabulary and math content were retained.
The results of our analyses suggested that students performed highest on the modified English version, lower on the original English version, and lowest on the Spanish version of the math assessment. Additionally, non-LEP (fluent English proficient, initially fluent in English) students performed better on the math test than LEP students, both in general and across test forms. These results were maintained even after controlling for students' reading proficiency. Finally, students may have performed lower on the Spanish version because, in most cases, the language of instruction was English only or sheltered English. Additional analyses suggested that students tend to perform best on math tests that are in the same language as their math instruction.
The results of this study also indicated that clarifying the language of the math test items helped all students improve their performance. Certain types of linguistic modifications may have contributed more than others to the significant math score differences.
Multiple regression analyses, predicting math and reading scores from students' background questions, indicated that language-related background variables, such as length of time of stay in the United States, students' grade point average, and the number of times the student changed schools, are good predictors of students' performance in math and reading.
Three test booklets were developed (original English, linguistically modified English, original Spanish) using the 1996 NAEP Grade 8 Bilingual Mathematics booklet. The three booklets were randomly assigned to the students within a given class. All booklets contained the same math items, differing only in their linguistic demands. During the linguistic modification process, only linguistic structures and non-technical vocabulary were modified; mathematics vocabulary and math content were retained.
The results of our analyses suggested that students performed highest on the modified English version, lower on the original English version, and lowest on the Spanish version of the math assessment. Additionally, non-LEP (fluent English proficient, initially fluent in English) students performed better on the math test than LEP students, both in general and across test forms. These results were maintained even after controlling for students' reading proficiency. Finally, students may have performed lower on the Spanish version because, in most cases, the language of instruction was English only or sheltered English. Additional analyses suggested that students tend to perform best on math tests that are in the same language as their math instruction.
The results of this study also indicated that clarifying the language of the math test items helped all students improve their performance. Certain types of linguistic modifications may have contributed more than others to the significant math score differences.
Multiple regression analyses, predicting math and reading scores from students' background questions, indicated that language-related background variables, such as length of time of stay in the United States, students' grade point average, and the number of times the student changed schools, are good predictors of students' performance in math and reading.
#766 – Examining the Effectiveness and Validity of Glossary and Read-Aloud Accommodations for English Language Learners in a Math Assessment
Mikyung Kim Wolf, Jinok Kim, Jenny C. Kao, Nichole M. Rivera
Mikyung Kim Wolf, Jinok Kim, Jenny C. Kao, Nichole M. Rivera
CRESST Report 766, November 2009
Summary
Summary
Glossary and reading aloud test items are often listed as allowed in many states' accommodation policies for ELL students, when taking states' large-scale mathematics assessments. However, little empirical research has been conducted on the effects of these two accommodations on ELL students' test performance. Furthermore, no research is available to examine how students use the provided accommodations. The present study employed a randomized experimental design and a think-aloud procedure to delve into the effects of the two accommodations. A total of 605 ELL and non-ELL students from two states participated in the experimental component and a subset of 68 ELL students participated in the think-aloud component of the study. Results showed no significant effect of glossary, and mixed effects of read aloud on ELL students' performance. Read aloud was found to have a significant effect for the ELL sample in one state, but not the other. Significant interaction effects between students' prior content knowledge and accommodations were found, suggesting the given accommodation was effective for the students who had acquired content knowledge. During the think-aloud analysis, students did not actively utilize the provided glossary, indicating lack of familiarity with the accommodation. Implications for the effective use of accommodations and future research agendas are discussed.
To cite from this report, please use the following as your APA reference:
Wolf, M. K., Kim, J., Kao, J. C., & Rivera, N. M. (2009). Examining the effectiveness and validity of glossary and read-aloud accommodations for English language learners in a math assessment (CRESST Report 766). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
To cite from this report, please use the following as your APA reference:
Wolf, M. K., Kim, J., Kao, J. C., & Rivera, N. M. (2009). Examining the effectiveness and validity of glossary and read-aloud accommodations for English language learners in a math assessment (CRESST Report 766). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
#522 – Instructional Variation and Student Achievement in a Standards-Based Education District
Lauren Resnick and Michael Harwell
Lauren Resnick and Michael Harwell
CSE Report 522, 2000
Summary
Summary
This paper, part of a larger study of the links between instructional variation and variation in performance on standards-based assessment, reports on the relations between examination results and instructional variation in a diverse New York City school district. The district has put in place an educational improvement system founded on intensive, school-based professional development that is carefully related to a preferred framework for teaching literacy. This study provides strong evidence that the school district's school-based professional development program improves teaching quality for diverse schools in ways that affect students' achievement scores.
#767 – Measuring Opportunity to Learn and Academic Language Exposure for English Language Learners in Elementary Science Classrooms
José Felipe Martínez, Alison L. Bailey, Deirdre Kerr, Becky H. Huang, & Stacey Beauregard
José Felipe Martínez, Alison L. Bailey, Deirdre Kerr, Becky H. Huang, & Stacey Beauregard
CRESST Report 767, January 2010
Summary
Summary
The present study piloted a survey-based measure of Opportunity to Learn (OTL) and Academic Language Exposure (ALE) in fourth grade science classrooms that sought to distinguish teacher practices with ELL (English language learner) and non-ELL students. In the survey, participant teachers reported on their instructional practices and the context in their science classrooms. A small sub-sample was also observed teaching a lesson in their classroom on two occasions. The pilot data were used to investigate basic psychometric properties of the survey: specifically (a) the dimensions underlying the survey items, in particular whether OTL and ALE are distinct or overlapping features or dimensions of science instruction and (b) the match between information reported by teachers in the survey, and that collected by classroom observers. Qualitative analyses of observation and teacher open ended responses in the survey informed the interpretation of the quantitative analysis results and provided useful insights for refining the survey instrument to better capture the classroom experiences of ELL students.
To cite from this report, please use the following as your APA reference:
Martinez, J. F., Bailey, A. L., Kerr, D., Huang, B. H., & Beauregard, S. (2010). Measuring Opportunity to Learn and Academic Language Exposure for English Language Learners in Elementary Science Classrooms (CRESST Report 767). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
To cite from this report, please use the following as your APA reference:
Martinez, J. F., Bailey, A. L., Kerr, D., Huang, B. H., & Beauregard, S. (2010). Measuring Opportunity to Learn and Academic Language Exposure for English Language Learners in Elementary Science Classrooms (CRESST Report 767). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
#737 – Recommendations for Assessing English Language Learners: English Language Proficiency Measures and Accommodation Uses
Mikyung Kim Wolf, Joan L. Herman, Lyle F. Bachman, Alison L. Bailey, Noelle Griffin
Mikyung Kim Wolf, Joan L. Herman, Lyle F. Bachman, Alison L. Bailey, Noelle Griffin
CRESST Report 737, 2008
Summary
Summary
The No Child Left Behind Act of 2001 (NCLB, 2002) has had a great impact on states’ policies in assessing English language learner (ELL) students. The legislation requires states to develop or adopt sound assessments in order to validly measure the ELL students’ English language proficiency, as well as content knowledge and skills. While states have moved rapidly to meet these requirements, they face challenges to validate their current assessment and accountability systems for ELL students, partly due to the lack of resources. Considering the significant role of assessment in guiding decisions about organizations and individuals, validity is a paramount concern. In light of this, we reviewed the current literature and policy regarding ELL assessment in order to inform practitioners of the key issues to consider in their validation process. Drawn from our review of literature and practice, we developed a set of guidelines and recommendations for practitioners to use as a resource to improve their ELL assessment systems. The present report is the last component of the series, providing recommendations for state policy and practice in assessing ELL students. It also discusses areas for future research and development.

