Like this Product? Get CRESST News Every Month!

Don't miss out on the latest CRESST news including our FREE reports and products. Subscribe to the monthly CRESST E-newsletter right now!

We only use this address for the monthly CRESST E-Newsletter and will ask for confirmation before adding you to our list.



No thanks | Don't ask again

Reports

Please note that CRESST reports were called "CSE Reports" or "CSE Technical Reports" prior to CRESST report 723.

#752 – Validity from the Perspective of Model-Based Reasoning
Robert J. Mislevy

Summary
From a contemporary perspective on cognition, the between-persons variables in trait-based arguments in educational assessment are absurd over-simplifications. Yet, for a wide range of applications, they work. Rather than seeing such variables as independently-existing characteristics of people, we can view them as summaries of patterns in situated behaviors that could be understood at the finer grainsize of sociocognitive analyses. When done well, inference through coarser educational and psychological measurement models suits decisions and actions routinely encountered in school and work, yet is consistent with what we are learning about how people learn, act, and interact. An essential element of test validity is whether, in a given application, using a given model provides a sound basis for organizing observations and guiding actions in the situations for which it is intended. This presentation discusses the use of educational measurement models such as those of item response theory and cognitive diagnosis from the perspective of model-based reasoning, with a focus on validity.

#751 – An Alternative IRT Observed Score Equating Method
Taehoon Kang, Troy T. Chen

Summary
In this report, an alternative item response theory (IRT) observed score equating method was newly developed. The proposed equating method was illustrated with two real data sets and the equating results were compared to those of traditional IRT true score and IRT observed score equating methods. Using three loss indices, the new method appeared to produce equating equivalents more similar to those of the IRT observed score equating than those of the IRT true score equating. In addition to the conversion relationships between new form scores and their equating equivalents on the old form scale, the bootstrap standard errors of equating were provided and compared for the three IRT equating methods. These methods performed similarly.

#750 – Some Aspects of the Technical Quality of Formative Assessments in Middle School Mathematics
Julia Phelan, Taehoon Kang, David N. Niemi, Terry Vendlinski, Kilchan Choi

Summary
While research suggests that formative assessment can be a powerful tool to support teaching and learning, efforts to jump on the formative assessment bandwagon have been more widespread than those to assure the technical quality of the assessments. This report covers initial analyses of data bearing on the quality of formative assessments in middle school mathematics. Specifically, these data address the question of whether relatively short assessments can provide reliable and useful information on middle school students' understanding of conceptual domains in pre-algebra. Items and test forms were developed and tested in four domains (rational number equivalence, properties of arithmetic, principles for solving equations, and applications of these concepts to other domains), all of which are critical to eventual mastery of algebra. We tested the items with sixth-grade students in classrooms in four districts. We then pared down the items to create eight assessment forms that were further tested alongside instructional support materials and professional development. Results of this study suggest that relatively brief formative assessments focused on key conceptual domains can provide reliable and useful information on students’ levels of understanding and possible misunderstandings in the domain.

#749 – Examining the Relationship between LA's BEST Program Attendance and Academic Achievement of LA's BEST Students
Denise Huang, Seth Leon, Deborah La Torre, Sima Mostafavi

Summary
Researchers and policymakers are increasingly interested in the impact of afterschool programs on youth development. Even though numerous studies have investigated the impact of afterschool participation on academic outcomes, there is limited research on the differential impact of afterschool programs based on students' participation rate. This study bridges that research gap and presents results from a study of the effectiveness of the LA's BEST afterschool program based on different levels of student participation. This research tracked 4 years of the academic histories for two cohorts of students participating in LA's BEST. We separated the students in each cohort into four categories based on their intensity of attendance in LA's BEST and then used a propensity based weighting method to remove existing differences in student background characteristics. Hierarchical growth modeling was employed to analyze the academic outcomes. Results indicate that math achievement outcomes of students vary by intensity of program participation. Student participants who attended LA's BEST over 100 days per year demonstrated greater math achievement growth than students with low program attendance. This finding was consistent, and was statistically significant, for both cohorts of students. In contrast, although the trend for English-language arts achievement growth was positive, and followed a developmental pattern similar to math, it did not vary significantly by intensity of program participation. This finding was also consistent for both cohorts of students.

#748 – Identification of Key Indicators of Quality in Afterschool Programs
Denise Huang, Deborah La Torre, Aletha Harven, Lindsay Perez Huber, Lu Jiang, Seth Leon, Christine Oh

Summary
Researchers and policymakers are increasingly interested in the issue of school accountability. Despite this, program standards for afterschool programs are not as fully developed as they are in other fields. This study bridges that gap and presents the results from a study that identifies benchmarks and indicators for high quality afterschool programs. This research employed a multi-method approach, including a synthesis of literature on afterschool programs, observations, and a survey data collection of 15 highquality afterschool program sites. Results of the study suggest that most of the issues emphasized in the afterschool literature can be considered core components of a quality afterschool program. This finding was consistent across the three broad categories of program organization, program environment, and instructional features. This study also revealed that some issues emphasized in the afterschool literature should be considered extra components that can increase quality, but that are not necessary. As a result, this study argues for a checklist strategy in assessing programs in order to meet quality-based standards. With further testing, refinement, and validation from larger study samples, this checklist tool can help evaluate afterschool programs in order to not only obtain basic core standards, but also to assist in identifying and tackling weak and problematic areas.

#747 – The Afterschool Experience in Salsa, Sabor y Salud
Evaluation 2007-08

Denise Huang, Deborah La Torre, Christine Oh, Aletha Harven, Lindsay Huber, Seth Leon, Sima Mostafavi

Summary
In the United States, there is an alarming trend toward obesity and inactivity among children. Minorities and economically disadvantaged children are at an even higher risk. According to the Centers for Disease Control and Prevention one in two Latino children will become diabetic. As a result, there is a dire need for tailored intervention programs that take into account cultural, dietary, and lifestyle issues of the Latino community. Kraft Foods has partnered with the National Latino Children's Institute and developed a healthy lifestyles education program for Latino families called Salsa, Sabor y Salud (Food, Fun & Fitness). The current study examines the effectiveness of the child-centered version of the Salsa, Sabor y Salud curriculum at three pilot programs in Los Angeles and Chicago. The results of the outcome evaluation revealed that the child-focused Salsa, Sabor y Salud program has made a positive impact in student's healthy behaviors. Positive impacts were also seen in the knowledge and healthy behaviors of the instructors. Furthermore, the Salsa, Sabor y Salud messages has reached parents and families of the participants through the students as they shared their knowledge and encouraged their families to adapt healthier lifestyles.

#746 – Exploring Factors that Affect the Accessibility of Reading Comprehension Assessments for Students with Disabilities: A Study of Segmented Text
Jamal Abedi, Jenny C. Kao, Seth Leon, Lisa Sullivan, Joan L. Herman, Rita Pope, Veena Nambiar, Ann M. Mastergeorge

Summary
This study sought to explore factors that affect the accessibility of reading comprehension assessments for students with disabilities. The study consisted of testing students using reading comprehension passages that were broken down into shorter "segments" or "chunks." The results of the segmenting study indicated that: (a) segmenting did not affect reading performance of students without disabilities; suggesting that it does not compromise the validity of reading assessment; (b) segmenting did not affect reading performance of students with disabilities; (c) the segmented version had a higher reliability for students with disabilities without affecting the reliability for students without disabilities; and (d) no trends were observed with student motivation, general emotions and moods with respect to the segmented assessment. The study also introduced the idea of incorporating some commonly used accommodations for students with disabilities, such as test breaks, into the assessment. Limitations of the study included a disability sample with mostly students with specific learning disabilities and a high number of ELL students, as well as a reading assessment that only tested for reading comprehension and not other components of reading. More research using the methods in this study with different subjects can potentially shed additional light on accessibility issues in reading comprehension tests.

#745 – Assessment of Content Understanding through Science Explanation Tasks
Christy Kim Boscardin, Barbara Jones, Claire Nishimura, Shannon Madsen, Jae-Eun Park

Summary
Our recent review of content assessments revealed that language expectations and proficiencies are often implicitly embedded within the assessment criteria. Based on a review of performance assessments used in high school biology settings, we have found a recurring discrepancy between assessment scoring criteria and performance expectations. Without explicit scoring criteria to evaluate the language performance, it is difficult to determine how much of the overall performance quality can be attributed to language skills versus content knowledge. This is an especially important validity question for English Learners (ELs) under the current state assessment mandates. To date, studies of the validity and consequences of standards-based assessments for ELs have been limited. In the current study, we examined the various cognitive demands including language skills associated with successful performance on content assessments. Also, as part of the validity investigation, we developed and examined the relative sensitivity of performance-based assessment, which is constructed to be a more proximal measure of student understanding and sensitive to detecting instructional differences.

#744 – Examining Differential Item Functioning in Reading Assessments for Students with Disabilities
Jamal Abedi, Seth Leon, Jenny C. Kao

Summary
This study examines performance differences between students with disabilities and students without disabilities students using differential item functioning (DIF) analyses in a high-stakes reading assessment. Results indicated that for Grade 9, many items exhibited DIF. Items that exhibited DIF were more likely to be located in the second half of the assessment subscales. After accounting for reading ability using a proxy score from items on the first half of the subscales, students with disabilities consistently underperformed on items located in the second half relative to the items located in the first half, as compared with students without disabilities. These results were seen in Grade 9 for data from two different states. These results were not seen for Grade 3. This study has several limitations. There was no access to information regarding the testing accommodations that students with disabilities might have received, and no access to the type of disabilities. Results of this study can shed light on potential factors affecting the accessibility of reading assessments for students with disabilities, in an ultimate effort to provide assessment tools that are conceptually and psychometrically sound for all students. A companion report is available examining differential distractor functioning for students with disabilities.

#743 – Examining Differential Distractor Functioning in Reading Assessments for Students with Disabilities
Jamal Abedi, Seth Leon, Jenny C. Kao

Summary
This study examines the incorrect response choices, or distractors, by students with disabilities in standardized reading assessments. Differential distractor functioning (DDF) analysis differs from differential item functioning (DIF) analysis, which treats all answers alike and examines all wrong answers against the correct answer. DDF analysis in contrast examines only the incorrect answers. If different groups, such as students with disabilities and students without disabilities, selected different incorrect responses to an item, then the item could mean something different to the different groups. Our study results found items showing DDF for students with disabilities in Grade 9, but not for Grade 3. Results also suggest that items showing DDF were more likely to be located in the second half of the assessments rather than the first half of the assessments. Additionally, results suggest that in items showing DDF, students with disabilities were less likely to choose the most common distractor than students without disabilities. Results of this study can shed light on potential factors affecting the accessibility of reading assessments for students with disabilities, in an ultimate effort to provide assessment tools that are conceptually and psychometrically sound for all students. A companion report is available examining differential item functioning for students with disabilities.