Keywords

1.1 Introduction

While it has become commonplace to argue that high-quality teachers are essential to student learning, unfortunately, there is little clarity about the best means for improving teacher effectiveness, with major consequences for policymakers and researchers alike. Over the last several decades, education reformers have attempted to improve the capacity of the teacher labor force and the quality of instructional content, spawning a voluminous research literature in the process. However, the relationships between measures of teacher effectiveness and student outcomes, whether understood as mean achievement or equity, are inconsistent, and this, in turn, has raised serious questions about the best approach for achieving policy goals. In this sense, the ambiguity in the research literature has left policymakers with little direction as to the best approaches to reform. In addition, variation across countries in how various measures of teacher quality are related to student outcomes has made cross-country transfer of educational ideas difficult, since without understanding whether (or why) policies work differently in different national contexts, it is hard to know whether (or when) a particular policy should be adopted (Plucker 2014). The central aim of this report was to investigate what international comparative assessments can reveal about the role of teachers in influencing student outcomes. While the bulk of existing research from the United States suggests that teacher credentials have a limited impact on student achievement, it remains unclear whether this finding is more broadly applicable. If the limited impact of teacher credentials in the United States is due to the specifics of the country’s teacher preparation system (or educational system more broadly) or statistical issues (for example, more limited variation in how teachers are educated), research would suggest that the United States should seek solutions in other countries. However, if the (weak) relationship holds across educational systems, then policymakers and researchers will have to determine whether formal teacher preparation is an effective lever for improving student outcomes.

1.2 Linking Teaching Quality to Student Outcomes

Ever since its foundation in the late 1950s, the International Association for the Evaluation of Educational Achievement (IEA) has focused on providing high-quality impartial comparative data on student learning and contexts, aimed at enabling researchers and policymakers to unpack the “black box” of classroom instruction (Schmidt et al. 2018). Beginning with the original pilot study of 12 countries in the late 1960s, and continuing with the First International Mathematics Study (FIMS), the Second International Mathematics Study (SIMS), and the Trends in International Mathematics and Science Study (TIMSS), IEA sponsored studies have attempted to identify the key mechanisms of classroom instruction. Since 1995, successive cycles of TIMSS have collected extensive information about teacher background and practices across countries; such data can therefore potentially be used to address important questions about the role of teachers in influencing student outcomes, and is ideally suited to examining the relationship between teacher quality and student outcomes.

The design of TIMSS renders it feasible to focus on the role of teachers. There are two major long-term large-scale international assessments of student mathematics: IEA’s TIMSS, and the Programme of International Student Assessment (PISA) run by the Organisation for Economic Cooperation and Development (OECD). Both TIMSS and PISA randomly select a group of representative schools, but whereas the PISA study selects a group of 15-year-old students from within those schools without linking them to specific teachers, TIMSS selects intact classrooms and collects extensive information about teacher background and practice. As a consequence, TIMSS data provides a unique opportunity to examine the impact of teachers on a representative sample of students in multiple countries across time. As TIMSS also tests students at two different grades (namely grades four and eight), it is also possible to examine the specific impacts of teacher characteristics and teaching practices on students at two different stages of their learning.

1.3 Conceptual Framework and Research Questions

In this study, we took advantage of the design of TIMSS to examine key teacher characteristics (experience, education, and preparedness to teach) and teacher behaviors (instructional time and instructional content) and assessed how these were related to student outcomes using data collected for both grade four and grade eight during the multiple past cycles of TIMSS. We used national curriculum data collected by the TIMSS assessments to assess the relationship between teacher instruction and national curricular standards (e.g., instructional alignment) and educational outcomes. We also focused on the distributional impact of curriculum and instruction on students, with attention paid to overall and socioeconomic status based inequality at the student and country level. We explored the evolution of these associations over time using multiple methods, including regression, fixed effect, and structural equation modeling. We also paid close attention to the methodological complexities involved with using TIMSS data and their impact in examining the relationships between teacher measures and student outcomes.

Our primary focus was student learning in mathematics. Although international assessments have also examined reading and science achievement, the research literature on the role of instructional content is much better grounded for mathematics, and hence presents a better example of the impact of schools and teachers on student outcomes because learning of mathematics takes place principally inside the classroom. Whereas students may be well exposed to reading, the use of language, or basic scientific concepts outside school, they tend to have much more limited exposure to mathematical concepts before they enter school (Sparks 2017).

We examine teacher effectiveness along multiple dimensions, including both traditional measures (teacher experience and teacher education) and more infrequently employed indicators (instructional time and content, and preparedness to teach). Although others have examined the contribution of these elements to student achievement, to our knowledge, there has been no previous research into the joint effects of these factors using a multi-year, multi-country model. The research outlined in this report is therefore novel in assessing the robustness of these relationships across countries treating each iteration of the TIMSS as a separate sample, thereby testing the replicability and reproducibility of analyses based on only one year of data. Our approach also enabled us to examine whether the significant cross-country variation in some of the observed associations (especially the relationships of teacher experience and teacher education with achievement) was more consistent within countries. Examining trends provides a means to evaluate the success of an education system in strengthening teacher quality, improving coherence, and reducing inequality. This report builds on an initial valuable exploration of these topics undertaken by Mullis et al. (2016), in which TIMSS trend data was used to demonstrate considerable national-level curriculum reform (leading to greater instructional coherence), strengthened teacher preparation requirements, a reduction in standard deviations in student performance, but little change in the amount of time devoted to teacher professional development or to mathematics instruction.

The conceptual model that we used in this report builds on the work of Blomeke et al. (2016), who applied a structural equation modeling approach to the TIMSS 2011 data, analyzing each country separately. Their model included teacher observable characteristics (years of experience, college major, and specialization), professional development (participation in broad mathematics instruction professional development, specific mathematics instruction professional development, and collaborative professional development), and teacher preparedness (using preparedness to teach numbers, geometry, and data indices) as direct predictors of student achievement. The relationship of these predictors with student outcomes are mediated by instructional quality, operationalized as a latent variable derived from clarity of instruction, supportive climate, and cognitive activation indicators. Blomeke et al. (2016) also controlled for student gender and books in the home in their analysis.

While our model is based on the Blomeke model, we incorporated several additional components. Blomeke et al. (2016) noted a weak relationship between instructional quality and student outcomes. As it stands, the Blomeke model addresses the “how” without the “what;” the weak relationships they observed may be explained by the absence of measures of instructional content (what teachers are teaching) and of how long they spent on this task. Given the strong research base on the effects of opportunity to learn and instructional time on student learning (see Chap. 2), we believed that including such factors as additional mediating variables would influence the estimates of instructional quality and greatly strengthen the model as a whole. We thus also explored the relationship between teacher characteristics, and content coverage and instructional time, which despite obvious plausibility has received little attention in the research literature.

Bearing all these considerations in mind, we developed four key questions to guide our research:

  1. (1)

    Are there identifiable trends in teacher quality and instructional metrics over time?

  2. (2)

    What are the relationships between student achievement and different types of teacher quality and instructional metrics?

  3. (3)

    How stable are these relationships across time and statistical method?

  4. (4)

    What are the relationships between student equity and teacher quality and instructional metrics?

In this report, we aim to address each of these questions in turn. We begin by reviewing the existing research literature on teacher effectiveness in Chap. 2. Next, in Chap. 3, we focus on the variables we used for our analyses and consider some of the methodological issues involved. Turning to our research questions, in Chap. 4, we present descriptive statistics for teacher quality and instructional metrics, establishing changes over time and potential trends in the education-system-level means reported in the TIMSS data. In Chap. 5, we further analyze multiple cycles of the data using ordinary least squares regression, multilevel, and fixed-effect models, while paying close attention to the stability of estimates across time and according to each of the different statistical methods, concentrating on the relationship between teacher quality variables and student outcomes. Expanding on this analysis, in Chap. 6 we present a multilevel structural equation model using TIMSS 2011 data that extends the earlier work of Blomeke et al. (2016). In Chap. 7, we shift our focus to our final research question regarding issues of educational equity, examining the associations between teacher characteristics and behaviors and differences among students, rather than average outcomes. Finally, we conclude in Chap. 8 with a discussion of the overall research findings, implications for policymakers and researchers, a review of potential limitations of the work, and suggestions for future research.