Keywords

1 Conceptual Framework and Research Questions

Growing up, children are confronted with stimulation from multitudinous sources. From the very beginning, interactions with their social and material surrounding stimulate their senses, provide learning impulses, and foster development. The BiKS-3-18 study favoured this very broad look at child development, starting with an overarching research question to identify environmental effects on child development (see e.g. Lehrl et al. this volume). Primary learning environments affecting early child development are the child’s family, non-family based child care, in Germany most typically an age-heterogeneous preschool setting,Footnote 1 and later on primary school. A child experiences these learning environments concurrently (e.g. family and preschool) or consecutively (preschool and primary school). In past research, each learning environment has, in general, established its own field, resulting in different research frameworks that distinguish research regarding one learning environment from research on others. Such idiosyncrasies facilitate stand-alone research for each learning environment, and hamper research taking into account learning environments’ interactions and mutual dependencies. One major research goal in BiKS-3-18 was to study different environmental contexts by creating a common assessment framework that balances generic aspects of learning environments and specific learning environment particularities.

Educational and social monitoring has developed an approach to categorize context factors for child development by applying an analogy of (economic) production of outcomes. Grouping relevant factors heuristically along this production line results in input factors, processes, output and outcomes (e.g. Purves 1987; Scheerens 1991). The BiKS-3-18 study applied this heuristic (see Fig. 1) to the learning environments of family, preschool, and primary school, categorizing context factors made up of inputs (i.e. conditions, structural characteristics of a learning environment) and processes (i.e. interactions between a child and the learning environments). Closely related to both are belief systems and orientations, relating to the values and beliefs of relevant stakeholders. Earlier research suggested that structural background characteristics and orientations shape and condition processes. The latter in turn directly affect child development, the output of interest (immediate results in certain developmental domains) and outcomes (greater life achievements), thereby mediating effects of the other two categories of factors (NICHD 2002). In addition, research has proven that beyond the dosage or quantity of a certain environmental factor, a setting’s educational quality impacts child development (Anders 2013; Tietze et al. 1998; Tietze et al. 2005a). BiKS-3-18 therefore considered both the quantity and quality of stimulation in and across learning environments to be important for child development (see, for example, Anders et al. 2013). However, in the present paper, we will focus on the quality of stimulation.

Differentiating between certain outcomes, such as numeracy, literacy or socio-emotional development, leads to a differentiation on the learning environment side as well. Certain kinds of stimulation (e.g., reading with the child) may foster development in one domain (e.g., literacy) more strongly than in others (e.g., numeracy). Still other environmental factors may support development in different domains with equal strength. Consequentially, context factors that were assumed to target a broader range of output domains were labelled “global”, those that targeted only a single domain were labelled “domain-specific”. BiKS-3-18 studied global quality aspects as well as domain-specific facets of quality.

Fig. 1
An illustration of a heuristic framework of educational quality in learning environments. 3 mutually connected context factors of structural characteristics, belief systems and orientations, and processes lead to child development.

Heuristic framework of educational quality in learning environments

However, the heuristic framework should not be misunderstood in the sense that BiKS follows a simple model of teaching and learning in which the educational quality of an environment itself determines child development and learning. Development and learning not only depend on the quality of a learning environment, rather this provision has to be used by the children, which in turn depends on their individual cognitive, social, motivational and affective dispositions and abilities as well as their active role. Such a model of the supply of quality and use of quality has—in the German context—been proposed by, for example, Fend (1980) (see also Seidel and Reiss 2014, or Kiel 2018). Thus, the heuristic framework in Fig. 1 displays only one side—the environmental side—of effective learning of children. In analyses of effects on child development in BiKS-3-18, both sides are considered. In addition, the realized quality may also depend on child and family characteristics in the sense that the arrangement of quality adapts to the needs, motivations, socio-emotional and cognitive conditions of the children and their families (and their teachers). In this regard, children and families influence the arrangement of quality.

Research has emphasised the importance of process quality for developmental outcomes over other context factors (Anders et al. 2016). Therefore, BiKS-3-18 put a strong emphasis on researching the quality of educational processes. One concept that is common in international research on learning and instruction (Pianta and Hamre 2009; Praetorius et al. 2020) distinguishes three dimensions of process quality: student support, structuring and guiding the learning process, as well as students’ cognitive activation and stimulation. We consider this differentiation of three dimensions as a general orientation for our conceptualisation of quality. Further on, children are clustered in classes, which leads to considerations on whether teachers target their educational approach towards a whole class or to individual children. Thus, individual children within the same classroom may experience different levels of quality. Process quality therefore has to be distinguished between quality related to the whole class and process quality related to individual children.

Applying this framework to each learning environment (i.e. family, preschool, and school), we follow two research questions in this paper:

  • Research question 1: What are the levels of process quality across environments and do levels change over time?

  • Research question 2: How is process quality influenced by structural characteristics and beliefs and orientations of the caregivers?

Before presenting and discussing results according to these research questions, the research instruments used, especially with regard to process quality, are introduced.

2 Research Instruments

The following paragraph illustrates the measures applied to answer our research questions. A complete overview of the used measures can be found in Homuth et al. (this volume). The longitudinal design of BiKS is described in more detail in von Maurice et al. (this volume, see Fig. 2).

Measurement of Quality in Preschools

In preschools, BiKS-3-18 measures the quality of educational processes at class level as well as at single child level in parallel, including global as well as domain-specific aspects. Each of the 97 preschool classes was observed in spring 2006, 2007 and 2008 to cover the entire preschool phase of the BiKS-3-18 children. To assess process quality at class level we used the German version of the Early Childhood Environment Rating Scale-Revised Edition (ECERS-R, Harms et al. 1998; KES-R, Tietze et al. 2005b) for global aspects and the German version of the Early Childhood Environment Rating Scale—Extension (ECERS-E, Sylva et al. 2006a, b; KES-E, Rossbach et al. 2018) for domain-specific aspects. Both observational instruments are high-inferential rating scales and were administrated by trained observers during one morning in preschools (four hours) at each measurement point. Compared to the focus on interactional aspects in the Classroom Assessment Scoring System (CLASS, Pianta et al. 2008), ECERS-R and ECERS-E put a stronger focus on the equipment and use of materials and—this is particularly true for the ECERS-E—on domain-specific aspects (Schmidt et al. 2018). Both instruments define environment in a broad sense and guide the observer to assess the arrangement of space, the materials and activities offered to the children, the supervision and interactions that occur in the classroom and the schedule for the day. Thus, the two instruments are specifically adaptable to the early childhood education practice in Germany, which is characterized by free play situations and open settings, whereas the CLASS instrument follows a more instructional educational approach. The ECERS-R consists of 43 items measuring global aspects of educational processes in seven domains (space and furnishings, personal care routines, language and reasoning, activities, interaction, program structure, and parents and staff). The ECERS-E extends the ECERS-R to give additional insights into 18 important domain-specific aspects of educational processes in four subscales (literacy, mathematics, science and environment, and diversity). The scores range from 1 to 7, with 1 indicating inadequate quality, 3 minimal quality, 5 good quality, and 7 excellent quality (see for more detail, Kuger and Kluczniok 2008). In the study, a four-day training course had to be completed by the observers successfully before using the observational scales. The observer agreement using 20 double coded observations of the KES-R measured by Cohens Kappa was κ = 0.58 and κ = 0.44 for KES-E indicating only lower inter-rater reliability. However, the percentage of agreement (within one scale point) between the two observers was 82% for the KES-R and 80% for the KES-E. The internal consistencies were satisfactory (Cronbach’s Alpha for the KES-R total at the three measurement points ranged between α = 0.80–0.87 and for the KES-E total score between α = 0.70–0.73).

To assess process quality at single child level we developed a new instrument, called ZiKiB (target child observation, Kuger et al. 2006Footnote 2; Smidt 2012). During the study preparation years (2004/2005), only few target child-related instruments were available (e.g., Emerging Academic Snapshot—EAS, Ritchie et al. 2001Footnote 3; Observation of Activities in Preschools—OAP, Palacios and Lera 1995Footnote 4; for an overview of target child-related instruments see Riedmeier 2019; Smidt 2012). Several years later, the inCLASS, another target child-related instrument, was introduced in the U.S. (Downer et al. 2010) and is now increasingly used in German-speaking countries as well (e.g., Kluczniok and Schmidt 2020; Smidt and Embacher 2021; von Suchodoletz et al. 2015). The ZiKiB is to some degree oriented to the ECERS-R and ECERS-E and the above-mentioned international instruments. It includes global and domain-specific aspects of quality ratings at target-child level, as well as a time sampling observation to document the daily activities of individual target children in preschool classes. Several areas are considered (e.g., activities of the target child, role of the preschool teacher). Moreover, the ZiKiB considers specific domains of support (e.g., of literacy- and numeracy-related abilities).Footnote 5 The observer agreement of the ZiKiB has been conducted in 20 preschools in Bavaria. Cohens Kappa ranged for different parts of quality ratings from κ = 0.40 to κ = 0.81, thus from low to good observer agreement (see in more detail Smidt 2012; Smidt and Rossbach 2016).

Furthermore, the preschool teachers completed questionnaires on structural characteristics of the whole preschool setting and the preschool class (e.g., class size, class composition), of the preschool teacher (e.g., teaching experiences, qualification) as well as on their educational beliefs (e.g., educational goals, beliefs about teaching and learning in preschools).

Based on theoretical considerations from classroom research (Klieme et al. 2006), 28 items of both the ECERS-R/KES-R and ECERS-E/KES-E were—according to their content—assigned to three scales (Kuger and Kluczniok 2008): classroom management/climate (12 items, Cronbach’s Alpha at the three measurement points: α = 0.63–0.78), stimulation in literacy (9 items, α = 0.72–0.78) and stimulation in numeracy (7 items, α = 0.66–0.77). These three scales are taken up again when reporting the level and stability of quality and the relations between structural characteristics, educational beliefs, and educational processes. For the ZiKiB (Smidt 2012; Smidt and Rossbach 2016), we also refer to one global and two domain-specific scales: support of social competencies (Cronbach’s Alpha at the three measurement points: 3 items, α = 0.63–0.77), support of early literacy competencies (2 items, α = 0.53–0.90), and support of early numeracy competencies (2 items, α = 0.30–0.91). Again, some of the alphas are low. However, the small numbers of items have to be considered.Footnote 6 The application of the ZiKiB required a successful completion of a four-day training, during which the data collectors had to achieve an 80% agreement with a master rater (Smidt 2012, p. 68).

Measurement of Quality in Primary School Classes

Instructional quality in primary school classes was measured in the school years from 2009 to 2011/2012.Footnote 7 At that time and to the best of our knowledge, no existing observation instrument was available which could be readily used to capture instructional quality in German primary school classes. For this reason, 26 items from various studies (Clausen 2002; Helmke and Schrader 1998Footnote 8; Rakoczy and Pauli 2006) were adopted for an observational instrument and partially adapted to measure global instructional quality in primary school classes. These items can be assigned to the model of the three basic dimensions of teaching quality (for an overview, Klieme et al. 2009; Praetorius et al. 2018). Items on global quality were rated on a 4-point scale (1 = strongly disagre e – 4 = strongly agree) and grouped to form the scales classroom climate (6 items, Cronbach’s Alpha at the different measurement points α = 0.44–0.89), classroom management (6 items, α = 0.80–0.90) and global cognitive activation (14 items, α = 0.84–0.89). For analysing domain specific aspects of educational classroom processes another observational instrument was developed based on current teaching methodology and on the implementation of co-constructive instruction which refers to cognitive activation in mathematics (Steinweg 2011) and cognitive activation in literacy acquisition (Kammermeyer 2009Footnote 9). The scale of cognitive activation in mathematics includes 5 items (α = 0.71–0.83) and the scale of cognitive activation in literacy acquisition 4 items (α = 0.47–0.63). The domain specific items were rated on a 7-point scale (1 = inadequate – 7 = excellent). All observation scales (global and domain specific) were assessed after having observed two or three lessons of a regular school day. Ratings on domain specific items were based on time periods in which topics of the particular subject were taught. Before using the observation scales, a three-day training course had to be completed successfully. Interrater reliability was tested in 22 double ratings. ICCunjust assessed the observer’s conformity for the different rating scales and ranged from 0.56 to 0.76 (classroom climate 0.56, classroom management 0.76, global cognitive activation 0.62, cognitive activation in mathematics 0.62, and cognitive activation in literacy acquisition 0.73). Again, the reliabilities of the scales range from low to good (see for more detail, Grosse et al. 2017).

The teachers additionally received questionnaires to capture structural characteristics of the teachers and classes (e.g. job experience, class size, proportion of students whose parents had a native language other than German) and educational beliefs (e.g. transmissive beliefs, constructivist beliefs).

Measurement of Quality in Families—Preschool Phase

Quality of stimulation within the family at preschool age was recorded annually in 2005, 2006, and 2007 based on a multi-method approach. The more global and quantitative aspects were surveyed with the German version of the Home Observation for Measurement of the Environment—Early Childhood (HOME-EC; Bradley and Caldwell 1984), which includes 55 items integrated in written and oral questionnaires as well as in observations. The HOME-EC was proven to capture the overall quality of stimulation within families with reliability and validity, and differentiates especially for families at the lower end of the quality continuum (Bradley et al. 2001). In addition, to cover more domain specific aspects of stimulation within the family, the Family Rating Scale (FES-KiGa, Kuger et al. 2005Footnote 10) was developed. FES is a live rating scale to assess a semi-standardised reading situation in a parent-child interaction in the home environment to record the quality of parental support behaviour. The picture books used within this situation were designed within the BiKS-3-18 study and thus unknown to the parents and the children. The parents were advised to share this book with their child as they usually do in joint picture book situations. A trained observer rated the quality of this interaction on 11 items measuring three dimensions: general language stimulation, stimulation of mathematics and stimulation in literacy. Each item is evaluated on 7-point-scales. The scale-levels 1 (low quality), 3 (minimal quality), 5 (high quality), and 7 (excellent quality) are qualitatively characterized and described to facilitate and standardize the evaluations. Beforehand, raters were trained to a criterion of 90% agreement (±1 scale point) to a gold standard of a master rating. Ten percent of observations were double coded by two independent raters (rater agreement ICC = 0.78). According to the overall-framework of the home learning environment, three cross-instrument scales were formed from the FES-KiGa and HOME-EC. Items addressing global stimulation were grouped under the scale social support (Cronbach’s α = 0.47) and domain-specific ones under the scales stimulation in literacy (Cronbach’s α = 0.75) and stimulation in numeracy (Cronbach’s α = 0.71). Since the individual items included in the scales have different anchors, the items were standardised and transformed in such a way that they assume a value range from 0 = low quality to 1 = excellent quality. Structural aspects of the family and parental beliefs and orientations were measured with questionnaires (see for more detail, Kluczniok et al. 2013; Lehrl 2018; Lehrl et al. 2020).

Measurement of Quality in Families—Primary School Age

Quality of stimulation within the family during primary school age was assessed annually in 2009, 2010, 2011 and 2012 (in each primary school year). The observations took place in those families who had participated since preschool age (“initial sample”, Homuth et al. this volume). As in preschool age, the data collections are based on a multi-method approach. The HOME Inventory was again used to record the more global and quantitative aspects in primary-school age (German version of the Home Observation for Measurement of the Environment—Middle Childhood, MC-HOME, Bradley and Caldwell 1984; Tietze et al. 2005a). The BiKS-3-18 study used 38 items of the 59 MC-HOME items and integrated them in written and oral questionnaires as well as in observations. A Family Rating Scale—FES-GS (BiKS Research Group 2009Footnote 11) was developed analogous to the preschool age. A semi-standardized situation is initiated that evokes parental support behaviour in a kind of homework solving situation independent of curriculum, subject, and learning level. The BiKS-3-18 research group used the commercially available game “Rushhour” as a puzzle task to initiate a parent-child interaction. Trained observers rated 10 items, again on a 7-point-scale similar to the preschool phase. The 10 items of the live rating scale can be assigned to the basic dimensions of teaching quality according to Klieme and colleagues (2006): clarity of rules and structure, supportive teaching climate, and cognitive activation. The internal consistency of the overall scale is high across all survey time points (Cronbach’s α = 0.87–0.90, 10 items). The internal consistencies of the sub-scales turn out to be somewhat lower (scale clarity of rules and structure: α= 0.68–0.80, 3 items; scale supportive teaching climate: α = 0.70–0.79, 4 items; scale cognitive activation: α = 0.68–0.78, 3 items). Furthermore, information on family background characteristics (e.g., educational attainment of parents, migration background, household composition and income), educational orientations (e.g., school-related expectations regarding the importance of certain support areas) as well as activities within the family (homework support of parents) and activities outside the family (e.g., leisure activities) of the children were asked annually via questionnaires.

As mentioned, some of the developed scales have rather low reliabilities. This limitation has to be kept in mind when stabilities (Sect. 3) and multiple regressions (Sect. 4) in the 97 preschools are analysed.

3 Level and Stability of Quality in the Different Learning Environments

The first research question relates to a description of the levels and stabilities of process quality children experience in their preschools, primary schools, and families during preschool and primary school age. How can the mean levels of quality in preschool, primary school, and family be evaluated? How heterogeneous are children’s experiences? Do children experience quality changes during their preschool years within learning environments? The following passages will concentrate on process quality as the more proximal aspects for the development of children. For structural characteristics and beliefs of the actors in the respective environments, only selected aspects are reported. First, the level of quality for preschools are reported, followed by the level of quality in primary schools as well as the level of quality in families at preschool and primary school age.

Levels and Stability of Quality in Preschool Classes

Table 1 displays descriptive statistics of the global and domain-specific aspects of process quality in preschools for the three measurement points at class and child level.

Table 1 Descriptive statistics of process quality in the 97 preschools

Overall, we find only moderate quality at the class level and at single child level in the 97 preschools under study (see Homuth et al. this volume, for the sampling procedure). The more global aspects of educational quality at class level (“classroom management/climate”) as well as the original ECERS-R total scores are in the mid-range of the 7-point scale over the three preschool years with quite large differences between classes (standard deviations from 0.7–1.1). Thus, children experience quite different quality environments. Such medium (mediocre) levels of process quality found here can also be found in other national and international studies. The global educational quality measured by the ECERS-R ranges between the means of 3.7 and 4.4 (LoCasale-Crouch et al. 2007; Sylva et al. 2006a, b; Tietze et al. 2013). Means above 5.0 (= good quality) are very rare in the BiKS-3-18 data ranging between 2 and 4% for the total score of the ECERS-R compared to 7% in the NUBBEK study (Nationale Untersuchung zur Bildung, Betreuung und Erziehung in der frühen Kindheit, Tietze et al. 2013).

The domain-specific quality in the BiKS-3-18 study is lower: “stimulation in literacy” and “stimulation in numeracy” as well as the domain-specific scale of the ECERS-E (total score), indicating minimal domain-specific quality with a slight increase over the three years of observation, possibly due to the introduction of educational plans (curricular guidelines) at that time in Bavaria and Hesse, which emphasize domain-specific stimulation. This result is in line with other national and international studies, such as NUBBEK (M = 2.8; Tietze et al. 2013) or The Effective Provision of Pre-School Education (EPPE) Project in England (M = 3.1; Sylva et al. 2006a, b), indicating a lower level of domain-specific quality compared to global quality. No preschool class in BiKS-3-18 reaches good or excellent values of domain-specific quality (values above 5 measured by ECERS-E). About 10 years later a study shows that the situation in German preschools is still very similar to the BiKS-3-18 data measured in spring 2006, 2007 and 2008 (Anders et al. 2021).

Table 1 shows that the global educational quality at single child level (“support of social competencies”) is at a moderate level in each of the three preschool years. By contrast, substantially different findings are revealed for the aspects “support of early literacy competencies” and “support of early numeracy competencies”. The overall pattern of results shows—comparable to the class level—fairly moderate quality in terms of the “support of social competencies” scale and rather inadequate quality in terms of the domain-specific scales “literacy” and “numeracy” over the entire preschool phase (Smidt and Rossbach 2016). It is difficult to compare these results with other studies, due to the lack of studies analysing the educational domain-specific quality at single child level.

Considering process quality, an overall picture emerges: Global process quality is only mediocre, domain-specific quality is even lower. Stimulation in mathematics is lower than in literacy. In all aspects, there is high variation between preschool classes. If process quality at the class level since the beginning of this century is compared to now, no general improvement of quality can be seen (see comparison in Tietze et al. 2013, p. 85). It could be speculated that new challenges for preschool (e.g., the introduction of educational curricula) may have interfered with efforts to improve quality.

In order to give a short impression on key structural characteristics that are of high political interest and therefore intensively discussed, three characteristics are presented for the first measurement point: The average class size (24.3 children, SD = 3.6) and the average staff-child ratio (12.4 children per teacher, SD = 4.6) are almost comparable to official statistics around the time of BiKS assessments (Bundesministerium für Familie, Senioren, Frauen und Jugend 2008). There are high variations in class size—ranging from 9 to 30 children—and teacher-child-ratio—ranging from a group with 4.5 children per preschool teacher to one with almost 28 children per teacher. On average, about a quarter of the children in the classes have a migration background (24.6%, defined by the mother tongue of the parents) with large differences between classes (SD = 26.8%). They range from classes without children with a migration background to classes in which all children have a migration background.

On the one hand, the mean levels of process quality do not change much over the three measurement points from 2006 to 2008. On the other hand, it is possible that the quality of individual classes changes. The question of the stability of educational quality is relevant for politics and practice insofar as one could assume that only stable (high) quality has positive effects on children’s development. However, the number of studies on this topic is rather low due to the lack of longitudinal studies measuring educational quality several times in the same class. With the BiKS-3-18 study, this question can be addressed. The stability of the educational quality at class level over the three preschool years is low to moderate, ranging between r = 0.13–0.48 and is highest on both domain-specific scales.Footnote 12 In sum, we find higher stability for the domain-specific quality aspects with lower quality level at the same time. Changes of quality do not only indicate measurement inaccuracy because the changes in the educational quality can partly be explained by changes to structural characteristics of the classes over the preschool years. Growth curve analyses using Bayesian estimation found that classroom composition characteristics (e.g., proportion of children with migration background in class, mean age of children in class) show the strongest relations to process quality in preschools (in terms of effect sizes), whereas allocated resources and characteristics of the staff (e.g., teaching experience) are less strongly related to overall level and changes of quality (Kuger et al. 2016).Footnote 13

Overall, these findings indicate a need for quality improvement in German preschools. This seems particularly relevant in light of the fact that high quality in particular has a positive effect on child development (e.g., Anders et al. 2012; Lehrl et al. this volume). Thus, in practice and politics, the focus should be on improving high-quality learning opportunities in preschools.

Levels and Stability of Instructional Quality in Primary Schools

The global aspects of instructional quality in primary schools across the years all range around values that can be classified as “good quality” (see Table 2). The rating of the scale classroom climate is, at all measurement points, higher than the rating of the scale classroom management and especially than the scale global cognitive activation. Results of other studies show inconclusive findings: The above results were also found in some other international studies (e.g. Cadima et al. 2010; Ponitz et al. 2009), in these studies, however, a different classroom observation instrument was used. Further studies, in which yet other classroom observation instruments are used, present a different picture (e.g. Lerkkanen et al. 2016; Pakarinen et al. 2014). The BiKS-3-18 study also considered domain specific aspects of instructional quality. The respective means are in mid to low range of the scale over the primary school classes. The rating of the scale of cognitive activation in mathematics (M = 3.2–3.7) is, at all measurement time points, in a higher range than the scale of cognitive activation in literacy acquisition (M = 2.5–3.1). Why this is so must remain open at this point. In addition, in all aspects of global and domain specific instructional quality we find a high variation; this is especially true for the domain specific aspects. As in preschool, the children experience very different instructional qualities in the primary school classes.

Table 2 Descriptive statistics of instructional quality in primary schoolsa

The importance of good instructional quality for children’s learning was shown in many studies (e.g. Decristan et al. 2016; Lerkkanen et al. 2016; Pakarinen et al. 2011). However, studies that investigated stability of instructional quality in primary school classes over the primary school years are rare. The BiKS-3-18 study examines the stability of instructional quality and reveals low to medium stability (r = ‒0.02–0.50). Possible reasons for these lower correlations could be due to the specificity of instruction and the change of teachers over time, which occurred—especially in Bavarian primary schools—very often.

In order to give a short impression on key structural characteristics and teacher beliefs (at the first measurement point, grade 1), the mean class size is 22 children and the mean proportion of students whose parents have a native language other than German is 27%. Similar to characteristics of the preschool settings, the range here is very large, between 0 and 96%. Transmissive beliefs of teachers are lower (M = 2.8 in the 4-point scale) than constructivist beliefs (M = 3.4) (for results of selected structural characteristics and teacher beliefs in grade 2 see Grosse et al. 2017).

The results underline efforts in improving cognitive activation in mathematics and cognitive activation in literacy acquisition in primary schools. Especially because the domain cognitive activation in mathematics was shown to be predictive of children’s mathematical growth in primary school (Lehrl et al. 2016). The importance of domain specific aspects of instructional quality for children’s competencies at primary school age were also revealed by further studies (e.g. Fauth et al. 2014; Hanisch 2015). In addition, significant positive associations with later vocabulary status were shown for the global aspects of instructional quality, but not for the dimension cognitive activation in literacy (Grosse et al. 2017). These findings emphasize the importance of considering global and domain specific aspects of instructional quality in primary schools.

Levels and Stability of Process Quality at Home During Preschool Age

Table 3 displays the descriptive statistics of the different aspects of process quality at home during preschool age (three measurement points, 2006, 2007, 2008).

Table 3 Descriptive statistics of process quality during preschool age

Across the entire age range, stimulation in literacy and global stimulation is qualitatively higher than the quality of interaction regarding mathematical content. The global stimulation, which includes family social support, shows the highest means across the preschool period compared to literacy and mathematical stimulation. The overall quality of stimulation within the home learning environment (HLE) measured through the HOME is slightly above the theoretical mean of 0.5 across all measurement points and indicates moderate to good quality for most of the children within the BiKS study.

With regard to the overall level of informal language stimulation via contact with books (without table), the BiKS-3-18 study showed that informal language promotion through reading aloud and the provision of books is relatively high on average throughout the preschool period (Lehrl 2018). Nevertheless, it must be pointed out that at all measurement points there are children (around 10%) who own less than 10 (first preschool year) or less than 20 (second and third year) books on their own, and that almost 30% of the children are not read to daily. These children miss out on important experiences with language, writing and grammar in the years before they start school, which could be the reason for the increasing disparities in vocabulary and grammar acquisition, as found e.g. in Farkas and Beron (2004) and Weinert et al. (2010).

Examining changes in the home learning environment can provide important information about the dynamics in parent-child-interactions that foster young children during the preschool period as they approach school readiness. The level of change may vary across families due to their psychological, educational, or social resources (Son and Morrison 2010). Within the BiKS-3-18 study, the (interpersonal) stability of the educational quality of the home learning environment over the three preschool years is low to moderate, ranging between r = 0.12–0.54 and is highest on both the domain-specific scales. In comparison, the numeracy scale and especially the literacy scale have a higher stability across the survey points at preschool age (r = 0.38–0.43 and r = 0.53–0.54, respectively). In contrast, the relatively lower stability of the global stimulation indicates that children of preschool age have very different experiences from year to year with regard to global stimulations. Even if the levels of domain-specific stimulations are lower, parents seem to focus more consistently on stimulation of numeracy and literacy throughout the early years and before the start of school, whereas the general family stimulation processes are subject to more changes over the preschool period. We assume that parents may have consistent attitudes about the importance of domain-specific stimulation in the years before school starts.Footnote 14

In order to give a short impression on key structural characteristics of the BiKS-3-18 families (first measurement point, beginning of preschool), 34% of the mothers have 12 to 13 years of school education (entrance certificate for university) and their weekly working hours amount on average to 11.3 h. There are, on average, two children in the families, and the mean socio-economic status of the families lay almost in the middle of the HISEI-scale (M = 52.3, SD = 15.5, scale from 16 to 90; Ganzeboom et al. 1992; see Homuth et al. this volume).

Levels and Stability of Quality at Home During Primary School Age

With regard to the stimulation in families during primary school age (see Table 4), the more global stimulation within the family at the beginning of primary school (Mt4 = 0.68, SDt4 = 0.14; measured with the total score of the MC-HOME) is comparable to the level of stimulation at preschool age (measured with the total score of the EC-HOME). The mean scores of the more general quality aspects of the FES-GS consistently range from M = 4.8 to M = 5.7 on the 7-point scales in all four primary school years on the scales of clarity of rules and structure and supportive teaching climate. The cognitive activation scale ranges just under one scale point below this, indicating a lower level of quality. Again, we find large differences between the families (with a maximum of one standard deviation) and, hence, the children experienced quite different learning environments in their families.

Table 4 Descritptive statistics of the MC-HOME and Family rating scale primary school age (FES-GS)a

The means of the quality scale are quite similar across the four measurement points. If we look for (interpersonal) stability, the correlations are significant but relatively small: for the scale “supportive teaching climate” r = 0.16–0.27; for the scale “clarity of rules and structure” r = 0.10–0.23 and for the scale “supportive teaching climate” r = 0.21–0.28. Even if the mean levels of quality are quite stable across the four years in primary school, children experience to some degree different (levels of) family stimulation from year to year. This suggests that parental stimulation varies notably between the primary school years.Footnote 15

4 Relations Between Process Quality and Structural Conditions and Beliefs within Learning Environments

The second research question studies the mutual relationships between structural characteristics of learning environments and orientations and beliefs of the acting persons on one side and process quality on the other side. Especially in a political context it is, for example, often assumed that structural conditions of preschools classes like class size, staff-child-ratio or teacher training are of high importance for educational processes. Referring to our framework model it is assumed that improvements in such structural conditions will lead to higher process quality and, following, to improvements in child development (see for example the discussions on the German federal law to improve quality in preschools, Bundesministerium für Familien, Senioren, Frauen and Jugend and Jugend- und Familienministerkonferenz 2016). Thus, an important research question is if these assumptions hold and how strong the respective influences are. In following these questions, the section restricts the analyses to the respective first measurement points in BiKS-3-18 study in preschool, primary school, and family during preschool and primary school age.

Conditions of Process Quality in Preschool Classes

First, the relations between process quality at the preschool class and the individual child level are reported. Referring to the first measurement point (i.e., first preschool year) findings from the BiKS-3-18 study show that the correlation between the total score of the ECERS-R and the total score of the ZiKiB is r = 0.46 and between the total score of the ECERS-E and the total score of the ZiKiB r = 0.45, respectively (Smidt 2012), indicating about 20% shared variance of both constructs. Referring to other instruments with different foci, Sabol et al. (2018) reported low correlations between the subscales of the CLASS and the inCLASS (r = 0.02 – r = 0.24). Thus, although the findings are varying, a separate consideration of both levels of educational quality (i.e., class level and single child level) can be supported in principle. This pattern of findings supports, in accordance with previous results, that both single child-related and class-related educational quality should be considered in order to obtain a comprehensive picture of the nature of the educational processes in preschools.

As already stated, an important question for policymakers is to know which structural characteristics and educational beliefs of preschool teachers are associated with process quality and to what degree. Several studies have found (mainly moderate) correlations between process quality, structural characteristics, and educational beliefs of preschool teachers (Pianta et al. 2005; Slot 2018; Tietze et al. 2013) yielding evidence of a complex interaction of structural characteristics that jointly predict process quality. For process quality at the class level, the following relationship pattern at measurement point 1 emerged (see Table 5; for more details see Kuger and Kluczniok 2008). The selection of the reported predictors in the models was oriented towards predictive variables in other comparable analyses (see Mashburn et al. 2008).Footnote 16

Table 5 Results of multiple regression analyses: preschool class-level, first preschool year

In general, the explained variance is less than a third of the total variance in the process quality, indicating only moderate relations between the set of these predictors and the process quality. We find a significantly lower process quality concerning global as well as domain-specific quality in preschool classes with more children with migration background, in classes with a younger mean age of the children, and in smaller classes. It is noticeable that in this set of predictors the staff-child ratio has hardly any significance for the process quality. In addition, educational beliefs of preschool teachers are also related to process quality. Classes with teachers with more cooperative attitudes towards learning and lower conservative goals in education achieve a higher quality in literacy stimulation. The less satisfied the preschool teacher is with his or her job and the more years of job experience he or she has, the worse the global quality. The latter result may be a cautious indication of burnout tendencies among preschool teachers.

We want to highlight the result that quality is lower in smaller classes when the staff-child-ratio is controlled, a result that was also found some years later in the NUBBEK-study (Tietze et al. 2013). In that case, smaller classes do not seem to increase quality. The finding that quality is lower when the mean age of the children in a class is lower may indicate that the concept of preschool classes—as it is measured with our quality instruments—is more related to older children in the age range 3 to 6 and that in younger classes care aspects are more emphasised than educational aspects. The negative relation of quality to the percentage of children with migration background—which was also found in other studies (see, e.g., the NUBBEK-study, Tietze et al. 2013)—indicates more challenges in such classes which are not always met.Footnote 17 However, more detailed analyses are necessary to describe the relationship more precisely (e.g. curvilinear relations). In sum, these findings point to the need for targeted approaches (with additional resources) for specific groups of children, e.g., language education for disadvantaged children and concepts for younger age groups in preschool classes. In this context, approaches that combine targeted approaches for specific children in the daily preschool routines could be promising (Leseman and Slot 2020). The negative relation between the percentage of children with migration background in classes and process quality is striking for Germany (and other countries) in that it is not found in other early childhood education systems that implemented specific interventions for disadvantaged children in preschools. For example, in the Netherlands or UK we find the opposite effect, according to which the process quality is better in classes with more children with migration background (Slot et al. 2015). For Germany, some federal states now provide additional funding for staff positions in disadvantaged social areas (particularly in the context of the “Gute-Kita-Gesetz”, e.g., Rhineland-Palatinate, Saxony-Anhalt) to compensate for the challenging conditions of these settings. Currently, it remains to be seen how this additional funding will influence the preschool quality today. The findings for Germany appear to be particularly important because further analyses by Lehrl et al. (2014) show that children with migration background bear a lower chance of attending high quality settings (similar to Becker 2012), whereas comparable effects cannot be found for children in families with lower educational or less advantaged economic backgrounds. Summing up across global and domain-specific indicators of process quality, the most influential variables in BiKS seemed to be those of classroom composition; less important were allocated resources and least influential proved to be characteristics of the professional staff in the classroom.

The relations between structural characteristics, educational beliefs, and domain-specific processes at single child level are small (without table, see in more detail Smidt 2012). For the support of social competencies, for example, child-related variables (gender, migration background) and class size had no significant effects. The quality of support of social competencies is higher the fewer years of experience the teacher has. In addition, the support of social competencies is somewhat better for children whose teachers show a higher understanding of her role as hierarchical (i.e., seeing herself/himself as an expert, advisor, and authority towards children). The quality and quantity of materials in the preschool class positively relate to support of social competencies. In addition, Linberg and Kluczniok (2020) report that language- and math-related processes are associated not only with classic structural characteristics (e.g. job experiences), but also with individual child characteristics (e.g. shyness) and that math-related processes are related to math competencies shortly before school entry. Due to the limited comparability with other findings more research is needed particularly on issues about predictors of educational quality at single child level. That child characteristics may have a significant influence on interaction quality is shown by recent studies that assessed interaction quality with the inCLASS (e.g., children’s language skills, Smidt and Embacher 2020; children’s personality, Smidt and Embacher 2023). Taken together, the survey of interaction quality at the child level may well represent a research benefit.

Conditions of Instructional Quality in Primary School Classes

Relations between educational processes, structural characteristics, and educational beliefs of the teachers in grade 1 of primary school are depicted in table 6. The selection of the variables was based on structural characteristics and educational beliefs of the teachers that have been shown to be relevant for instructional quality in primary school (cf. e.g. for individual structural characteristics NICHD ECCRN 2004; Tietze et al. 2005a)—supplemented by findings from the secondary school instruction. Only a small part of the total variance is explained by the total set of predictors. This result that structural conditions and teacher beliefs are not striking predictors is in line with other studies (e.g. Tietze et al. 2005a).

Table 6 Results of multiple regression analysis: primary school class-level, grade 1

In the first year of primary school, teachers with more transmissive beliefs display a lower global cognitive activation and a lower cognitive activation in literacy acquisition. The results—not yet published—show that the educational beliefs of teachers regarding the design of lessons are manifested in the processes that actually take place. More years of job experience are related with lower global instructional quality in grade 1 when job satisfaction is controlled. The extent to which the date of training or of first trends of burnout are evident in this context must remain open at this point. In addition, a negative association can be seen between the proportion of students whose parents have a native language other than German and global cognitive activation (significant at a 10% level). However, compared to results on quality in the preschool classes, the negative associations between the proportion of students with migration background and quality are much lower and less consistent. This may relate to the fact that schooling is more regulated than preschool education. Another result is the unexpected finding of quality differences between the two federal states for the first school year. The extent to which the differences in quality can be attributed to different training contents in teacher training, to different structural conditions at the teachers’ workplace, or in the classes cannot be answered at that time.

Conditions of Quality in the Family at Preschool Age

When turning the focus to the home learning environment (HLE) at preschool age, several studies showed that different structural characteristics of the family are related to different aspects of the HLE (e.g. Bradley et al. 2001; Tietze et al. 1998). Typical predictors are the socio-economic status, maternal education, income of the family, the language background of the family, and the number of siblings in the household, as well as educational beliefs (see Lehrl 2018, for an overview). However, a systematic investigation of different facets of HLE, as conceptualised in the BiKS-3-18 study, including a broad selection of structural and belief characteristics on a personal, social, and spatial-material level has not yet been taken into account. Table 7 displays the results from the first measurement point of the BiKS-3-18 study.

Table 7 Results of multiple regression analysis: HLE, first preschool year

The results show that higher maternal education and the family’s ability to afford to enrol their child in extracurricular activities go along with better overall or global family support in preschool age. Moreover, literacy stimulation is better in families without migration background, who have a higher socioeconomic status, higher maternal education, higher income, higher expenses for activities, and higher educational orientations. Promotion of numeracy was associated with migrant status, higher SES and educational school-oriented beliefs. When examining explained variance, general aspects of the home learning environment (global stimulation) seem to be relatively independent of structural characteristics and parental educational beliefs. Domain-specific aspects of the home learning environment, on the other hand, are more strongly related to structural characteristics and educational beliefs. Literacy stimulation is more dependent on the structural characteristics, whereas stimulation in numeracy is more dependent on parental beliefs towards education.

Overall, it can be stated that the variation in the HLE at the age of 3 can be predicted differently depending on the dimension under consideration, and that it is therefore not possible to speak of families as “the” promoters or “the” non-promoters. It is nevertheless apparent that children from socially disadvantaged families have fewer language, literacy, and mathematical experiences, suggesting that children from less privileged households are at more developmental risk than their more privileged peers.

Conditions of Quality in the Family at Primary School Age

At the beginning of the primary school years in grade 1, we find almost no relation between family stimulation quality (total score of FES-GS) and socio-structural factors (e.g., parents’ school education, income, migration background, parental aspiration; without table, adjusted R2 = 0.02). In the second to fourth year of primary school, the educational level of the parents becomes somewhat more important. The higher the parents’ level of education, the higher the quality of stimulation in the family. The migration background does not play a role at any time. At all measurement points, the explained variances are low (R2 < 0.10). These results are somewhat unexpected and striking, and maybe related to our rather global operationalisation of family quality. It could be that quality that is more related to school-based parental involvement is more dependent on structural conditions of the family and parental beliefs, but results are still to be produced.

5 Discussion

The focus of this section will not be on an integration of the results into the research literature. This has already been done in the preceding sections. Rather, we will highlight some results concerning quality in the different learning environments and their importance. For some of the implications, two general limitations have to be considered:

  • The study included only two German federal states, Bavaria and Hesse. Thus, the results may not be representative of all German states.

  • The analysed data was collected 2006 to 2008 in the preschool phase and 2009 to 2012 in the primary school phase. Since there have been many developments and reforms in preschools and primary schools since then, the current situation may have changed. Therefore, (political) implications have to be considered with caution. In any case, a current replication of the study, in which recent developments—e.g. new measures of educational quality—are considered, is needed.

BiKS managed to establish an integrative concept of quality assessment across children’s primary learning environments, family, preschool, and school. The starting point of this project in the BiKS-3-18 study was the development of an integrative concept for recording the educational quality of the early learning environments of preschool, primary school, and family to allow for a more rigorous analysis of the interrelation of the quality a child experiences in different environments during early childhood. Against this background, a broad set of research instruments was selected and justified. In addition, we developed new instruments like the ZiKiB for measuring preschool quality at the child level, an observational tool for measuring domain-specific quality in primary school classes and a semi-structured tool for measuring domain-specific quality in families in preschool and primary school age (Family Rating Scale). All instruments have proved their worth and can be applied in other studies and this has been done with the (shortened version) of the ZiKiB in a study concerning the interaction quality at target child level in Austria (Smidt and Embacher 2020) and Germany (Kluczniok and Schmidt 2021).

Process quality in families, preschool settings, and primary classes are low to mediocre, with lower levels for domain-specific quality than for more global aspects. Process quality in preschool classes, in primary school classes, and in the family show high variation. Thus, the children experience not uniform quality but differing opportunities for their development. In the preschool classes, we find only low to mediocre global process quality. This was also found in other studies in the last 20 years, showing that the mean quality level has not changed over time despite many efforts towards improvements in Germany. Domain-specific quality is even lower than global quality. It is expected that domain-specific quality will increase over time because of the political discussions following large-scale national and international studies to measure competencies in schools and the introduction of curricula in preschools in all German states that emphasize domain-specific stimulation. However, further efforts to improve global and domain-specific process quality are needed. Currently, several programmes for quality improvement are implemented in Germany. We would only like to mention the nationwide federal large-scale programme “Sprach-Kitas”—Language Day Care Centres.Footnote 18 Our study is one of the first in Germany that observes domain-specific instructional quality (especially for the stimulation of mathematics) in the beginning of the primary school phase. One result is that the domain-specific stimulation in primary school classes (cognitive activation in mathematics and in literacy acquisition) is rather low. Thus, challenges exist for improvements. It has to be kept in mind that all observational quality rating instruments were developed by experts in the field. The verbalization of scale values, such as “good” or “poor” quality thus represents the research consensus e.g. in didactics of elementary mathseducation or early childhood education on what is understood to be good or poor quality in a normative sense.

For process quality in the family, in preschool, and primary school age, high heterogeneity has been found, again pointing to very different learning opportunities for young children. Interestingly for the preschool age, domain-specific quality during shared book reading is quite high in language stimulation compared to stimulation in mathematics. However, results of the frequencies of domain-specific stimulation in the family in the two domains are reversed. Namely, the frequencies of stimulation in mathematics is higher than in literacy, but the quality of stimulation is lower in mathematics than in literacy. This result needs further analyses. In addition, global stimulation of the children across the preschool time is less stable than domain-specific stimulation. This points to quite stable orientations of the parents in the two domains whereas global stimulation seems to be less stable across the preschool time, probably depending on the specific situations. For stimulation in the family during primary school age, it is somewhat surprising that the clarity of rules and structure are assessed as good across the primary school time, whereas the quality of cognitive activation is just medium. Consequences for family education have still to be discussed, but it seems that parents should be further encouraged in their role as cognitive activators.

Process quality is only weakly related to structural conditions and orientations of the stakeholder. In detail, the relations of structural conditions in the different environments and orientations and beliefs of the actors on one side and the process quality on the other side have been analysed. A first impression is that the predictions of process quality by structural aspects and beliefs are small. This relates to all analysed environments. For example, only about 30% of the total variance in scales on process quality are explained by structural aspects and beliefs in preschool classes and only a fifth in primary school classes. In other words, the level of quality realized in settings is only weakly related to structural conditions and beliefs. Consequently, changes, for example in preschools’ structural conditions like class size or staff-child-ratio, will have only limited effects for quality improvements. Other means than changing structural conditions for improvement are needed (see above). Whereas quality of stimulation in the families in the primary school age is almost independent of the studied structural conditions (with the exception of the migration background of the family) and beliefs of the parents, some interesting results emerge for the preschool phase. Stimulation in literacy and numeracy is lower in families with migration background and in families with lower SES, but there are no differences for these families in global stimulation. Maternal education is positively related to global stimulation and stimulation in literacy, but there is no relation to stimulation in numeracy. Depending on the quality dimension considered, we find different important predictors. We do not find parents to be “the” promotors or “the” non-promotors of quality (even if we find that children in socially disadvantaged families have fewer informal language experiences and receive lower quality of language input). These differential relations are important for both further research and considerations on practical improvements. One result especially found for preschool classes is the negative relation between the proportion of children with migration background at the class level and process quality for global and domain-specific quality. This is in no case natural and inevitable, but seems to be—see the contrary relation in England and the Netherlands—a result of our pedagogical approaches which focus more on stimulation of all children in daily routines in preschool and less on—additional—special enrichment for special groups of disadvantaged children.

There are two further areas of research results that were not included in this section. First, there are analyses of the relations of levels of quality the children experience in preschool, primary school, and family with the development of children in different domains (see Lehrl et al. this volume). Second, there is some research on the interrelatedness of levels of quality between environments as well as between institutional or family environment(s) across time (e.g. the relations between characteristics of the family and quality in preschool, Lehrl et al. 2014, or the interplay between experiences in family and preschool during preschool age and reading in primary school age, Lehrl and Kuger 2013). Due to limitations in length, they were not included in this section.

The BiKS-3-18 data for the quality in preschools, primary schools, and families form a rich data set for further analyses and also for other research groups. Summing up the most needed further analyses, we mention some selected directions: analyses of intrapersonal stability of quality experiences, more analyses of the level of quality in primary school, analyses of preschool quality at the child level and stability over time, analyses of the relations of change and stability of quality in the family to psychological, educational and social resources of the family and analyses of the interrelations of quality between the learning environments.