The development of students’ critical thinking (CT) as part of schooling is included in national curricula in Norway (Udir 2018) and many other countries (OECD 2020). CT holds the potential to raise the quality of laypeople’s interpretation and evaluation of information encountered and the quality of their decision-making and their construction of their own argumentation on issues. Osborne nevertheless claims that there is a “virtual absence of critique and critical thinking from science education” (2014, p. 183).

A longstanding aim for science education has been to prepare students for informed participation and thoughtful decision-making on complex socioscientific issues (SSIs) they encounter as citizens, workers, and consumers (Sadler, 2011). Participation in SSI involves interpretation and evaluation of views, arguments and factual claims and judgment of the quality of these as part of the process of reaching thoughtful personal decisions. CT has been described as the “capacity to work with complex ideas” (Moon 2008, p. 128) using “reasonable reflective thinking focused on deciding what to believe or do” (Ennis 1987, p. 10). Consequently, thoughtful participation in SSI includes practices involved in critical thinking. The development of students’ CT is therefore relevant to include in curricula for such thoughtful participation in SSIs.

Many studies have provided insights into how different approaches to teaching CT could affect students’ CT (Abrami, Bernard, Borokhovski, Waddington, Wade and Persson 2015). Typically, these studies have used quantitative tests to measure possible effects. This study seeks to complement these studies by taking another approach. It will focus on critical thinking practices in a specific classroom and discuss their relevance for developing students’ CT. In particular, we seek to inform the debate on how to develop students’ CT by qualitatively describing two teachers’ facilitation of CT practices in a science classroom and students’ resulting CT practices. This implies that our approach is focused on describing the processes by which students' CT seems to develop. This is in line with the proposal from Thayer-Bacon (2000) to study how teachers work with students as opposed to focusing on the effect of short interventions on students’ CT.

One reason for this approach is the challenge that some aspects of CT are seen as difficult to measure using tests (Abrami, Bernard, Borokhovski, Waddington, Wade and Persson 2015). Specifically, this pertains to collective practices and CT in complex authentic contexts, such as discussions about SSI-related texts and arguments, scientific experiments, and concepts from various subjects. An additional reason for choosing this approach is that it can provide insights into the interplay between teachers’ practices and students’ CT practices. Moreover, descriptions of what is possible to achieve in an 8th-grade classroom and potential challenges that warrant further consideration can provide insights for teacher professional development programs.

To fulfil these aims, data were collected from an inquiry-based teaching unit called the Climate project (see, in Norwegian, https://argument.uib.no). The chosen approach implies that it is not the students' thinking per se that is explored but contextualized CT practices where individual and collective reasoning (Moshman and Geil 1998) are expressed through oral use of language. It is therefore mainly oral representations of CT (Moon 2008) that are explored, albeit partly in interaction with representations in written preparations and actions related to practical work.

Our focus on CT practices entails characterizing the interpretations, analyses, evaluations, and inferences made by the students during the Climate project. Furthermore, we will also identify the teaching strategies employed by teachers and discuss how students' CT practices are influenced by these strategies. We will also conduct a detailed analysis of characteristic episodes from the classroom to illustrate how CT practices manifest in various dialogues and situations in the classroom.

These analyses, coupled with the identified characteristics of students' CT practices and the teaching strategies that foster them, will enable us to discuss the classroom culture in relation to CT practices. In this discussion, we will identify recurring patterns in the discourse within the 18-lesson-long Climate project and explore how these patterns both represent and contribute to a distinctive classroom culture. Our concept of classroom culture is based on Goodenough's (1994) notion of culture, which emphasizes how regularly occurring activities within a group, such as teacher-initiated dialogues in a classroom, entail a shared understanding of how to perform the activity and interact with one another.

Prior studies on approaches to teaching CT and challenging characteristics of students' CT

CT is taught using a variety of instructional approaches. Based on categories originally suggested by Ennis (1989), Abrami, Bernard, Borokhovski, Waddington, Wade and Persson (2015) performed a meta-analysis differentiating between general, infusion, immersion, and mixed approaches. General approaches are characterized by focusing on CT skills and dispositions but not on specific subject matter content. In the infusion and immersion approaches, content plays an important part, but CT is an explicit objective only in the infusion approach. Having CT as an explicit objective does not necessitate the initial teaching of CT concepts but rather involves ensuring that CT principles are somehow explicated in relevant situations that emerge while using an immersion approach. In mixed approaches, CT is taught explicitly in a separate track within a course focusing on subject matter content using an infusion or immersion approach.

In a meta-analysis of studies using quantitative CT tests to measure instructional efficacy, Abrami, Bernard, Borokhovski, Waddington, Wade and Persson (2015) showed that all approaches produced significantly positive average effect sizes. This was also the case in their first meta-analysis (Abrami, Bernard, Borokhovski, Wade, Surkes, Tamim and Zhang 2008), which also showed a clear advantage of explicit approaches (i.e., general [g+ = 0.38], infusion [g+ = 0.54], and mixed [g+ = 0.94]) over implicit approaches (i.e., immersion [g+ = 0.09]). Thus, Abrami, Bernard, Borokhovski, Wade, Surkes, Tamim and Zhang (2008) suggest that “educators must take steps to make CT objectives explicit in courses” (p. 1102).

Abrami, Bernard, Borokhovski, Waddington, Wade and Persson (2015) also studied the possible effects of four main types of instructional approaches: individual study, dialogue (more than two persons in any arrangement), authentic or anchored instruction (students work with genuine problems), and coaching (involving two persons, e.g., student and teacher). The meta-analysis found moderate effects for authentic instruction (g+ = 0.25) and dialogue (g+ = 0.23), an increased effect for combining authentic instruction and dialogue (g+ = 0.32) and the highest instructional effects for authentic instruction, dialogue, and mentoring combined (g+ = 0.57).

In their review, Abrami, Bernard, Borokhovski, Waddington, Wade and Persson (2015) included studies from any subject area and grade level (elementary school children to adults). The effects of dialogic and authentic instruction have also been found in studies involving young science students. Investigating the effect of dialogic teaching in a study involving 297 students, Frijters, ten Dam and Rijlaarsdam (2008) found that 8th grade students following dialogic lesson series in biology scored higher on generative fluency of reasoning than students following nondialogic lesson series in classes. Improved CT skills in a project emphasizing group and class discussions were also found in a seminal study by Zohar, Weinberger and Tamir (1994) involving seventh graders.

In science education, authentic instruction in science typically involves inquiry-based teaching or SSI. Using an inquiry approach called The Science Writing Heuristic (SWH), Hand, Shelley, Laugerman, Fostvedt and Therrien (2018) found improved scores on CT for disadvantaged groups of fifth grade students (N = 9963). Interestingly, the SWH template used also involves sharing and discussing interpretive ideas and statements related to students' authentic inquiry. Inquiry-based science teaching that does not use the SWH template has also found significant effects on students' CT, including Hairida's (2016) study of seventh graders who explored food additives and Duran and Dökme's (2016) study that involved sixth graders who performed studies related to the particle structure of matter. Positive effects on seventh-grade students' CT using different SSIs in combination with collaborative learning have been found by Wang, Chen, Lin, Huang and Hong (2017).

Although they provide information on general strategies and templates, the above quantitative studies provide less insight into the details of tasks, activities and dialogues used in classrooms. Moreover, the qualities of class and group dialog might vary (Angeli, Valanides and Bonk 2003). In regard to group discussions, Mercer (2008) note that some students engage critically but constructively with each other's ideas in what the researchers call exploratory talk, but many students instead use what they denote as dispositional or cumulative talk, which does not involve these qualities. We believe that studies that provide more detailed descriptions and identify cultural features of CT practices in classrooms can provide complementary insights into what we know today.

Discussing cultural perspectives in science education, Carlone, Johnson and Eisenhart (2014) argue that ethnographic studies can provide additional insights into traditional topics associated with students’ learning in science. We have not been able to find ethnographic studies focusing explicitly on CT in science classrooms. However, in an ethnographic study on fourth and fifth-graders by Crawford, Kelly, and Brown (2000), it was documented how teacher strategies that encouraged students to explore their own questions created opportunities for fourth and fifth-grade students to participate in scientific practices, including argumentation and questioning experimental results.

Concerning challenging characteristics of students’ CT, Simonneaux and Simonneaux (2009) found, in an SSI context that involved questions about the desirability of having bears and wolves in an area, that the greater the students’ proximity to involved parties, the lower their reasoning and critical analysis. Asking students to provide reasons for a set of answers identified as difficult questions on a CT test, Paulsen and Kolstø (2022) found in many students’ answers a lack of differentiation between the trustworthiness of observations and interpretations, and a lack of ability to identify the potential role of conflict of interest when judging trustworthiness. In their summary of findings on students' difficulties in constructing scientific arguments, including CT involved, such as the use of justifications, Ryu and Sandoval (2012) state that many students do not provide data for their own claims, demand data from each other, make counterclaims when needed, consider references and competing arguments, or justify the connection between claims and data.

Research question

Within the conceptual landscape outlined above, the purpose of this study is to contribute detailed descriptions of students' dialogical CT practices that seem to be lacking. An ethnographic perspective is therefore used since it enables the analysis of situated practices. This means that students’ CT practices are studied in an authentic and complex context, which enables an analysis of how students' practices are related to teachers' facilitation and contextual factors. With a focus on CT practices, as these are expressed in contextualized dialogues in the classroom, the following research questions were formulated:

  1. 1.

    What characterizes the CT practices found in the students’ oral contributions?

  2. 2.

    What characterizes the teachers’ ways of facilitating these practices?

Building upon the resulting characterizations, we will investigate how the employed teaching strategies may account for students' CT practices. Furthermore, we will examine recurring patterns in teaching strategies, dialogues, and students' CT practices, exploring how they collectively shape a distinctive classroom culture and identifying strengths and challenges in the CT practices this culture seems to foster.

Conceptualizations of CT on which the analysis is based

Ennis (1987) has proposed that critical thinking consists of specific cognitive skills and affective dispositions. This view of critical thinking is supported by many researchers (Ku, 2009). A well-known conceptualization of this view was developed by Facione (1990) using a Delphi consensus panel involving 46 scholar experts. Other researchers have put forth conceptualizations of critical thinking that, while not in contradiction to the skills and dispositions traditions, emphasize core principles that they believe should play a prominent role. Bailin and Battersby (2016) suggest a conception of CT as a practice of inquiry involving “the careful, critical examination of an issue, problem, controversy, or challenge, according to relevant criteria, in order to come to a reasoned judgment” (p. 369). Additional proposals for central principles include commitment to evidence (Siegel 1989), critical thinking as reasoned argument (Kuhn 1991), and the competence to develop both independent opinions and engage in reflection about the world around us (Jiménez-Aleixandre and Puig 2012).

The analysis of utterances and dialogues in the Climate project is based on the widely used conceptualization of CT (Abrami, Bernard, Borokhovski, Waddington, Wade and Persson 2015) formulated by Facione (1990). This conceptualization identified six cognitive skills (interpretation, analysis, evaluation, inference, explanation, and self-regulation), 16 sub-skills and a set of 19 dispositions. The six cognitive skills describe tasks requiring thinking (Bailin and Siegel 2003), and in short, a critical thinker should be able to do the following (our paraphrasing):

  • Interpret various kinds of information, experiences, data, graphs, statements, beliefs and more by categorization, interpreting significance and clarifying the meaning of its content.

  • Analyze to identify relationships among statements, concepts and more by examining ideas, data, and concepts and by detecting and analyzing arguments and their premises.

  • Evaluate the credibility of statements, arguments, and different representations, including their sources, by determining their degree of accuracy or probability and by assessing the strengths and weaknesses of assumptions and reasoning.

  • Infer possible needs for additional information to be able to assess the quality of arguments and information and to formulate strategies that can provide this. Conjecting alternatives by proposing possible solutions, hypotheses, and plans and drawing conclusions by using appropriate ways of thinking.

  • Stating the results of one's reasoning activities by producing accurate statements, descriptions, and representations, by justifying procedures and by presenting arguments for acceptance of claims.

  • Self-regulate own cognitive activities, e.g., using metacognitive self-assessment and self-correction.

Incorporating dispositions underscores the significance of motivation and intention in utilizing these cognitive skills and striving for improved judgment. Examples of the 19 dispositions suggested in the Delphi report are inquisitiveness, open-mindedness, concern to be well-informed, and precision. Many of the descriptions of cognitive skills in the Delphi report seem to presuppose knowledge, e.g., of differences between premises, conclusions, and assumptions; observations and inferences; and relevant frameworks to understand and clarify information. Bailin and Siegel emphasize that good CT “requires mastery of context-specific knowledge to evaluate specific beliefs, claims, and actions” (2003, p. 181) and of relevant criteria for evaluating different kinds of statements.

The framework by Facione also incorporates the analysis, evaluation, and presentation of arguments. While the construction of arguments entails coordinating evidence and theory to support or refute a conclusion (Osborne, Erduran and Simon 2004), Facione’s (1990) framework places its emphasis on assessing the quality of arguments, including the evaluation of the information, assumptions, and reasoning involved. Hence, in regard to argumentation, these aspects will be the focus of the current study.

An important historical conceptualization of CT is Dewey’s description of what he denotes reflective thinking and defines as “active, persistent and careful consideration of any belief or supposed form of knowledge in the light of the grounds that support it, and the further conclusions to which it tends” (Dewey 1910, p. 6). A characteristic of Dewey’s conception of reflective thinking is how it is tied to experience and context (Garrison 1996). It involves critical thinking involved during the building, e.g., of an explanation or an opinion, and is therefore sometimes categorized as describing constructive thinking (Hitchcock 2020). Dewey emphasizes the significance of critical thinking in the process of constructing ideas. This perspective is only indirectly addressed in Facione’s work (i.e., “conjecturing alternatives”) and complements the application of critical thinking to statements and texts made by others.

An additional perspective on CT is introduced by Thayer-Bacon (2000) through her conception of thinking constructively. She argues that constructive thinking is a social practice where the quality of thinking rests on collective reasoning, distributed knowledge and inputs from diverse perspectives, backgrounds, and orientations. She also argues that engagement and emotions are prerequisites for prolonged involvement in constructive thinking and that collective discussion based on diverse perspectives is the way to enlighten perspectives, views, and arguments and provide a basis for individual and collective judgments.

Thayer-Bacon’s (2000) emphasis on collective reasoning aligns with Bailin and Battersby (2016) conception of CT as a practice of inquiry. According to them, such a practice entails the give and take of reasons and arguments and “is at its core a communal, social practice” (p. 370). Inquiry is defined as involving “the careful, critical examination of an issue, problem, controversy, or challenge, according to relevant criteria, in order to come to a reasoned judgment” (Bailin and Battersby 2016, p. 369).

In summary, this study’s analysis will employ the primary categories from Facione’s (1990) concept of CT when examining scenarios in which students formulate ideas, as well as when evaluating statements and texts produced by others. This implies that our aim is not to identify and characterize all kinds of reasoning that students formulate or to determine whether their utterances represent stages in their CT practice, but only analyze those utterances that are relevant to the chosen framework. Additionally, this analysis will encompass situations where students present their own views, as well as those involving group discussions and whole-class dialogues. Additionally, considering the potential educational value of explicating general principles of CT as identified by Abrami, Bernard, Borokhovski, Wade, Surkes, Tamim and Zhang (2008), instances of such explication will be incorporated into the analysis.

While we include situations where students participate in scientific inquiry or are attempting to construct knowledge claims, interpretations, definitions, and more, our analysis will not comprehensively cover the entire processes within these endeavors. Instead, we will consider them as situations where students might be engaging in interpretation, analysis, evaluation, and inference.

The above framework encompasses several cognitive skills and dispositions that are also integral to scientific inquiry, as emphasized in conceptualizations of scientific practices (Osborne 2014) and epistemic practices in science (G. Kelly 2008a, b). The rationale for our choice of framework is the recognition that the concept of scientific practices encompasses more than just the cognitive skills and dispositions outlined by Facione (1990). It also involves engaging students in constructing scientific explanations and models, conducting investigations, and obtaining and communicating information (Osborne 2014). As a result, approaches to CT that are rooted in epistemic practices in science might be less suitable for analyzing classrooms where the teacher utilizes teaching strategies other than inquiry-based methods. Importantly, Facione's framework is intended to be applicable not only in scientific contexts but also in diverse situations where one might encounter diverse types of texts and inscriptions encompassing perspectives from various disciplines. Additionally, in these varied contexts, criteria beyond traditional scientific criteria may become pertinent in evaluating the quality and reliability of conveyed perspectives.

Methodological approach

This study used an ethnographic perspective to understand interactive and dialogic aspects of students' CT practices (Kelly and Green 2019) in combination with the inductive development of categories (Merriam 1998). This combination allows for the identification of specific CT practices in the classroom and the description and understanding of these practices in the context of teachers’ facilitation of classroom activities and the wider context of the identified practices. Consequently, we see classroom practice from a sociocultural perspective, where we assume that context and teachers’ and students’ utterances and forms of participation mutually influence each other.

By studying students’ CT practices in a natural and complex context, we aim to provide ecologically valid accounts of what students' CT practices and teachers’ facilitation of these practices might look like in an authentic educational setting. And not, say, in a test situation where students are working individually with constructed issues.

Data collection

Data from the classroom were collected using a nonparticipatory approach and included the following: whole-class video recordings (camera pointing towards the front of the classroom from behind) from 15 lessons, including two videos containing only ten-minute introductions; 17 audio recordings of group discussions from 11 lessons; field notes from observations in classrooms; and field notes from conversations with the two teachers, hereafter called Thor and Lise, during and after the project week. The Climate project lasted one week and included all the lessons the students had that week, for a total of 18 lessons. Because of malfunctions and misunderstandings, recordings from some lessons and groups were empty. However, we do have recordings, whole-class and/or groups, from all lessons but one, for a total of 34 h of recordings. Together with the field notes, this enables us to develop “thick descriptions” of relevant classroom episodes (Gobo and Molle 2017). One or two recordings from each lesson were transcribed verbatim, yielding 26 transcriptions in total. It was ensured that the transcriptions included plenary discussions from each lesson and group talk from one or two groups. We believe that the inclusion in the analysis of all phases and available lessons of the Climate project makes it possible to see whether the CT practices that are identified are linked only to particular themes or tasks. The class consisted of 26 students, aged 12 and 13, and it was their first semester at this lower secondary school.

The Climate project was a part of a larger school development project, denoted the Argument project, during which data were collected from five projects conducted in various schools and at different grade levels. The selection of these five projects was based on the willingness of the teachers and the students. The climate project run by Thor and Lise was selected for this study. The Climate project was selected among these five based on the observation that the students in this class had relatively many oral contributions during lessons. Thus, the selection criterion was the accessibility of rich data.

The local context and development of the teaching unit

The Climate project was run in a lower secondary school in one of the major cities in Norway. The school is located in a suburb outside the city consisting mainly of high and low blocks and semi-detached and detached houses further from the school. The class composition reflects the surroundings of the school, and three times the teachers informed us that the class included students experiencing low motivation, weak prior knowledge, and lack of support for schooling at home. At the national test in mathematics the year of the project, the school’s score was two points below the national average, which was defined as 50 (www.udir.no/in-english/). There are no national tests in science.

The Argument project was run by the school authoritative at the local municipality and involved all science and math teachers at three lower secondary schools. It was run in cooperation with six researchers and four outreach staff at the University of Bergen and the Western Norway University of Applied Sciences as part of an R&D project (see https://argument.uib.no/) funded by the Norwegian Research Council.

By the start of the Climate project, the science and math teachers had attended six workshops as participants in the Argument project. During these workshops, discussions revolved around teaching strategies for SSI-based teaching, effective approaches to engage students through practical work, and the incorporation of inquiry, argumentation, and critical thinking into the teaching process. At all workshops, teachers were working to develop ideas for teaching units together, and they shared teaching experiences.

The teachers were challenged to include all the topics from the workshops in the teaching units to be developed. In general, the teachers involved in the Argument project, including the two teachers who were involved in the analyzed Climate project, expressed positive interest in the inclusion of these aspects. Focusing explicitly on the development of students’ argumentation skills and critical thinking in science and math was a new element for the teachers, and all seemed to agree that this had not been in focus during their teacher education.

Thor and Lise planned the Climate project partly in cooperation with other eighth grade teachers at their school. Thor, an energetic teacher with fifteen years of teaching experience, was the class’s teacher in science and math and was in charge of 14 of the lessons. At an informal meeting between teachers, the head of department for eighth grade classes suggested that Thor was the right teacher to choose for video recording, and this was approved by the other teachers at the meeting, including Thor. Lise, a mild and supportive teacher who had her first year as a teacher, was normally teaching the class mother tongue and English as a foreign language. She taught four lessons during the Climate project, and in five lessons, both teachers were present.

General features of the teaching in the climate project

Several teachers challenged the researchers to make inspirational sketches for possible student activities and teaching units. Five such sketches were made, including a climate project, and shared with teachers using Google Docs. Thor and Lise designed their own teaching unit but included several of the activities and tasks in the sketch for a climate project made by the researchers. This resulted in a teaching unit that included activities where students constructed and discussed designs of equipment for measuring millimeters of precipitation; performed calculations based on measurement; interpreted bar graphs showing precipitation; performed and discussed experiments about evaporation, condensation, and humidity; and presented scientific facts and arguments supporting their own views on a question related to climate change. In all activities, students worked in groups or participated in whole-class discussions. (See part 1 in the supplementary materials for details.)

Thor and Lise stated that they wanted the students to develop their CT and ability to build good arguments related to an SSI. However, they also emphasized the importance of students learning key issues, terminology, and explanations related to climate change, and most activities and dialogues were designed with this focus in mind. In several discussions with the researchers, Thor expressed the importance of activating students and including all students in the learning process. In particular, he stated that the most important idea he brought with him from the workshops was the possibility of collecting answers from several students on questions before discussing these together with the class. In general terms, he also stated that he valued the idea of designing a teaching unit around an SSI and helping students use subject knowledge and other sources of information to develop arguments supporting their views and to be critical of information encountered. However, he also warned the researchers that conflict inherent in, for example, climate issues was probably not sufficient to get students with a low interest in schooling to become engaged in the topic. His recommendation was that all teaching units should include practical work at the start and at later phases.

Inspecting fieldnotes, videos and transcripts, a pattern emerged concerning Thor’s teaching strategy. He presents some stimulating material (e.g., video, artefacts, or graphs) and challenges students to interpret, analyse or evaluate its content before suggestions are shared and discussed in class, and he sums up a lesson to be learned from the activity. Sometimes groups are asked to discuss a second time, based on a first discussion of shared ideas, before final sharing and discussion. He typically challenges all groups to share after such short group discussions and often encourages specific students to report or share their thinking. These strategies enable him to make most students share some ideas during each lesson. As the examples below will show, he seldom responds to students’ suggestions using evaluation or corrections but often repeats their ideas, relates to these in his comments, and sometimes challenges them. Most students provide answers when asked to provide a report from a group discussion or state a point of view.

Our general impression is that the relation between the teachers and the students is relaxed and good and that most of the students do not seem to be afraid to express their ideas to the teachers, especially during group work. Several times, we saw examples where Thor repeats challenges to a student to share his or her point of view, sometimes adding interesting proposals he has heard from the students during the group work. Thor accepts comments on tasks and time schedules but also signals that he is in charge. Sometimes Thor makes small jokes, and sometimes students provide irrelevant but funny comments on tasks and situations that Thor seems to accept as long as students are fast back on track. The other teacher, Lise, had only attended one workshop in the Argument project, one arranged at this school, but had attended group meetings where the school's version of the climate project was planned and had also planned adjustments for their class together with Thor.

At the start of the Climate project, Thor explained to the class that on the last day of the project, they should present their views and supporting arguments on the question of whether global warming is leading to more rainfall.

Strategy of analysis

Although several conceptions of CT concern thinking per se, e.g., cognitive skills, this study analyses students’ and teachers’ utterances and dialogues. Building on Facione's (1990) framework for critical thinking (CT) and how CT can manifest itself in oral discussions during tasks that may potentially trigger CT, the focus of our analysis is the CT practices of students and teachers as present in dialogues in the classroom. The analysis encompasses dialogues that took place during all the various activities within the Climate project, covering instances where students aimed to construct knowledge as well as situations where concepts, media texts and peers’ ideas were scrutinized. In some dialogues based on closed questions, corrective feedback was used, for example, when activating prior knowledge. Due to the authoritative nature of these dialogues, these were not coded.

Using techniques from interactional ethnography, the analysis consisted of three structured stages in addition to the analytical thoughts noted during fieldwork (Erickson 1992). In the first stage, videos and audio-recordings were inspected, and tentative characteristics of students’ CT practices and teachers’ facilitation of CT practices, along with episodes involving CT practices, were noted. Following this, the transcripts were inductively coded (Merriam 1998) by author one and author two. Within the main categories of conceptions of CT (interpret, analyze, examine, infer, and explication of CT principles), we inductively developed codes to describe how students performed these CT practices, thereby following the approach denoted as directed content analysis by Hsieh and Shannon (2005). Importantly, we identified instances where students engaged in interpretations, analyses, evaluations, and inferences as CT practices, irrespective of the quality of the reasoning involved.

To categorize and compare codes, the constant comparative method (Strauss and Corbin 1990) was used in the coding process, and the focus of analytical attention was expanded through several rounds of analysis (Erickson 1992). Teachers’ facilitation of CT practices was coded based on the same framework, and their utterances were also coded for how they communicate CT and engage students in CT practices. Each individual utterance was inspected and coded when it was relevant. In cases where different parts of a student’s utterance required different codes according to the analysis framework, e.g., an interpretation followed by an evaluation, multiple codes were assigned. Similarly, teacher utterances were given more than one code if they contained several relevant elements, e.g., supportive feedback followed by a challenge.

During the coding process, it was noted that the knowledge object in the focus of discussions alternated between the students’ constructed artefacts, graphs presented to the students, textbook science, and students’ own arguments on questions related to climate change. Moreover, it was noted that the teachers alternated between the use of group work and whole-class discussions, challenging students to think and provide inputs and summing up and supporting and challenging students’ ideas.

In the second stage of the analysis, these observations were further explored by re-examining episodes in the data involving many CT practices. While considering the elements from the detailed coding, we identified patterns and variations within and between episodes.

Finally, in the third stage of the analysis, holistic descriptions of episodes, patterns and variations were made, taking into account contextual features and the findings from the initial analysis. Episodes for holistic analysis were selected to ensure variation in the task situation, exposed knowledge, students' use of criteria and the presence of patterns identified during the initial phases of the analysis. Additionally, episodes were selected from all main phases of the Climate project.

The focus on cognition, particularly cognitive skills, in Facione's (1990) framework may seem contradictory to our emphasis on CT practices. However, as Bailin, Case, Coombs and Daniels (1999a) have argued, the cognitive skills in that account are describing tasks requiring thinking. Such tasks, i.e., interpret, analyze, evaluate, and infer, can also be included in assignments and questions during dialogues given by a teacher in a classroom. Furthermore, Black (2007, p. 2) comments that the inclusion of decision-making in the definition of CT implies that “critical thinking is exercised and is not just pure skills”, i.e., it includes practices. Additionally, students can take on such tasks on their own in different kinds of situations encountered during lessons. Thus, in this study, we will describe the students’ practices along the four dimensions of interpretation, analysis, evaluation, and inference where this is relevant. We consider the fifth dimension in Facione's account, which involves stating the results of one's reasoning, to be included in the analysis based on the first four dimensions. This is because we are analyzing only orally communicated reasoning, and our aim is to describe this aspect. Additionally, we will not include students’ self-regulation, the sixth dimension of Facione’s framework, in the analysis, as we did not find explicit expressions of it, and it was impossible to detect without such. Concerning dispositions, these are typically not explicitly stated but are implicit in CT practices and thus hard to identify unequivocally.

As discussed earlier, the meta-study by Abrami, Bernard, Borokhovski, Wade, Surkes, Tamim and Zhang (2008) suggests that explicating CT principles is valuable in developing students' CT. Based on this, the analysis also includes the identification of utterances where the students or teachers focus on meta-concepts about argumentation and CT, for example, the concepts of argument and types of support. Additionally, we included in our analysis utterances where a teacher expresses the value of specific dispositions.

The holistic level of the analysis focused on interactive aspects in dialogues involving CT practices. The purpose is to describe ways CT practices are embedded in dialogues and social practices, e.g., whether students built on each other’s inputs or not and whether diverging views were debated. This third phase of the analysis drew from conceptualizations of CT that explicitly highlight the potentially constructive role of dialogue in CT, i.e., collective reasoning, thinking constructively together, and practicing inquiry as a communal social practice. In addition, it involved a focus on the characteristics of student inputs in interplay with the teacher's arrangements for CT practices and follow-ups of student input.

In the following, we will first provide an overview of the categories developed to characterize students’ CT practices in dialogues during the project, as well as the teachers’ ways of challenging the students to think and express their thinking. Thereafter, holistic analyses of episodes with CT practices from all main phases of the teaching unit were conducted. Each episode from the classroom will expose several of the categories from Table 1 below.

Table 1 Categorizations of students’ CT practices in dialogues during the Climate project

CT- practices and teaching strategies in the climate project

Overview of categories of CT practices developed

The open coding of dialogues in the Climate project resulted in 12 categories (seen in Table 1) describing utterances involving interpretation, analysis, evaluation, and inferences. The coding process also resulted in a category that describes utterances where concepts or ideas related to CT were sought to be explained. In total 1071 instances of CT practices were identified, distributed over all faces of the Climate project (see part 2 in the supplementary materials for details.).

In most cases, the teacher responded to the students' point of view by using the statement, repeating it, or giving a positive comment, regardless of the quality of the statement. Sometimes the teacher explained briefly when an input was not relevant at the time. A few times, a teacher interrupted a student.

The open coding of the teachers' utterances resulted in six categories, as presented in Table 2. Three categories summarize ways in which the teachers orally challenge or encourage students to participate with contributions representing CT practices, for a total of 657 instances during the 17 analyzed lessons. The three additional categories describe how elements involved in CT were communicated through explanations, remarks, or modeling of CT by formulating interpretations, analyses, evaluations, or conclusions themselves, e.g., based on the students' responses to challenges.

Table 2 Categorizations of teachers’ CT practices in dialogues during the Climate project

Examples showing how codes were used to code utterances are included in analyses of episodes presented below, and in part 3 in the supplementary material.

An intercoder reliability test was performed where the third author, who had not participated in the development of codes, coded 17% of the material, involving 465 codes. The resulting Cohen’s kappa was 0.74. Most of the lack of agreement in codes was about challenges in differentiating, e.g., between analyses and assessments in the students' input, and few disagreements related to which statements were relevant to code as CT practices.

Day 1: start up activities about claims and types of support

Interpret and analyze an argumentative speech

As one of several start-up activities in the first lesson, the teacher showed a video with Greta Thunberg’s speech in the UN in 2019, which includes the phrase “How dare you!”. The students were first challenged to practice CT by discussing their interpretations of the speech, then to make an analysis identifying “good arguments” and “what kind of arguments does she use?” Finally, Thor asked them to discuss “Are there any arguments she uses that enable her to express what she wants?” The teacher walks around and challenges students to formulate their thoughts on this. When asking group 1, “You then, do you have anything,” Magne stated, “That she sort of says that they ruin the future of the children,” thus identifying an argument. Bent adds that “She scares people, so that they get scared and start complaining to the politicians to fix it and such,” thus proposing scaring as a means of persuasion and identifying a possible motive of the speech (i.e., fixing the climate issue).

Magne then asks his peers, “What is that 2-degree goal,” and Bent provides a short explanation, thus clarifying the meaning of a concept. The group then discusses whether a goal of 2 degrees is a lot or a little and compares with high temperatures in Africa and discusses whether it can affect the growth of some crops. They agree to return to the teacher’s question, and Janne states that “But, nothing destroys our future, she exaggerates a lot, it is so sick”, thus stating an unsupported judgment seemingly based on disputable or deficient knowledge and identifying it as a means of persuasion. Magne replies that “But the reason she exaggerates is to get politicians to do things”, thus suggesting a possible motive for Tunbergs’ argumentation. The group never had discussions evaluating the correctness of Tunberg’s claims.

In these discussions, the students did manage to propose possible motives and means of persuasion. At the same time, they did not analyze the speech by identifying facts used in the speech, and only one argument was identified. However, the teacher did not ask explicitly for this or for evaluations of the correctness of facts and arguments. The teacher's question gave room for the students’ own interpretations, which they then developed, shared, and discussed. Magne and Janne’s comments indicate that their knowledge of relevant scientific facts was lacking.

Discussing ways to support an argument

In the second lesson of day one, the second teacher, Lise, runs a sequence of activities where CT meta-concepts were explicitly taught. She starts out by asking the groups to find arguments against the idea, pretended to be from their parents, that from now on, the student should do all the house cleaning. After sharing arguments from all groups, the teacher sums up, stating that argumentation is about convincing other people, thus explaining a CT concept. Next, she names six ways to support an argument and asks them to “talk together in the group and try to find out or think about, what do these different types of support mean.” However, she chooses first to run a whole-class dialogue on this, and excerpt 1 shows the first part of the dialogue that followed.

Excerpt 1 An example of explicit teaching of CT concepts is Steve and Walter’s practice explication by suggesting characteristics of the argument from the majority

Teacher:

But. If we take the first one together. Steve, did you have a hand up?

Supporting students to share

Steve:

I've heard of [the appeal to] the majority argument

 

Teacher:

Okay, maybe you can explain what that means?

Challenges to discuss CT concepts

Steve:

There are more people arguing like that

Suggests characteristics of argum.

Teacher:

Okay. Several people argue, at the same time? Yes. the majority argument. You're close. There are several who argue. Yes?

Supporting students to share

Walter:

I have not heard of it, but I think the majority argument is about how the majority wins

Suggests characteristics of argum.

Teacher:

Yes. Exactly. The majority wins, kind of. You could say something like, "most people in Norway think so and so". That is a majority argument, then you use the majority to convince someone, kind of.

Explaining CT concepts

In this activity, Lise challenges the students to discuss CT concepts, although she has not yet explained them. She invites Steve and later Walter to share their thoughts, and she challenges Steve to deepen his thoughts. In this excerpt, Lise asks for elaboration, reflects on their suggestions, and adds positive comments, thereby supporting students to share by signaling that she appreciates their suggestions. Lise signals interest in the students' proposals, which may help to explain that the students continued to participate even though the phrase “you are close” implies that she also gave evaluative feedback. Steve and Walter seem not to have prior knowledge about “the argument from majority” but nevertheless suggest alternatives, i.e., characteristics of arguments from majority, apparently by guessing. In the last utterance in excerpt 1, the teacher briefly explains a CT concept, and she also gives such short summary explanations during the follow-up of groups and when ideas were later shared in class.

Discussing trustworthiness of claims

In the next activity, Lise provides a list of twelve short claims related to the climate issue and asks the groups to read each claim and discuss if it is trustworthy and why, thus challenging students to practice CT. In the groups, many students make judgments without justifications, e.g., “Okay, I think that sounds credible.” Some statements, such as “It actually sounded pretty right,” might be based on an unstated comparison with the student’s prior knowledge. Confronted with a claim expressing no need to take immediate action, one student asserted that “It's not very credible since they do not show us why we do not have to worry about it for a long time.” The student’s view seems to be based on an analysis of the claim that resulted in the identification of lacking support, thus evaluating based on reason, and the criterion involved in the evaluation seems to be that it is necessary to give convincing reasons for assertions if they are to be convincing. The challenge provided by the teacher asked for the students’ own examination and evaluation.

Day 2: calculating millimeters of precipitation

Clarifying information

In the second half of the first day, the students designed and built their own measuring equipment for collecting rainwater and placed it on a flat roof. Later, Thor used this measuring activity to focus on the concept of millimeters of precipitation. At the suggestion of the teacher, several groups had used empty plastic bottles from which they cut the upper third to obtain straight-walled cylinders, and some turned the cut part over and used it as a funnel to guide the rainwater into the meter. The second day, they took the meters to the classroom. The groups had used bottles with different diameters, and the teacher arranged for a discussion on this. While drawing two cylindrical meters with different diameters on the blackboard, he challenges students to practice CT by reasoning: “Another gang chose to do it this way. Use one minute now. How much water would be in this meter, compared to that meter?” A student, Ann, spontaneously asks: “It depends on how much more than. Should we not pour any more water?”. Obviously, Thor had not explicated the context for the question clearly enough for Ann, which makes her unsure, possibly whether the same volume of water should be poured into both, which would make the water level different between the two meters, or whether the imagined context was outside during rain showers. Thus, Ann requests an explanation to clarify meaning by seeking to remove “confusing, unintended vagueness or ambiguity” (Facione 1990) to be able to interpret the task correctly.

Reasoning together

Later, Thor focused on a measuring device where the upper diameter of the funnel was 9 cm, and the diameter of the cylinder collecting the water was 6 cm. He also asks them to imagine that the height of water in the cylinder is 2 cm. Then, he arranged discussions to repeat how to calculate the area of a circle. Next, he challenges students to practice CT as he asked them to come up with possible solutions for “how are we going to be able to find out how many millimeters of precipitation have come into it [the cylinder]?”, which turned out to be a very difficult task for the students.

In group 1, however, Sigurd has managed to calculate the bottom area of the cylinder correctly and multiplies by 2 cm to obtain the volume of the water and states the result as 5.652 L. The teacher, listening to the group, challenges Sigurd to clarify what he thinks this number means, i.e., to practice CT as clarification of meaning. In addition to unit conversion issues, Sigurd seems to confuse the amount of water, i.e., its volume, with millimeters of precipitation, i.e., the height of water measured in millimeters, when he answers: “It's how much, eh, how much rain is measured—or in millimeters [precipitation?]—no I do not know.” Thor challenges the group to check it once more and to make a drawing as this might help. The dialogue in excerpt 2 occurs after some recalculation and small talk.

Excerpt 2 Collective reasoning with disagreement and critical questions leading to renewed and improved reasoning

Sigurd:

We are to measure how much water we have

Clarify meaning

John:

28,26

 

Sigurd:

Me too—but as soon as I multiply by 20, I get

Identify a challenge way too much

Aina:

Why do you multiply 20 then—why am I the one asking, I'm the dumbest here, okay

Questioning correctness

Sigurd:

But what the hell, I must multiply with 20—no?

Questioning correctness

Aina:

Why so?

Questioning correctness

Sigurd:

Then, it will not equal rain!

Evaluates based on reasons

Sigurd first repeats his interpretation of the task given but uses the formulation “how much water”, which can indicate that he still is focused on volume. John then states his conclusion, 28,26 [cm2], which is the area of the bottom of the cylinder. Sigurd’s reply indicates that he compares his finding with what is empirically plausible. Using this as an evaluation criterion, he realizes that his result must be wrong and identifies this as a challenge. The change in the groups’ thinking, which led to clarification through further discussion, was triggered by Aina’s questioning of the correctness of multiplying by 20. Her evaluation might or might not have been based on an understanding of the calculations performed and of the phrase “how much water”. One possible interpretation of this dialogue is that collective reasoning led to improved thinking and was strengthened by the variety of voices involved, even though their contributions differed in sophistication and knowledge base. Their sharing of views and criticisms is consistent with signals from Thor indicating that he wanted them to reflect and discuss together and is consistent with his respect for Anna’s criticism of lack of information.

Day 3: discussing experiments to understand scientific ideas

Challenged to suggest own explanations

On the third day, several small experiments were carried out on evaporation, condensation, humidity and how temperature affects evaporation rates. The results of these experiments were used to discuss how rain is formed and how raised atmospheric temperatures can explain increased precipitation.

In one experiment, students were challenged to find a method for making and then explaining dew. Using equipment such as kettles and glass plates, dew was observed by all groups. Thor keeps to his strategy of challenging groups to suggest possible explanations, i.e., to practice CT, without providing the scientific account first. After approximately twenty minutes, Thor asks all groups to share, "What did you have to do to make dew?" One group explains their method and their hypothesis, stating that “hot air and cold air, when mixed, it becomes dew”. In further discussion on what might explain the formation of dew, none of the students’ suggested alternatives were scientifically correct, although two students included the expression “water particles” in their suggestions. A possible explanation for students’ sharing of their own ideas using their own words is Thor’s facilitation, by inclusion of the phrase “do you think” in his question (Kolstø 2018), and students’ former experiences of Thor’s noncorrecting response to such answers. This way of phrasing tasks and questions and responding to students’ answers, which in practice implied supporting students to share, is a reoccurring pattern during the project.

Challenged to reduce ambiguity

Following the sharing of ideas, Thor asks the groups to “Explain the word water vapor, in your own words!” While Thor is discussing with a group, one of the students suggests opening the window because “There is so much smoke.” Thor takes the opportunity and challenges the student’s choice of using an imprecise word: “Smoke?”. The student immediately replies, “Yes, vapor”. This is one of very few situations where Thor challenges students to reduce “confusing, unintended vagueness or ambiguity”, included in Facione’s (1990) framework.

Challenged to construct and improve explanations

In the last lesson on day three, Thor ran some demonstration experiments using an open transparent box in which he placed a cloth with a bit of water and covered it first with a piece of plywood and later with thin transparent plastic. The box was placed on an overhead projector that served as a source of light and heat, and the groups were challenged to discuss which alternative, plywood or transparent plastic, would give the highest humidity reading.

In the subsequent whole class discussion, Thor runs a poll, asking “Who concluded that using plastic film as cover on top is the smartest thing?”, which more than half the class agreed on. Thor then challenges this half, stating that “Then you should be able to argue why this is smart,” thus challenging them to practice CT by providing reasons for their conclusion. One student responded, “Because it is transparent, the light comes in, but it does not come out again”, thus suggesting an alternative by providing a reason, although vague or ambiguous as the mechanism involved is not explicated.

The teacher then runs a short discussion on the position of the source of the light. As seen in the first line in excerpt 3, he uses this fact to challenge students to think anew. When one student answers “Yes,” the teacher challenges him to practice CT by explaining why he changed his conclusion, which the student did. As often, the teacher repeats the student’s explanation using his own words and asks if this is what the student suggests, thus practicing CT by modelling the CT practice—which is sometimes necessary—of clarifying interpretation before commenting.

Excerpt 3 Example showing how Thor challenges students to think, express and clarify their reasoning and to clarify if he has interpreted a suggestion correctly

Tor:

Does anyone change their mind now, now that they know that the light comes from below?

Challenges students to practice CT

Student:

Yes

 

Tor:

Ok. Why did you change your mind then?

Challenges students to practice CT

Student:

Because the glass is on the other side. [The light] Gets kind of crushed then

Suggests alternative

Teacher:

But what does it have to do with the fact that plastic is up here?

Challenges students to practice CT

Student:

(hard to hear)

 

Teacher:

Yes, you think that the light comes from outside, through

Practices CT—Interpreting

 

the plastic bottom, goes through the plastic film, and

Challenges students to practice CT

 

therefore, it does not stay inside the box?

 

Several times on this day, students proposed and discussed possible explanations for the observations identified, and their views were sometimes challenged by the teacher or peers. The students thus practiced CT as communal, social practice. In the whole-class discussions, not all students were challenged by the teacher to formulate a view, relate to an additional fact, and provide arguments for their view. Nevertheless, most of the time, the class was focused on the dialogues, especially when the teacher challenged a student.

The tasks for the groups this day did not include challenges to test their causal explanations. However, the epistemic practice of controlling variables was modeled as Thor on several occasions pointed out that he had used the same procedures, e.g., used the same amount of water, in the experiments to be compared. The students were not included in discussions on the CT involved in this epistemic practice.

Day 4: interpreting precipitation graphs

Evaluates one’s own idea by analyzing data

The fourth day the teaching was about graphical representations of precipitation covering many years involving bar charts, trend lines and bars showing millimeters of precipitation relative to the so called Normal. In the second lesson that day, all groups were given a bar chart for local precipitation, which showed one bar for each of the last five years and where the bar for the last year, 2018, was in a separate red color. The teacher had not yet explained the concept of arithmetic mean and challenged students to analyze the graph and compare the bars using the following question: “Is it raining more in 2018, or is it raining less in 2018, than it did the four previous years?”.

The students in group 3 first seem surprised that it had rained less the last year than the previous ones and seem to have thought that climate change should give more rain. They conclude that “It's not exactly a big change, a tiny little one. It does not matter.” The teacher, spotting the instance of biased thinking, challenges the group to a better analysis: “Okay, but what's the smallest bar here then?” The students examined data in the graph again and stated that it is “2018.”

Next, the teacher challenges the whole class to justify their analyses, explain the method used, and try to compare to what is the normal level of rain (the formal meteorological concept of normal was introduced in the next lesson):

And then the most important thing - there are two important things here. One is your answer: “Yes, it rains less,” and the most important thing is: “Because.” You must try to justify why! […] And what method did you use to say whether it rained more or rained less. […] You must now try to think. We're talking about Normal values, right? We compare other things with the Normal [Precipitation].

As seen in excerpt 4, the responses in group 3 are to state a conclusion, state method used, and to talk over each other while examining data again and pointing to details in the bar chart they believe support their view and trying to include their somewhat loose concept of mean.

Excerpt 4 Collective CT practice in a group of students following a challenge from the teacher

Olai:

Eh It rained less

Suggests an alternative

Bjarne:

We used a ruler [put horizontally on the histograph to read measurement values]

Suggests an alternative

Olai:

Yes, we used a ruler. And here it rained—the average was really (interrupted)

Examines idea using data

Bjarne:

It actually rained more here—or?

Examines idea using data

Olai:

It rained more here than in 2016, but a little less than -

Examines idea using data

Bjarne:

About the average

Examines idea using data

Olai:

Yes, about the average

States a view based on facts

Although with a low level of sophistication, in the dialogue in the excerpt and the discussion leading up to it, the students state the planned method to gather information (i.e., suggest an alternative), examine their idea using data, and state a view based on facts identified. In the fourth utterance, Bjarne adds the question “or?” to his input, which signals to his peer an openness to a possible alternative analysis, a disposition characteristic of CT.

Additionally, on day four, when Thor teaches graphs, he challenges students to suggest and examine ideas without giving introductory explanations. In his dialogue with group three, he is listening to their ideas but also uses information in the graph to challenge the group and, later, the whole class to make more thorough analyses. Group three responds by making further suggestions and by building on and commenting on each other’s ideas, thus practicing collective CT.

Day 5: students present their products

Explains and challenges students in constructing reasoned arguments

Towards the end of the fourth day, Thor explains that the following day, all groups will identify their own views on one of two questions decided by the teacher and make presentations in front of the rest of the class: “First we want you to answer—[the] question: Will there be more rainfall? And—or, that means you can use both or just one of them—will there be more drought?” He continues to explain that they need to provide arguments for their view, use information they have read, be critical of their sources of information and use, for example, graphs they have seen to explain and support claims. Therefore, he provides an explanation of a CT concept, which constitutes an argument. He adds that “You may conclude with the opposite of what I believe or mean” if they argue well; thus, he expresses the CT dispositions of critical attitude and openness. Later, he clarifies that in the product to be presented, “you show us that, number 1, that you have learned something about mathematics and science during the week and, number 2, that you argue for the view you conclude on,” thus again challenging them to practice CT and construct reasoned arguments. Furthermore, he states that all students in each group must participate by presenting part of the group's work. The last lesson on day 4 and the two morning lessons on day 5 are allocated for the groups to prepare, and they are also asked to work on it after school.

The next morning, Thor, together with Lise, repeats what they should aim for in their presentations. In addition to information given the day before, he tells the students, “You should simply try to convince us that what you have got on the sheet is correct (last word was emphasized)” thus, he emphasizes correctness as a criterion. Lise tells the class that “I'm so excited” and that she is looking forward to seeing the kind of products they have made, signaling that she is valuing what students are to share.

Presents views based on facts

All groups made their own decisions, actively collected additional information, and selected relevant facts to support their arguments. The group presentations, which were supported by slideshows or wall newspapers, all contain 3–6 min of facts and explanations of three to six topics, such as precipitation, greenhouse gases, the 2-degree goal and consequences of global warming for precipitation, animals, health, or food production. Using one to five sentences they then conclude by stating their view on one of the main questions for the Climate project. All but one group provides support for their conclusion, thereby stating a view based on facts presented, as in this example:

“So, we came to the conclusion that countries that are already dry receive less rainfall than countries that usually receive more rainfall. Because, in the course of ten years, 10–20% of the world's dry areas have become drier. It rains almost 20% more in wet countries than it did in 1900.”

This group did not explicate their sources, as two other groups did, but they did present a reasoned argument. Neither this group nor any other groups expressed evaluations of sources or correctness of facts and explanations that they presented, although this might have occurred in group discussions that were not audio-recorded.

The group not providing support for their conclusion had a complex claim and stated that “We think it is more dangerous with draught in the dry places, Africa for example, than [increased precipitation is for] the wetter places, like Bergen.” However, when challenged by the teacher to explain “why,” they provided an explanation based on several facts.

Thor gave feedback to all groups and explained the characteristics of good arguments in the context of the groups’ presentations. Thor gave each group a positive comment and a concluding “thank you.” He did not correct or evaluate students’ views, facts or explanations but explained one aspect they might improve next time to get an even better argumentation, thus practicing CT. His main point was still the need for relevant support of claims and referring sources.

Summary of findings

This study set out to describe critical thinking (CT) practices in a teaching unit involving 18 lessons in an 8th grade class where the climate issue was used as a context for working on argumentation, critical thinking, and relevant science concepts and data representations. The analysis identified 1071 instances distributed over all phases of the project where the students made interpretations, analyses, evaluations, inferences and suggested alternatives and characteristics of arguments (see Table 1). These instances of CT practices were based on prior knowledge and information currently available and included examinations and evaluations based on reasoning and facts. However, they also included judging without justification and unsupported suggestions of alternative explanations and consequences. In a few instances, students identified possible motives behind utterances and elements in arguments.

The holistic analysis of episodes showed how students’ CT practices were embedded in dialogues where students shared perspectives, built on each other’s contributions, and occasionally questioned their peers’ inputs. Several episodes demonstrated how constructive collective reasoning stemmed not only from sophisticated comments but also from critical comments and ideas that seemed to be based on relatively basic knowledge. Students' CT practices were partially responses to questions and challenges posed by the teacher and sometimes emerged spontaneously during group and whole-class discussions, as evidenced in several dialogues on day 2. The criteria the students apparently used in their evaluations and judgments were relevant.

A main characteristic of the teachers’ classroom practice was to challenge students to interpret and analyze texts and graphs, evaluate claims, and suggest solutions, explanations, consequences, and causes. Such challenges resulted in CT practices during group- and whole-class discussions. Interestingly, the students were given such tasks without first receiving the correct explanation. Additionally, the teachers supported students to share their reflections by encouraging them and incorporating their responses into the ongoing discussion. Although the students’ contributions were at different levels of sophistication, most students contributed suggestions for possible interpretations, evaluations, consequences or explanations or reasons during most of the lessons.

The teachers made interpretations, analyses and evaluations of inscriptions and student contributions, used explicit criteria, and made inferences and thus practiced and modeled CT. The teachers sometimes stated CT-dispositions and scientific criteria as reasons for judgements they made, e.g., during demonstrations, thus modelling CT practices. Additionally, the teachers emphasized several times the need to include evidence, explanations and sources when constructing arguments, thus explaining the concept of argumentation. However, they did not involve students in reflections on these or other CT practices experienced during the project.

In two activities, the students evaluated trustworthiness and identified types of support based on lists of claims and arguments. Apart from this, students did not explicate the characteristics of arguments or critical thinking during the project. However, all but one group constructed and presented one or several reasoned arguments at the end of the project week.

Students’ critical thinking practices

Based on students’ CT practices in the Climate project, it is possible to discuss whether these practices can contribute to the enhancement of the quality of their CT skills. The analysis revealed that students were involved and gained experience with CT practices but at varying levels of sophistication. For example, although most examinations and evaluations were supported by a reason, many times students stated a view on a matter without formulating a rationale. The many suggestions made by the students in response to the teachers' challenges might by some be denoted as "only" guesswork. However, the abductive reasoning typically involved in goal-directed guessing as well as in learning processes and scientific research is central in Dewey’s (1910) conception of critical thinking. It is also necessary as an initial step to construct possible alternative interpretations, explanations, analyses, etc. Additionally, many students obtained experiences with collective reasoning. In several of these dialogues, the students actively questioned each other's suggestions and provided alternative perspectives, leading to improvement of ideas.

Bailin and Battersby (2016), among others, emphasize that good CT is thinking that meets certain criteria or standards for judging claims and arguments. The teachers several times stated the importance of providing facts and sources to support claims, implicitly emphasizing empirical adequacy as a criterion. The students asked critical questions and made evaluative comments focusing on the correctness of interpretations, the presence of support for claims, the correctness of the supporting data, the feasibility of solutions proposed, and the empirical plausibility of claims. Moreover, no examples were found in the data of students using questions, evaluations or criteria that seemed misplaced, e.g., group thinking. Nevertheless, no groups have argued against anthropogenic climate change in their presentations. This might have been a strategic choice for some groups, hoping that their view would align with their perception of the teacher's view, an opportunistic criterion for weighing arguments. In general, however, the students’ use of criteria seems relevant, although never explicitly stated or discussed. In addition, while sharing their views, the students often included phrases such as “I think,” “I have heard”, and “I think I read once”. Thus, they provided epistemic qualifications of their input, which is a prerequisite for making good inferences based on inputs.

Although the students participated in CT practices, their CT practices were mostly based on everyday knowledge and formulated at low levels of sophistication. However, even if the level of sophistication and depth of students’ interpretations, analyses, evaluations, and inferences was low, the students were immersed in challenges, activities and dialogues that involved the construction and discussion of the quality of knowledge claims. The finding in Abrami, Bernard, Borokhovski, Waddington, Wade and Persson’s meta-study (2015) that using group and whole-class dialogues focusing on authentic or situated problems improved the acquisition of CT skills also suggests that the identified CT practices are relevant to developing students' CT. Moreover, the analysis also shows that the teachers challenged the students with the type of tasks and questions that critical thinking seeks to answer. Taking into account that immersion is a strategy for developing students’ CT with documented effects (Abrami, Bernard, Borokhovski, Waddington, Wade and Persson 2015), the results indicate that students' CT practices were relevant as experiences that can contribute to developing their CT practice further. The study thus provides an example of the characteristics of critical thinking in the making.

Inclusive practices as a key element in the classroom culture

It is a common observation that some students do not raise their voice in teacher-guided discussions in the science classroom (see e.g. S. Kelly 2008a, b). According to Belenky, Clinchy, Goldberger and Tarule (1986), developing one’s thinking about the quality of knowledge claims begins with developing confidence in one’s ability to reason and express ideas. The ability to reason about the quality of knowledge claims and reasoning and to communicate and justify one’s reasoning is constitutive of CT. The presence in classrooms of students who do not raise their voices and express their reasoning therefore gives cause for concern.

Thayer-Bacon (2000) emphasizes that some dialogue partners need the support and interests of others to begin to put into words their CT. A characteristic of the teachers’ practices in the Climate project was the many strategies used to include students in group and whole class dialogues. Some questions were anchored in students’ experiences from everyday life or in school activities, thus enabling participation independently of the quality of students' scientific knowledge. Another strategy was to encourage students by informing them that what the teacher had heard during group work was interesting. The teachers typically challenged many groups or students to respond to each question, thus activating more students and making it obvious to all that it was possible to give several different answers to questions.

Another interesting strategy they used was to challenge students to suggest possible interpretations, causes, explanations and consequences of observations, graphs, and views without first presenting the relevant disciplinary knowledge. To a certain extent, this may have reduced differences between students with different knowledge backgrounds, as all students thus had to go beyond their prior knowledge and since high-achieving students could to a lesser extent benefit from their ability to learn from explanatory introductions. Additionally, as evident in the analyzed episodes, students, in general, actively listened to each other's contributions in dialogues and responded in a constructive, although sometimes playful, manner to their peers’ utterances and challenges from their teachers. These patterns of communication and teacher strategies evidently established a classroom culture, i.e., ideas and practices related to knowledge and learning (Goodenough 1994), where students felt it natural to reason and to share unfinished and even incorrect ideas and where students and teachers responded to each other's utterances in ways that enabled and maintained such practices.

At the same time, the teachers often stated learning goals at the beginning of days and lessons and summarized lessons to be learned after discussions. Thus, probably no students were in doubt that the purpose of the activities and dialogues was to learn certain ideas. Thus, the resulting classroom culture was characterized by practices reflecting a view of learning as a partly individual and partly collective process involving reasoning and where students’ own ideas and formulations are relevant and necessary to build solid knowledge. These practices include students constructing, interpreting, analyzing and evaluating tentative knowledge claims, making inferences, and suggesting alternatives, i.e., they practiced CT skills. This is interesting, as other researchers have identified classroom cultures that do not support students' development of critical thinking (Mkimbili and Ødegaard 2020). It also exemplifies a possible response to a challenge of SSI teaching identified by Tal and Kedmi, when they state, “What is needed is creating a whole classroom culture that encourages thinking by using a language of thinking, providing constant feedback and encouraging reflection” (2006, p. 638).

In general, the teachers responded positively to students’ reflections, often repeating students’ ideas and reasons, and they did not make corrective responses to these reflections. In triadic dialogues about facts that were not coded as CT practices, the correctness of the students' contributions was normally clarified, but often several answers were collected first, signaling an interest in the students' knowledge and thinking and not just in the correct answer. Thus, the teachers’ practices signaled that all student contributions were valued. These kinds of responses imply that students’ contributions were considered relevant and legitimate by the teachers. This can potentially be important for students' continued participation in dialogues and therefore important for the continued development of their CT as individual and social practice. Lave and Wenger (1991) described how learning new practices, competencies and values can take place through legitimate peripheral participation in a community of practice. Their studies suggest that development from being peripheral participants, with only simple competencies, to "old timers" is possible but presupposes that such peripheral and mediocre contributions, for example, in CT practices, are considered legitimate by “old timers”, for example, a teacher. In contrast to Stroupe (2014), who describes inquiry classrooms where students' legitimate participation stemmed from being granted responsibility for shaping scientific knowledge, our results suggest that legitimate participation in knowledge-producing processes in the science classroom is also achievable even when the teacher largely maintains scientific authority.

Taken together, the pedagogical strategies employed in the Climate project signaled to all students that their teachers believed in all their ability to learn, reason, construct arguments and think critically. However, the teachers' continuous use of challenges (468 times during the 15 lessons analyzed) and strong classroom leadership left limited space for students to pursue their own questions and engage in spontaneous CT practice. Nevertheless, the Climate project provided the students with experiences of critical reasoning and communicating their thoughts, potentially enhancing their confidence in engaging in discursive practices involving CT. Interestingly, this also seems to have been the case for students with low school motivation.

What the teachers did to a lesser extent was to engage students in reflective dialogues regarding potential insights into critical evaluation based on their experiences during immersion activities—practices reminiscent of the metacognitive monitoring proposed by Halpern (1998). According to Cavagnetto (2010), only immersion experiences provide the opportunity for students to exercise and fully comprehend scientific cultural practices. Nevertheless, the inclusion of reflection dialogues based on relevant student experiences offers an opportunity to enhance students' awareness and, consequently, their ability to critically evaluate principles and practices associated with CT. When asked, the teachers explained that they did not consider and were not used to employing this kind of meta-reflection as a teaching strategy.

Several qualitative studies on student argumentation have also described how various teacher strategies can foster students' CT practices. In a study analyzing a lesson focused on argumentation within an SSI context, McNeill and Pimentel (2010) observed that a high school science teacher's use of open-ended questions and links to previous student comments cultivated an environment in which students considered multiple perspectives, reflected on their own thoughts, and contemplated their classmates' viewpoints, thereby promoting CT practices. A study by Sampson, Grooms, and Walker (2011) implemented the Argument-Driven Inquiry instructional model, which includes sessions where small groups share their arguments and were tasked with determining which claim was the most valid or acceptable. A quantitative analysis of group dialogues conducted before and after the intervention revealed that students displayed increased CT, as evidenced by an increase in oppositional statements of ideas after the intervention. In a review study focused on characterizing immersive argument-based inquiry approaches, Weiss, McDermott and Hand (2022) found that common teacher actions included encouraging argumentation practices, asking questions, modeling dialogue, communicating norms, and practicing shared authority. A general pattern that appears to emerge from these studies is that pedagogical strategies based on sharing authority that encourage students to engage in critical discussions involve structures that promote the sharing of ideas, the presence of multiple viewpoints or ideas, and contexts or tasks that stimulate students to assess the quality of proposed ideas. Variations in these strategies were also employed in the Climate project. In dialogues involving CT practices, students were granted and exercised a degree of epistemic authority, likely contributing to their CT development and learning in science. However, teachers' authority over tasks and questions in classroom activities probably explains much of the students' critical engagement. In science classrooms, students are likely aware of the existence of authoritative scientific theories and the teachers' role as class leaders. However, both the current study and the research by Weiss, McDermott and Hand (2022), exemplify that it is indeed possible to facilitate inclusive student engagement in discussions involving CT practices and shared authority. This also implies that the facilitation of shared authority in science projects and potential impacts on student learning and CT practices are complex issues requiring further research.

In addition to the strategies discussed above, our analysis has also described approaches that facilitate extensive student participation in dialogues involving CT practices and characteristics of students' CT practices in a particular classroom context.

The focus on CT and argumentation in science lessons was new to the students and new to the teachers. Consequently, the strategies used by the teachers and the students’ resulting CT practices provide insight into what CT practices in a classroom can look like in a first attempt to employ an immersion approach.

Knowledge base and critical thinking

A characteristic of students’ CT practices was the limited knowledge base often involved in their utterances. Although covering all main categories in the framework by Facione (1990), the students’ CT practices only reflected a subset of all the elements in the framework. More specifically, only a few simple instances were identified where the students used conceptual frameworks, identified values and criteria, or distinguished between main conclusions and the premises in utterances they encountered. A common characteristic of many of these and other shortcomings in students' CT practices is that improved practice seems to presuppose context-relevant content knowledge. Additionally, some of the knowledge lacking in students' considerations echoes what Bailin, Case, Coombs and Daniels (1999b) denote as key critical concepts for CT (e.g., knowing the difference between value statements, empirical statements and conceptual statements, and between premises, conclusions and assumptions). This is in line with Moon’s (2008) observation that differences in the quality of critical thinking represented in student texts might be due to differences in the quality of the knowledge base of the learner. Additionally, a study by Paulsen and Kolstø (2022) found that test questions in the Cornell Critical Thinking Test (CCTT) that turned out to be particularly challenging for adolescents in fact presupposed operational understanding of certain key concepts.

This pattern, where the students' CT practices echoed the main aspects of critical thinking but with shortcomings in their knowledge bases for these practices, suggests that lacking knowledge might be a limiting factor for the quality of the students' CT practices. The fact that teachers often challenge students to reason before clarifying the content knowledge involved in the discussions can partly explain this pattern. However, weak content and background knowledge can easily be the students' future situation when confronted with topical SSIs. Additionally, science learning assumes that students utilize their existing knowledge as a resource for their learning (National Research Council 2005) and engage in the generation and critical evaluation of tentative interpretations of observations and information (Osborne 2014). During science learning activities in the climate project, the students actively contributed their own answers, questions and reflections based on their existing knowledge. The limited quality of the students' CT practices therefore indicates that, considering the context, students were being activated at an appropriate level, likely enabling their development of scientific knowledge as well as their CT practices.

The idea that knowledge can act as a limiting factor for CT implies that the cognitive skills outlined in Facione's (1990) framework demand a practical understanding of both contextual background knowledge and the underlying knowledge embedded within those skills. However, the teachers' approach of presenting challenges before clarifying concepts, explanations, and procedures in the Climate project likely facilitated students' engagement in CT practices. The dual role of prior knowledge in influencing students' CT practices, both as a supporter of quality and a potential hindrance to participation, poses a dilemma in terms of enhancing the quality of critical thinking in the classroom. This dilemma underscores the need for further studies characterizing the quality of CT practices among students exposed to various teaching strategies and real-world contexts. On the final day of the Climate project, the teacher offered critical feedback on students' argument presentations, signaling heightened expectations for knowledge support at this stage. This progression of raising standards for the knowledge base required for CT practices during projects or schooling might offer a generalizable approach. However, students' acquisition of key critical concepts might necessitate different strategies.

Consequences for science education and further research

Our analysis of the climate project reveals that the teachers' use of challenging but inclusive teaching strategies for learning scientific concepts and socioscientific argumentation resulted in a classroom culture including rich CT practices among the students. Prior studies have documented that teaching centered around authentic issues and inquiry-based science teaching can enhance students' performance on critical thinking tests (Abrami, Bernard, Borokhovski, Waddington, Wade and Persson 2015). The current study contributes to this body of research by outlining the characteristics of students' CT practices in a classroom that supports such practices. The findings suggest that it is possible to engage students of all achievement levels in practising CT based on their own perspectives using appropriate teaching methods that blend accessible challenges and support strategies, as well as group and whole-class discussions rooted in concrete texts, experiments, and graphs. Additionally, the findings indicate that relevant teaching methods can extend beyond experiment-based inquiry teaching to encompass discussions based on a variety of materials. This suggests that the general teaching strategies identified are significant and transferrable across topics. Interestingly, although the teachers aimed for all groups to develop a reasoned argument by the end of the project, the activities during the project primarily focused on students learning topics related to climate science. This suggests that the presence of CT practices in the classroom was not in conflict with the emphasis on learning science but rather an integral part of the teachers' focus on student learning. This suggestion is further supported by Osborne's (2014) discussion on how CT is an inherent component of science learning.

Inspired by their participation in a school development project, the teachers made a few but limited changes in their teaching. In particular, Thor explained that in the Climate project, in addition to his increased focus on encouraging students to engage in CT and argumentation, he expanded his traditional teaching strategies with a new one to challenge several students to answer his questions before he discussed their answers. Thus, the analysis indicates what students' CT practices might look like under those kinds of circumstances—practices that can contribute to the development of their CT in authentic contexts. For science teachers and science teacher educators, the results offer descriptions of a classroom culture that includes teaching strategies combining challenging tasks and strategies for inclusive education that can stimulate further reflection. The bypassed opportunities to use examples of students’ CT practices for teacher-guided reflection and explication of characteristics of good CT can stimulate reflections on potential ways to use such examples without side-tracking students’ attention away from the main topic of a project. The positive effects of extending the immersion approach through explication of CT, referred to as the infusion approach, (Abrami, Bernard, Borokhovski, Wade, Surkes, Tamim and Zhang 2008) support such use of student experiences.

Our analyses demonstrated that the identified classroom culture effectively stimulated students' critical thinking and reasoning. These findings suggest that analyzing students' critical thinking (CT) in authentic learning contexts can unveil characteristics of classroom culture relevant to CT learning. Further research on students' CT practices in authentic learning contexts can offer insight into the various ways different classroom cultures can foster students' CT practices. Moreover, it appears relevant to explore the interplay between the quality of students’ CT and their knowledge base in the topic being discussed. Additionally, it may be valuable to investigate ways to empower students with epistemic authority when the teaching goal is to enhance students' CT practices during science learning in SSI and other contexts. Further studies will be necessary to elucidate the extent to which the characteristics of the identified classroom culture, teaching practices, and the resulting CT practices of students contribute to increased quality in students' CT practices.