An increasing number of studies have investigated the effects of basic cognitive skills on reading development, such as visuospatial attention (Gori & Facoetti, 2015), working memory (Peng et al., 2018), executive functions (Follmer, 2018), and temporal processing skills (Farmer & Klein, 1995). Working memory (WM) and visuospatial attention constitute two crucial domain-general cognitive skills that serve as reliable predictors for reading development. Extensive research has consistently shown that WM predicts reading comprehension across alphabetic and logographic scripts (for reviews, see Peng et al., 2018; Savage et al., 2007). Numerous studies have also linked visuospatial attention abilities to comprehension in different writing systems (Liu et al., 2015; Fu et al., 2019; Vidyasagar, 2019). Additionally, some intervention research demonstrates that WM and visual-attentional training can enhance reading performance (Loosli et al., 2012; Peters et al., 2019). Building upon the substantial prior research, this study aimed to provide novel insights into how WM and visuospatial attention influence reading comprehension. By elucidating these relationships in a Chines-speaking children sample, the current work advances theoretical understanding of reading development processes. The findings also have potential implications for early identification and prevention, pending replication with varied orthographies and readers.

Studies on reading comprehension have begun to examine whether domain-general cognitive skills influence reading through literacy-specific skills (e.g., Kim, 2017; Taboada Barber et al., 2021; Yang et al., 2019). To develop a theoretical framework for investigating intermediary linguistic skills, we based this study on the simple view of reading (SVR; Gough & Tunmer, 1986; Hoover & Gough, 1990), an influential theory of reading comprehension.

A framework for reading comprehension

The SVR postulates that two interrelated but distinct skills are necessary for successful reading comprehension (Gough & Tunmer, 1986; Hoover & Gough, 1990). Decoding (or word recognition) refers to the ability to “read isolated words quickly, accurately, and silently” (Gough & Tunmer, 1986; p. 7), which allows access to semantic information present in printed inputs. Linguistic comprehension (or language comprehension) is viewed as the ability to interpret lexical information, sentences, and discourses. Both components are multidimensional, encompassing various subcomponents, and deficits in one or both of them result in comprehension difficulties. Many studies have supported the SVR, indicating that decoding and linguistic comprehension contribute to successful reading comprehension among readers of alphabetic orthographies (Hoover & Gough, 1990; Joshi et al., 2015; Tobia & Bonifacci, 2015) and the Chinese language (Ho et al., 2012; see Florit & Cain, 2011 for review).

Working memory, linguistic skills, and reading comprehension

To comprehend a text, readers decode the meanings of printed inputs (words and phrases) and then construct and merge them to form the coherent meanings of sentences. In addition, they are required to integrate adjacent sentences and relevant world knowledge to build a coherent representation of the whole text (Gernsbacher et al., 1990; Kintsch & Rawson, 2005). Working memory provides the capacity to store and manipulate information simultaneously (Baddeley & Logie, 1999). Researchers have claimed that working memory enables readers to hold decoded information and integrate it with subsequently incoming information and relevant knowledge retrieved from long-term memory, facilitating the construction of a coherent mental model of the text (Cain et al., 2004; Kim, 2015).

Abundant evidence suggests that working memory is a unique predictor of reading comprehension ability (Cain et al., 2004; Chrysochoou et al., 2011; Gottardo et al., 1996; Nouwens et al., 2017; Swanson et al., 2006; see Carretti et al., 2009 for a meta-analysis). For instance, a longitudinal study reported that working memory predicted reading comprehension after word decoding, vocabulary, verbal intelligence quotient (IQ), inference making, comprehension monitoring, and story structure knowledge were controlled for Cain et al., (2004). In some cases, the effect of working memory emerged only at a later developmental stage (e.g., grade 3 but not grades 1 and 2; Seigneuric & Ehrlich, 2005). However, the mechanism underlying this effect remains unclear.

According to the SVR, working memory should affect reading comprehension through decoding and linguistic comprehension. Studies have supported the relationship among working memory, decoding (Kibby et al., 2014; Knoop‐van Campen et al., 2018), and linguistic comprehension (Florit et al., 2009; Was & Woltz, 2007). Working memory is expected to play a similar role in both linguistic and reading comprehension. When comprehending spoken text, oral information is temporarily stored in working memory. As readers process new words and phrases, they actively integrate this information into their current mental model to construct a coherent meaning of the whole text (Cain et al., 2004; Kintsch & Rawson, 2005). Thus, working memory is crucial for linguistic comprehension.

One the other hand, some studies have found a relationship between working memory and word decoding (e.g., Christopher et al., 2012; Kibby & Cohen, 2008; Kibby et al., 2014; but see Keresteš et al., 2019). Learning to read words may involve actively maintaining, matching, and manipulating lexical and sub-lexical information to form connective bonds between the spoken and graphic counterparts of printed words (Christopher et al., 2012; Vellutino et al., 2004).

Studies on Chinese reading seem to provide more consistent evidence for the effect of working memory on decoding (Leong et al., 2008; Liu et al., 2023a, 2023b; Yang & Qiao, 2021; cf. Yang et al., 2020). This finding is in line with the notion that the contribution of cognitive factors to reading is modulated by orthographic depth (e.g., Vaessen et al., 2010). Chinese is considered an opaque orthography due to its complex writing system and lack of phonetic transparency, making reading acquisition challenging for Chinese-speaking children (McBride, 2016). Chinese characters comprise many graphic units, such as strokes and radicals (stroke patterns carrying phonetic or semantic information), and are visually complex (Yang et al., 2013). They generally contain more visual information than English words (Li et al., 2012). The lack of one-to-one correspondence between phonemes and graphemes and high degree of homophony and homography contribute to the difficulty in decoding and understanding written Chinese (Siok & Fletcher, 2001). Working memory is crucial for remembering word sounds when beginning readers process visual–orthographic information contained in print and establish sound-to-print mapping (Yang & Qiao, 2021).

Some studies have provided evidence that working memory has an indirect relationship with reading comprehension, which is mediated by decoding and/or linguistic comprehension (Haft et al., 2019; Jiang & Farquharson, 2018; Kim, 2015). For instance, Kim (2017) examined the pathways of relationships among several cognitive and language skills and found that decoding and linguistic comprehension completely mediated the relationship between working memory and reading comprehension (also see Taboada Barber et al., 2021). However, studies elucidating the mediating mechanisms involved in this relationship are limited (e.g., Jiang & Farquharson, 2018; Yang et al., 2019).

Visual search skill, linguistic skills, and reading comprehension

Reading begins with the visual processing of printed inputs. Visual skills are relevant to reading (Besner et al., 2005). According to visuospatial attention theory, visual search is crucial for reading and its deficit can cause reading difficulties (e.g., Vidyasagar & Pammer, 2010). Visual search refers to the ability to shift the focus of attention sequentially along visual stimuli in the environment; this is necessary for the efficient recognition of one or a few stimuli at a time. When reading, the sequential recognition of letters or letter clusters is related to the perception of their spatial sequences. Therefore, if the serial allocation of attention is disrupted, readers may experience degraded or reversed visual inputs, making reading frustrating and difficult (e.g., Vidyasagar & Pammer, 2010). Studies have provided adequate support for the presence of visual search impairments in alphabetic (e.g., Lallier et al., 2013; Menghini et al., 2010; Sireteanu et al., 2008; Wright et al., 2012) and Chinese readers with dyslexia (Liu et al., 2019).

Visual search is related to reading comprehension. Solan and colleagues (2007) reported that a composite measure of visual attention, which includes visual search, contributed to categorizing children with poor and good comprehension skills with a high degree of sensitivity. In addition, the researchers indicated that a visual attention intervention successfully improved the reading comprehension scores of children with moderate reading difficulties (Solan et al., 2003). In addition to alphabetic reading, Chinese reading comprehension could be independently predicted by visual search skill after controlling for decoding, some cognitive–linguistic skills, and nonverbal IQ (Liu & Liu, 2020; Liu et al., 2015).

The aforementioned studies argue that visual search affects reading comprehension because it facilitates the efficient decoding of words printed in a crowded text (Vidyasagar & Pammer, 2010). Considerable evidence supports the link between visual search and decoding (Casco et al., 1998; Franceschini et al., 2012; Plaza & Cohen, 2007). For example, the visual search skill of Italian-speaking kindergartners predicted their word recognition achievements in grades 1 and 2 after controlling for their age, nonverbal IQ, phonological awareness, and rapid naming (Franceschini et al., 2012). The Chinese script is typically more visually complex than alphabetical scripts. Thus, good visual search ability is required to process and recognize the graphic forms of Chinese words accurately and fluently (Liu et al., 2015). This assumption is supported by previous studies reporting a link between visual search and Chinese word reading (Liu et al., ; Yu et al., 2018).

Visual search ability might affect reading comprehension by facilitating not only individual word recognition but also text-level processing. Reading most languages requires the efficient orientation of visuospatial attention from left to right and line-by-line, which is not the manner in which the visual search mechanism is evolved to operate. It is believed that part of the process of learning to read involves training the visual system to shift in a reading-like manner (Vidyasagar & Pammer, 2010). Efficient attentional shifting may facilitate fluent reading because printed text can be rapidly processed in the correct sequence, making it easier to integrate information extracted from adjacent units. For example, Liu and Chen (2020) found that visual search ability indirectly predicted Chinese reading comprehension through not only word decoding but also word detection, which refers to the ability to detect the characters that form a word in a string of irrelevant characters. Considering the potential effect of visual search skill on text-level processing, we hypothesized that visual search exerts a direct effect on comprehension in addition to an indirect effect through decoding.

The present study

The current study had three objectives. Firstly, we aimed to investigate the potential influence of domain-general cognitive skills on reading comprehension via domain-specific skills. We focused on working memory and visuospatial attention, based on the substantial evidence signifying their effects on reading (see Peng et al., 2018; Vidyasagar, 2019 for reviews). Our study explored whether linguistic skills could be the underlying mechanisms facilitating these effects. Secondly, our study was built on the foundational framework of the Simple View of Reading (Gough & Tunmer, 1986). This theory emphasizes the significance of linguistic comprehension and decoding as key components for reading comprehension, highlighting their independent yet interconnected nature. In concordance with previous studies (e.g., Braze et al., 2016; Ho et al., 2017), we tested whether any domain-general skills have a direct effect, thereby qualifying as a third independent component of reading comprehension. Thirdly, most existing research has examined the effects of visual search and working memory in isolation, leaving their relative contributions to reading acquisition largely unexplored. A notable exception is the study by Plaza and Cohen (2007), which reported that preschool visual search skill contributed to first-grade decoding and spelling abilities above and beyond linguistic skills. However, working memory did not show a unique contribution in their study, a finding contradicts the evidence supporting the role of working memory. This discrepancy emphasizes the necessity of studying the key cognitive skills simultaneously to better elucidate their relationships with reading development.

The research questions of this study are as follows:

  1. 1.

    Are working memory and visual search skill uniquely related to Chinese reading comprehension?

  2. 2.

    How do working memory and visual search skill relate to Chinese reading comprehension?

  3. 3.

    Do decoding and linguistic comprehension completely or partially mediate the relationship between these domain-general cognitive skills and reading comprehension?

These questions were addressed using 1-year longitudinal data from grade 1 students in China. The collected data were analyzed using structural equation modeling (SEM) to evaluate mediation models. The findings of this study demonstrate how general cognitive skills affect reading comprehension in early elementary grades.

Methods

Participants and procedure

In this 1-year longitudinal study, 202 children who were native speakers of Mandarin Chinese (126 boys, 62.7%; Time 1, mean age = 86 months, standard deviation = 4.9 months) were included as participants. At Time 1, they were attending grade 1 at a public primary school in Shenzhen, China, and they completed Time 2 tasks 1 year later. Parental consent was obtained. According to parental and teacher reports, none of the children had perceptual, attentional, or developmental difficulties. Because we examined the effect of visual search, we screened the children for attention deficit and hyperactivity disorder based on their scores on the Swanson, Nolan, and Pelham-IV Parent Form (Gau et al., 2008). All of the children had normal or corrected-to-normal vision.

This study was approved by the research ethics committee of the corresponding author’s university. Data were collected at two time points as part of a larger longitudinal study on Chinese reading. All tasks were administered by trained undergraduate students at the participants’ school during class time. Each child was individually tested in one-on-one sessions by a tester in a computer classroom. The study began after we obtained parental consent. We measured domain-general and domain-specific skills at Time 1 and Chinese reading comprehension and nonverbal IQ at Time 2. Every child completed the tests in random order. During the testing sessions, the testers allowed the children to take short breaks and encouraged them to stay focused. Each testing session lasted around one hour. Stationary items were provided to the children as reward for their participation.

Measures

Verbal working memory

Working memory was measured using a backward digit span task based on the Wechsler Intelligence Scale for Children (Wechsler, 1991). The tester said a string of numbers and asked the children to repeat them in reverse order. The length of the strings of numbers ranged from two to nine digits. A total of 14 strings were presented in increasing length. The task score was the number of strings accurately reproduced by the children. In the present study, Cronbach’s α for the test was .71.

Visual search

We used the visual search task reported by Liu et al., (2023a, 2023b), who followed the procedure of commonly adopted tasks (e.g., Sireteanu et al., 2008). In each trial, the children viewed a visual display in which a predesignated target stimulus may be intermixed with several distractor stimuli. They were asked to indicate whether or not the target was present. Two types of trials were presented. In the conjunction search trial, the target was distinguished from distractors by a combination of orientation and color (i.e., when the target was a red horizontal line, distractors were green horizontal lines and red vertical lines). In the feature search trial, the target (i.e., a white horizontal line) was distinguished from distractors by only one feature, namely orientation (i.e., white vertical lines). The children were asked to respond as quickly and accurately as possible. Figure 1 illustrates the task procedure.

Fig. 1
figure 1

Procedure of visual search tasks

Each trial had 9, 16, or 36 visual stimuli (i.e., three set sizes). A total of 96 conjunction and 96 feature search trials were administered and distributed evenly across the three set size conditions. Across the conditions, the target was present in half of the trials. Throughout the task, the children were asked to place the left index finger on the “F” key of the keyboard and the right index finger on the “J” key. They were asked to press the “F” key when the target was present and the “J” key when the target was absent in half of the trials and do the reverse in the other trials. After successfully completing 10 practice trials, the children began the real trials. The task took around 15 min to complete. Reaction times and accuracy rates were recorded, and accuracy feedback was shown at the end of each trial. The split-half reliability was satisfactory (α = .90 and .89 for the feature and conjunction search, respectively).

Expressive vocabulary

We used a vocabulary definition task to measure expressive vocabulary (McBride-Chang et al., 2008). The tester presented 30 words in order of increasing difficulty. After each word was sounded out by the tester, the children were asked to explain its meaning orally. The tester assigned a score of 2 to an accurate answer, 1 to a partially correct answer, and 0 to an irrelevant or wrong answer. Interrater reliability was .95, and disagreements were resolved through discussion. Test reliability was good (α = .85).

Morphological awareness

The morphological awareness task comprised two components (Liu et al., 2017). The first component measured compound production. The tester described a nonexistent object, and the children were asked to create a word to name it. A sample item was “專門用來切石頭的刀應該叫做什麽?” (What should we call a knife particularly used for cutting stones?), and the correct answer was 切石刀 (/cit3 sek6 dou1/, cut stone knife). To create a word to name the object, the children must recognize the right morphemes and arrange them in a proper compounding structure. Answers were scored on a 5-point scale ranging from 0 to 4; four score was given to answers that included all the key morphemes with correct positions and without redundant morphemes and zero score was assigned when no answers were given, or answers did not contain key morphemes. Interrater reliability was .96. This component contained 31 items (α = .87). The second component of the task assessed homophone identification. The tester orally presented a target word (“子女”, /zi2 neoi5/, children) and two optional words (“子孫”, /zi2 syun1/, offspring and “子弹”, /zi2 daan2/, bullet) that share one syllable and one character. The children were asked to select one of the two options that shared the same morpheme with the target word (the answer was “子孫”). This component included 33 items (α = .82). We transformed the scores of the two components to Z scores and averaged them to produce a morphological awareness score.

Chinese character reading

We used the Chinese character recognition task, in which the children were asked to read 100 Chinese characters aloud (McBride-Chang et al., 2003). The characters were presented in order of increasing difficulty, and the children were asked to read them sequentially. The task stopped when the children failed to read 15 consecutive characters. The task score was the total number of correctly read characters. Test reliability was good (α = .97).

Word reading fluency

In the word reading task, the children were asked to read as many words as possible within 45 s (Pasquarella et al., 2015). The task comprised a set of 104 Chinese words, consisting of 21 single-character words, 76 two-character words, 5 three-character words, and 2 four-character words. All the words were carefully selected to be both frequent and age-appropriate for the children participating in the study. Additionally, prior to the main assessment, the children were provided with eight practice words to familiarize themselves with the task. The tester assigned 1 point for a correctly read word and 0 point when the children skipped, misread, or failed to read a word in 3 s. Because only the total score was recorded, the reliability indicator could not be calculated.

Reading comprehension

The reading comprehension task consisted of two passages (e.g., Liu et al., 2015). The first passage (Mend the Door After the Sheep Have Been Stolen) is a traditional fable story including 139 characters, and the second passage (The Great Wall in China) is an expository text with 246 characters. After each passage, the children were presented with six multiple-choice questions. The answers to these questions were not explicitly stated, requiring the children to infer them from the text content. The task score was the total number of correct answers (α = .50).

Nonverbal intelligence

Raven’s Progressive Matrices-Parallel (Raven et al., 1996) was adopted to measure nonverbal intelligence. To reduce testing time and participant burden, we used Sets A and B only, which consist of 24 items (α = .77).

Results

Overview of data analysis

We conducted a correlation analysis to explore the common variance in all of the measures. As shown in Table 1, most of the variables were significantly correlated with each other, indicating their suitability for SEM analysis. The feature search indicators were not related to Chinese reading skills. Moreover, accuracy in the conjunction search was more strongly correlated with reading skills than reaction time. In line with past research (Liu et al., 2023a, 2023b), we retained only accuracy in the conjunction search to represent visual search skill in the subsequent analysis.

Table 1 Correlation among all variables measured at T1 and T2

We used Mplus 7.4 (Muthén & Muthén, 1998–2011) for SEM to examine whether and how working memory and visual search are related to Chinese reading comprehension. On the basis of the SVR, we modeled linguistic comprehension using measures of expressive vocabulary and morphological awareness and decoding using tests of Chinese character reading and word reading fluency. As shown below, our analysis revealed that within the measurement model, each observed test score loaded significantly onto its corresponding latent factor as hypothesized. Moreover, linguistic comprehension and decoding were modeled as mediating factors through which the two cognitive skills predicted reading comprehension. The goodness of fit of the data was evaluated based on the comparative fit index (CFI), chi-square divided by the degrees of freedom (χ2/df), standardized root mean square residual (SRMR), and root mean square error of approximation (RMSEA). According to previous research, a well-fitting model has values of .95 or greater on CFI, a χ2/df value of 5 or less, an SRMR value of .08 or less, and an RMSEA value of .05 or less (e.g., Hu & Bentler, 1999).

Effects of domain-general cognitive skills on reading comprehension

In Model 1, we set linguistic comprehension and decoding as mediators through which working memory and visual search exerted an effect on reading comprehension, with linguistic comprehension exerting an effect on decoding. We controlled for age and nonverbal IQ in the model. The model fit was good, χ2 (16) = 21.44, p = .16, χ2/df = 1.34, CFI = .98, RMSEA = .049, and SRMR = .065. All of the hypothesized paths were significant, except that the direct effects of working memory and visual search skill on reading comprehension were nonsignificant. The absence of the direct effects was supported by an analysis using the bootstrapping sprocedure. As shown in Table 2, the 95% confidence interval (CI) of the direct effect of working memory included 0 (95% CI [− .12, .19]) and so did that of visual search (95% CI [− .08, .20]).

Table 2 Standardized indirect and direct effects and confidence intervals of all models

Given the nonsignificant direct effects in Model 1, we first removed the direct effect of working memory in Model 2. The values of the fit indices were χ2 (17) = 21.58, p = .20, χ2/df = 1.27, CFI = .99, RMSEA = .044, and SRMR = .065. Compared with Model 1, the fit of Model 2 did not deteriorate significantly, Δχ2 = .14, Δdf = 1, p = .71, suggesting that the simpler Model 2 fit the data similarly. Moreover, in Model 3, we removed the direct effect of visual search skill on reading comprehension, χ2 (18) = 22.28, p = .22, χ2/df = 1.23, CFI = .99, RMSEA = .041, and SRMR = .066. This model fit the data similarly to Model 2, Δχ2 = .7, Δdf = 1, p = .40. We accepted Model 3 as the final model. Its standardized path coefficients are illustrated in Fig. 2. The results of bootstrapping indicated a full mediation model. Specifically, after the effects of age and nonverbal IQ were controlled for, we observed that working memory predicted linguistic comprehension, followed by decoding and then reading comprehension. Visual search indirectly affected reading comprehension through decoding.

Fig. 2
figure 2

Standardized path coefficients of the final model (model 3). Note. The model was controlled for age (T1) and nonverbal IQ (T2), which were nonsignificant predictors of reading comprehension. For simplicity, these effects are not plotted. **p < .01, *p < .05, †p < .06

A closer look at Model 3 indicated that the effect of linguistic comprehension on reading comprehension was marginally significant (p = .055) and that the indirect effect from working memory to reading comprehension through linguistic comprehension was nonsignificant (Table 2). We believe that this result is partly due to the inclusion of nonverbal IQ (a nonsignificant control variable), which is similar to working memory in terms of tapping into general information processing ability. We removed this control variable in Model 4 and observed an indirect effect of working memory through linguistic comprehension, χ2 (14) = 12.55, p = .56, χ2/df = .90, CFI > .99, RMSEA < .001, and SRMR = .042.

Discussion

The findings of the present longitudinal study revealed that working memory and visual search skill predicted Chinese reading comprehension simultaneously and longitudinally. These two domain-general cognitive abilities affected reading comprehension through distinct pathways. Grounded on the SVR, our model indicated that the effect of working memory was fully mediated by linguistic comprehension but not decoding. However, visual search skill exerted an indirect effect through decoding only. The full mediation model suggested that these general abilities did not have direct effects on reading comprehension next to decoding and linguistic comprehension.

Effect of working memory on reading comprehension

After controlling for nonverbal IQ and age, we observed that working memory had a concurrent correlation with linguistic comprehension and decoding as well as a longitudinal association with reading comprehension. The findings support those of previous studies reporting a link between working memory and linguistic comprehension (Florit et al., 2009; Was & Woltz, 2007), decoding (Christopher et al., 2012; Yang & Qiao, 2021), and reading comprehension (Carretti et al., 2009). The mechanisms through which working memory affects reading comprehension are of particular interest. Working memory was found to affect reading comprehension through linguistic comprehension but not decoding. This result contradicts Kim’s finding (2017) that linguistic comprehension and decoding both mediated the effect of working memory on reading comprehension (also see Jiang & Farquharson, 2018). A potential explanation for the divergent results is that our model included visual search ability, which demonstrated a stronger effect on decoding (β = .21, p < .01) than working memory (β = .16, p < .05), consistent with prior research (Franceschini et al., 2012; Vidyasagar & Pammer, 2010). With visual search ability accounting for more variance in decoding, the indirect pathway from working memory to reading comprehension via decoding was non-significant in our model. The result pattern underlines the importance of examining the relative contributions of multiple cognitive skills within a single comprehensive model. Multivariate structural analysis enables all effects to compete with each other in tested models, revealing dominant contributors to separate outcomes. Our model suggested that working memory affects reading comprehension primarily through linguistic comprehension. In this study, we examined only verbal working memory. If we had evaluated visuospatial working memory, it might have demonstrated a relationship with decoding (Wang & Gathercole, 2013; but see Yang & Qiao, 2021). Further research is warranted to elucidate the influence of working memory on decoding while accounting for the impact of visuospatial attention.

There are some nuances in our conclusion drawn regarding the effect of working memory. On the basis of the SVR, we hypothesized one-layer mediation where working memory predicts linguistic comprehension, which ultimately impacts reading comprehension. Alternatively, we explored two-layer mediation adding decoding between linguistic comprehension and reading comprehension. The results showed the one-layer mediation was non-significant, whereas the two-mediation via decoding was significant. Notably, the path from decoding to reading comprehension was strong and positive (β = .36, p < .01), but the path from linguistic comprehension to reading comprehension was only marginally significant (β = .25, p = .05). A plausible explanation for these results could be attributed to the ages of our participants, who were first-grade students. Young readers who are in the early stages of developing their decoding abilities often face greater difficulties in converting written text into spoken language compared to comprehending the meaning of the words and sentences they decode. In other words, the ability to decode plays a more significant role than linguistic comprehension in the reading comprehension performance of beginning readers (García & Cain., 2014; Hjetland et al., 2017). Another possibility is that linguistic comprehension was measured using word-level tasks while reading comprehension used text-level assessments. Given this difference in level of analysis, it is possible that linguistic comprehension showed a relatively weak relationship with reading comprehension. Another possible explanation is the inclusion of nonverbal IQ in the model could have disrupted the expected pattern of results by accounting for overlapping variance across predictors and outcomes. Catts et al. (1999) demonstrated that the selection of different control variables can affect the pattern of relationships and argued that IQ should not be controlled for because it is a highly general measure whose relationship with reading is unclear. Moreover, IQ is believed to be related to working memory because of the demand that people engage attention control while performing these tasks (Shipstead et al., 2015). When IQ was removed (Model 4), both the one- and two-layer mediating effects of working memory were significant.

Effect of visual search on reading comprehension

The present findings provide confirmatory evidence for the visual attention theory of reading (Vidyasagar & Pammer, 2010). Our results indicated that visual search ability predicted decoding and reading comprehension; this finding is in line with those of many previous studies on alphabetic languages (Casco et al., 1998; Franceschini et al., 2012; Plaza & Cohen, 2007; Sireteanu et al., 2008; Solan et al., 2007) and the Chinese language (Liu & Liu, 2020; Yu et al., 2018). Moreover, the effect of visual search was independent from that of working memory. As a cognitive skill in visual modality, visual search ability reasonably affects reading comprehension through decoding instead of linguistic comprehension. Interestingly, the impact of visual search ability on reading cannot be solely attributed to processing speed, although visual search ability is often assessed through search speed. Consistent with previous research (e.g., Lallier et al., 2013; Liu et al., 2019; Sireteanu et al., 2008), our findings indicate that not all types of visual search are impaired in poor readers. Specifically, while both feature and conjunction search abilities involve processing speed, only conjunction search speed was found to be associated with reading. Therefore, we can infer that visual search plays a role in reading acquisition.

In addition to the indirect effect of visual search ability through decoding, we expected its direct effect because visual search ability may affect reading comprehension beyond facilitating the processing of individual words. In particular, in Chinese reading, a study found that in addition to decoding, word detection (detecting familiar words embedded in passage-like displays) mediated the effect of visual search on reading comprehension, possibly due to the visual feature of the Chinese text (Liu & Chen, 2020). However, we identified only an indirect effect, which is consistent with the finding of another study based on the SVR indicating that visual search exerted an indirect effect among typical readers but a direct effect among children with reading disorders (Lancaster et al., 2021). In their model, Liu and Chen (2020) did not include linguistic comprehension that may be conceptually related to word detection (e.g., involving word knowledge). Thus, in the presence of linguistic comprehension in the model, no effect of visual search on reading comprehension was observed other than its indirect effect through decoding.

Limitations

One limitation was that our measure of linguistic comprehension focused on vocabulary knowledge and morphological awareness at the word-level, without directly assessing listening comprehension via text-level tasks. Prior SVR research supports these constructs as subcomponents of linguistic comprehension (Braze et al., 2016; Hoover & Tunmer, 2020; Kieffer et al., 2016). Our model fit indicates the current operationalization of linguistic comprehension was sound. Nevertheless, future studies could complement our findings by additionally measuring listening comprehension. Meanwhile, we acknowledge that the reading comprehension task used in our study needed to demonstrate higher reliability. The observed low reliability of this type of reading comprehension task aligns with previous findings reported in studies by Ho et al. (2017) and Zhang et al. (2014). Consequently, it is imperative to interpret the findings with the appropriate caution.

We measured cognitive and linguistic factors at grade 1 and reading comprehension at grade 2. Ideally, the predictor, mediating, and outcome variables should be measured at three time points to assess mediating relationships, preferably after controlling for autoregressors. Including autoregressive effects would inform whether working memory and visual search skills can predict the growth in reading comprehension ability.

Implications and future directions

The present research has two theoretical implications. Firstly, our findings contribute to the simple view of reading framework by demonstrating that working memory and visuospatial attention represent foundational cognitive processes on which reading comprehension is built. These lower-level skills indirectly influence reading comprehension through their relationships with linguistic comprehension and decoding. Specifically, the SVR posits that decoding and linguistic comprehension predict reading comprehension. The literature has suggested that working memory and visual search ability are also related to reading comprehension. However, how these general cognitive skills fit within the SVR is yet to be fully understood. Our study extends the theory by evaluating whether cognitive skills contribute to reading comprehension within this theoretical model. The findings support the notion of the SVR that the effects of cognitive skills on reading comprehension are fully mediated by decoding and linguistic comprehension. Similarly, previous studies have demonstrated that cognitive abilities are not independent precursors of reading comprehension (Ho et al., 2017; Kershaw & Schatschneider, 2012; Taboada Barber et al., 2021), including working memory (van Wingerden et al., 2018) and visual search (Lancaster et al., 2021), suggesting that certain mediators may fully explain their impacts. Whereas limited previous studies assessed only the impact of either working memory or visuospatial attention separately, our research simultaneously considered the roles of both cognitive abilities. Future research along this line could use multiple measures of working memory and visual search to explore how separate aspects of these constructs are related to reading comprehension differently. For example, semantic working memory was found to be more critical than phonological working memory (Nouwens et al., 2017). Moreover, future studies should simultaneously evaluate visual search and other components of visuospatial attention processes, such as visual attention span and orientation. Secondly, this study lends support to the broader theoretical perspective that domain-general cognitive processes facilitate reading development by shaping the acquisition of literacy-specific abilities. This stream of research provides insights into how reading-related abilities are built upon the basic cognitive system (Knoop‐van Campen et al., 2018; Yang et al., 2019).

Our findings provide initial evidence that general cognitive skills such as working memory and visuospatial attention contribute indirectly to reading comprehension. If replicated, assessing these fundamental skills could facilitate early identification of children at risk for reading difficulties, enabling timely intervention. Regarding interventions, while preliminary research has explored the potential for working memory and visual search training to improve reading outcomes (Dahlin, 2011; Franceschini et al., 2013; Morrison & Chein, 2011), questions remain about their efficacy compared to targeting literacy-specific processes more directly. Interventions focusing on linguistic comprehension and decoding may yield better outcomes as they strengthen abilities proximal to reading performance. However, for students demonstrating insufficient response to such domain-specific training, practitioners might consider supplemental exercises targeting the underlying general cognitive processes that influence reading development, such as working memory and attention (Arrington et al., 2014). A multi-pronged approach strengthening both foundational cognitive skills and literacy-specific abilities may help optimize intervention effects for struggling readers by bolstering the full range of competencies critical for reading comprehension.

In conclusion, verbal working memory and visual search ability contribute to Chinese reading comprehension. Moreover, their effects are fully mediated by linguistic comprehension and decoding, supporting the SVR and demonstrating that different precursors underlie the development of decoding ability and linguistic comprehension. Furthermore, our study supports that general cognitive skills affect reading development by influencing reading-specific skills. In practice, children with poor working memory or visual search ability may require additional support to determine the meaning of written text.