Introduction

The Child Behavior Checklist (CBCL), a parent report checklist, measures various emotional and behavioral problems (Achenbach & Rescorla, 2000). The CBCL is widely used because it allows the gathering of information quickly to determine multiple symptoms in different age groups. Rescorla (1988) identified one of the factors that emerged in her analysis of the CBCL as autistic/bizarre. This finding has strengthened the idea that the CBCL can be implemented in the screening of children with suspected autism spectrum disorder (ASD). Accordingly, subsequent studies have begun to investigate the potential of the CBCL in identifying childhood and adolescent ASD. Some studies have provided evidence that the CBCL helps identify children with ASD (e.g., Ooi et al., 2011; Pandolfi et al., 2014; So et al., 2013).

The CBCL is designed to obtain statements from parents regarding their child’s problems and competencies. The 1991 version of the CBCL was normed for ages 4 to 18 (CBCL/4–18; Achenbach, 1991), and the 2001 revised version was normed for ages 6 to 18 (CBCL/6–18; Achenbach & Rescorla, 2001). The CBCL has been translated into more than 90 languages and has been used in numerous studies (Hartini et al., 2015).

Co-occurring Emotional and Behavioral Disorders (EBD) in Children with ASD

ASD is classified in the Diagnostic and Statistical Manual of Mental Disorders (DSM-V) as a heterogeneous group of neurodevelopmental disorders characterized by persistent deficits in social communication and interaction accompanied by restricted, repetitive patterns of behavior, interests, or activities (American Psychiatric Association, 2013). Children with ASD constantly want to maintain their routines and engage in stereotypical behavior (Chebli et al., 2016; Jiujias et al., 2017). Hyperactivity, and attentional and behavioral problems are frequently observed in children with ASD (Lian et al., 2022). They may exhibit behaviors such as harming themselves and their surroundings, not accepting the rules, not following instructions, experiencing anger outbursts, screaming, yelling, and committing physical attacks (O’Connor & Kirk, 2008). At the same time, children with ASD may face situations such as difficulties in friendships, depression, anxiety, peer bullying, and loneliness (Helles et al., 2015; Neary et al., 2015; Spain & Blainey, 2015).

Children with an ASD commonly present with one or more co-occurring EBDs. Such may include specific disorders classified in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (American Psychiatric Association, 2013) and International Classification of Diseases-11th Edition (World Health Organisation, 2019). EBD assessment in children with ASD should be an ongoing, broad-based process that includes both child characteristics and contextual variables. The resulting data will inform a comprehensive intervention that will include skills training for youth and families and environmental supports and modifications (Matson et al., 2009). Identifying children’s behavioral and emotional risk and intervention services can ameliorate symptoms and decrease the likelihood of adverse outcomes (Lane & Menzies, 2005; Walker & Shinn, 2002). Unfortunately, many students with emotional or behavioral problems are not identified, therefore losing the critical opportunity for intervention. Developing strategies to identify and treat students with emotional and behavioral risk remains an important task for researchers in special education (Horwitz et al., 2003, 2007; Kataoka et al., 2002). EBD may cause unpleasant results when they are not diagnosed in childhood (Beard & Sugai, 2004). It may negatively impact children’s social and academic development. If interventions are not made for EBD, both the child and their environment will suffer (Glover & Albers, 2007), and the issues can persist into adulthood (Weikart, 1998). The intensity and continuity of EBD may vary depending on the situation, so they must be examined by considering their occurrence and frequency (Gimpel & Holland, 2003).

Individuals with ASD often experience additional health and behavior problems (Maskey et al., 2013). These individuals experience high levels of emotional and behavioral difficulties beyond the symptoms of ASD (Brereton et al., 2006), potentially showing emotional, social, peer-related, and behavioral problems (Goodman et al., 2010). Despite differences in definitions and measurements of emotional and behavioral problems, consistent results have been obtained using various measurement tools (Charman et al., 2015). Early detection and intervention to minimize the effects of EBDs in children with ASD increases the probability of achieving positive results. For accurate determination, reliable and valid measurements should be used. This requires a wide variety of methods and measurements that are psychometrically valid and developmentally sensitive. Although more studies are needed, empirical studies support the CBCL/6–18 as one of the best-studied EBD rating scales in individuals with ASD (Pandolfi & Magyar, 2014).

Child Behavior Checklist for Ages 6–18 (CBCL/6–18)

Researchers and clinicians should regularly check for possible behavioral and emotional symptoms in various contexts. Rating scales and behavioral checklists are helpful and practical tools for detecting a child’s symptoms. They can be used for screening and well-informed diagnosis (Lempp et al., 2012). The Achenbach System of Empirically Based Assessment is widely used to determine dimensional psychopathology in school-age children (Achenbach & Rescorla, 2001). The Child Behavior Checklist/Ages 6–18 (CBCL) is one of the most frequently used dimensional instruments for screening and diagnosing emotional and behavioral problems in children (Braet et al., 2011). The CBCL, which is based on observation of children in necessary situations, comprises different versions of child, parent, and teacher instruments, according to the child's age, to allow for multi-informative assessment (Achenbach & Rescorla, 2007).

Parent-report behavioral rating scales are generally used instead of comprehensive diagnostic interviews owing to their relative brevity and cost-efficiency. The purpose of the CBCL is to gather parent reports regarding their children’s problems and competencies (Dumenci et al., 2004). Using parent reports on the previous six months, it evaluates 118 different emotional and behavioral problems (Berube & Achenbach, 2004). The CBCL includes a correlated eight-factor structure of syndromes that are designated as follows: Anxious/Depressed, Withdrawn/Depressed, Somatic Complaints, Social Problems, Thought Problems, Attention Problems, Rule-Breaking Behavior, and Aggressive Behavior (Achenbach & Rescorla, 2001).

The CBCL is a parent report measure that evaluates observed functioning in internalizing and externalizing symptom domains. It offers scores along broad-band scales and narrow-band syndrome scales that were empirically developed and derived through factor analysis (Achenbach & Rescorla, 2001). The CBCL, a reliable and cost-effective parent-rated measurement tool for children and adolescents, is an empirically derived rating scale constructed through a series of quantitative analyses to determine the overlap of behavioral traits to capture specific dimensions of psychopathology. It has shown excellent reliability and validity in clinical and non-clinical populations (Pauschardt et al., 2010).

The CBCL measures emotional and behavioral problems in children and can be used as a screening tool for ASD in clinical settings (Achenbach & Rescorla, 2013). It is being used in an increasing number of studies on children with ASD. It has mainly been used to evaluate the types and correlates of emotional and behavioral problems in children with ASD (González & Stern, 2016; Hirata et al., 2016; Ross & Cuskelly, 2006; Samson et al., 2015; Son et al., 2015; Uğurlu & Eratay, 2022; Xu et al., 2014; Wade et al., 2014). The CBCL distinguishes behaviors and characteristics related to ASD well, accurately, and successfully (Matson & Cervantes, 2014). It is practical and effective in distinguishing the Withdrawn/Depressed, Social Problems, and Thought Problems syndrome scales between ASD and non-ASD children of school age (ages 6–18) (Biederman et al., 2010; Ooi et al., 2010). At an item level, the study found ten items were most predictive of ASD: acting young, obsessions, daydreaming, preferring to be alone, being clumsy, repeating acts, speech problems, staring, behaving strangely, and being withdrawn (So et al., 2013). Another study revealed that nine items were most predictive: acts young, does not get along with other kids, fears specific animals and situations, prefers to be alone, nervous, repeats acts, has speech problems, behaves strangely, and is withdrawn (Ooi et al., 2010).

The Present Study

Some studies have examined the psychometric properties of the CBCL/6–18 scores among different samples (cultures or different groups) (Pandolfi et al., 2012; Penelo et al., 2017; Seleem et al., 2023). Most of the studies used Confirmatory Factor Analysis (CFA) or Exploratory Factor Analysis (EFA) in the Structural Equation Modeling (SEM) framework to test psychometric properties, such as factor structures across different groups. The factor structure of the Turkish version of the CBCL/6–18 was validated by using the CFA approach (Dumenci et al., 2004). Another framework, Item Response Theory (IRT), also has different approaches to test the psychometric properties of the measurement instruments. The purpose of the study was mainly to verify the functioning of the rating scale categorization of the CBCL/6–18 by the Turkish parent sample. The Partial Credit Model (PCM; Wright & Masters, 1982) was utilized to examine the psychometric properties of the CBCL/6–18 items among children with ASD based on parent ratings. The PCM can be considered an extension of the one-parameter logistic model (1PL) in the IRT framework and has Rasch model features such as person-level and item-level parameters (Embretson & Reise, 2000).

SEM approaches (e.g., CFA, EFA, etc.) focus on the relationships between observed variables and latent factors, testing predefined factor structures, theory, or measurement invariance among groups (Brown, 2015). However, PCM focuses on individual item responses and their ordered categories, modeling the probability of each response. PCM helps understand the functioning of items and improves measurement precision (Embretson & Reise, 2000). If the researchers want to determine how well each item discriminates between different ability levels, PCM is useful for designing and analyzing measurement instruments.

The PCM is appropriate for analyzing attitude or personality trait items that are polytomously scored with varying category scores (e.g., 0, 1, 2..). The response categories of the CBCL/6–18 have ordinal responses (i.e., 0, 1, and 2) indicating the levels of children’s behavior problems by parent ratings. PCM was employed to evaluate the probability of a behavior problem being rated based on children’s behavior levels (person ability) and the intensity of the behavior problem that the item is designed to measure (item difficulty). PCM analysis allows for a thorough examination of alignment with children’s behavior levels and parent ratings, as well as the creation of accurate measures that reflect the underlying construct of behavior problems.

The aim of this study was to investigate the psychometric properties of the subscale scores (internalizing, externalizing, and total problem) of the CBCL/6–18 across Turkish parents by applying the PCM. We assessed assumptions of the PCM (i.e., unidimensionality and local independence), reliabilities of the subscale scores, and item-person map. The research questions are as follows:

  1. 1.

    To what extent do the CBCL/6–18 subscales meet the assumptions of the PCM?

  2. 2.

    What are the reliabilities of the CBCL/6–18 subscales?

  3. 3.

    What are the step difficulty (category intersection) levels of the CBCL/6–18 subscales?

Method

Participants

The study participants comprised 548 parents, 313 females (57%) and 235 males (43%). The ages of the participants ranged from 29 to 53. 124 parents reported for their daughter and 424 for their son. The ages of these children range from 7 to 14. Seventy-six (14%) of these students have multiple disabilities.

Measures and Data Collection

Children’s behavior problems with ASD were assessed by parent ratings using the CBCL/6–18. Data were collected through the CBCL/6–18 scale developed by Achenbach and Rescorla (2001) and adapted to Turkish by Erol and ve Şimşek (2010). The scale, consisting of 113 items, is graded as 0 (not true), 1 (somewhat true), and 2 (very true) according to the frequency of problem behaviors in the last six months. The scale has eight sub-factors: Anxious/Depressed (includes 16 items), Withdrawn/Depressed (includes 8 items), Somatic Complaints (includes 3 items), Social Problems (includes 11 items), Thought Problems (includes 10 items), Attention Problems (includes 26 items), Other Problems (includes 7 items), Rule-Breaking Behavior (includes 12 items), and Aggressive Behavior (includes 20 items). While each subscale can be scored separately, a total score can be obtained from the scale. We investigated psychometric properties for three subscales (i.e., internalizing, externalizing, and total problems) with 85 items.

In the data collection procedure, a meeting was held with the parents at the schools, and the purpose of the research was explained. Parents who volunteered to participate in the study were informed about the CBCL/6–18 scale. Then, the scale was implemented face-to-face.

Data Analysis

Assumptions of Item Response Theory (IRT) Models

IRT models involve two key assumptions: unidimensionality and local independence (Embretson & Reise, 2000). Unidimensionality means that the model has a single ability for each examinee. Confirmatory Factor Analysis (CFA) was estimated using the Weighted Least Square Mean and Variance (WLSMV) to test the unidimensionality of eight sub-factors. The model-data fit is assessed with the Comparative Fit Index (CFI), the Tucker–Lewis Index (TLI), the Root Mean Square Error Approximation (RMSEA), and the Standardized Root Mean Square Residual (SRMR). A cutoff value for CFI/TLI > .90, RMSEA < .08, and SRMR < .10 are recommended for an adequate fit (Hu & Bentler, 1999). Yen (1984) proposed a simple test, Q3, to check for any large violations of local independence. Q3 is computed as the linear correlation between the residuals, with critical values of item residual correlations > 0.20 indicating problematic.

Reliability was evaluated using different coefficients (i.e., Cronbach’s alpha (α), McDonald’s omega (ω), and ordinal alpha (Gadermann et al., 2012; Zumbo et al., 2007), which is for ordinal response data, with values greater than .70 indicating adequacy. Additionally, based on the item calibration, the person separation index (PSI; Wright & Masters, 1982) was estimated to evaluate how well the CBCL/6–18 separated children. Higher values (> .70) indicate that the CBCL/6–18 is suited to differentiate between children with different behavior problems.

Partial Credit Model

PCM was conducted for item calibration. When PCM is an extension of the Rasch model, the discrimination of an item is assumed to be equal for all items; thus, this term disappears from the model (Embretson & Reise, 2000). Assume that the item i is scored x = 0,…mi with Ki = mi + 1 response categories. The PCM specifies that the conditional probability that an examinee with latent ability θ obtains a category score xj,

$$Pix \left(\theta \right)=\frac{\text{exp}\sum_{j=0}^{x}(\theta -\delta ij)}{\sum_{r=0}^{mi}\text{exp}\sum_{j=0}^{r}(\theta -\delta ij)}$$

The parameter, δij, is called the step difficulty parameter (De Ayala, 2009). A δij term can be directly interpreted as the point on the latent trait scale at which two consecutive category response curves intersect (Embretson & Reise, 2000). The category intersection parameters can be considered step difficulties associated with the transitions from one category to the next, and there are mi step difficulties (intersections) for an item with mi + 1 response categories. The CBCL/6–18 has ordinal response categories (0, 1, and 2), and we estimated two category intersection parameters (β, step difficulty) by applying PCM. These parameters indicate where, on the latent trait scale, the response of one category becomes relatively more likely than the previous category.

The unweighted (i.e., outfit) and weighted (i.e., infit) mean-square (MNSQ) estimates were employed to evaluate whether the items contribute efficiently to their own sub-factor. An item with an expected infit and outfit estimate of 1.0 would be ideal, indicating consistency between the data and the model. We considered the items as misfitting when outfit and infit statistics were below 0.5 or above 1.5 (De Ayala, 2009). If the infit and outfit values exceed 2.0, it indicates distortion or degradation to the item from the scale (Wright & Linacre, 1994). Violation of unidimensionality may result in item misfit. Thus, the validation of item fit also provides evidence for establishing unidimensionality.

An item-person map was constructed to evaluate the relationship between persons and items. It maps the range of children’s abilities or traits being measured (i.e., behavior problems) against the location of the item threshold on the same logic. This provides a useful way to determine if any items have disordered thresholds, which is an indicator that a higher category is not frequently used (Padgett & Morgan, 2020).

Linacre (2002) suggested that at least ten observations should be in each category for stable step calibrations in the Rasch model. Thus, frequencies of items 99 and 105 in the response category of 2 (very true) were zero, which was excluded from the analysis.

PCM statistical analyses were conducted with R (version 4.3.2) (R Core Team, 2023), using eRm (Mair & Hatzinger, 2007) and mirt (Chalmers, 2012) packages. Mplus (version 8) (Muthén & Muthén, 1998–2012) was utilized for CFA.

Results

Unidimensionality and Local Independence

CFA was carried out to check the unidimensionality assumption for eight sub-factors. Table 1 shows the results of factor analysis. For the internalizing subscale, the results supported a one-factor solution separately in three sub-factors and a higher-order factor describing the internalizing sub-scale (χ2 (df) = 864.95 (290), CFI = .87, TLI = .86, RMSEA = .06, SRMR = .11). (Table 1).

Table 1 Model-data fit indices for CBCL/6–18 subscales from CFA

For the externalizing subscale, the factor loading of item 106 in the sub-factor of rule-breaking problems was below 0.20. Thus, this item was excluded from the subsequent analysis. The results supported a one-factor solution separately in two sub-factors describing the externalizing sub-scale (χ2 (df) = 1437.07 (458), CFI = .93, TLI = .92, RMSEA = .06, SRMR = .11). For the total problem subscale, the results supported a one-factor solution separately in two sub-factors and a higher-order factor describing the total problem sub-scale (χ2 (df) = 1714.57 (586), CFI = .93, TLI = .92, RMSEA = .06, SRMR = .11).

The local independence assumption was inspected with critical values for item residual correlations. Residual correlations of any item pair under three subscales were below .20 in magnitude, except for item 33 in the internalizing subscales, items 20 and 81 in the externalizing subscales, and item 70 in the total problem subscales, which were greater than .20. Thus, these items were excluded from the subsequent analyses. Local independence was established for these subscales.

Reliability

The internalizing subscale reliabilities ranged from .70 to .72 for Cronbach α, .70 to.73 for McDonald ω, and .80 to .86 for ordinal alpha. The externalizing subscale reliabilities ranged from .77 to .89 for Cronbach α, .83 to .91 for McDonald ω, and .88 to .93 for ordinal alpha. The total problem subscale reliabilities ranged from .70 to .81 for Cronbach α, .70 to .84 for McDonald ω, and .78 to.86 for ordinal alpha. It can be concluded that the CBCL/6–18 subscales present with adequate levels of convergence between items.

PSI was estimated at .73 for internalizing subscale, .86 for externalizing, and .87 for the total problem. These findings suggested that the CBCL/6–18 subscales are suited to differentiate between children of different behaviors based on parent ratings.

Item Calibration with PCM

Tables 2, 3, and 4 display item fit values and category intersections (step difficulties) of items for the internalizing, externalizing, and total problem subscales, respectively. For the internalizing subscale (see Table 2), infit (i.e., ranging from .83 to 1.21) and outfit MNSQ (i.e., ranging from .61 to 1.35) values for all items were between .5 and 1.5, indicating adequate model fit. This result is considered reasonable for defining children’s behavior observations by parent ratings.

Table 2 Item fit statistics for the internalizing subscales (25 items)
Table 3 Item fit statistics for the externalizing subscales (30 items)
Table 4 Item fit statistics for the total problem subscales (35 items)

Category intersection parameters were between −.85 and 3.76 logits for step one (β1), and between −.41 and 3.08 logits for step two (β2). For example, item 14 has ordered category intersection parameters (β1 = .97 and β2 = 1.5). This indicates that there is at least one trait level where every response option is most likely. Another example is that the category intersection parameters in item 29 are not ordered (β1 = 1.22 and β2 = .86). However, PCM models the ordered response categories and category intersections are not necessarily ordered (De Ayala, 2009). This does not imply that the definitions of the categories are disordered (Linacre, 2002). (Table 2).

For the externalizing subscale (see Table 3), infit (i.e., ranging from .70 to 1.41) and outfit MNSQ (i.e., ranging from .50 to 1.50) values for all items were between .50 and 1.50, indicating adequate model fit. However, two items had outfit MNSQ values greater than 1.50, including items 63 and 96 from the rule-breaking behavior sub-factor. This result may reflect noise in the data because the outfit MNSQ is sensitive to unexpected responses. Thus, these items were excluded from the subsequent analyses. Category intersection parameters were between − .96 and 4.77 logits for step one (β1), and between .34 and 3.20 logits for step two (β2). For example, item 2 has ordered category intersection parameters (β1 = − .63 and β2 = .34). Another example is that the category intersection parameters in item 39 are not ordered (β1 = 4.77 and β2 = 3.20). (Table 3).

For the total problem subscale (Table 4), infit (i.e., ranging from .75 to 1.32) and outfit MNSQ (i.e., ranging from .71 to 1.49) values for all items were between .50 and 1.50, indicating adequate model fit. However, item 64 from the social problem sub-factor had outfit MNSQ values greater than 1.50. This result indicates that it failed to define the same construct as the other items in the social problem sub-factor. Thus, this item was excluded from the subsequent analyses. Category intersection parameters were between -1.22 and 3.27 logits for step one (β1), and between − .92 and 2.29 logits for step two (β2). For example, item 11 has ordered category intersection parameters (β1 = .05 and β2 = .78). Another example is that the category intersection parameters in item 12 are not ordered (β1 = 2.55 and β2 = 1.91). Some items (for example, 34, 38, and 79) had similar disordered category intersections. (Table 4).

Person-Item Map

The person-item map displays the relationships of estimates for persons and items. These figures offer a useful representation of how the difficulty of items corresponds to the person parameters for the fitted PCM (Padgett & Morgan, 2020). The person-item map for internalizing items is shown in Fig. 1. It can be seen that category intersections are represented along with the item location. Items with disordered category intersections are marked with a (*) on the right vertical axis. Item-person maps for the other subscales are in the Appendix (Figs. 2 and 3). (Fig. 1).

Fig. 1
figure 1

Person-item map for internalizing scores

Figure 1 displays the person-item map for internalizing items. The person parameter distribution was presented at the top of the figure. Items toward the top of the person-item map were more likely to be endorsed by parents, whereas those that were less likely to be endorsed by parents were located toward the bottom. Item 52 (sense of guilt), appearing at the bottom of the map, was the most difficult item to endorse for parents. Conversely, item 49 (constipated) at the top of the map was the easiest to endorse for parents. The marked (*) item (i.e., 65, 42, 56…) indicated having disordered category intersections. This information indicates that there is a need to examine the item (i.e., remove or revise the item).

Discussion

Measurement and evaluation techniques provide researchers with various ways to examine the degree to which ordinal rating scales function in psychometrically meaningful ways. When the scale operates as expected, researchers can interpret ratings in the intended way, differentiate between categories, and compare ratings across individuals (Wind, 2023). The purpose of the study was to examine the psychometric properties of the CBCL/6–18 used for defining children with ASD within a sample of Turkish parents. PCM, which is one of the polytomous IRT models, was used for a detailed examination of the parent version of the CBCL/6–18 subscales (internalizing, externalizing, and total problem).

PCM results showed that misfit items were identified as two for externalizing and one for total problem subscales, excluding from subsequent analysis to improve the psychometric structures of the two subscales. After removing a total of three items, unidimensionality and local independence could be assumed for the remaining 25 items of internalizing, 28 items of externalizing, and 34 items for total problem subscales. Reliabilities of the CBCL subscales present with adequate levels of convergence (> .70) between items. High PSI (> .70) values indicate that the CBCL subscales are reliable for distinguishing between different behaviors of children with ASD.

One important contribution of this study is that it provides detailed information on the step difficulty of each item from the three subscales. It was found that category intersection parameters (step difficulties) for some items from the CBCL/6–18 subscales were disordered. Step disordering does not imply that the definitions of the categories are disordered (Linacre, 2002). This result can indicate that category 1 represents a narrow segment of the latent variable (behavior problem) or transitions from category 1 (somewhat true) to 2 (very true) by parent ratings are relatively easy. Disordering generally arises when the frequencies of category usage exhibit an irregular pattern. For example, item 51 (dizziness) from the internalizing subscale had the highest frequency for response category 0 (not true) and lowest frequency for category 2 (very true), resulting in step disordering. Thus, items with zero response categories (two items from the externalizing subscale) were excluded from the analysis. It seems that Turkish parents in this study might not be observing some problem behaviors as frequently.

The person-item map presents the range of person parameter distribution against the difficulty level of each item threshold on the same logistic scale. Among the internalizing items, items from the anxiety/depressed scale, for example, “sense of guilt” or “suicidality”, were less likely to be endorsed by parents (the most difficult items). One possible explanation is that parents could not recognize emotional disorders easily. Fewer difficult-to-endorse items (more likely to be endorsed by parents) were from the withdrawn scale; for example, “do not speak” and “little energetic”. The most difficult externalizing items were from the rule-breaking behavior scale, including “use bad language”, “cheating”, “bad friends”, which were behaviors less frequently observed by parents. The most difficult total problem items were from social and thought problem scales; for example, “hears things”, “teased”, “lonely”, which were individual-specified behaviors less likely to be observed by parents.

Another important contribution is that we can examine all the CBCL/6–18 subscales. However, the CBCL/6–18 is a widely used standardized parent rating scale, to our knowledge, this was the first study to examine the psychometric properties of the parent rating version of the CBCL/6–18 subscales in the Turkish parent sample. Researchers tested the versions of the CBCL (different age groups or parent/teacher forms) in different cultures. Ivanova et al. (2007) studied the factor structure of parent ratings of the CBCL/6–18 in 30 societies, including Turkey, and the results support the use of the scale in diverse societies. Al-Hendawi et al. (2016) confirmed factor structures of the CBCL/6–18 in the sample of Qatar, and Seleem et al. (2023) did the same in a sample of Egyptian children. Hsiao et al. (2023) examined the psychometric properties of the CBCL/1½–5 from low-income families using Rasch analysis. Another study by Pandolfi et al. (2012) examined the psychometric properties of the parent or caregiver rating of the CBCL/6–18 in a sample of youth with ASD. This study is one of the studies to test the scale's factor structure and to examine sensitivity to item response categories to distinguish children’s problem behaviors in a Turkish parent sample.

Early profiling of emotional and behavioral problems in children and adolescents offers better early detection and more timely and less expensive intervention (Braet & van Aken, 2006; Tick et al., 2007). To recognize children and adolescents with ASD, professionals need reliable and valid screening methods for behavioral and emotional problems. Nevertheless, screening through observation or interview is a significant burden, and these methods could be more reliable and cost-effective (Jensen & Weisz, 2002). Therefore, rating scales such as the CBCL/6–18 with clear cutoff points are essential for monitoring children with ASD with emotional and behavioral problems and selecting potential cases for further evaluation.

Limitations

There were limitations to this study that need to be acknowledged, as well as directions for future research. The sample included children from 7 to 14 years, whereas the CBCL/6–18 is designed for ages 6 to 18. Future studies with the CBCL/6–18 extend its use in a larger and more heterogeneous sample to replicate the psychometric evaluations. The sample is also limited to generalizability to children from different age groups and backgrounds.

Another limitation is that this study only includes data from the parent observation ratings. The CBCL/6–18 is available for teacher ratings. Different versions of the CBCL ratings will enable us to examine each subscale for potential non-invariance items across parent and teacher ratings. Testing measurement invariance between different ratings (i.e., parent and teacher) will provide useful information to eliminate potential sources of bias and improve agreement between teacher and parent ratings. We recommended that more research on the CBCL/6–18 is needed to improve usability practically and support inferences from measures.

Conclusion

The current study provides validity evidence for the factor structure of the CBCL/6–18 based on parent-ratings. The results indicate that the CBCL/6–18 subscales -internalizing, externalizing, and total problem- meet the assumptions of the PCM in a psychometrically satisfactory way. However, some items from different subscales are more difficult for parents to endorse. Overall, the psychometric evaluation shows that the CBCL/6–18 subscales are appropriate to use for detecting children’s behavior problems, especially children with ASD.