Abstract
Elementary education is critical as it lays the foundation for learning, critical thinking, social and emotional development, individual growth, and self-confidence. Hence, studying the elementary students’ educational progress is immensely important. This manuscript aims to investigate the factors that impact elementary students’ academic performance, predict their academic performance, and use the above factors to identify the student’s appropriate skilled level group based on their academic performance. In this study, the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) score, a measurement to assess students’ early literacy skills for K-6 graders, is used to quantify students' academic performance. A sample of 185 first and second graders and their features, including student’s BMI (body mass index), amount of time the student spends watching television, student’s gender, number of hours that a student sleeps each night, amount of time that the student spends reading books, student’s race, and amount of time that the student spends on physical activity are used. Based on the regression analysis, second graders’ academic performances are significantly impacted by their BMI values (β = − 4.002, p < 0.05) and the amount of time students spend reading books (β = 29.14, p < 0.05). The first graders’ academic performances are significantly impacted by the amount of sleeping time (β = 41.89, p < 0.01) and their gender (β = − 37.129, p < 0.05). Furthermore, the experimental findings indicate that machine-learning techniques accurately predict the students' appropriate academic group. In the naive Byers classifier, students in the lowest academic performance group can be identified successfully with a sensitivity of 92%, and the students in the highest academic group can be identified with a specificity value of 100%.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
For many reasons, understanding the factors that influence children’s academic skills is imperative. Research findings show that children’s cognitive and non-cognitive development influence their achievements and decisions [1].
In each generation, people’s behavior changes with the development of technology. Children at present utilize more and more time with technological gadgets than in the past; this is true even for infants [2]. In the literature, some studies investigate the relationship between the amount of time spent watching television, the amount of sleeping time, a child’s physical fitness, weight, and academic achievement [3, 4]. Anderson [5] revealed that 26% of all US children watch four or more television hours daily and that 67% watch at least two hours daily. Gortmaker [6] found that for children between the ages of 10 and 15, television viewing is a significant contributing factor for a child to be overweight. In addition, the research indicates that reducing viewing time can keep children from developing some common chronic health conditions.
Body Mass Index (BMI) is an indicator that identifies the level of obesity. Studies conducted by Trembly [7], Wingfield [8], and Cawley [9] have studied the relationship between BMI and physical fitness. Study conducted by Cawley [9] found that physical exercise lowers the BMI z-score and reduces obesity among 5th graders, especially boys. While [7] and [8] found the existence of a negative relationship between BMI and physical activity, the findings of [8] indicate that the negative relationship between BMI and physical fitness is significant across grade levels and gender. Wingfield [8] Datar [10], Taras and Potts-Datema [11], Castelli [12], and Trembly [7] considered the impact of weight and physical fitness on academic skills. Both [10] and [11] found a negative association between academic skills and obesity in school-aged children. Specifically, [10] found that the performance in mathematics and reading scores of overweight children is lower than that of non-overweight children in kindergarten. In a study using 259 public school third and fourth graders, [12] found an inverse relationship between pre-adolescents BMI and academic skills and a positive correlation between physical fitness and academic skills. Wingfield [11] revealed a correlation between the academic skills of 5th-grade females and BMI and physical fitness. In another study, [7] examined the relationships between 6th-grade students’ physical activity level, BMI, self-esteem, reading scores, and mathematics scores. In the same study, physical activity and self-esteem were positively related, and there was a trivial negative relationship between physical activity and academic achievement. Using a longitudinal study, Benner and Crosnoe [13] show a strong association between children’s academic and school racial/ethnic composition. More studies on students’ academics and race can be found [7,8,9,10,11,12,13,14]. Another study conducted by Stroebele [15] using self-reported students’ related factors of elementary students, overweight, less physical activity, food consumption habits, and the number of hours of television watching negatively impact students' academic success. Furthermore, students who slept at least 9 h showed higher academic performances.
According to [16] and [17], students’ academic performance is influenced by the motivational effects of the family, peers, and the classroom atmosphere. The influence of peers can negatively impact other students. As Ogbu [18] states, at times, some groups of students tend to interpret their academic failures as accomplishments rather than their failures. Another study conducted by Johnson [19] shows that peer effect strongly influences elementary students’ academic achievements, while family background is also an influential factor. In addition, [19] states that girls’ academic performance is better at reading and writing, while boys perform better than their counterparts in more analytical subject disciplines.
Furthermore, [19] states substantial differences in academic achievement among students of different races, though this difference narrows down as age progresses.
One of the educational practices is to group students according to their educational level. The practice of grouping students emerged as students' preparation for their future workforce [20]. Grouping students based on their education level increases their academic achievements as the instructors have the freedom to adjust the pace of the instructions to the group of nearly uniform students. Furthermore, a more challenging and higher-level curriculum can be implemented for a higher level of kids. According to [21] elementary kids' grouping becomes maximally effective when applied only to one or two subjects. Some researchers find adverse effects of within-class ability grouping. Therefore, classifying students based on their education level is valuable for educators and policymakers. A comprehensive study about grouping practices is conducted by [22].
Grouping students or classifying them based on educational performance was benefitted by the development of machine-learning techniques. In this direction, [23] utilizes various machine-learning techniques to develop a model to classify students based on their academic levels. In another study, [24] use some of the demographic characteristics of the students together with their written assignment grades to predict students’ performance in a distance learning class. Ahadi [24] conducted a study to investigate how to use machine-learning techniques to identify students who are underperforming and high-performing. In another study, Peterson [25] used machine learning algorithms to assess students’ reading levels. More applications of machine-learning techniques related to students’ academic performances can be found in [27] and [28]. Though there are numerous studies related to student’s academic performance using machine-learning techniques at various academic levels, there is a vacuum of such studies related to elementary students.
As pointed out before, it is essential to find out the relationship among the factors that influence the academic success of elementary students. Furthermore, gathering early information about the student’s educational level is helpful for many viewpoints, including the parents, instructors, and administrators.
Studying this direction, we answer this manuscript’s research questions: (1) Which variables impact first and second-graders academic success? (2) How can we predict students' academic success at an early stage? (3) How can students be classified into appropriate academic groups based on student-related variables? Therefore, the outcome of a study like this will be helpful to the parents, instructors, administrators, and literature.
2 Materials and methods
2.1 Sample
This dataset contains 185 students, including 91 first and 94 s graders. Out of the first graders, 51% are boys, and 49% are girls, while for the second graders, these percentages are 56 and 44%, respectively. As far as the racial percentages are concerned, out of the first graders, 66% of the students were white, 13% black, and the rest were from other groups, while out of the second graders, 69% were white, 24% were black, and the rest were from other groups. For this study, we recorded the following variables: the student's grade, gender, race, weight, height, the number of hours students sleep, involvement in physical activities, and reading books.
2.2 Study design
This study used a cross-sectional quantitative research methodology [29]. Data for this investigation was implemented at an elementary school in rural New Mexico, USA. The school has classes from grade 1 to grade 2, starting at 8:00 am and ending at 3:30 pm. First and second-graders curriculum consists of disciplines including Mathematics, Language and Arts, Social Studies, Science, and Physical Education.
The data for this study was collected with the approval of the students, parents, and the school. Students who provided consent were considered for this study, while others were discarded. Students, parents, teachers, and the nurses at the school assist with data collection for this study.
3 Measuring instruments
3.1 DIBELS score
Dynamic Indicators of Basic Early Literacy Skills (DIBELS) score is a measurement to evaluate the acquisition of early-stage literacy skills for K-6 grade students. Furthermore, this score measures the students’ early literacy skills development and is monitored periodically at the academic year’s beginning, middle, and end. The DIBELS score reflects students’ phonemic awareness, alphabetic principle, accuracy, fluency, and comprehension. Out of the three DIBELS scores, we use the end-of-the-year DIBELS scores to represent students’ academic performance indicators. Further information about the DIBELS score can be found in [29].
3.2 Data collection using a questionnaire
After obtaining consent from all the parties involved in this study, a questionnaire was developed and distributed to the parents of the students through the school teachers. The questionnaire tool employed consisted of the following personal and Gender: Student’s Gender: Student’s gender was recorded as 0-Female and 1-Male.
Race: Race was recorded as American Indian or Alaskan Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. Due to the fewer number of students from certain race groups, the final representation of the race data was recorded using the three categories of white (1), black (2), and other (3).
TV: Each student’s average number of hours spent on watching TV in a day Sleep: Each student’s average hours of sleep in a day. Though this variable has a continues nature, in the later part of the analysis, the variable was categorized as LESS (sleeping time is less than 9 h) and GOOD (sleeping time is between 9 and 12 h) as the research has shown higher academic results with students who make at least 9 h of sleep [15].
Books: Each student’s average number of hours of reading books in a day.
Physical: This includes each student’s duration of doing physical activities such as playing, cycling, running, etc.
Height and Weight: Each student’s height and weight were measured with the help of the school nurse.
BMI: BMI values were calculated using the heights and weights obtained according to the following equation provided by [31].
4 Data analysis
4.1 Statistical indicators
We calculate descriptive statistics such as measures of central tendency, variability, and measures of the relative position of each variable collected on the students. In addition, we observe the shape of distributions of the critical variables related to this dataset. Some of the descriptive statistics we computed are as follows. Mean, standard deviation, DIBELS scores for both boys and girls are symmetrically distributed centered about 225, though girls’ scores show less variability than the boys. The distributions of BMI values of both boys and girls are skewed to the right. Like the DIBELS scores, BMI values for girls show less variability than for boys. In addition, the distribution of sleeping times also are approximately normally distributed around 8.5 h.
4.2 Two-way ANOVA and post-hoc methods
Two-way ANOVA (Analysis of Variance) is a statistical method used to analyze the effects of two categorical independent factors on a continuous dependent variable. It determines whether there are significant interactions between the two factors and whether each factor has a significant main effect on the dependent variable. In this study, we use two-way ANOVA to identify the impact of gender and sleep duration.
Further analysis is needed to identify the impact of gender and sleep duration on the students’ DIBELS scores. Though the sleeping duration is a continuous variable, here, for the ANOVA analysis, we discretize it by categorizing it into categories, namely LESS and GOOD, as explained before.
The null hypotheses of this test are as follows:
-
The means of observations grouped by gender are the same.
-
The means of observations grouped by the duration of sleeping are the same.
-
There is no interaction between the above two factors.
The alternative hypotheses are that the means of the above groups are not the same and that there is an interaction effect.
In order to make a decision with the ANOVA test, we use the F value and p-values.
As ANOVA does not determine which levels of the main factors are the highest and the lowest, further analysis is required to identify the above. This step is called a post-hoc analysis. We use a post-hoc analysis called Tukey’s HSD to determine which factor levels are significantly different from each other.
4.3 Regression analysis
Multiple linear regression is a statistical technique used to understand the linear association between a dependent variable, which is continuous, and two or more independent variables, which are either discrete or continuous. It extends the concept of simple linear regression, where only one independent variable is used to predict the dependent variable. In this study, we used the DIBELS score as the dependent variable, and the rest of the variables we collected about students, such as gender, the time they spend reading books, sleeping, and doing physical activities, and BMI. Even with the regression analysis, we use p-values to determine the impact of each independent variable.
4.4 Machine learning models
Machine learning, a subset of artificial intelligence, comprises computer systems capable of learning and adapting without predefined instructions. In contrast to the standard statistical procedures, machine learning does not follow parametric assumptions, which enhances the applicability to various data sets from various domains. This study uses two machine learning techniques called Naïve Bayes and Decision Tree algorithms.
4.4.1 Naïve bayes (NB) model
In this study, we use the NB model to group students into their appropriate academic groups based on their characteristics belonging to the students. This classification model is constructed using the data we collected about the students. Some popular machine learning classifiers are support vector machines, k-nearest neighbors, random forests, neural networks, and NB. The NB classifier is one of the most straightforward and most accurate data-classifying algorithms. The base of this classifier is the well-known Bayes theorem, used in probability theory. The simplicity, accuracy, and robustness of NB have made NB a popular classifying technique with various applications [32]. As the literature indicates, NB is one of the top-performing classifiers used in data mining [33].
4.4.2 Decision tree algorithm
A decision tree is a sequential technique used in machine learning. It is made of three parts: decision nodes, branches, and the end nodes, called leaves, representing the outcomes. The decision tree operates like a flow chart to make the decision-making process of classifying a new object to its appropriate type or class variable. This classification process starts from the top of the tree to the bottom. Decision trees have been used as a classification tool across many domains. This model has numerous applications in students' performance prediction and classifications.
A study conducted by Pandey and Sharma [34] used various decision tree models to predict student's performances. In a similar study, [35] utilized several decision tree models to predict students' academic success. The literature indicates various applications of machine learning models in educational data mining, but it is hard to see the application in elementary student's performance analysis. In this study, given the students' characteristics, we use the decision tree model to predict whether a student can meet the benchmark for the DIBELS score.
4.4.3 Implementation of machine learning models
Machine learning models are applied in a standard pattern, as illustrated in Fig. 1. As it shows, the data set is first preprocessed, where things like handling missing values and data scaling are implemented. After that, the dataset is divided into training and testing data. Usually, 70% of the data are randomly selected for the training, and the remaining 30% are used for the testing data. Next, a candidate model is proposed and trained on the training data before it is tested using the testing data. The candidate model is fine-tuned by using hyper-parameterization, and finally, the best model is selected.
4.4.4 Implementation of the multi-level classification
This section aims to develop a machine-learning model to classify students into their respective academic groups based on student-related features. As we do not have existing groups, students are classified into four groups based on their DIBELS scores, and the model is developed to predict the new students’ group. The student is classified into the “Well Below (WB)” group if the student’s score is lower than the first quartile. If the score is between the first and the second quartiles, the student is classified into the group "Below, (B).” Similarly, if the student’s score is between the second and the third quartile, the student is classified into the “Above, (A)” group, while those who have a score above the third quartile are classified in the “Well Above, (WA)” group. We use the NB classifier to develop this classification model.
After randomly splitting the data into training and testing, the training dataset is used to train the model. In the testing dataset, the class variable (appropriate group) is predicted using the student-related variables and compared with the actual values in the class variable. Unlike the binary classification, the class has four levels (WB, B, A, and WA). Therefore, assessing the model’s accuracy is slightly different due to the multi-class situation. Hence, sensitivity and specificity are defined, as shown in Tables 1 and 2. In addition, each group does not have an equal number of subjects, which means we have an unbalanced group here. The balanced accuracy can be used as a performance indicator in such a situation.
4.4.5 Performance indicators of the classification model
When evaluating machine learning models, the confusion matrix plays a significant role. Let us assume the class variable has two possible values (binary classification), namely the positive and negative. As Fig. 2 shows, the confusion matrix summarizes the values of the actual and the predicted class variables (positive/negative). Most performance indicators of machine learning models are defined on the confusion matrix. This study uses the following performance indicators: accuracy, balance-accuracy, sensitivity, and specificity.
Accuracy is a metric to quantify the classification of a model, which represents the ratio between the total positive predictions and the total instances. The precision indicates the proportion of true positive values and the total positives. Recall, or the sensitivity, quantifies the ratio between predicted and actual positives. Another indicator, specificity, represents the ratio between the total predicted and the actual negatives.
In this dataset, the proportion of positives (failing students) and negatives (passing students) are not equally distributed. This situation is considered an unbalanced situation. When working with unbalanced data, the balance accuracy, the average of the sensitivity and specificity, is regarded as a better performance indicator than sensitivity and specificity. Therefore, we calculate the balance accuracy as well.
5 Results
-
A.
Influential Factors on Elementary Students’ Academic Performance
-
A
Descriptive statistics
For the first graders, the number of hours students watch TV per day is 1.42 h (SD = 0.77), the average number of hours the students read books is 1.30 h (SD = 0.62), the average number of hours students sleep per day is 8.48 h (SD = 0.55), the average number of hours students involve in physical activates is 1.82 h (SD = 0.94). In addition, the average BMI value is 17.88 (SD = 4.05) and the average DIBELS score is 160.29 (SD = 66.64). As far as the average DIBELS score is concerned, white students have maintained 159.74 (SD = 71.03). For the black and other students the values were > 154.40 (SD = 45.82) and 163.61 (SD = 61.08) respectively.
For the second graders, the number of hours students watch TV per day is 1.49 h (SD = 0.62), the average number of hours the students read books is 1.24 h (SD = 0.51), the average number of hours students sleep per day is 8.60 h (SD = 0.65), the average number of hours students involve in physical activates is 3.12 h (SD = 1.39). In addition, the average BMI value is 18.52 (SD = 3.54), and the average DIBELS score is 226.19 (SD = 79.76). Furthermore, the average DIBELS score for white, black, and other students is 227.08 (SD = 69.56), 219.11 (SD = 89.13), and 233 (SD = 97.22), respectively.
When the distribution of DIBELS score is considered, it is approximately symmetrically distributed, while BMI scores have a skewed to right distribution. These behaviors are the same for both graders. More descriptive statistics can be seen in the Table 1. According to the collected data, 54% of the students are boys, and about 56% of the first graders are boys. Among the students, 67% are white, 14% are black, and 19% are others. More descriptive statistics can be found in Tables 3 and 4.
-
2.
ANOVA results
In order to identify the impact of gender and duration of sleep for the first grader and second graders, two-way ANOVA was conducted. In this calculation, we converted the duration of sleep to a categorical variable as LESS (sleeping time is less than 9 h) and GOOD (sleeping time is more than 9 h), as explained earlier. As Table 5 indicates, for the first graders the main effects of the Gender (F(1,88) = 5.160), p < 0.05) and Sleeping time (F(1,88) = 7.415), p < 0.01) on DIBELS were observed.
Table 6 indicates that LSD post hoc test results revealed that the girls have significantly higher DIBELS scores compared to the boys. Furthermore, Students who sleep more than 9 h (Good) show significantly higher DIBELS scores than the other group of first grades who sleep less than 9 h.
As Table 7 indicates, for the second graders the main effects of BMI (F(2,90) = 2.68), p < 0.1) on DIBELS were observed. According to the Table 8, it indicates that LSD post hoc test results revealed that students in the Healthy group show higher DIBELS scores than the other two groups though they are not significant.
-
3.
Results of the multiple regression model.
For first graders, the number of hours they sleep significantly predicted their DIBELS score, β = 41.89, p < 0.01. Gender also explained a significant proportion of variance in DIBELS scores, β = 37.13, p < 0.05. Further information can be seen in Table 9.
Figure 3 shows the impact of gender and sleep duration on the DIBELS is illustrated. First graders who sleep more than 9 h show higher DIBELS scores than the students who sleep less than 9 h. In the meantime, girls ‘performance on the DIBELS exam is higher than boys.
For the second graders, the number of hours they read books significantly predicted their DIBELS score, β = 29.14.89, p < 0.05. Students BMI value also explained a significant proportion of variance in DIBELS scores, β = − 4.00, p < 0.05. Table 10 indicates the findings of the regression analysis.
Figure 4 shows the relationship among the DIBELS score, gender, and BMI class. Based on this figure, second graders whose BMI class is in a healthy state show higher performance on the DIBELS exam than the other two classes of Obese and Close to obese.
-
4.
Results of the decision tree model.
In order to PREDICT success with DIBELS, we proposed a decision tree model. This model aims to predict whether students can meet the benchmark level of the DIBELS exam or not. Two models for grade 1 and grade 2 were constructed based on the decision tree algorithm. The model for first graders was tested on the testing dataset for grade 1, and an accuracy of 60% and recall of 71% were observed. The recall value tells us how well the model accurately identifies students who pass the DIBELS out of all test takers. The final decision tree for grade 1 is represented in Fig. 5.
Similarly, the other model for the second graders was trained and tested using the second grader's data. This model showed an accuracy of 83% and a recall of 75%. The final optimal decision tree for grade 2 is represented in Fig. 6.
-
5.
Results of the Naïve Bayes (NB) Model
In order to predict students’ appropriate DIBELS level (group), an NB model was constructed. This NB classifier was trained using a randomly sampled 70% of the collected dataset. The remaining 30% of the collected data was used for the testing data. According to the outcomes, the WB group has the highest sensitivity of 92%, meaning 92% of the students who belong to the WB group can be identified accurately. The specificity of the WA group is 100%. This states that 100% of the students who do not belong to the WA group can be correctly identified as not belonging to the WA group. Similarly, the specificity percentages for B, WB, and A groups are 95, 75, and 81%, respectively.
In this dataset, the number of students belonging to each group (class) is not equal. Therefore, for unbalanced classes, a better measurement of classification accuracy is balanced accuracy. According to the outcomes, WB, B, A, and WA's balanced accuracy is approximately 84, 73, 65, and 63%, respectively.
6 Discussion
This study had three main objectives. First, we wanted to investigate the impact of various elementary students’ related factors on their academic performance. Secondly, to develop a model to forecast their academic performance. Finally, we devised a mechanism to classify these students into appropriate skilled-level groups using the identified influential factors. In this study, we used the DIBELS score as an indicator for elementary students to quantify their academic performance. This score not only quantifies an individual student’s basic literacy skills but can also be used as an indicator to compare an individual’s academic level beyond the school level.
According to the regression analysis, both the number of hours a student sleeps, and their gender statistically influence their DIBELS score. While higher hours of sleep indicate higher academic success, girls show higher performance than boys as far as the DIBELS score is concerned. Further analysis using ANOVA procedures consolidates the above relationship. Furthermore, post hoc comparisons using the Tukey HSD test indicate that the difference between the mean DIBELS score for girls and boys is significantly different (M = 4.497, p < 0.026). In addition, using the Tukey HSD test shows that there is a significant statistical difference in DIBELS scores between those who sleep more than 9 h and those who sleep less than 8 h (M = 43.056, p = 0.007). The literature indicates that [36] found that sleep problems and patterns affect elementary students’ school performance. According to the report released by the World Health Organization in 2004 [37], lack of sleeping time can impede the development of the brain and the acquisition of cognitive and intellectual capabilities. Furthermore, [38] report that inadequate sleep can negatively impact students' academics and physical and psychosocial health.
According to the regression analysis, second graders, both the number of hours students read books and the BMI values significantly impact their DIBELS score. While the duration of book reading positively (β = 29.139, p < 0.05) impacts the DIBELS score, the BMI value negatively (β = − 4.002, p < 0.05) impacts the DIBELS score. Further analysis using ANOVA indicates that.
For the second graders, the main effects of the BMI (F(1,90) = 2.68), p < 0.1) on DIBELS were observed. As Kuczmarski [39] state, a child of this age group is considered “Healthy” if the BMI range is between 13.6 and 17.5 or in the 5th to 85th percentile of the BMI values. Furthermore, a child is considered “Close to obese” if the BMI values are between 17.5 and 19 (or 85th to 95th percentiles). If the BMI value is more than 19 (or over the 95th percentile), the child is considered “Obese.” According to the results of the ANOVA procedure, though there is no statistically significant difference among the three above groups, students in the Health group showed higher performance in DIBELS scores than the other two. This proves the findings of a study conducted by [40] about a negative association between BMI scores and student's academic achievements.
Furthermore, according to the World Health Organization (WHO), childhood obesity influences premature death and disability in adulthood [41]. As the Centers for Disease Control and Prevention (CDC) indicates, childhood obesity is a complex health issue that harms the body in various ways. Morrison [42] reveals that obesity has caused higher anxiety, depressive symptoms, and poorer quality of life. Furthermore, [43] and [44] state that low self-esteem, lower self-reported quality of life, and social problems, such as bullying and stigma, are other consequences of childhood obesity.
The second objective of this study was to predict student's success with the DIBELS exam. Using their related variables, we used the decision tree approach to predict whether students can meet the benchmark of the DIBELS exam or not. According to the experimental outcomes, if students’ related variables such as gender, the number of hours they spend sleeping, reading books, and watching TV, their race, and their BMI values are known, our constructed model can accurately predict students' success with the DIBELS exam.
Our final objective of this study was to forecast students’ appropriate academic groups at an early stage. We used another machine learning model, Naïve Bayes, for this classification, and the experimental findings show promising results. In this process, we group the students into four levels, WB, B, A, and AB, based on the students’ DIBELS score. According to the trained Naïve Bayes model, students’ appropriate academic groups can be identified accurately using the students’ related information.
This model can be beneficial for administrators in identifying students who need extra attention to succeed in their studies and identifying students who are currently performing well. In either case, the administrators can allocate resources effectively to improve the student’s success. Furthermore, the improvement in the placement of a student into an appropriate learning group enables teachers to use relevant pedagogy and pacing to improve student learning.
7 Conclusion
This study was conducted with three aims: to observe the relationship between students' related variables and academic performances, to develop a model to predict students meeting the level of academic success, and to group students into uniform groups based on their academic levels. Students’ academic success was quantified using the DIBELS score, which represents student's early literacy skills.
Using the collected dataset of first and second-graders student-related variables (characteristics), we devised techniques to achieve the above three aims. According to the experimental findings, first-grade students’ academic success is significantly impacted by the number of hours of sleep and their gender. This shows the association of higher academic success with a higher duration of sleeping time. In addition, the second graders’ academic performance is significantly influenced by the BMI value and the number of hours of book reading. These results confirm some of the findings of the literature as well as the recommendations given by the World Health Organization regarding the required number of hours of sleep for elementary students.
Another objective of this study was to develop a model to predict whether students can meet the benchmark level of academic success at the end of the year. Using the machine learning model based on decision trees, we constructed a model to predict the students’ end-of-year academic performance successfully. Finally, we constructed another accurate machine learning model called Naïve Bayes to classify students into similar groups based on their endemic levels. These models are significantly crucial for parents, teachers, and school administrators. Parents can pay attention to the variables associated with the kids that impact their academic performances, such as sleeping, book reading times, and BMI values. Furthermore, teachers and administrators can use the developed models to predict students’ future academic success before they reach the end of the semester. The above-constructed model can be effectively used if the administrators want to identify students of similar academic levels to form separate classes.
The authors have identified some limitations of a study like this. The data of this type of study can show some inaccuracies due to the parent-reported data such as sleeping time, TV watching time, reading time, and physical activity time. In this study, we used DIBELS values as the indicator for academic success, but if there are alternative measurements, this proposed model can also be applied to them. Unfortunately, we could not collect data from other graders from elementary school. Analyzing other elementary students’ data in a future study would be interesting.
Data availability
The data used in this study is not publicly available but is available with the corresponding author upon reasonable request.
References
Gözün-Kahraman Ö, Ceylan Ş, Korkmaz E. An analysis of the developmental status of children aged 0–3 in terms of various variables. Mersin Üniversitesi Sağlık Bilimleri Dergisi. 2016;9(2):60–9.
Günüç S, Atli S. Parents’ views on the impact of technology on 18 to 24-month old infants. Addicta: Turkish J Addict. 2018;5(2):205–26.
Reale L, Guarnera M, Mazzone L. The effects of sleep disturbance on school performance: a preliminary investigation of children attending elementary grades. School Psychol Int. 2014;35:398–404. https://doi.org/10.1177/0143034313485850.
Unalan D, Ozturk A, Ismailogullari S, Akgul N, Aksu M. The effect of sleep duration and quality on academical success of the elementary school children in Kayseri Turkey. J Pak Med Assoc. 2013;63(5):576–80.
Anderson RC, Fielding LG, Wilson PT. Growth in reading and how children spend their time outside of school. Read Res Q. 1988;23:285–304.
Gortmaker SL, Must A, Sobol AM, Peterson K, Colditz GA, Dietz WH. Television viewing as a cause of increasing obesity among children in the United States, 1986–1990. Arch Pediatr Adolesc Med. 1996;150:356–62.
Trembly MS, Inman JW, Willms JD. The relationship between physical activity, self-esteem, and academic achievement in 12-year- old children. Pediatr Exerc Sci. 2000. https://doi.org/10.1123/pes.12.3.312.
Wingfield RJ, McNamara JPH, Janicke DM, Graziano PA. Is there a relationship between body mass index, fitness, and academic performance? mixed results from students in a southeastern united states elementary school. Curr Issues Educ. 2011;14(2):1–12.
Cawley J, Frisvold D, Meyerhoefer C. The impact of physical education onobesity among elementary school children. J Health Econ. 2013;32(4):743–55.
Datar A, Sturm R, Magnabosco JL. Childhood overweight and academic performance: national study of kindergartners and first-graders. Obes Res. 2004;12(1):58–68.
Taras H, Potts-Datema W. Obesity and student performance at school. J Sch Health. 2005;75(8):291–5.
Castelli DM, Hillman CH, Buck SM, Erwin HE. Physical fitness and academic achievement in third- and fifth-grade students. J Sport Exerc Psychol. 2007;29:239–52.
Benner AD, Crosnoe R. The racial/ethnic composition of elementary schools and young children’s academic and socioemotional functioning. Am Educ Res J. 2011;48(3):621–46. https://doi.org/10.3102/0002831210384838.
Quinn DM, Cooc N. Science achievement gaps by gender and race/ethnicity in elementary and middle school: trends and predictors. Educ Res. 2015;44(6):336–46. https://doi.org/10.3102/0013189X15598539.
Stroebele N, McNally J, Plog A, Siegfried S, Hill JO. The association of self-reported sleep, weight status, and academic performance in fifth-grade students. J Sch Health. 2013;83(2):77–84.
Wentzel KR, Wigfield A. Academic and social motivational influences on students’ academic performance. Educ Psychol Rev. 1998;10:155–75.
Phelan P, Davidson AL, Cao HT. Students’ multiple worlds: negotiating the boundaries of family, peer, and school cultures. Anthropol Educ Quart. 1991;22:224–50.
Ogbu J. Cultural ecology of competence among inner-city blacks. In: McAdoo H, McAdoo J, editors. Black children’s social, educational, and parent environments. Newbury Park: Sage; 1985.
Johnson KA. (2000). The Peer Effect on Academic Achievement among Public Elementary School Students. A Report of the Heritage Center for Data Analysis.
Cooper R. Detracking in an urban California high school: improving the schooling experiences of African-American students. J Negro Educ. 1996;65(2):190–208.
Slavin RE. Ability grouping and student achievement in elementary schools: a best-evidence synthesis. Rev Educ Res. 1987;57(3):293–336. https://doi.org/10.3102/00346543057003293.
Boaler J, Wiliam D, Brown M. Students’ experiences of ability grouping: disaffection, polarisation and the construction of failure. Br Edu Res J. 2000;26:631–48.
Aggarwal D, Mittal S, Bali V. Prediction model for classifying students based on performance using machine learning techniques. Int J Recent Technol Eng. 2019;8(2S7):496–503.
Kotsiantis S, Pierrakeas C, Pintelas P. Predicting students’ performance in distance learning using machine learning techniques. Appl Artif Intell. 2004;18(5):411–26.
Ahadi A, Lister R, Haapala H, Vihavainen A. (2015). Exploring machine learning methods to automatically identify students in need of assistance. In Proceedings of the eleventh annual international conference on international computing education research (pp. 121–130).
Petersen SE, Ostendorf M. A machine learning approach to reading level assessment. Comput Speech Lang. 2009;23(1):89–106.
Goldhaber D, Wolff M, Daly T. (2020). Assessing the accuracy of elementary school test scores as predictors of students’ high school outcomes.
Chung JY, Lee S. Dropout early warning systems for high school students using machine learning. Child Youth Serv Rev. 2019;96:346–53.
Schmidt NA, Brown JM. Evidenced-based practice for nurses: appraisal and application of research. 4th ed. Burlington, MA: Jones and Bartlett; 2019.
Good RH, Kaminski RA. (2009). What are DIBELS. Dynamic Indicators of Basic Early Literacy Skills.
Prentice AM. Body mass index standards for children. Are useful for clinicians but not yet for epidemiologists. BMJ. 1998;317(7170):1401–2. https://doi.org/10.1136/bmj.317.7170.1401.
Arar ÖF, Ayan K. A feature dependent naive bayes approach and its application to the software defect prediction problem. Appl Soft Comput. 2017;59:197–209.
Wu X, Kumar V, Ross Quinlan J, et al. Top 10 algorithms in data mining. Knowl Inf Syst. 2008;14:1–37. https://doi.org/10.1007/s10115-007-0114-2.
Pandey M, Sharma VK. A decision tree algorithm pertaining to the student performance analysis and prediction. Int J Comput Appl. 2013;61(13):1–5.
Alsalman YS, Halemah NKA, Al Nagi ES, Salameh W. (2019). Using decision tree and artificial neural network to predict students academic performance. In 2019 10th international conference on information and communication systems (ICICS) (pp. 104–109). IEEE.
BaHammam A, Al-Faris E, Shaikh S, Saeed AB. Sleep problems/habits and school performance in elementary school children. Sleep Hyp. 2006;8:12–8.
WHO technical meeting on sleep and health, Sleeping is required for every living organism, Bonn Germany, 22–24 January 2004. http://www.euro.who.int/__data/assets/pdf_file/0008/114101/E84683.pdf
Mellon P, Montemurro G, Sulz L, Torrance B, Storey K. “Your kid has potential, but they need sleep”: teacher perspectives on school‐based sleep promotion in Alberta. J School Health. 2024. https://doi.org/10.1111/josh.13439.
Kuczmarski RJ, Ogden CL, Guo SS. CDC growth charts for the United States: methods and development. Vital Health Stat. 2000;246:1–190.
Yu B, Kelly S. Untangling the relationship between BMI and academic achievement in the elementary years. Br J Nutr. 2024;131(2):351–8.
Alswat KA, Al-Shehri AD, Aljuaid TA, Alzaidi BA, Alasmari HD. The association between body mass index and academic performance. Saudi Med J. 2017;38(2):186–91. https://doi.org/10.1553/smj.2017.2.16320.
Morrison KM, Shin S, Tarnopolsky M. Association of depression and health related quality of life with body composition in children and youth with obesity. J Affect Disord. 2015;172:18–23. https://doi.org/10.1016/j.jad.2014.09.014.
Halfon N, Larson K, Slusser W. Associations between obesity and comorbid mental health, developmental, and physical health conditions in a nationally representative sample of US children aged 10 to 17. Acad Pediatr. 2013;13:6–13.
Beck AR. Psychosocial aspects of obesity. NASN Sch Nurse. 2016;31(1):23–7.
Funding
This study was initially funded by the Internal Research Grant of Eastern New Mexico University. The support for publication was given by Prairie View A&M University’s Faculty Enhancement Program (FEP).
Author information
Authors and Affiliations
Contributions
This research protocol was approved by Human Subjects Ethics Committee of Eastern New Mexico University and the involved School.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This research protocol was approved by Human Subjects Ethics Committee of Eastern New Mexico University in accordance with the guidelines employed for screening are those set forth by DHHS and the ethical standards of the APA.
Competing interests
The authors alone are responsible for the content and writing of the paper. The authors report no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wickramasinghe, I., Aragon, R. & Valles, J. Influential factors on elementary students’ academic performance and identifying the appropriate performance group. Discov Educ 3, 76 (2024). https://doi.org/10.1007/s44217-024-00167-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s44217-024-00167-x