1 Introduction

1.1 State of the art on manufacturing modelling

The recent advancements in information and communication technology have revolutionized the way through which industrial systems are monitored, managed, and retained data [1]. This transformation has significantly impacted the manufacturing sector, with a specific emphasis on the production systems, allowing real-time capture of operational information related to the manufacturing activities involved in producing value-added goods. Achieving precise tolerances is crucial in critical sectors like aerospace and super boilers applications, particularly when working with nickel (Ni)-based superalloys categorically Inconel 617 (IN617). To meet these stringent requirements, an accurate cutting method is imperative. EDM is a widely employed technique for such materials and applications [2, 3]. However, the significant challenge of electrode degradation poses a primary challenge to the seamless application of EDM. This wear issue adversely impacts the control over dimensional accuracy in the machined cavity, leading to an increased rate of parts rejection and compromising overall process productivity. Electrode wears not only hamper the cutting performance by disrupting sparking control but also produce inconsistent geometrical features of the machined impressions. Addressing this concern is essential to enhance the dimensional precision and overall effectiveness of EDM. Traditional machining methods struggle with the high strength, rapid strain hardening, and low density of IN617; hence, this research opted for EDM. The critical qualities and data required for customized and problem-specific analyses are integrated into manufacturing industry data repositories. This improvement extends to enhanced intelligent functioning, efficiency, operational excellence, and informed decision-making in these systems.

The production sector commonly relies on traditional techniques like employing physical prototypes and regression models to extract functioning statistics efficiently from their manufacturing practices [4]. However, accurately representing these systems is challenging due to their intricate and interconnected nature, featuring nonlinear and mutually dependent collaborations among input–output variables. This complexity makes it difficult to construct precise physical models based on limited assumptions [5]. Moreover, the computational complexities involved limit the feasibility of modelling and optimizing these physical representations [6]. In contrast, it is crucial to recognize that when confronted with the intricate and nonlinear output patterns found in large-scale production systems, regression models may produce less-than-optimal results. This is attributed to their effort to effectively acquire the inherent information and sophisticated relations suppressed within the extensive dataset [7].

Nonetheless, recent scholarly works [8, 9] underscore a lack of a thorough analytical framework explaining the requisite stages and integrating data-driven algorithms for carrying out evaluations in the manufacturing sector. Therefore, this research introduces an analytical framework to determine the best possible functioning parameters for industrialized systems (Table 1). This framework encompasses diverse stages, including the collection and visualization of data, the formulation of data-driven algorithms, the validation of extrapolative model, the scrutiny of variable sensitivity, and, ultimately, the integration of the model into an optimization framework. The application of this framework is directed towards simulating and alleviating the EWR and OC phenomena in EDM operations, specifically addressing the challenges associated with the IN617 material.

Table 1 Analysis of the state-of-the-art techniques for modelling of EDM

1.2 State of the art on electric discharge machining

The case study of EDM is selected because of its complex parametric control and not involvement mechanical forces during machining of high-strength materials, including Ni-based superalloys [1, 38]. In contrast to conventional machining techniques, EDM functions by employing a tool and workpiece that do not come into direct physical touch with each other. Conversely, the process initiates an electrical discharge between the workpiece and electrode substances, resulting in the fusion and evaporation of the underlying material [39, 40]. In light of its inherent benefits, it is worth noting that EDM exhibits comparatively lower cutting rates. Consequently, scholars have sought to address this limitation by incorporating nano-powders such as graphene, copper, alumina (Al2O3), silicon carbide (SiC), graphite, and others, with the aim of augmenting cutting rates [41, 42]. The utilization of nano-powders in the EDM process enhances its efficiency by serving as conductive bridges between the tool and work electrodes. Furthermore, the incorporation of surfactants into the dielectric has been implemented as a means to function as surface active agents, hence mitigating agglomeration and enhancing the performance of machining processes [43]. Prior studies have examined the integration of nano-powders, surfactants derived from WCO, and cryogenic treatment (CT) in conjunction with AI-based predictive modelling, with a specific focus on IN617.

The process of selecting the suitable powder for machining is a pivotal decision requiring a comprehensive assessment of various variables. Prioritizing the selection of nanomaterials based on their electrical conductivity and suitability for the dielectric fluid is crucial [25, 44]. Considerations such as particle size and morphology are crucial, with smaller nanoparticles holding the potential to enhance precision in terms of dimensional accuracy and reduce EWRs. Additionally, factoring in the thermal conductivity and erosion resistance of the nanomaterial is vital to minimize tool wear and improve machining outcomes. Striking a balance between cost-effectiveness, quality requirements, and process control is indispensable to confirm that the chosen material meets quality standards [45]. Thorough testing and validation protocols are imperative to assure the desired machining results [46]. Taking these factors into consideration, it can be inferred that alumina nanomaterial stands out as the optimal choice for EDM of IN617, particularly when paired with WCO as the dielectric fluid.

A comprehensive framework has been developed utilizing AI to effectively tackle the challenges associated with simulating complex systems. This framework provides a more cost-effective approach by leveraging the existing computer capacity [6, 47]. As a result, models are custom-designed and can be efficiently utilized in similar fields for adaptable functionality [48]. The utilization of AI in various fields, including energy, material discovery, process design, and improvement, is highly beneficial due to its advanced modelling and feature recognition capabilities [9]. The current state of the art reveals a notable gap in providing a thorough analytical framework that outlines crucial analytical methods while also incorporating data-driven algorithms for the scrutiny of manufacturing activities. To address this research gap, the present investigation introduces a comprehensive analytical structure covering multiple phases. These stages include data collection and visualization, process modelling framework establishment, accuracy verification of the predictive model, input responsiveness assessment, and AI model adjustment within an optimization framework. The overarching objective of this framework is to identify the most effective operational parameters for manufacturing systems. The primary aim is to enhance the efficiency of machining, especially for hard-to-machine alloys, with a key focus on diminishing EWR and OC.

The utilization of machining procedures, such as EDM and WEDM, holds significant importance in the manufacturing operations of the automotive and aerospace sectors [49, 50]. Nonconventional manufacturing technologies, such as EDM, provide accurate material cutting and enable the production of delicate components for use in many industries such as automotive, aerospace, and others [51]. In the current manufacturing context, it has become essential to model these different machining variations in order to improve process efficiency and promote skill development. EDM is widely recognized for its exceptional capability to effectively cut materials that are tougher, stronger, and more resistant to wear. In addition, EDM demonstrates superior proficiency in shaping intricate forms with enhanced accuracy when compared to conventional machining techniques [52, 53]. The field of literature significantly encompasses studies about the modelling and optimization of AI-based EDM systems. ANN has been increasingly prominent in the field of process modelling and optimization due to its ability to achieve higher levels of accuracy compared to traditional statistical models. The aforementioned high-dimensional models possess the capability to accurately approximate any function and effectively integrate nonlinear datasets [54]. Pradhan and Das [55] introduced an ANN model to analyse the fluctuations in radial OC during EDM. The radial OC plays an essential role in determining the precision and accuracy of the final product. The developed ANN models demonstrate a high level of accuracy in approximating the EDM process responses. The findings suggest that these proposed models can be effectively utilized for predicting the outcomes of the intricate and stochastic EDM process.

In this study, Meena et al. [56] focussed on the process parameter optimization of micro-machining of commercially pure titanium. The observed performance criteria for verifying the optimal include MRR, EWR, and OC. The parameters such as current, frequency, and Pon are systematically evaluated using the Taguchi method-based GRA. Pradhan and Biswas [57] used two neuro-fuzzy and a neural network to model MRR, EWR, and OC in die-sinking process. Comparative analysis indicates that both the neuro-fuzzy and ANN models exhibit accurate and efficient results. The findings suggest that the proposed models are effective tools for successfully predicting MRR, TWR, and OC in the complicated and stochastic EDM process. Bhandare and Dabade [40] performed dry EDM on Inconel 718 and modelled MRR, EWR, and SR through feed-forward backpropagation ANN model. A comparison of experimental and predicted values reveals 0.94455 correlation coefficient, indicating robust prediction accuracy and the efficacy of ANN. Manikandan et al. [58] presented a comprehensive exploration of the EDM process and introduced a novel hybrid grey ANN model. Grey Relational Analysis was harnessed to ascertain the Grey Relational Grade (GRG), encapsulating a multi-attribute optimization model. A neural network was then developed to predict GRG, with the input being the Grey Relational Coefficient (GRC) values.

Rao [59] introduced predictive models for achieving precise dimensions in micro-holes machined on Inconel 718 superalloy through the µEDM process. An analysis of estimated statistical parameters reveals that the ANFIS-PSO model demonstrates superior accuracy in predicting µEDM responses compared to other models. Walia et al. [60] evaluated tool shape changes using the input Ip, Pon, and Poff which significantly influenced impression roundness. Employing the random forest technique, the authors demonstrated more compelling predictive results, leading to the inference that it outperforms other methods in accurately forecasting tool shape alterations. Dey et al. [61] evaluated surface quality and dimensional accuracy and proposed the utilization of four metamodeling techniques. Drawing insights from past experimental datasets, it is discerned that gene expressing programming (GEP) stands out by delivering more precise and stable predictions for both considered EDM processes. Shirguppikar and Patil [62] machined titanium alloy using coated tungsten carbide tool. The study revealed that the TiN tungsten carbide exhibited superior performance, yielding a 16.32% reduction in EWR and a 12% decrease in OC compared to the naive electrode.

To attain optimal process parameters, the utilization of an efficient optimization technique is crucial [63]. Optimization algorithms, as iterative methods, systematically evaluate various solutions until an optimum or satisfactory solution is identified. In this study the authors delve into the exploration of nontraditional optimization techniques, employing them to achieve both single and multiple objective optimizations for SR and OC values. Furthermore, a comparative analysis is conducted to assess the performance of various optimization techniques, considering factors such as prediction accuracy and convergence characteristics. Abhilash and Chakradhar [64] used a hybrid method, combining GRA with the Technique for Order of TOPSIS and utilizing entropy weights. The validation tests demonstrated that the entropy-weighted GRA-TOPSIS approach enhances overall process performance in comparison to conventional TOPSIS.

Pourasl et al. [65] conducted research to study the EDM of AISI D6 material with both an ANN and an ANFIS employed to predict the machining outputs. The results revealed that increased Pon values led to higher MRR and SR, while concurrently reducing EWR. Farooq et al. [39] investigated the potentiality of minimizing geometrical errors during machining complex features. Yadav et al. [66] identified the optimal settings of process parameters, determining that a current of 15A, gap voltage of 30V, and Pon of 100µs were the optimal settings for achieving the desired machining responses. According to many researchers, the Taguchi approach is a well-defined statistical method for optimizing processing variables in terms of performance, cost, and quality. Additionally, it is a robust design methodology that quickly identifies the most influential parameters, saving time, and resources while delivering consistent and efficient results [3, 67, 68].

2 Research gap and directions

Previous studies have utilized nonconventional machining techniques to evaluate the effectiveness of EWR and OC on different materials. Traditional analytical simulations were constrained to basic systems and did not encompass the integration of intricate simulations. Different algorithms, in this study, are utilized in order to examine their influence on the performance of EDM. For instance, the incorporation of nano-powder into the dielectric fluid has introduced complexity to the system, hence posing challenges for conventional analytical methodologies to fully appreciate. Therefore, the use of an AI modelling framework, specifically an ANN, has demonstrated its efficacy in effectively tackling this intricate issue, primarily in the examination of the EWR and OC of engineering materials such as IN617. The study’s innovative characteristics are attributed to apply AI optimization framework for boosting the performance of the EDM process. The framework is utilized to simulate the EDM of materials that are difficult to cut. This work utilizes AI computational techniques to model the performance of EDM to reduce EWR and OC. The findings of the simulation are then validated using experimental procedures, to bridge the detected gap. The utilization of a parametric optimization technique is employed to ascertain the most favourable input parameters that result in reduced EWR and dimensional error. This particular aspect has not been adequately addressed in existing research; however, it holds significant importance for industrial applicability and competitiveness in enhancing the performance of EDM and optimizing resource utilization. The study is in line with the dimensions of Industry 4.0, as it facilitates intelligent functioning of industrial system, digitalization of process, and effective management of operation.

3 Materials and methods

The comprehensive procedure used in the study is depicted in Fig. 1. The first stage of the research process entails doing a comprehensive assessment of existing literature, identifying any gaps in knowledge, and formulating the research problem. During the second phase, the selection of input parameters based on trial experiments is of utmost importance, followed by the succeeding phase which involves the collection of response data (EWR and OC) for the EDM process of IN617. This encompasses results pertaining to both the inputs and outputs of the process. The EWR (mg/min) was measured through the weight difference methodology, while the OC (mm) was assessed employing a coordinate measuring machine (CMM). This study examined three input factors, namely the electrode treatments (SCT and DCT), Cp, and Sc. The treatment consisted of two levels, while the other parameters were characterized by three levels. Specifically, the Cp parameter had levels of 0.5, 1.0, and 1.5 g/L, while the Sc parameter had values of 6, 8, and 10%/L. The third phase of the process entails the examination and graphical representation of the dataset. To determine the appropriate parametric values, a comprehensive examination of the existing literature was undertaken. This evaluation encompassed the application of EDM on a range of challenging materials, including various input parametric configurations. Following this, a particular set of processing parameters was picked for the first experiments. From these trials, specific values for the parameters were chosen that resulted in improved output parameters, especially a decrease in EWR and OC. The experimental procedures employed in this investigation involved the use of the EDM machine, namely the RJ230 model, as illustrated in Fig. 2. Additionally, the CMM is depicted in Fig. 3.

Fig. 1
figure 1

Flowchart of present state of the art

Fig. 2
figure 2

EDM setup for collecting the data

Fig. 3
figure 3

Setup for measuring dimensional error

Box plots have demonstrated to be an effective means of visually representing EWR and OC data concerning input factors in EDM. In the quest for independent variables, the initial and primary step involves scrutinizing linear dependence. An efficient AI-based model possesses the capability to predict the output variables of a hyperdimensional and intricate system using independent variables. The determination of linear dependence between variables can be achieved through the Pearson correlation coefficient as per Eq. 1.

$${R}_{xy}= \frac{\sum_{i}^{N}\left({x}_{i}- \overline{x }\right)\sum_{i}^{N}\left({y}_{i}- \overline{y }\right)}{\sqrt{\sum_{i}^{N}{\left({x}_{i}- \overline{x }\right)}^{2}}\sqrt{\sum_{i}^{N}{\left({y}_{i}- \overline{y }\right)}^{2}}}$$
(1)

The input in this case is denoted by \(x\), the output by \(y\), and the complete number of observations in the dataset is represented by \(i\), which ranges from 1 to N. The values of \(\overline{x }\) and \(\overline{y }\) represent the average \(x\) and \(y\) values, respectively. The \({R}_{xy}\) takes values between -1 showing negative and + 1 perfect correlations. On the other hand, the absence of any linear relationship is indicated when \({R}_{xy}\) = 0.

Following data processing and visualization, the fourth phase involves training an AI framework, such as ANN, with meticulous hyperparameter optimization. As indicated in the literature, AI-based models have the capacity to grasp the nonlinear and intricate characteristics of a system, with detailed insights into the functioning of ANN provided in [69]. Three statistical measures are employed to assess the trained AI models and evaluate their performance. The training, testing, and validation of response measures have been carried out by using the MATLAB (R2018a) on the Core i7, 7th generation laptop. The performance criteria encompass R2, RMSE, and MAE [70]. Equations for these performance parameters (2–4) are outlined below:

$${R}^{2}=\frac{\sum_{\text{i}=1}^{n}({y}_{\text{i}}-\overline{{y }_{\text{i}}})(\widehat{{y}_{\text{i}}}-{\overline{\widehat{y}} }_{\text{i}})}{\sqrt{\sum_{\text{i}=1}^{n}{({y}_{\text{i}}-\overline{{y }_{\text{i}}})}^{2}\sum_{\text{i}=1}^{n}{(\widehat{{y}_{\text{i}}}-{\overline{\widehat{y}} }_{\text{i}})}^{2}}}$$
(2)
$$\text{RMSE}=\sqrt{\frac{1}{\text{n}}\sum_{\text{i}=1}^{\text{n}}{\left({\widehat{y}}_{\text{i}}-{y}_{\text{i}}\right)}^{2}}$$
(3)
$$\text{MAE}=\frac{1}{n}\sum_{\text{i}=1}^{\text{n}}\left|{y}_{\text{i}}-{\widehat{y}}_{\text{i}}\right|$$
(4)

Here, n denotes the sample size, and \({\widehat{y}}_{i}\), \({y}_{i}\) stand for the estimated and real values, respectively. The averages of the actual and anticipated values are denoted by \(\overline{{y }_{i}}\), and \({\overline{\widehat{y}} }_{i}\), respectively. R2 is a metric for precision; values range from zero, which denotes very poor predictive accuracy, to one, which denotes 100% accuracy in all forecasts. In contrast, two error measures called "RMSE" and "MAE" were developed to evaluate the disparity between the outcomes predicted by the model and the real observations.

In the fifth stage, an assessment is carried out using the Valext test to scrutinize the capacity of the trained models to assess predictive capabilities. This evaluation involves predicting unseen data serving as a pivotal gauge of their effectiveness in a broader context. Moving to the sixth phase, evaluation indicators such as R2, MAE, and RMSE come into play. These indicators serve as benchmarks for evaluating the predictions and determining the superiority of models. In the seventh step, a parametric approach is employed to analyse the input variables. Following the identification of the model demonstrating superior performance, the eighth segment of the study involves conducting an SA. The goal of this analysis is to determine the relative importance of input characteristics that are crucial for predicting EWR and OC in the context of EDM. The optimization procedure has a central role in the final phase. In the process of EDM, the input variables are optimized using PO in order to simultaneously minimize both EWR and OC.

4 Results and discussion

4.1 Descriptive statistics

The data related to EWR and OC are successfully gathered, and box plots are used as a visual tool to illustrate the data for a comprehensive analysis. Figure 4a, b displays the training data for EWR and OC in EDM while machining IN617 with different input variables. A thorough examination of Fig. 4a reveals that certain input factors, specifically increased Cp and Sc, show a noteworthy impact on the EWR during the EDM process for used superalloy. On the contrary, Fig. 4b indicates that parameters Cp and Sc significantly impact the outcome of OC during machining of IN617. Herein, by briefly examining the attributes of these input variables, the subject of examination pertains to a Ni-based superalloy, more particularly identified as IN617. The variable Cp is used to designate three different concentrations of alumina powder, specifically 0.5 g/L, 1.0 g/L, and 1.5 g/L. On the other hand, the variable Sc represents three separate concentrations of surfactant, namely 6%/L, 8%/L, and 10%/L. Furthermore, the investigation was made more complex by the inclusion of two different CTs, namely SCT and DCT, applied to the tool material.

Fig. 4
figure 4

Average magnitude of response measures representing the boxplots for EWR

The box plots showcased in Figs. 4 and 5 yield valuable insights into the observed variations among the three input parameters utilized in the experimentation. It is important to highlight that employing the DCT electrode led to a meaningful decrease in both EWR and OC. Likewise, the study demonstrated that higher values of Cp and Sc exhibited a negative correlation with the deterioration of EWR in the case of IN617. Conversely, a moderate Cp value and the highest Sc value were associated with smaller OC values. The distribution of EWR and OC data, within a range of 1.5 times the interquartile range (IQR), indicates considerable variability in the response parameters. Specifically, EWR values ranged from 0.007 mg/min to 0.011 mg/min, and OC varied from 0.08 mm to 0.11 mm. Nonetheless, it is crucial to underscore that the average values of the responses were situated below the median.

Fig. 5
figure 5

Average magnitude of response measures representing the boxplots for OC

4.2 Development of AI-based framework

The main goal of this study is to develop ANN of the complex machining process focussing particularly on its ability to successfully attain the intended results. Furthermore, Sub Sect. 3 is devoted to providing a comprehensive explanation of the complex procedures entailed in the construction of ANN models, the execution of external validation, the implementation of SA, and the execution of evaluations for PO.

4.3 Artificial neural network (ANN)

ANN has emerged as a state-of-the-art modelling tool in the field of AI, finding extensive use in many engineering industries such as automotive, construction, manufacturing, and energy [71, 72]. The chosen ANN variation for this research is the MLP, which is widely recognized for its various benefits compared to alternative machine learning methods. The MLP demonstrates a notable proficiency in tackling intricate engineering problems that involve nonlinearity and temporal variability. This is exemplified by its efficacy in handling large and diverse datasets.

ANNs are highly effective algorithms for approximating functions, demonstrating their proficiency in constructing functional links between parameters. A conventional ANN model consists of three distinct layers, namely input, hidden layer, and output layers. This paper investigates how the input layer integrates different input properties and makes it easier for data to be received and then transferred to the layer above. The entire operational architecture of the ANN used in the EDM process is shown in Fig. 6.

Fig. 6
figure 6

Working schematic of ANN model

The reasoning behind choosing the specific topology of an ANN with a single hidden layer is based on the following primary benefits in comparison to more intricate topologies, as elucidated by Cruz et al. [73]. (1) The utilization of a single hidden layer in the network design enhances simplicity, hence facilitating the training process and improving the comprehensibility and explainability of the model. The neural network’s ability to construct intricate data representations is made possible by the engagement of several neurons within a single hidden layer, which promotes faster training. Additionally, the case for using this method is reinforced by two important elements: (3) quicker prediction following the training phase, and (4) a noteworthy accuracy ratio, especially when working with a small dataset.

The identification of neurons in the subsurface layer is critical. According to earlier research, the total number of neurons in the layer that is hidden is often chosen to be between 1 × and 2.5 × that of the variables that are in the input layer. Alternatively, the determination can be made concerning the quantity of layers in the concealed layers. This option provides versatility by allowing the selection of one, two, or many levels to process the data in the concealed layer, depending on convenience. However, to achieve maximum performance in modelling with an ANN, it is considered satisfactory to utilize a solitary hidden layer containing a suitable quantity of neurons [57].

The transmission of information from hidden to output layer enables the execution of computational operations that mimic the output variable. The process entails error-back propagation, which facilitates the modification of weights and biases. The purpose of these adjustments is to enhance the concordance between the response simulated by the model and the observed value. In mathematical terms, the functioning of an ANN can be described as follows [1].

$${Y}_{\text{i}}={f}_{2}\left(\sum {W}_{2}\left[{f}_{1}\left(\sum {X}_{i}{W}_{1}+{b}_{1}\right)\right]+{b}_{2}\right)$$
(5)

The value Yi, produced by the ANN at the output layer, is determined based on the input vector \({X}_{i}\), which is defined using input variables for each count \(i\) in the dataset (\(i\) = 1,2,3,…, N). The biases \({b}_{1}, {b}_{2}\), along with activation functions \({f}_{1}, {f}_{2}\), are applied to the hidden and output layers of the ANN, respectively. The weight matrices and \({W}_{1},{W}_{2}\) play a crucial role, representing the weighted relations from input to output layer of the ANN.

In this investigation, the EDM’s performance, specifically EWR, and OC, is modelled via a three-layered ANN. The process of model deployment involves training, testing, and validation, with data split ratios of 0.8, 0.1, and 0.1, respectively. The ANN is equipped with 3–8 hidden layer neurons for effective learning. PO of the ANN employs the Levenberg–Marquardt algorithm and focuses on minimizing the sum-of-squared error. The performance of the trained ANN is assessed using indicators such as R2, MAE, and RMSE.

Figure 7 depicts the contour plots that showcase the effectiveness of the ANN model in terms of training, testing, and validation. The number of hidden layer neurons ranges from three to eight, specifically for the EWR. During the training phase, the R2 values regularly surpass 0.98. Furthermore, for both testing and validation, each ANN attains an exceptional R2 value of 1.0 for the response measures. It is worth mentioning that the ANN model, which consists of eight neurons in the hidden layer, exhibits improved R2 values. Specifically, the R2train value is 0.99, while both R2test and R2val are equal to 1.0 for the EWR. Furthermore, the EWR, as illustrated in Fig. 7, exhibits MAE values for the three stages of ANN development. Specifically, the MAE values are as follows: MAEtrain = 1.03E-04 mg/min, MAEtest = 0.011 mg/min, and MAEval = 0.00026 mg/min. Furthermore, the RMSE values for the training, testing, and validation sets are as given: RMSEtrain = 1.28E-04 mg/min, RMSEtest = 0.011 mg/min, and RMSEval = 0.00028 mg/min. The graphical representation presented in Fig. 7 highlights the notable effectiveness of a trained ANN consisting of eight hidden layer neurons across the various stages of EWR.

Fig. 7
figure 7

The ANN model built for R2, MAE, and RMSE in the training, testing, and validation phase for EWR

Figure 8 depicts the results of the ANN model in terms of all stages. The number of hidden layer neurons for the OC scenario is varied between three and eight. During the training phase, the R2 values routinely surpass 0.99. Additionally, for both testing and validation, each ANN achieves an exceptional R2 value of 1.0 for the response measures. It is worth mentioning that the ANN model, which incorporates a hidden layer consisting of five neurons, exhibits improved R2 values. Specifically, the R2train value is 0.99, while both R2test and R2val are equal to 1.0 for the OC. The ANN model with five hidden layer neurons for the OC demonstrates the lowest MAE and RMSE. Regarding the observed OC as illustrated in Fig. 8, it is noteworthy to mention that throughout the three stages of ANN development, the MAE values are as follows: MAEtrain = 0.0018 mm, MAEtest = 0.03 mm, and MAEval = 0.0002 mm. The RMSE results for the training, testing, and validation sets are as given: RMSEtrain = 0.0029 mm, RMSEtest = 0.045 mm, and RMSEval = 0.00038 mm. The graphical representation presented in Fig. 8 highlights the notable efficacy exhibited by a trained ANN consisting of five hidden layer neurons across the various stages of OC.

Fig. 8
figure 8

The ANN model built for R2, MAE, and RMSE in the training, testing, and validation phase for OC

5 External validation test

The model, which is trained using an ANN, undergoes a thorough validation procedure. This approach involves using a comprehensive dataset that covers all possible operational cases inside the system being reviewed. The dataset in question is characterized by its distinctiveness, as it poses a challenge to the trained neural networks. It serves as a valuable resource for evaluating and assessing the network’s capabilities in making accurate projections. Further clarification regarding this validation approach will be provided in the following sections. A critical step in the development of a reliable function approximator entails the assessment of number of neurons to be included in the hidden layer. This process involves a sequence of iterations for training and testing, wherein several ANNs are utilized with varying numbers of neurons. After completing these iterations, independent validation is performed on the trained networks to select number of neurons essential for high-quality results.

The verified ANNs are subjected to a Valext assessment, where the number of neurons in the hidden layer is varied between 3 and 8. The evaluation of the ANNs is conducted by employing statistical metrics, which are visually depicted in Fig. 9a, b. ANNs frequently demonstrate strong prediction ability, as evidenced by their R2 values > 0.98 for both EWR and OC. In the context of EWR, the ANN including eight neurons shows the most favourable outcomes in terms of error metrics. Specifically, the ANN achieves an RMSE of 0.00026 mg/min and an MAE of 0.00028 mg/min, as depicted in Fig. 9a. In contrast, in the case of OC, the design that consists of five hidden layer neurons demonstrates the lowest error magnitudes. This is evident from the RMSE and MAE values of 0.00029 mm and 0.00038 mm, respectively, as depicted in Fig. 9b. The aforementioned results remain coherent when the ANN is implemented on a Valext dataset, resulting in an R2 value that surpasses 0.98. Figures 7 and 8 provide visual evidence that the ANN models with eight and five hidden layer neurons outperform other models in the Valext test. These models exhibit superior performance, highlighting their superiority. Therefore, it has been shown that the ideal number of neurons for accurate performance estimate in the context of EDM is eight for EWR and five for OC, resulting in satisfactory performance metrics.

Fig. 9
figure 9

External validation test for ANN a EWR; b OC

5.1 Parametric analysis

5.1.1 Electrode wear rate

In contrast to the SCT Cu electrode, the DCT Cu electrode performed exceptionally well, producing the lowest magnitude of EWR in all of the factorial design’s permutations. By making electrodes harder and more resilient, CT confers special benefits. An increase in hardness leads to improved wear characteristics and an extended tool life. The DCT Cu electrode performed better in this investigation than the SCT Cu electrode, demonstrating a more noticeable reduction in grain size with DCT than with SCT. This significant reduction in grain size leads to better wear characteristics because harder materials usually wear less. When contrasted to the SCT Cu electrode, the DCT Cu electrode performs better because of its higher hardness, which reduces discharge heat during pulse length and prevents irregular sparking. The phenomenon of melting and evaporation of the electrode material are lessened by the decreased discharge heat in the DCT Cu electrode.

Notable results are obtained from the analysis of three different Cp(s) for EWR when WCO is present during the EDM of IN617. The results show that increased EWR is correlated with lower Cp. The conductive bridge that alumina powder creates between the tool material and base is essential. By serving as an efficient conductive bridge, the minimum amount of powder increases spark concentration. This raises the EWR because more heat is released during the discharge, fading, and vapourizing of the electrode material. The creation of the plasma channel with and without the nano-powder is shown in Fig. 10. Figure 10 shows that the addition of nano-powder increased the production of the plasma channel and eroded the tool to the same level. Because of the agglomeration process, in which additional powder collects and obstructs current flow and reduces discharge heat, the Cp of 1 g/L led to a lower EWR. On the other hand, a greater WCO Cp results in a lower EWR. When Cp exceeds the ideal threshold, spark concentration to the base alloy and electrode material is hindered, and current flow is restricted by the agglomeration process, which lowers EWR.

Fig. 10
figure 10

Phenomenon of plasma channel generation

Using cryo-treated electrodes, the effect of adding surfactant (Span-80) to the dielectric fluid—more precisely, WCO—was evaluated during the Ni-based superalloy’s EDM process. The EWR increased when a lower Sc (6%/L) was used. This result can be justified by the fact that there was not enough Span-80 present to allow for adequate mixing of the nanoparticles in the dielectric. As a result, alumina aggregated and a large amount of the spark density produced during the pulse-on period was not efficiently transferred to the workpiece material, resulting in increased EWR. When employing an Sc of 8% in the WCO with CT electrodes, the second highest EWR was noted. Alumina powder was more evenly distributed in the dielectric due to the modest Sc, and the presence of the surfactant reduced the fluid’s viscosity, which made it harder for eroded material to stay in the machining zone. On the other hand, when compared to other concentrations, the WCO with a greater Sc (10%/L) produced the least EWR. Because of the larger concentration, the alumina powder was more evenly distributed, which lessened the agglomeration process and encouraged the right distribution of spark density on the workpiece surface.

5.1.2 Dimensional error

Grain refinement is responsible for the DCT Cu electrode’s better performance compared to the SCT Cu electrode. Compared to the SCT Cu electrode, the DCT Cu electrode produces a noticeably more noticeable grain refinement. Thus, the absence of erratic sparking during the machining process was attributed to the finely tuned grain structure of the DCT Cu electrode. When compared to the SCT Cu electrode shown in Fig. 11b, the diametric error provided by the DCT Cu electrode is smaller in Fig. 11a. The better microstructure brought about by CT is directly linked to the DCT Cu electrode’s increased performance. Compared to the SCT Cu electrode, the DCT Cu electrode produces more accurate dimensional readings because of its finely tuned microstructure, which prevents over-sparking.

Fig. 11
figure 11

Diametric error in machining profiles taken by CMM a DCT Cu electrode; b SCT Cu electrode

Notable insights are obtained from the examination of three distinct Cp during the EDM of IN617 in the presence of WCO. The findings show a relationship between decreased OC and lower Cp values. An essential conductive link between the tool and base materials is provided by alumina powder. By avoiding agglomeration, the insertion of the least Cp improves conductivity and reduces diametric error. On the other hand, a Cp of 1 g/L is associated with higher OC, which is explained by a larger plasma channel that is created when powder particles aggregate and result in a higher OC. On the other hand, reduced OC is achieved with increased alumina Cp in WCO. Above the ideal threshold, higher Cp prevents spark concentration to the base alloy and, as a result of agglomeration, reduces current flow, which lowers the OC.

The impact of adding surfactants to the dielectric fluid—more specifically, WCO—was assessed, while the Ni-based superalloy was being EDMed. There was an increase in OC when a lower Sc (6%/L) was used. This result can be explained by the fact that there was not enough Sc, which prevented the nanoparticles in the dielectric from properly mixing. This caused alumina to clump together, and a large amount of the spark density produced during the pulse-on period did not properly transfer to the workpiece material, which resulted in an increased OC. Using an 8% Sc in the WCO produced the second highest EWR. The inclusion of surfactant reduced the viscosity of the dielectric fluid, which made it difficult for eroded material to stay in the machining zone, and the mild Sc allowed for the effective distribution of alumina powder in the dielectric. On the other hand, compared to other concentrations, a higher concentration (10%/L) of Span-80 in the WCO produced the least amount of OC. By properly dispersing the alumina powder, this Sc reduced the agglomeration process and encouraged a good distribution of spark density on the workpiece surface.

6 Sensitivity analysis

To gain a deeper understanding of the system, it is crucial to evaluate how the output parameter responds to changes in the input variables. An SA is run on the created ANN model to evaluate the impact of input parameters on the EWR and OC in the EDM process. To modify the input variable of interest along its complete operational range, from the smallest to the highest values, the current analysis uses a methodical approach. The mean values of the other input parameters are maintained constant in the interim. The resulting complete shift in the output parameter values is quantified and normalized, allowing the percentage importance of the input variable to be expressed.

Figure 12 illustrates the percentage relevance of input factors for the EWR and OC of IN617. Significantly, the variable Sc contributes the highest proportion to the overall significance, accounting for 52.41% of the explained EWR. This is followed by the variable Cp, which contributes 33.14% towards EWR, and the treatment variable, which contributes 14.43%. On the other hand, in the context of this study, the variable with the greatest percentage of significance for OC is also Sc, accounting for 72.67%. This is followed by Cp, which contributes 21.25% to the overall significance, and finally, treatment, which accounts for 6.06%. Moreover, the notable contribution of DCT can be ascribed to the enhancement of grain size, leading to a heightened level of homogeneity and regularity in the electrical discharge machining procedure.

Fig.12
figure 12

Percentage SA of process parameters on EWR and OC

6.1 Parametric optimization

For a particular set of input parameters, parametric optimization (PO) is a methodical and efficient process design technique that produces better results within a specific range of input data. Because the EDM process is inherently nonlinear, PO shows promise as a useful technique for improving EDM cutting efficiency and determining the ideal input parameter values. In order to achieve PO, a unique value is found for each level of an input parameter. The optimized values for each input parameter are carefully calculated throughout the EDM process for used super alloy, and they are then applied to determine the EWR and OC.

Figure 13 presents a graphical depiction illustrating the optimized input parameter values for the EDM process. These values were determined by a comprehensive study using the PO method, to minimize both EWR and OC. The detailed values of the optimized input parameters are provided below: The value for the treatment (DCT) is 1.003. The Cp is 1.49 g/L, and the Sc is 8.57%/L. Consequently, the EDM technique achieves an EWR of 0.0017 mg/min and an OC of 0.07 mm. The aforementioned findings indicate considerable enhancements, as evidenced by a noteworthy decline of 47.05% in EWR and a huge reduction of 85.00% in OC when compared to the maximum values of EWR and OC recorded in the original dataset.

Fig. 13
figure 13

PO analysis for EWR and OC

7 Conclusions

Conventional analytical techniques and classical process modelling are employed as tools to assess the effectiveness of production processes and pinpoint areas for improvement. In the present scenario, a comprehensive framework for process analysis, driven by AI, has been deployed with a specific focus on the manufacturing sector. The objective of this implementation is to improve the performance of the system under evaluation. The primary objective of this application is to reduce the EWR and OC for the IN617 in the EDM process. The next part presents a concise summary of the main findings obtained from the conducted case study.

  1. 1.

    The training of the AI model, particularly the ANN employed in this study, involves thorough hyperparameter fine-tuning. Subsequent evaluation of the AI model’s performance indicates that the constructed ANN demonstrates exceptional predictive proficiency for both EWR and OC within the used superalloy for EDM. In the context of EWR, the model achieves an R2 value surpassing 0.98, with corresponding MAE and RMSE values of 1.03E−4 mg/min and 1.28E−4 mg/min, respectively. This underscores the outstanding generalization capabilities exhibited by the trained ANN model.

  2. 2.

    On the other hand, for OC, an R2 value more than 0.99 is also achieved, along with MAE and RMSE values of 0.0018 mm and 0.0022 mm, respectively. These results highlight and confirm the exceptional generalization abilities displayed by the well-trained ANN model.

  3. 3.

    The investigation has revealed that the use of a Cu electrode subjected to DCT significantly influences both EWR and OC due to the substantial refinement of grain size compared to the Cu electrode subjected to SCT. Interestingly, a higher concentration of alumina powder is associated with lower EWR, while a higher Sc contributes to a lower EWR. Furthermore, the influence of the DCT Cu electrode on achieving a superior EWR and dimensional accuracy appears more pronounced.

  4. 4.

    To unveil the hierarchy of variable significance in the EDM process, an SA is undertaken. The findings indicate that, concerning EWR, Sc emerges as the most crucial factor, commanding a significant 52.41% contribution, trailed by Sc (33.14%) and treatment (14.43%) for the IN617 material. Conversely, for OC, Sc also takes the lead with the highest percentage significance level at 72.67%, followed closely by Cp (25.25%) and treatment (6.06%). Furthermore, DCT consistently exhibits greater contributions to both response measures.

  5. 5.

    PO processes are carried out, resulting in the determination of optimal input parameters that align with minimizing EWR and OC in the EDM process. The identified optimal input values are as follows: treatment = DCT; Cp = 1.49 g/L; Sc = 8.57%/L.

  6. 6.

    With the optimal parameter configurations obtained through PO, the EWR and OC values in the EDM process demonstrate significant enhancements, exceeding the highest EWR and OC values observed in the machining of IN617 material by 47.05% and 85.00%, respectively.