1 Introduction

1.1 Traditional quality monitoring of gas metal arc additive manufacturing

Gas metal arc additive manufacturing (GMA-DED) [1] or wire arc additive manufacturing (WAAM) is now regarded as a viable and cost-effective alternative to subtractive technology and casting for the production relatively large and complex metallic components. It is effectively an automated welding deposition process, and it is subject to similar quality considerations as other welding processes, which requires optimization of process parameters, procedure qualification, and monitoring.

In the case of GMA-DED, the product value is generated progressively through the component deposition, and early identification of quality issues during build up is important to allow intervention at an early stage, reducing waste of hours of deposition time and kilograms of metal deposited. High speed transient data acquisition systems have already been developed for online welding monitoring; therefore, these systems can be used also for GMA-DED process monitoring.

With these systems, data can be readily obtained from several sensors, but this raw data must be processed to indicate potential quality anomalies. For welding, a range of predetermined tolerance limits, statistical analysis, numerical signature recognition, and rule-based techniques have been applied to analyze the data and indicate the possible anomalies and their location [2,3,4].

In most cases, the relationship between signal features and defects is indirect, and the system is often trained by inducing defects, for example, by restricting gas flow to induce porosity—in which case the arc instability is taken as an indirect indication of porosity. It is important to note that, although several studies have been developed aiming to monitor and control the welding process, the layer-by-layer nature of the GMA-DED process introduced several new challenges, due to multiple and repetitive heating of the layers and the importance of maintaining the layer geometry under determinate parameters [5, 6]. Therefore, despite the applicability of these techniques in online monitoring systems, the development of quality inspection procedure specifically directed a GMA-DED has more recently gained attention as reflected in the increasing number of research papers aimed at enhancing the final product quality [7, 8].

1.2 Brief survey on state of art monitoring techniques for gas metal arc additive manufacturing

A “Scopus” survey targeting titles, abstracts, and keywords related to WAAM, monitoring, and defect detection, yielded 63 relevant documents. The review indicated that in 2017, Artaza et al. retrofitted a CNC machine with a plasma arc source and employed a combination of sensors, including a pyrometer, welding camera, laser, and voltage sensor, for real-time process monitoring [9]. In this application, all the sensors data were presented to users, giving them the possibility to monitor the process. Furthermore, a threshold variation of welding voltage was used to identify anomalies during deposition, based on previous research on welding technology [10,11,12].

Subsequent developments in WAAM were introduced by Zhang et al. [13], who introduced computer vision for measuring wire deflection during the process using a welding camera, while addressing variations in process parameters.

In 2020, Wang et al. marked an incorporated artificial intelligence to analyze welding pool images for bead width and height estimation [14], which was claimed to be useful for feedback controller development.

Chabot et al. in 2021 employed frequency response analysis using fast Fourier transform (FFT) to correlate short circuit frequency during cold metal transfer-WAAM with the contact to workpiece (CTWD) distance [15]. This offered an indirect measurement useful for feedback controllers. Hebert et al. further expanded this approach in 2022 by employing feature extraction of audio signals, and a random forest ML analysis [16], and claim an improved performances in estimation compared with linear regression.

The integration of image-processing techniques for defect detection has gained prominence in subsequent years. He et al. in 2021 employed convolutional neural networks (CNN) for magneto-optical image analysis [17]. Lee et al. in the same year developed a binary classifier for anomaly detection using CNN and welding pool images [18] A similar approach has been presented by Xia et al. [19, 20], which employs a supervised multiclass approach aiming to classify the instabilities and defects. Cho et al. in 2022 demonstrated the feasibility of these methodologies for monitoring molybdenum alloys [21].

The shift towards using advanced techniques to process acoustic signals for defect detection emerged in 2023. Surovi et al. demonstrated the extraction of frequency features from audio signals, achieving a high F1 score of 0.89 for classifying normal and defective depositions during Inconel 718 production [22].

Alcaraz et al. extended this approach by combining audio signals with welding current and voltage signals, utilizing long short-term memory (LSTM) classifiers for detecting porosity [23]. Rajesh et al. employed a combination of welding current, voltage, and audio sensors, applying random forest and artificial neural networks to solve anomaly detection tasks using a supervised binary classification approach [24]. Rohe et al. introduced a unique time-frequency approach using spectrograms and deep learning for detecting anomalies in audio emission signals [25].

However, the employment of audio signals is not new. In fact, the literature reveals a comprehensive exploration of non-machine learning (non-ML) and machine learning (ML) using acoustic approaches for defect detection signals. Time domain, frequency domain, and wavelet analyses have been extensively employed to identify various defects, with features such as peak amplitude, kurtosis, energy, and decomposition coefficients, playing crucial roles [22]. While the utilization of audio signals for monitoring is not a novel concept, it is widely acknowledged that the acoustic emission of the welding process is intricately linked to both welding current and welding voltage [26]. Therefore, also welding voltage and welding current are important parameters to monitoring for online process quality assessment, given that alterations in the operating characteristics of the arc can have a profound impact on the quality of the fabricated components [27]. Nevertheless, the existing body of research in this domain remains constrained to the extraction of time-domain features from the waveforms of welding current and voltage signals and the employment of supervised machine and deep learning to develop monitoring applications [28, 29].

1.3 Gap of knowledge and work proposal

The correlation between audio signals and welding voltage and welding current, particularly within the repetitive process of controlled droplet dip-transfer, suggests that exploring the frequency domain content of transient voltage and current may be used to derive more significant features associated with both normal and anomalous conditions [30].

Despite this potential, the existing research landscape lacks comprehensive insights into this aspect, indicating a gap in our current understanding.

Moreover, while the integration of supervised learning methods has demonstrated promising results in the realm of online process monitoring for WAAM, several challenges arise when attempting to implement these approaches in an industrial environment. One of the primary challenges lies in the manual labelling process itself. Creating a labelled dataset for training supervised learning models requires experts to meticulously annotate data points, specifying whether they represent normal or defective instances. This process is not only time-consuming but also demands a deep understanding of the specific defects and anomalies that may occur during the WAAM process. Another challenge is the need for balanced dataset to avoid problems like poor generalization. In fact, as mentioned earlier, defects are induced within components, leading to additional waste of materials, considering the high amount of defects that needed to be generated for solving data-driven classification task.

Furthermore, industrial environments are inherently dynamic, with manufacturing processes subject to variations over time, for example, due to wear of equipment. The need for continuous retraining of models to accommodate changes in the production environment adds another layer of complexity. In this context, unsupervised learning techniques offers a solution for the mentioned challenges; however, the number of works related to this problem remains restricted to few research [31,32,33].

Drawing from the knowledge presented above, the current study aimed to introduce a new strategy based on frequency analysis of welding signals and unsupervised learning to develop a monitoring application for a GMA-DED process. In particular, moved by recent standard such as ISO/ASTM 52943-2:2024 [34], the proposed methodology allows to develop an anomaly detection application with good accuracy using small dataset composed by only defect-free deposition data.

2 Methodology

2.1 Experimental setup

In this study, 49 beads on a plate of 70-mm length were deposited using process parameters that resulted in defect-free deposition. More precisely, in this proof-of-concept study, “defect-free” deposits were identified as those having uniform bead profile, absence of humping or discontinuities, absence of surface breaking cracks, lack of spatter, and surface porosity. The experimental setup, illustrated in Figure 1, consisted of a 9-axis Yaskawa welding robot station equipped with a Lincoln Electric PowerWave 500 welding system. During this defect-free deposition experiments, the welding voltage and the welding current signals were collected at a sampling frequency of 5 kHz using an NI-6009 device facilitating tracking of high-frequency variations in voltage and current during deposition. The raw material used was a 1.2-mm diameter AWS A5.18 ER70S-6 wire, with a shielding gas mixture of 75% argon, 20% CO2, and 5% oxygen for process protection. Additionally, aiming to collect data related to a multi-layer deposition process, two wall structures of ten layers each, namely Wall 1 and Wall 2 were printed maintaining the interpass temperature around 30 °C, and data were collected. The data collection process for the raw data and the deposition strategy for the wall structures are illustrated in Figure 2. This defect-free data collection procedure aligns with the recent ISO/ASTM 52943-2:2024 [34] standard, which mandates the use of welding procedure type tests which could be the primary source of training data for innovative ML algorithms in the field of welding.

Fig. 1
figure 1

The 9-axis Yaskawa welding robot cell with power wave 500 power source is used for this experiment. For software integration, an AI-computer is used to acquire and manipulate data, communicating through TCP/IP with robot controller DX200

Fig. 2
figure 2

Employed experimental data collection procedure

Surface tension transfer (STT) waveform-controlled short circuit welding technology was utilized to control the deposition process. This technology, controlling the output current waveform during the shorting circuit phase through a high-speed inverter, resulted in lower heat input during deposition. This approach reduced spatter, final component distortion, and residual stresses, ultimately enhancing the mechanical properties of WAAM components [35].

The experimental campaign, detailed in Table 1, involved varying welding speed, gas flow rate, and contact tip to workpiece distance (CTWD) while keeping the waveform parameters, namely the background current and peak current, and wire feed speed constant to the values of 120 A, 320 A, and 4.5 mm/min respectively. In fact, the fixed parameters are mainly related to wire material and dimension and subject to procedure qualification, while the others may be varied within defined range during the deposition process due to feedback controllers.

Table 1 Experimental campaign beads on plate in which welding speed (WS), gas flow rate (GFR), and contact to workpiece distance (CTWD) have been varied

Walls 1 and 2 were printed using the same parameters employed for the deposition of beads 45 and 46. Specifically, Wall 2 was printed with feedback compensation to control the CTWD, while for Wall 1, this feedback compensation was not used, and the CTWD was controlled based on the bead model output. However, due to complex thermal phenomena, the actual layer height often differs from the planned height, leading to defects such as porosity or excessive spatter during the building process. Therefore, the data associated to Wall 1—the only one contained data associated to parts with defects—building process were classified via surface appearance and sound emitted during the process by experts in welding technology and used to test the performance in anomaly detection of ML algorithms.

2.2 Dataset generation and methodology

Once the data in terms of welding current and welding voltage was collected at a high sampling frequency of 5 kHz, allowing to capture the instantaneous variations related to the presence of defects, a windowing approach [2] is proposed aiming to develop an online process monitoring application for WAAM.

The collected raw data were segmented into 1-s windows, each consisting of 5000 samples, to simulate a buffer used during the welding process for both welding current and voltage. From these raw data windows, a total of 257 training samples were obtained, each containing 5000 samples of welding current and voltage, with a shape of (257, 5000, 2). Aiming to tune the hyperparameter of the machine learning algorithms used to learn the complex pattern of normality in the data, we split this dataset into training and validation sets, with 90% (231 samples) allocated for training and 10% (26 samples) for validation. The hyper parameters were tuned based on accuracy metrics in normal behavior detection.

Finally, the test phase, used to compare the different proposed methodologies, comprises 150 samples associated with the ten layers deposited during the Wall 1 (W1) construction, in which every sample have been labelled as normal or anomaly with the help of expert in welding technology.

As declared in the introduction of this work, different methodologies for feature extraction are proposed and compared with several unsupervised machine learning algorithms. Before being used as input to machine learning algorithms, the extracted features have been normalized using the formula reported in Equation 1. This normalization involves scaling the features based on the minimum and maximum values of each feature (denoted as j) with respect to the values of the training dataset. This aspect is important to enhance the ML algorithm performance [36].

In this study, statistical and process-based features were extracted from time-domain signals, which is a well-established method for analyzing welding current and voltage signals. However, the frequency domain of these signals may contain additional information about physical phenomena involved in the welding process, such as droplet detachment, which is linked to process stability and defect occurrence. In order to explore the results of employing frequency domain analysis in WAAM anomaly detection, we compared the results obtained using standard time-domain features with those from an innovative frequency-domain feature extraction technique specifically designed for welding current and voltage signals. The overall approach of the proposed methodology is illustrated in Figure 3.

Fig. 3
figure 3

General scheme of the presented methodology. Once the data have been collected using a window approach, features are extracted from both buffers of welding current and voltage which contain 5000 samples each. The extracted features are used in an unsupervised machine learning algorithm that produces feedback to the Yaskawa cell every second

$${x}_{i}^{j}=\frac{{x}_{i}^{j}-{x}_{\text{min}}^{j} }{{x}_{\text{max}}^{j}-{x}_{\text{min}}^{j}}$$
(1)

2.3 Unsupervised machine learning methods and anomaly detection metrics

As discussed in the introduction, machine learning (ML) methods have been extensively used in arc welding and WAAM processes due to their capability to address complex data-driven tasks such as process modelling, monitoring, and anomaly detection. As outlined in the introduction, ML methods can be categorized into supervised and unsupervised techniques (see Figure 4). Supervised methods utilize labelled datasets to map inputs, such as features extracted from welding signals to specific labels like defects (e.g., humping, spattering) or bead geometry characteristics. In contrast, unsupervised methods analyze the input data without additional labels to identify patterns, cluster similar data points, and detect anomalies that deviate from normal patterns.

Fig. 4
figure 4

A graphic high-level classification of machine learning methods

While supervised learning methods have shown promise for online monitoring in WAAM, they face challenges such as the time-consuming manual labelling process, the need for a balanced dataset for effective generalization, and the necessity for continuous retraining due to dynamic manufacturing environments. In contrast, unsupervised learning algorithms can be more effective for anomaly detection, as they identify anomalies as deviations from normal behavior without needing labelled data. In this work, we employ unsupervised learning to detect anomalies using only defect-free deposition data. Among various methods, isolation forest, one-class support vector machine, and local outlier factor represent the state-of-the-art techniques, and we applied these methods to process features extracted from normal deposition data, as presented in the previous section.

2.3.1 Isolation forest

Isolation Forest [37] is a tree-based algorithm that isolates anomalies by randomly partitioning data points into subsets, and then recursively splitting the subsets until each point is in a separate subset. Anomalies are identified as points that require fewer splits to isolate them from the rest of the data, as they are less representative of the data distribution. The resulting anomaly scores may be used also for root cause detection. In this case, several decision trees (DTs) are used to isolate the training data, e.g., using normal behavior data, and the output score of each tree is voted to give an overall output anomaly score.

2.3.2 One-class support vector machine

One-class SVM (OC-SVM) [38] is a variant of the support vector machine (SVM) particularly developed for anomaly detection. It is a type of unsupervised learning algorithm that aims to learn a decision boundary that separates the normal data from the anomalies in a dataset. The decision boundary is learned by maximizing the margin around the normal data, which is the distance between the boundary and the closest normal data points. After defining the hyperplane that can maximize the margin concerning the closest normal data point, the algorithm returns +1 if point X is inside the boundaries, −1 otherwise.

2.3.3 Local outlier factor

Local outlier factor (LOF) is a commonly used density-based technique that identifies local outliers present in the dataset by comparing the local density of each data point with that of the neighborhood [39]. Once defined a point O of the dataset, composed of the 12 features extracted, the k-neighbors are considered. Furthermore, the local reachability distance (LRD) is evaluated as the inverse of the average distances of O concerning the neighbors, described by the Euclidean distance. The LOF is defined as the ratio between the LRD of point O and the average LRD of the neighbors. If the LOF is greater than 1, then an anomaly is found. As for all the presented algorithms, only the training dataset was used during the training phase and a k value of 5 was employed.

2.3.4 Hyperparameters employed in this work

In this work once the features have been extracted following different methodologies (time domain, and time-frequency domain), the proposed algorithms have been trained. The validation accuracy is used to tune the hyperparameters, and then the results have been compared aiming to assess the improvement in performance introduced by frequency analysis of welding signals. Specifically, the Isolation Forest was constructed with 20 trees utilizing three random features to split the normal data, with each tree using 75% of the training data and 75% of the total features to mitigate overfitting. A contamination rate of 1% was applied. For the LOF algorithm, a contamination rate of 1% and consideration of ten neighbors for score evaluation was employed. The OCSVM utilized a radial basis function kernel and a ν factor, with an upper limit on the fraction of training errors and a lower limit of the fraction of support vectors set to 0.01. Finally, to overcome the need to manually extract features, a deep learning approach was also used and was compared with machine learning results.

2.3.5 Metrics for anomaly detection

When dealing with unbalanced datasets, accuracy alone is not a reliable performance metric, especially in anomaly detection where anomalies are rare. Instead, metrics like precision (p) and recall (re) should be evaluated. Therefore, the performance of the proposed methodologies has been evaluated using precision, recall, and the F-score, as outlined in Equations 2, 3, and 4. In these equations, TP represents the number of windows correctly labelled as anomalies, TN represents correctly labelled normal cases, and FP and FN represent false alarms and undetected anomalies, respectively.

In the context of anomaly detection, it is often preferable to detect more anomalies, even if some are false positives, to avoid missing defects. This is particularly crucial in additive manufacturing, where undetected defects can have significant consequences, such as wasted deposition time and materials, especially in WAAM, which is used for large-scale components. To evaluate the performance of algorithms, we focus on F2-score, which emphasizes recall, aligning with the goal of minimizing missed anomalies.

$$p=\frac{\text{TP}}{\text{TP}+\text{FP}}$$
(2)
$$re=\frac{\text{TP}}{\text{TP}+\text{FN}}$$
(3)
$$F\text{score}=\frac{\left(1+{\beta }^{2}\right)\times p\times re}{{\beta }^{2}\times p+ re}=\frac{2p\times re}{p+ re}$$
(4)

2.4 Time domain features extraction

Concerning time domain features extraction. statistical and process-based features were derived from the welding current and voltage signals windows of 1-s length, specifically mean, standard deviation, kurtosis, and skewness values, which were extracted from both welding current and voltage signals. To capture the characteristics of the standard waveform in the STT process, which should have a maximum value equal to the peak current and a minimum value equal to the background current, two additional process-based features were extracted from the buffer containing the current signals. These features represent the sum of the time during which the current value exceeds the peak value, in Equation 5, and falls below the background value, in Equation 6.

$$\sum_{\text{on window}}I>{I}_{\text{peak}}$$
(5)
$$\sum_{\text{on window}}I>{I}_{\text{background}}$$
(6)

A value consistently bigger than peak current suggests a higher heat input to the material, potentially increasing the likelihood of defects such as porosity, humping, and layer collapse. On the other hand, a value consistently lower than the background can lead to issues like delamination, lack of fusion, and spatter during the welding process. At the end of the feature extraction activity, ten features associated with each buffer have been extracted and used to detect anomalies and defects every 1 s of deposition. As discussed, these features extracted are normalized before their utilization in the machine learning algorithm.

2.5 Frequency domain features extraction

Frequency domain analysis of signals is an important task in data processing and machine learning for several reasons, including signal denoising and feature extraction from time series signals. The most common frequency domain analysis techniques are fast Fourier transform (FFT) and wavelet analysis. The FFT converts a waveform from the time domain to the frequency domain, which is a representation of the signal in terms of its constituent sine wave frequencies. The FFT can be used to identify the different frequencies that are present in a signal and their relative magnitudes. However, the FFT is limited in its ability to analyze signals that vary in frequency over time. To address this limitation, discrete wavelet transform (DWT) can be used. This technique involves applying a series of high-pass (details coefficients) and low-pass (approximation coefficients) filters to the signal using the same wavelet which results in predefined levels of decomposition, as shown in Figure 5.

Fig. 5
figure 5

Applying a DWT the input signal may be decomposed in several detail coefficients which, as for the results of a FFT, may be used to identify anomalous behavior

Concerning features extraction in the frequency domain, in this work, both FFT and DWT were employed to represent in the features space of the windows of 1-s length. Ther aim is to investigate the improvements in anomaly detection via unsupervised learning enhanced by frequency domain analysis of the signal.

To extract features from 1-s-long signals composed of 5000 samples of welding current and welding voltage signals, frequency features derived from FFT analysis and DWT were obtained. To perform the FFT analysis and decompose the signal using DWT, the influence of DC content on the signal was minimized by eliminating the mean values, as expressed by Equation 7. Then the frequency analysis is conducted on \(\check I\) and \(\check V\) signals.

$$\check I=I-{I}_{\text{window mean}}; \check V=V-{V}_{\text{window mean}}$$
(7)

Subsequently, the data was utilized to extract features such as peak frequency (PF), the corresponding peak amplitude (PA), and the energy of the spectrum (E) from the FFT analysis for both welding current and welding voltage, resulting in a total of six features. In a three-level decomposition, features were extracted from the four detail coefficients and the last approximation coefficient. Specifically, two features were derived from each decomposition level: the standard deviation per level, offering insights into the distribution of wavelet coefficients across various decompositions, and average energy of each level, providing a measure of the overall importance of that component in the signal. In total, 22 features were extracted, collecting information about the general frequency response through FFT and details in the frequency bands of 2500–1250 Hz (level 1), 1250–625 Hz (level 2), 625–312.5 Hz (level 3), 312.5–0 Hz (level 3) using a 3rd-order Daubechies wavelet. The extracted features are reported in Table 2, while a typical result of decomposition level is illustrated in Figure 6. To compare the results obtained using frequency domain analysis, the same algorithms used with time domain features have been used with the same hyperparameters. In particular, the same training, validation, and testing datasets are used for the metrics evaluation.

Table 2 Extracted frequency features from welding current and welding voltage using FFT and DWT for a total of 26 features
Fig. 6
figure 6

An example of decomposition levels obtained from a 1-s length window of welding current signal. a 0–312.5 Hz. b 312.5–625 Hz. c 625–1250 Hz. d 1250–2500 Hz

2.6 Convolutional autoencoder in spectrogram analysis

Utilizing FFT and DWT can facilitate the extraction of a general frequency response of welding current and welding voltage signals within a selected window of 5000 samples. However, this approach involves a trade-off between time and frequency content, as small, rapid events within the window, like the ones related to defects during the deposition, may be overshadowed by more dominant normal characteristics. A too-small window might inadequately capture the signal’s frequency representation, albeit yielding a smoother time domain response as the window size increases. In cases where both time and frequency domain information are crucial, an extension of FFT and DWT can be employed, leading to time-frequency domain analysis. The short-time Fourier transform (STFT) serves as a method for analyzing the time-frequency content of signals. It entails applying multiple FFTs to segments of the total signal, utilizing windows of 128 samples or multiples thereof. To enhance frequency content analysis, an overlap between adjacent windows can be incorporated. The general scheme of STFT applied to STT welding signals is illustrated in Figure 7. The outcome is known as a spectrogram, a two-dimensional representation of the signal’s time-frequency domain content. Here, the x-axis denotes time, and the y-axis illustrates the frequency content of the signal in each time window. Given that this output can be treated as an image, image processing emerges as a potent tool for the frequency analysis of signals. The fundamental idea in image processing involves convolving a kernel or filter with a raw image, generating another image that highlights some specific aspects of the data. The nature of the highlights depends on the construction of the filter made by the developer. In contemporary approaches, deep learning techniques are increasingly employed to learn the optimal construction of filter structures, aiming to minimize goal-dependent cost functions. In the realm of image processing, these deep learning techniques eliminate the need for manual feature extraction, which traditionally involves selecting filters for convolution.

Fig. 7
figure 7

The output of a STFT on a welding voltage signal. Defined a window length, an FFT is applied and the results of frequency domain during the time are stored in an image called spectrogram

In the context of deep learning addressing anomaly detection tasks, a specific structure known as an autoencoder (AE) is commonly employed [31], illustrated in Figure 8. A typical AE model employs ANNs to process input feature space with the objective of compressing information into a latent space. This latent space automatically extracts features from the input, utilizing them to reconstruct the signal. By employing a standard loss function, such as mean squared error, and backpropagation algorithms, the model can learn optimal features [32] and uncover hidden patterns in normal signals. Deviations from what the algorithm learned during reconstruction indicate the presence of defects. In the case of images, a 2D convolutional network can be used to process the image, eliminating the need for complex manual filters to extract valuable information, such as features, from the images.

Fig. 8
figure 8

A typical autoencoder architecture. The input is compressed in a latent space using a non-linear transformation, and another non-linear transformation is used to reconstruct the input aiming to minimize the reconstruction error

In this approach, once data has been collected, spectrograms have been obtained using the STFT method performing FFTs on a window of 128 samples with an overlap of 64 samples, resulting in an output spectrum of dimensions 77 by 65. The two obtained images, corresponding to the spectrograms of welding voltage and current, are then resized to a consistent shape of (64,64) using a bicubic interpolation and concatenated into two channels. A 2D-Convolutional AutoEncoder (2D-CAE) is employed to process the images, featuring a symmetrical architecture composed of three hidden convolutional layers. Each layer consists of two stack of n filters, each with a size of (3x3), applied consecutively. The objective is to capture a higher receptive field with a reduced number of parameters, following the principles outlined in [33]. The three layers in the encoding part are configured with filters in the sequence (8, 8, 16, 16, 64, 64). After each repetition, a stride of 2 is utilized to halve the input image dimensions. The decoding part mirrors this structure in reverse order. To train the network Adam algorithm is used with a learning rate of 0.001 and a batch size of 16 samples for 2000 epochs. During the inference phase, the reconstruction error, namely the mean square error of the reconstruction, is used to detect anomalies. The threshold THR is selected considering the 99.9 quantiles of the errors obtained at the end of the training phase. The visual representation of this scheme is illustrated in Figure 9.

Fig. 9
figure 9

The proposed methodology consists of training a 2D-CAE aiming to reconstruct the spectrograms associated with good deposition. After the training phase, the threshold (THR) is chosen as the 99.9 quantiles of the reconstruction error (RE) obtained at the end of the training phase

3 Results and discussion

3.1 Time domain approach

The performance metrics for various anomaly detection algorithms are presented in Table 3. The standout performer is the Isolation Forest showcasing a robust validation accuracy of 100%. With a precision score of 0.922, it correctly labels normal instances most of the time. The recall, standing at 0.76, and an F2 score of 0.784 underscore its effectiveness in detecting anomalies. The LOF algorithm possesses the same validation accuracy and a higher precision of 0.94. However, its lower recall of 0.69 and an F2 score of 0.735 suggest slight lower performances in detecting the anomalies. The test accuracy is higher, but in unbalanced dataset problems, it cannot be used as comparative metrics, since it is more influenced by the precision, due to the highest number of normal depositions with respect to anomalies. In the case of the one-class SVM, it achieves the lowest validation accuracy of 88.4%. Its precision of 0.9 indicates a good ability to correctly classify normal instances. However, the lower recall of 0.47 and an F2 score of 0.503 reveal strong limitations in identifying abnormal instances, potentially leading to false negatives.

Table 3 Summary of results of the proposed methodology based on time domain features

3.2 Frequency domain approach

The introduction of frequency-based feature extraction significantly improves the performance of all the algorithms presented. Indeed, each algorithm maintains a high precision value, with Isolation Forest and LOF achieving 0.89 and OCSVM reaching 0.96. The slightly lower precision is offset by a substantial increase in recall. Specifically, Isolation Forest, LOF, and OCSVM exhibit increments of 10%, 17%, and 30%, respectively, in recall. Notably, LOF shows the highest improvement in F2-score, with a 14% increase. Nevertheless, Isolation Forest remains the top-performing algorithm, with an F2-score of 0.85, with an increment of 8% with respect to 0.784 reached using time domain features. The results are summarized in Table 4.

Table 4 Summary of results of the proposed methodology based on frequency domain features

3.3 Time-frequency domain approach

The final results in terms of validation accuracy, precision, recall, and F2-score of the proposed algorithm based on CAE are compared with the best results obtained with the other methodologies in Table 5. In general, each algorithm exhibits distinct strengths. The time domain Isolation Forest demonstrates robust performance in detecting normal cases, indicating a higher overfitting on normal data. However, when leveraging frequency domain features, it excels particularly in recall and F2-score, but a lower precision is obtained. This suggests its effectiveness in detecting all defects but also indicates a tendency to generate more false alarms. On the other hand, the CAE presents a well-balanced performance, showcasing high precision and recall, respectively of 0.966 and 0.88, higher than 4% with respect to best results. While it may not identify all defects, it generates a reasonable number of false alarms. Finally, also the F2-score is higher of then 5% with respect to the best case of frequency domain Isolation Forest. In general, also the test accuracy is higher than 12%. The choice between these approaches depends on software requirements. Notably, this approach introduces two key considerations. Firstly, the more balanced performance achieved by the CAE, and secondly, the avoidance of extracting features manually from the signal. While this can be advantageous for a specific dataset, it underscores the importance of tailoring feature extraction to the nuances of different welding processes, as the features crucial for one process may differ from those pertinent to another. In contrast, frequency features can be extracted using the same technique, especially if deep learning is used to analyze time-frequency domain data as spectrograms since this approach is data-driven.

Table 5 Summary of the best results obtained with the same STT-WAAM dataset with the proposed methodologies

3.4 Development and industrial integration

The algorithms presented have been developed using Python, utilizing libraries such as Tensorflow, NumPy, Pandas, and Nidaqmx for communication with the NI board. This enables the development of the application. Communication between the AI-PC and the DX200 controller occurs via TCP/IP, facilitating the exchange of information regarding the arc-on command state. When the robot reaches its designated position, the arc-on variable is activated from the robot program. The welder starts the welding process, and the acquisition system begins recording. Due to the current and voltage values consistently falling out of range during arc ignition, data collection is configured to start with a half-second delay following the ignition event, to avoid an anomaly generation at the beginning of the process. The output of the monitoring system, indicating the presence of anomalies, is communicated through the PC terminal for user visualization. Both welding signals and anomaly detection output are saved in a csv file for further exploration during the cooling phase. An internal variable in the robot PLC is specifically designated to halt the welding process if the algorithm detects one or more anomalies, but during the experimental campaign reported in this work, it has not habilitated. To highlight the results visually, Figure 10 illustrates the last testing result obtained from the algorithm and saved in csv file format.

Fig. 10
figure 10

The outcome of the CAE method developed in this study reveals instabilities that result in porosity within the component. These porosity issues are linked to an anomaly in the frequency content observed during the STT deposition

3.5 Summary

The task of monitoring the Wire Arc Additive Manufacturing (WAAM) process is crucial for future manufacturing systems. Several studies have indicated the applicability of machine learning in developing monitoring applications, but even more attention needs to be given to defect detection. Detecting defects demands additional layers of abstraction, with anomaly detection being the initial step. This is because supervised learning can be applied following anomaly detection, facilitating the classification of defects on low-dimensional and more balanced datasets.

Employing unsupervised approaches, as used in this work, eliminates the need to generate big volume and balanced dataset with an equal amount of normal deposition and deposition with various types of defects. The practical application of this technique would involve the use of welding procedure qualification tests as a source of training data, with more extensive mechanical and non-destructive testing. In fact, the recent ISO/ASTM 52943-2:2024 standard [34] mandates the use of welding procedure type tests which could be a source of this training data about only defect-free data.

While this methodology appears promising, particularly in industrial environments where cost reduction and efficient defect localization are paramount, unsupervised machine learning algorithms face challenges in identifying anomalies resembling normal deposition. This is because these algorithms are specialized in detecting rare events, and this has limited the research on this topic.

This research demonstrates that better performance in anomaly detection employing only defect-free deposition data can be achieved by leveraging features extracted from the frequency content of welding signals, such as current and voltage. These signals are more robust and less affected by environmental noise compared to audio signals and are less costly than using welding cameras. In particular, this study shows that traditional state-of-the-art techniques, which rely on time-domain feature extraction, achieved an F2-score of 0.784, limiting their success in this field due to poor performance. In contrast, the proposed time-frequency domain feature extraction and the proposed data-driven spectrogram processing via Convolutional AutoEncoder improved the F2-score to 0.85 and 0.895 respectively. This result is comparable to performance obtained by supervised learning-based anomaly detection techniques. The findings suggest that real-world anomaly detection software should focus more on the time-frequency response of these signals rather than just the time domain, paving the way for the broader application of this underexplored data analytics approach in WAAM, which allows to develop good performance applications using small dataset.

4 Conclusion

Various machine learning tools have been evaluated in this work, and they demonstrate the potential effectiveness of unsupervised learning approaches to anomaly detection in WAAM process monitoring. These approaches require more investigation to determine their effectiveness in a real industrial environment. This remains an indirect process anomaly detection technique which can be used to target post deposition non-destructive evaluation (NDE). While it is indirect, it utilizes robust simple sensors, unlike the more direct ultrasonic, radiography, and vision-based approaches, but it could simply be linked to basic temperature sensing and laser measuring devices to supplement thermal control and bead profile measurement. It also has the potential to allow interruption of the build process to allow rectification and avoid high-cost remediation. For a complete quality assurance, it is likely that it would be used in combination with procedure qualification and monitoring.