Keywords

1 Introduction

Consider a machine that performs a simple operation in the production cycle of a product. On a typical day, this machine can perform the same motions over and over, with very little deviation. Then one day, one of its motors develops a problem and does not function properly anymore. As a result, the machine performs its operations incorrectly and damages the product. The longer it continues, the more damages accrue, potentially causing significant losses in damages and production delays. Detecting anomalous behaviors like this is important for quality control and safety. Fortunately with the rise of digitization and advanced analytics, industry has developed automated methods to accomplish just that.

We begin this chapter by introducing the concept of anomaly detection in more detail. For this purpose, we summarize different perspectives on that topic. We then discuss statistical methods to detect anomalies. The widely used techniques discussed in this section will offer a solid foundation for detecting anomalies and serve as a starting point for further research.

Next, we explore deep learning, a more advanced approach that employs artificial intelligence to detect anomalies. Deep learning has found success in many areas of machine learning including anomaly detection. Here we present a case study from the EU project knowlEdge [3], where an autoencoder was used to detect anomalies in a manufacturing process of fuel tanks. The autoencoder architecture is explained in depth and further illustrated by the case study.

Finally, we stress the importance of human involvement in the anomaly detection process. While AI has many capabilities, humans are essential in interpreting the results, refining the models and making informed decisions.

2 Anomaly Detection in Industry

As digitization progresses, the desire of manufacturing companies for more transparency about their machine and plant landscape is increasing. Once the data are available, suitable processes are needed to gain a deeper insight into the production processes. With the help of data and process analysis, irregularities and occurring disturbances in the production process flow can be viewed and analyzed in detail. These irregularities are referred to as anomalies, although there is no standard definition of the term in the literature [8]. For example, Zheng et al. [24] define an anomaly as “a mismatch between a node and its surrounding contexts,” Lu et al. [13] as a “data object that deviates significantly from the majority of data objects”, and Su et al. [18] as an “unexpected incidence significantly deviating from the normal patterns formed by the majority of the dataset.”

Anomalies can be further specified depending on the context, such as described in Hasan et al. [10], where anomalies that occur in IoT datasets are divided into eight classes: Denial of Service (DoS), Data Type Probing (D.P), Malicious Control (M.C), Malicious Operation (M.O), Scan (SC), Spying (SP), Wrong Setup (W.S), and Normal(NL). Wu et al. [23], on the other hand, classify the anomalies that occur into three types that are more descriptive in nature, namely punctual, contextual, and collective anomalies. Here, a punctual anomaly represents a specific reference point with anomalous information, a collective anomaly characterizes a set of data records that have an anomalous character compared to the other data collected, and a continuous anomaly, which is a collective anomaly whose considered time period extends from a specific starting point to infinity [8].

The occurrence of anomalies can have many reasons. One common reason represents a change in the environment, such as a sudden increase in temperature, which is considered an abnormal condition by the sensor [8]; another reason may simply be due to a sensor error [8]; or it may be a malicious attack intended to weaken the computing power of an IoT network and thus intentionally cause the sensor to malfunction [15].

With the help of machine learning (ML) tools for anomaly detection, a variety of algorithms and methods are available to enterprises to identify anomalies [14]. Agrawal and Agrawal [2] divided the process of anomaly detection into three major phases, as shown in Fig. 1. Here, parameterization describes the preprocessing of data into a previously defined acceptable format, which in turn serves for the further training phase. In this, a model is created based on the normal or abnormal behavior of the system. Depending on the type of anomaly detection considered, different methods can be chosen, which can be either manual or automatic. The last phase is the detection phase, where finally the model is compared with the parameterized datasets. For example, if a predefined threshold is exceeded, an alarm can be triggered to draw attention to the anomalies.

Fig. 1
A flow diagram starts with the monitored environment and flows through parameterization, training, model, detection, and intrusion report.

Methodology of anomaly detection

Toshniwal et al. [21] further elaborate on the basic idea of ML algorithms. According to them, ML algorithms require input data for training, and in turn output labels are generated for test instances, where each feature represents a dimension. The input data represent the batch or real-time data of the data instances, where each data instance can be considered as a data point. Depending on the type of data, the input data are tagged or untagged. The output of the algorithm in turn has classes with which the instances are associated.

Clustering is used as one method for anomaly detection. For this, data are divided into groups of similar objects, where each group (cluster) consists of objects that are similar to each other and can be distinguished from objects in other groups [4]. Various clustering methods [1] can be used for anomaly detection, including partitional methods such as k-means or probabilistic methods such as EM.

In addition, outlier detection algorithms find patterns in data that do not follow a specific behavior. There are a variety of outlier detection schemes, such as the distance-based approach [4]. This is based on the nearest-neighbor algorithm and uses a distance-based metric to identify outliers [19].

Another possibility for the identification of anomalies is based on classification, which describes the problem of identifying the category of new instances based on a classification model learned on a training dataset that contains observations of known memberships of categories. The category represents the class label, and different observations may belong to multiple class labels. In machine learning, classification is considered an instance of supervised learning. An algorithm that resorts to the method of classification is called a classifier. It can predict class labels and distinguishes between normal and abnormal data in the case of anomaly detection [2]. Well-known classification methods form the Classification Tree, Fuzzy Logic, Naïve Bayes network, Genetic Algorithm, and Support Vector Machines.

The Classification Tree is also called decision tree because it resembles the structure of a flow chart. Here, the internal nodes form a test property, each branch represents the test result, and the leaves represent the classes to which an object belongs [22]. Fuzzy Logic, derived from fuzzy set theory, deals with approximate inference. In this method, data are classified using various statistical metrics and classified as normal or abnormal data according to the Fuzzy Logic rules [12].

The Naïve Bayes network is based on a probabilistic graph model where each node represents a system variable and each edge represents the influence of one node on another [2]. The Genetic Algorithm belongs to the class of Evolutionary Algorithms and generates solutions to optimization problems based on techniques inspired by natural evolution such as selection and mutation. The Genetic Algorithm is particularly robust to noise and is characterized by a high anomaly detection rate [12].

A Support Vector Machine (SVM) is a supervised learning method used for classification and regression. It is more widely used especially in the field of pattern recognition. A one-class SVM is based on examples that belong to a specific class and do not include negative examples [20].

3 Feature Selection and Engineering

As with all machine learning methods, feature selection plays an important role in anomaly detection. One important category of features in the context of manufacturing is sensory data from monitoring systems. For example, the operation of a robotic arm might create measurable vibrations during motor operations [16]. These vibration data could be used in the process of detecting anomalous behavior, which might act as a trigger to perform maintenance on the arm.

The importance of feature selection comes from the fact that irrelevant features can lead to decreased performance of the model. In addition, the computational complexity and data storage costs increase with the number of features. When developing models for feature detection, a large chunk of development time will typically be allocated to the selection and transformation of features.

Not all data can readily be used by a model. Continuing the example of our robotic arm, there may be noise in the data from other vibrations or inaccurate measurements. In these cases, we may have to preprocess or engineer features. This could involve running a noise filter on our vibration data to produce a clearer signal. However, it may also be the case that there is hidden information unavailable to the model due to a lack of complexity in its architecture. In the example of our robotic arm, performing frequency analysis leads to engineered features that may be more informative for a given model.

4 Autoencoder Case Study

In this case study, we will take a look at a real manufacturing process from an industry partner in the knowlEdge project. KnowlEdge is an EU project funded by the Horizon2020 initiative that advances AI powered manufacturing processes and services [3]. The scenario described here represents the production of fuel tanks for combustion engines in cars. More precisely, it defines the blow molding procedure during tank manufacturing, which is subject to a variety of individual steps controlled and observed by high-precision sensors for different metrics such as temperature, position, and energy consumption.

In Fig. 2, we see an overview of the blow molding process as exemplified by a water bottle. Essentially, it involves melting plastic and forming it into a preform, which resembles a test tube with a threaded neck. This preform is then placed into a mold cavity and air is blown into it, forcing the plastic to expand and take the shape of the mold. Once the plastic has cooled and solidified, the mold is opened and the finished bottle is ejected. This process can be highly efficient, allowing for the mass production of objects with consistent shapes and properties.

Fig. 2
An illustration represents the blow molding process. It indicates the mold, the pressurized air and stretch rod, the molded part inside the mold, and the final molded part from left to right.

An illustration of the blow molding process. The plastic is melted and shaped into a tube-like preform. The preform is then placed in a mold, inflated with air to take the mold’s shape, and cooled. Once solidified, the finished bottle is ejected from the mold

Unfortunately, as blow molding of complex shapes is a very sensitive process, not every production cycle is successful, and produced tanks can show defects indicated by quality measures out of tolerance. To reduce these erroneous cycles already in early stages and thus to decrease additionally emerging costs, we will present a solution for anomaly detection that was implemented during the project. Although in this context different kinds of detection methods were considered including supervised approaches for already identified anomalous behaviors, in the following, we will focus on autoencoders [9] as a representative method for unsupervised anomaly detection that is applicable to a large quantity of potential use cases.

4.1 Autoencoders

An autoencoder is a neural network architecture that encodes and decodes data to learn a compact and efficient representation. For simplicity, imagine a neural network with several layers that simply tries to output whatever was put in (identity function). This is of course a non-challenging task for small datasets, if the layers in-between are big enough to maintain all the information. But what if you need a representation of many individual data instances? Since learning every instance is too complex for large datasets, the idea of autoencoders is to find a reduced description that generalizes well for big data, while still maintaining the essential information to reproduce given inputs. Considering the aforementioned architecture of a multi-layered neural network (as illustrated in Fig. 3), autoencoders split up the propagation of information into an encoder and decoder part. In the encoder, information generally flows through to a chain of shrinking layers with a final layer that equals a bottleneck to encode and therefore to learn a compressed version of the data. Subsequent layers, representing the decoder, are responsible to restore information from the generated encoding. The accuracy of the model is given by the reconstruction error, which is defined as a measure on how much the original data differ from its compressed approximation constructed via a propagation through the learned encoding.

Fig. 3
A neural network diagram represents the interaction between the nodes in the input, hidden, and output layers from left to right. The hidden layer is indicated as encoding, and the output layer is denoted as decoding.

A simplified example of an autoencoder with 1 hidden layer. There are fewer neurons in the middle than at the start and the end. This forces the model to learn a compressed representation of the data

4.2 Anomaly Detection for Blow Molding

The original task of fuel tank manufacturing comprises many steps including continuous and cyclic sub-processes such as extrusion or blow molding. Focusing on the cyclical blow molding process, more than 100 associated data attributes were analyzed. Attributes mainly corresponded to predefined and observed values for machine positions, temperatures, energy consumption, and pressures. As this information is recorded over time, the problem becomes a multivariate anomaly detection task for timeseries data. To better capture the inherent structure of timeseries data, a specific variant of autoencoders was utilized, called Long-Short-Term-Memory Autoencoders (or LSTM-Autoencoders) [11]. In the first step, all relevant process information was assigned to its associated machine cycle—indicated by a combination of two binary machine events—to produce input data for model training and evaluation. The resulting dataset is then used to learn a compressed representation of individual machine cycles. Since autoencoders are used to find a suitable representation of the provided information, the model tries to find an encoding, which reflects essential cycle information. With anomalies being defined as rarely occurring events, it makes use of the fact that a deviation from common behaviors is not regarded as characteristic information. As explained in the previous section, this deviation leads to higher reconstruction errors for cycles not following the usual patterns. A problem with unsupervised models is the lack of a ground truth. While theoretically one can try to make the model more and more complex to capture different behaviors of a system, the goal is not to consider anomalies as rare, but valid data entries. However, the question remains: how do we draw a line to distinguish which reconstruction errors indicate an anomaly and which do not? One potential solution, which was also used in the presented approach, is to make use of probability distributions—usually the normal distribution—to estimate the distribution of the reconstruction error for each data attribute. As a result, the reconstruction error can be compared against a predefined threshold derived from the overall dispersion, e.g., errors that are higher than three times the standard deviation. This simple method also regularizes the accuracy, as a poorly fitted model automatically leads to an increase in the overall reconstruction error and the associated spread. A representative illustration is given in Fig. 4. It shows individual machine cycles during a blow molding process, including modeled approximations and identified anomalies based on exceeded thresholds with respect to the reconstruction error.

Fig. 4
An illustration represents the snippet of the original process data with the 5 attributes highlighting the anomalies. The attributes in each row denote the progression of the original signal and distinct cycle approximations generated from the autoencoder.

A snippet from the original process data showing 5 attributes including identified anomalies. Each row presents a different attribute over time, with the original signal shown in black. Differently colored lines indicate distinct cycle approximations generated from the autoencoder (reproduced signals). Highlighted by the red-shaded rectangles, two cycles are regarded as anomalies, since for one or more dimensions, the reconstruction error is too high

4.3 Human-Enhanced Interaction

The lack of a ground truth, as is the case for unsupervised anomaly detection, leaves room for further enhancements to improve the accuracy of the system. One solution is to integrate expert feedback, also known as “humans-in-the-loop.” A domain-specific verification of generated results enables us to label former predictions and to subsequently reuse this information as input for supervised approaches. In addition, human feedback can help to fine-tune predefined thresholds for the reconstruction error. If a domain expert notices that the approach detects too many false-positives (i.e., the number of identified anomalies, which are not anomalies according to the expert), thresholds for associated attributes can be reduced. The same holds for the opposite case, where the sensitivity of the system might be increased if no anomalies can be identified, although a manual inspection would reveal existing anomalies. Further information on how human feedback enhances approaches for unsupervised anomaly detection can be found in the literature [5,6,7, 17].

5 Conclusions

In the fifth industrial revolution, human-centered AI (HCAI) is becoming a key part of manufacturing processes. Anomaly detection can be used to improve quality control and elevate the human operator. There are several machine learning approaches to detect anomalies. Some of the methods we discussed are clustering, which groups similar data objects, and classification, which categorizes new instances based on known categories. In the scope of the EU project KnowlEdge, we have used autoencoders to detect manufacturing flaws in the production of fuel tanks. We have seen in detail how autoencoders work and how they efficiently solve the problem of anomaly detection for large datasets. The involvement of human expertise, or “humans-in-the-loop,” is crucial for improving model performance and managing false-positive detections.