Keywords

1 Introduction

Knowledge graphs are becoming increasingly recognized as a valuable tool in data-driven domains like healthcare [1], finance [2], and manufacturing [3], where they have gained considerable popularity in recent research. They are commonly employed to represent and integrate both structured and unstructured data, providing a standardized approach to encode domain knowledge [4]. Built on ontologies that conceptualize domain classes, relations, and logical inference rules, KGs represent specific instantiations of ontological models and their inherent semantic characteristics. Typically, KGs are divided into two modules: a terminological TBox containing concepts (such as the class of a manufacturing process) and an assertive ABox containing real-world instances (such as unique executions of a manufacturing process).

We adopt the notion of a (standard) KG \(\mathcal {G}=(V,E)\) as described in [5], which is represented by a set of nodes V  (also referred to as vertices) and a set of triples \(E \subseteq V \times R \times V\) consisting of directed and labeled edges. Here, R denotes the set of valid relation types defined in the underlying ontology. Thus, an edge in the form of a triple \((s,p,o) \in E\) implies an outgoing relation from the subject \(s \in V\) to the object \(o \in V\) via the predicate \(p \in R\). Given such a KG, embedding techniques aim to exploit the topology of the graph to generate latent feature representations

$$\displaystyle \begin{aligned} \gamma: V \rightarrow \Gamma {} \end{aligned} $$
(1)

of its nodes V  in a latent representation space \(\Gamma \), e.g., \(\Gamma = \mathbb {R}^d\) with \(d \in \mathbb {N}\), thereby enabling their utilization in downstream applications, e.g., graph-based machine learning (ML). However, the findings of this work can be applied almost analogously to the most well-known KG extensions, such as labeled property graphs like Neo4j [6].

In addition to the improved applicability of graph-based data in tasks like recommendation systems [7] or question answering [8], embedding formalisms have also proven to be valuable as intrinsic complements to graph-structured data. This is due to their ability to provide an empirical approach for enhancing the expressivity of graph topologies by means of downstream tasks like entity linking [9] and link prediction [10]. Consequently, related areas such as relational ML are receiving significant attention in both literature and applications [11].

In this chapter, we first provide a brief overview of representation learning as the enabler of KG embeddings, addressing state-of-the-art embedding formalisms for generating lean feature representations and describing their functionalities. An analysis of the advantages and drawbacks of employing KG embeddings is provided, along with a discussion of associated open research questions. We focus specifically on potential challenges and risks that may hinder the usage of KG embeddings in the highly dynamic manufacturing domain. Accordingly, we present the methodologies developed within the Teaming. AI project to address those problems. In this context, we describe the applicability and potential benefits of KG embeddings in the human–AI-based manufacturing use cases of the project. Furthermore, we showcase the Navi approach as an enabler of dynamic KG embeddings that allows for real-time and structure-preserving computations of new or updated node representations.

2 Knowledge Graph Embeddings

The generation of KG embeddings as per Eq. (1) denotes a subdiscipline of representation learning. In the context of KGs, representation learning is applied to determine lean feature representations that are able to capture inherent semantic relationships between KG elements. Thus, we first provide a general overview of representation learning to subsequently describe its application in KG embeddings.

3 Representation Learning

Representation learning comprises techniques for the automatic detection of appropriate feature representations that can be employed by downstream models or tasks, such as machine learning models [12]. Thus, the main objective of representation learning is to eliminate the need for preprocessing raw input data. Given a set of observable variables V  with semantic representations \(\pi :V \rightarrow \Pi \) within an inherent representation space \(\Pi \) (which is not necessarily compatible with the downstream model), these techniques aim to generate an alternative feature mapping \(\gamma :V \rightarrow \Gamma \) into a representation space \(\Gamma \) that satisfies the requirements of the desired task.

Representation learning can be performed in a supervised, unsupervised, or self-supervised manner. One example of a supervised approach for learning latent feature representations is the training of deep neural networks on labeled input data. Namely, given an input feature \(\pi (v)\) for some \(v \in V\), the hidden layer outputs (and also the output layer) obtained from the forward pass of the network can be considered as alternative representations \(\gamma (v)\), as illustrated in Fig. 1.

Fig. 1
A network diagram depicts the interlinks between the input layer with three nodes, two hidden layers, one with five nodes and the other with four nodes, and the output layer with two nodes.

Deep neural networks as supervised representation learning formalisms

Contrarily, unsupervised representation learning techniques can be utilized for unlabeled representations \(\pi (v)\). Methods like principal component analysis or auto-encoders intend to reduce the dimensionality of high-dimensional input features. Accordingly, the goal of these algorithms is to determine alternative, low-dimensional representations without the consideration of any target feature except the input feature \(\pi (v)\) itself. For example, auto-encoders feed a representation \(\pi (v) \in \mathbb {R}^{d'}\) into a deep neural network and attempt to reconstruct it, i.e., \(\pi (v)\) also serves as the output feature. However, the hidden layers are assumed to be low-dimensional to serve as alternative representations \(\gamma (v) \in \mathbb {R}^{d}\) of \(v \in V\) with \(d \ll d'\) as depicted in Fig. 2.

Fig. 2
A network diagram depicts the interlinks between the input layer and output layers, both with six nodes, and three alternative representation, two with four nodes on either side with one column with two nodes at the center.

Auto-encoders as unsupervised representation learning formalisms

Finally, self-supervised representation learning aims to leverage the underlying structure \(\mathcal {S}_V\) of unlabeled data that contains the variables \(v \in V\) and which allows for deriving meaningful initial representations \(\pi (v)\). For example, a word \(v \in V\) may appear in a set of sentences \(\pi (v)\) within a shared text corpus \(\mathcal {S}_V\), as exemplified in Fig. 3. While state-of-the-art NLP models like BERT [13] usually split words into frequently occurring subword tokens via subword segmentation algorithms such as Wordpiece [14], the inherent methods can be applied analogously to sets of complete words. In the course of training such NLP models, numerical embeddings \(\gamma (v) \in \mathbb {R}^d\) are assigned to the domain variables \(v \in V\) with respect to their original representations \(\pi (v)\). These alternative representations are optimized by backpropagating the output of the LLM for at least one element of its initial representation \(\pi (v)\).

Fig. 3
Two paragraphs that highlight the promising future of mass-producing highly personalized products and focus on meeting changing demands of efficient production of highly personalized products.

Extract from the abstract in [15]. The semantics of the word products is encoded within the sentences that contain it

Analogously, most NLP techniques can be applied to KG structures \(\mathcal {G} = (V,E)\) by characterizing directed graph walks \((v_1,p_{1},v_2,p_{2},v_3,\ldots ,v_{l-1},p_{l-1},v_l)\) of depth \(l-1 \in \mathbb {N}\) as sentences that are composed of edges \((v_i, p_{i}, v_{i+1}) \in E\). For instance, the sample manufacturing KG depicted in Fig. 4 contains the 4-hop walk

Fig. 4
An illustration depicts the sample K G containing process flows with two products with controls on either ends, in contact with tasks 1 and 2, both having a subtask.

Sample KG containing process flows within a production process

(John, executes, Task 1, output, Product 1, input, Task 2, output, Product 2).

One of these transfer approaches is RDF2Vec [16], which utilizes random graph walks to generate input data for the NLP-based Word2Vec algorithm [17]. By doing so, a mapping \(\overline {\gamma }: V \cup R \rightarrow \mathbb {R}^d\) is trained and thus, alternative representations of the graph nodes in V , but also for the relation types in R as well. Therefore, node embeddings can be derived via \(\gamma (v) := \overline {\gamma }(v)\). Besides transfer approaches like RDF2Vec, various embedding algorithms exist, which are specifically tailored toward KG structures. These are further discussed in the following.

3.1 Representation Learning for Knowledge Graphs

KG embedding techniques denote a subdiscipline of representation learning, taking into account KG structures as initial input data. Given a KG \(\mathcal {G} = (V,E)\), these approaches intend to provide numerical representations \(\gamma : V \rightarrow \Gamma \) as per Eq. (1). However, as exemplified by RDF2Vec, KG embeddings may contain alternative representations of graph elements \(y \not \in V\) as well, such as embeddings of relations, but also edges or subgraphs. Thus, in general, a KG embedding is a mapping \(\overline {\gamma }: \Omega \rightarrow \Gamma \), where \(\Omega \) represents a collection of KG elements pertaining to \(\mathcal {G}\). The node embedding of some \(v \in V\) is accordingly obtained by restricting \(\overline {\gamma }\) to V , i.e., \(\gamma (v) = \overline {\gamma }(v)\).

Based on the research conducted in [10], KG embedding methods can be categorized into three model families, namely tensor decomposition models, geometric models, and deep learning models. We adopt this subdivision in the following.

3.1.1 Tensor Decomposition Models

Tensor decomposition models for KG embeddings are based on the concept of tensor decompositions within the area of multilinear algebra [18]. These attempt to characterize tensors via sequences of simplified tensor operations. For a KG \(\mathcal {G}\), this approach is applied to its unique adjacency tensor \(\mathcal {A} \in \left \{ 0,1 \right \}^{k \times n \times n}\), defined as

$$\displaystyle \begin{aligned} \mathcal{A}_{h,i,j} = 1 \iff \left( v_i, r_h, v_j \right) \in E. \end{aligned}$$

Here, \(k \in \mathbb {N}\) denotes the cardinality of the underlying relation set R and \(n \in \mathbb {N}\) is the number of nodes in V . Accordingly, without loss of generality, we may assume labeled sets \(R = \left \{ r_1,\ldots ,r_k \right \}\) and \(V = \left \{ v_1,\ldots ,v_n \right \}\), as exemplified in Fig. 5.

Fig. 5
An illustration depicts a 4 by 4 matrix on the left, a vector diagram with interconnections between four nodes v 1 to v 4, and a 4 by 4 matrix on the right.

Sample KG with \(n=4\) nodes and \(k=2\) relations \(r_1\) (blue) and \(r_2\) (red), including their respective adjacency matrices \(\mathcal {A}_1\) and \(\mathcal {A}_2\)

Accordingly, tensor decomposition-based KG embedding methods intend to approximate \(\mathcal {A}\) by a sequence of lower dimensional tensor operations. Among these methods, RESCAL [19] is considered to be the first work to apply this methodology for determining KG embeddings. Regarding \(\mathcal {A}\), it proposes a rank-d factorization

$$\displaystyle \begin{aligned} \mathcal{A}_h \approx \mathcal{X} \cdot \mathcal{R}_h \cdot \mathcal{X}^T \end{aligned}$$

of its h-th slice \(\mathcal {A}_h \in \left \{ 0,1 \right \}^{n \times n}\) by means of matrices \(\mathcal {X} \in \mathbb {R}^{n \times d}\) and \(\mathcal {R}_h \in \mathbb {R}^{d \times d}\) with \(d \ll n\). Therefore, the i-th row of the matrix \(\mathcal {X}\) contains an alternative representation \(\gamma (v_i) := \left (\mathcal {X}_{i,1},\ldots ,\mathcal {X}_{i,d}\right ) \in \mathbb {R}^d\) of \(v_i \in V\). The optimization of the matrices \(\mathcal {X}\) and \(\left (\mathcal {R}_h\right )_{1 \leq h \leq k}\) is accordingly achieved by solving the minimization problems

$$\displaystyle \begin{aligned} \min\nolimits_{\mathcal{X}, \mathcal{R}_h} f(\mathcal{X}, \mathcal{R}_h) \text{ ~for ~} f(\mathcal{X}, \mathcal{R}_h) =\frac{1}{2} \left( \displaystyle \sum\nolimits_{h=1}^k \| \mathcal{A}_h - \mathcal{X} \cdot \mathcal{R}_h \cdot \mathcal{X}^T \|{}_F^2 \right), \end{aligned}$$

with the Frobenius norm \(\|\cdot \|{ }_F\) and the corresponding element-wise operations

$$\displaystyle \begin{aligned} f(h,i,j) = \frac{1}{2} \bigg( \mathcal{A}_{h,i,j} - \gamma(v_i)^T \cdot \mathcal{R}_h \cdot \gamma (v_j) \bigg)^2. \end{aligned}$$

To reduce the complexity of these optimizations, DistMult proposes to use diagonal matrices \(\left ( \mathcal {R}_h \right )_{1 \leq h \leq k}\) [20]. However, by doing so, DistMult is limited to symmetric relations. ComplEx solves this problem by employing \(\mathbb {C}\)-valued embedding spaces [21]. In addition to the mentioned models, numerous other tensor decomposition models for KG embeddings exist, including ANALOGY [22], SimplE [23], and HolE [24].

3.1.2 Geometric Models

Geometric KG embedding models represent semantic relations as geometric transformations within a corresponding embedding space. In contrast to tensor decomposition models, embeddings are not determined based on characteristics of the unique adjacency tensor \(\mathcal {A}\), but with respect to individual facts \((s,p,o) \in E\).

As outlined in [10], transformations \(\tau _{p} (s) := \tau \left (\gamma (s), \overline {\gamma } (p)\right ) \in \Gamma \) are applied for subject nodes \(s \in V\) regarding predicates \(p \in R\). Accordingly, based on a distance measure \(\delta : \Gamma \times \Gamma \rightarrow \mathbb {R}_{\geq 0}\), KG embeddings are computed via score functions

$$\displaystyle \begin{aligned} f(s,p,o) := \delta \left(\tau_{p} (s), \gamma (o) \right). \end{aligned}$$

Among the family of geometric KG embedding methods, TransE [25] constitutes the most famous approach. As a translational model, it approximates object representations \(\gamma (o)\) via \(\gamma (o) \approx \tau _{p} (s) = \gamma (s) + \overline {\gamma } (p)\). Various geometric KG embedding models build upon the idea of TransE, improving the representation of nodes and relations by introducing additional components or transformations, such as

  • Relationship-specific hyperplanes to capture complex interactions between nodes and relationships more effectively (TransH) [26]

  • Relationship-specific node projection matrices to handle entities and relationships with different characteristics more flexibly (TransR) [27]

  • Adaptive projection matrices regarding differing node-relation-pairs (TransD) [28]

  • Relationship clustering to group similar relations (TransG) [29]

For a comprehensive overview of these methods, we refer to [10]. This work also introduces negative sampling as a common obstacle of KG embedding formalisms. Due to the open-world assumption of KGs, \((s,p,o) \not \in E\) does not necessarily imply that the fact is false. Rather, it means that the KG does not contain information about its validity. Thus, negative sampling is applied to create a set of false facts \(E_{neg} \subseteq V \times R \times V\) with \(E \cap E_{neg} = \emptyset \) to train the embeddings in a supervised way.

3.1.3 Deep Learning Models

Graph-based deep learning (DL) approaches, also referred to as Graph Neural Networks (GNNs), exist for some time already, especially in the context of complex network systems and their underlying undirected graph structures [30]. However, the application of such algorithms on directed and labeled KGs may lead to a loss of relevant information. To address this issue, Graph Convolutional Networks (GCNs) were first introduced to account for directed edges [31]. Furthermore, to accommodate different relation types, Relational Graph Convolutional Networks (RGCNs) were elaborated as extensions of GCNs [32], which were subsequently extended by means of attention mechanisms [33] in Relational Graph Attention Networks (RGATs) [34].

In contrast to geometric KG embedding models that apply score functions to individual triples and tensor decomposition models that intend to reduce the dimensionality of the adjacency tensor \(\mathcal {A}\), DL-based models perform predictions for labeled nodes \(v \in \overline {V}\), taking into account itself and its outgoing neighbors

$$\displaystyle \begin{aligned} N (v) := \left\{ y \in V~|~\exists (s,p,o) \in E : \left( s=y \land o = v \right) \lor \left( s=v \land o = y \right) \right\}. \end{aligned}$$

These labels can be derived from the KG itself via node assertions or link assignments, or they can be external, such as numerical or nominal node attributes. Adjacent node representations are meant to be aggregated to receive a composite node representation of v. By backpropagating a suitable loss, initial embeddings of v and its neighbors are optimized. This process is repeated for each labeled training node to generate latent feature representations for all \(v \in \overline {V} \cup \{ N(v):v \in \overline {V} \}\). The formalism proposed in [32] subdivides \(N (v)\) into relation-specific neighborhoods

$$\displaystyle \begin{aligned} N_r (v) := \left\{ y \in V~|~\exists (s,p,o) \in E : \left( s=y \land o = v \right) \lor \left( s=v \land o = y \right) \land p=r \right\}, \end{aligned}$$

regarding relation types \(r \in R\). Thus, given a matrix of (initial) feature representations \(\mathcal {X} \in \mathbb {R}^{n \times d}\) (i.e., the i-th row of \(\mathcal {X}\) is an embedding of \(v_i \in V\)), embeddings of outgoing neighbors can be incorporated in the forward pass of a GNN via

$$\displaystyle \begin{aligned} \mathcal{A}_h \cdot \mathcal{X} \in \mathbb{R}^{n \times d}, \end{aligned}$$

where \(\mathcal {A}_h\) denotes the h-th slice of \(\mathcal {A}\). For instance, in the context of the KG from Fig. 5, the composite representation of \(v_1\) regarding the relation \(r_1\) equals the sum of the initial embeddings of \(v_2\) and \(v_3\). To account for differing impacts of incoming and outgoing edges, R is typically extended via inverse relations \(r'\) for each \( r \in R\). Some works also consider a self-relation \(r_0\). Accordingly, by taking into account the adjacency matrices \(\mathcal {A}_0=Id\) and \(\mathcal {A}_{2h} = \mathcal {A}^T_{h}\) for \(1 \leq h \leq k\), we extend the set R via

$$\displaystyle \begin{aligned} \widehat{R} := R \cup \left\{ r'~|~r \in R\right\} \cup \left\{ r_0 \right\} \text{ ~with ~} r_h^{\prime} = r_{2h}. \end{aligned}$$

By doing so, GNN models capture the semantics of directed and labeled graphs by summing up weighted composite representations to receive a convoluted matrix

$$\displaystyle \begin{aligned} \displaystyle \sum_{h=0}^{2k} \widehat{\mathcal{A}}_h \cdot \mathcal{X} \cdot \mathcal{W}_h \in \mathbb{R}^{n \times d'}, \end{aligned}$$

including relation-specific weight matrices \(\mathcal {W}_h \in \mathbb {R}^{d \times d'}\). Moreover, the extended adjacency tensor \(\widehat {\mathcal {A}} \in \mathbb {R}^{(2k+1) \times n \times n}\) is not necessarily \(\{0,1\}\)-valued. Rather, it is intended to contain normalization constants or attention scores to encode the significance of individual nodes and relations to the forward pass of a GNN. However,

$$\displaystyle \begin{aligned} \left( v_i, r_h, v_j \right) \not\in E \Rightarrow \widehat{\mathcal{A}}_{h,i,j} = 0 \end{aligned}$$

still holds. If no normalization constants or attention mechanisms are to be implemented, this tensor can be directly derived from \(\mathcal {A} \in \left \{ 0,1 \right \}^{k \times n \times n}\) by means of matrix transpositions and the insertion of an additional identity matrix. Finally, by introducing an activation function \(\sigma : \mathbb {R} \rightarrow \mathbb {R}\) such as ReLu, the generalized forward pass of a GNN layer (including RGCNs and RGATs) can be defined as

$$\displaystyle \begin{aligned} \sigma \left( \displaystyle \sum_{h=0}^{2k} \widehat{\mathcal{A}}_h \cdot \mathcal{X} \cdot \mathcal{W}_h \right) =: \mathcal{X}' \in \mathbb{R}^{n \times d'}. {} \end{aligned} $$
(2)

4 Industrial Applications of Knowledge Graph Embeddings

The lack of use case scenarios poses a significant challenge to the application of KGs and corresponding KG embeddings in the manufacturing domain. Without specific applications, it becomes difficult to identify the relevant data sources, design appropriate KG structures, and create meaningful embeddings that capture the intricate relationships within manufacturing processes. Thus, the absence of concrete use cases hinders the exploration of the full potential of KGs and KG embeddings in improving efficiency, decision-making, and knowledge sharing within this domain.

As a result of the research conducted within the Teaming.AI project, which aims to enhance flexibility in Industry 4.0, while prioritizing human involvement and collaboration in maintaining and advancing AI systems, we identified several application scenarios within the manufacturing domain that can be leveraged by introducing industrial KGs and KG embeddings. These are introduced in the following.

Data Integration and Fusion

Manufacturing involves diverse and complex data from various sources, such as sensors, process logs, or maintenance records. While KGs can integrate these heterogeneous data sources, KG embeddings map them into a shared representation space. By representing KG nodes and their relationships in this shared embedding space, it becomes easier to combine and analyze data from different sources, leading to enhanced data fusion capabilities.

Semantic Similarity and Recommendation

KG embeddings allow for quantifying the semantic similarity between nodes. In the manufacturing domain, this can be useful for recommending similar products, materials, or processes based on their embeddings. For example, embeddings can help to identify alternative materials with desired properties or characteristics, thereby aiding in material selection.

Supply Chain Management

Effective supply chain management is crucial for manufacturing. KGs and corresponding KG embeddings can help model and analyze complex supply chain networks by representing suppliers, products, transportation routes, and inventory levels as graph entities. By considering their semantic relations, embeddings can facilitate supply chain optimization, demand forecasting, and identifying potential risks in the supply chain.

Decision Support Systems

KG embeddings and relational ML techniques can serve as a foundation for developing decision support systems in manufacturing. By learning from empirical semantic observations, these systems can provide recommendations, insights, and decision-making support to operators, engineers, and managers. For example, based on the current state of the manufacturing environment, the system can suggest optimal operating conditions or maintenance actions. Moreover, models can be learned to recommend ML models for AI activities, given the current manufacturing environment.

Fault Detection and Diagnosis

KG embeddings combined with relational ML techniques can aid in fault detection and diagnosis in manufacturing systems. By analyzing historical data and capturing the relationships between machines, process variables, and failure events, embeddings can be used to build systems that identify faults or failures in advance. This facilitates proactive maintenance, reduces downtime, and improves overall effectiveness.

In conclusion, KGs allow for representing manufacturing concepts and entities (such as processes, machines, and human workers) and their semantic relationships. KG embeddings, on the other hand, capture inherent semantics in lean numerical representations which facilitate (i) the analysis of existing manufacturing knowledge and (ii) the extraction of new manufacturing knowledge based on empirical observations. As a powerful tool for representing domain knowledge in a human- and machine-interpretable way, KGs enable the combination of human comprehensibility with the computational capabilities of machines. This synergy of human and machine intelligence enables effective collaboration, decision-making, and efficient problem solving in the manufacturing domain. Moreover, it represents a step toward optimized human-in-the-loop scenarios [35] and human-centric Industry 5.0 [36].

However, the manufacturing domain is inherently dynamic, with continuous changes in its processes, equipment, materials, and market demands. Therefore, it is crucial to incorporate these dynamics into KG embeddings, which are typically designed for static snapshots of a domain (cf. Sect. 3.1). In the end, KG embeddings should be able to capture the evolving relationships, dependencies, and contextual information, preferably in real time. By incorporating dynamics, the embeddings can adapt to changes in manufacturing operations, such as process modifications, equipment upgrades, or variations in product requirements. This enables the representations to accurately reflect the current state of the manufacturing system and to capture the evolving aspects of runtime observations and data.

5 The Navi Approach: Dynamic Knowledge Graph Embeddings via Local Embedding Reconstructions

Most of the existing works on dynamic graph embeddings do not account for directed and labeled graphs. Rather, they are designed to be applicable to undirected and/or unlabeled graphs [37, 38], or they aim to embed temporally enhanced snapshots of non-dynamic graphs [39, 40]. Moreover, approaches like the one proposed in [41] exist that intend to perform an online training of KG embeddings by focusing on regions of the graph which were actually affected by KG updates. However, the overall embedding structure is still affected, leading to a need for continuous adjustments of embedding-based downstream tasks, such as graph-based ML models. Thus, we require a dynamic KG embedding formalism that (i) can produce real-time embeddings for dynamic KGs and (ii) is able to preserve the original structure of KG embeddings to allow for consistent downstream applications.

We propose to utilize the dynamic Navi approach [42], which is based on the core idea of GNNs as per Eq. (2). Given an initial KG \(\mathcal {G}_{t_0} = (V_{t_0},E_{t_0})\) at timestamp \(t_0\), we assume an embedding \(\widetilde {\gamma }_{t_0} : V_{t_0} \rightarrow \mathbb {R}^d\) based on some state-of-the-art KG embedding method from Sect. 3.1. Accordingly, a dynamic KG is defined as a family of stationary snapshots \(\left ( \mathcal {G}_t \right )_{t \in \mathcal {T}}\) with respect to some time set \(\mathcal {T}\). Given a future timestamp \(t > t_0\), the Navi approach provides a consistent embedding \(\gamma _{t} : V_{t} \rightarrow \mathbb {R}^d\) so that previously trained downstream models can still be employed.

Since we leverage the idea of GNNs to reconstruct \(\widetilde {\gamma }_{t_0}(v)\) through local neighborhoods, these reconstructions are based on the unique adjacency tensors \(\left ( \mathcal {A}(t) \right )_{t \in \mathcal {T}}\) with \(\mathcal {A}(t) \in \mathbb {R}^{k \times n_t \times n_t}\). Here, \(n_t = \left | \bigcup _{\tau \leq t} V_\tau \right |\) denotes the number of nodes that were known to exist since the graph’s initialization and thus \(n_{t} \geq n_{t_0}\) holds. Thus, we assume an initial embedding matrix \(\widetilde {\mathcal {X}}_{t_0} \in \mathbb {R}^{n_{t_0} \times d}\) that contains the initial embeddings as per \(\widetilde {\gamma }_{t_0}\). This matrix is then reconstructed based on itself via a single-layer GNN

$$\displaystyle \begin{aligned} \sigma \left( \widehat{\mathcal{A}}(t_0)_{0} \cdot \Theta_{t_0} \cdot \mathcal{W}_{0} + \sum\nolimits_{h=1}^{2k} \widehat{\mathcal{A}}(t_0)_h \cdot \widetilde{\mathcal{X}}_{t_0} \cdot \mathcal{W}_h \right) =: \mathcal{X}_{t_0} \approx \widetilde{\mathcal{X}}_{t_0} \end{aligned}$$

by taking into account the extended adjacency tensor \(\widehat {\mathcal {A}}({t_0})\) (cf. Sect. 3.1.3). During the training process, a global embedding \(\gamma _{r_0} \in \mathbb {R}^d\) is implemented regarding the self-relation \(r_0\) so that \(\Theta _{t_0} \in \mathbb {R}^{n_{t_0} \times d}\) contains \(n_{t_0}\) copies of \(\gamma _{r_0}\). Moreover, instead of zero-value dropouts, overfitting is prevented by randomly replacing node embeddings with \(\gamma _{r_0}\) in the input layer, simulating the semantic impact of nodes that are not known at time \(t_0\). It is also used to represent self-loops, enabling reconstructions that are independent of the (potentially unknown) initial representations. A detailed overview, including training settings and benchmark evaluation results, can be found in [42]. The evaluation implies that, given a timestamp \(t > t_0\), this approach allows for high-qualitative and consistent embeddings \(\gamma _t : V_t \rightarrow \mathbb {R}^d\) that are computed via

$$\displaystyle \begin{aligned} \sigma \left(\widehat{\mathcal{A}}(t)_{0} \cdot \Theta_{t} \cdot \mathcal{W}_{0} + \sum\nolimits_{h=1}^{2k} \widehat{\mathcal{A}}(t)_h \cdot \widetilde{\mathcal{X}}_{t} \cdot \mathcal{W}_h \right) =: \mathcal{X}_{t}, \end{aligned}$$

i.e., the i-th row of \(\mathcal {X}_{t}\) represents the embedding \(\gamma _t (v_i)\) of the node \(v_i \in V_t\). In the case of new nodes, \(\widetilde {\mathcal {X}}_{t}\) and \(\Theta _{t}\) are the extensions of \(\widetilde {\mathcal {X}}_{t_0}\) and \(\Theta _{t_0}\) by inserting copies of \(\gamma _{r_0}\), respectively. Moreover, the update of the adjacency tensor can be performed via

$$\displaystyle \begin{aligned} \mathcal{A}({t})_h = I({t_0,t})^T \cdot \mathcal{A}(t_0)_h \cdot I({t_0,t}) + \mathcal{B}({t_0,t})_h. \end{aligned}$$

First, the matrix \(I({t_0,t}) \in \{0,1\}^{n_{t_0} \times n_{t}}\) accounts for newly inserted nodes, i.e.,

$$\displaystyle \begin{aligned} I({t_0,t})_{i,j} = 1 \iff i = j. \end{aligned}$$

Second, the update matrices \(\mathcal {B}({t_0,t})_h \in \{ -1,0,1 \}^{n_{t} \times n_{t}}\) identify KG updates

$$\displaystyle \begin{aligned} \mathcal{B}({t_0,t})_{i,j} = = \begin{cases} 1 & \iff \text{the edge } (v_i, r_h, v_j) \text{ was inserted between }t_0\text{ and }t\\ -1 & \iff \text{the edge } (v_i, r_h, v_j) \text{ was deleted between }t_0\text{ and }t. \end{cases} \end{aligned}$$

After the KG update, a synchronizing assistant is to provide (i) the number of nodes \(n_{t}\) and (ii) the update tensor \(\mathcal {B}({t_0,t}) \in \{ -1,0,1 \}^{k \times n_{t} \times n_{t}}\). For instance, given an Apache Jena FusekiFootnote 1 KG, existing logging tools like rdf-deltaFootnote 2 can be extended to use them as synchronizing assistants. Moreover, while we focus on a single update at time \(t \in \mathcal {T}\), transitions between arbitrary timestamps can be handled as well, i.e.,

$$\displaystyle \begin{aligned} \mathcal{A}({t'})_h = I({t,t'})^T \cdot \mathcal{A}(t)_h \cdot I({t,t'}) + \mathcal{B}({t,t'})_h \text{ ~for ~} t_0 < t < t'. \end{aligned}$$

In conclusion, the late shaping of KG embeddings via Navi reconstructions represents a promising approach for incorporating dynamic KG updates and semantic evolutions into KG embeddings as lean feature representations of domain concepts and instances. Besides the ability to allow for consistent embeddings, the results in [42] even showed that the reconstruction of existing embeddings often leads to an improved performance in downstream tasks like link predictions and entity classifications as key enablers of the industrial use case applications outlined in Sect. 4.

6 Conclusions

In this work, we highlighted the increasing importance of representing and exploiting semantics, with a specific emphasis on the manufacturing domain. While industrial KGs are already employed and utilized to integrate and standardize domain knowledge, the generation and application of KG embeddings as lean feature representations of graph elements have been largely overlooked. Existing KGs lack either domain dynamics or contextuality, limiting the applicability of context-dependent embedding algorithms. Thus, we provide an overview of state-of-the-art KG embedding techniques, including their characteristics and prerequisites. In this context, we emphasized the need for dynamic embedding methods and their implementation in concrete manufacturing scenarios, describing potential KG embedding applications in industrial environments, which were identified as a result of the Teaming.AI project. Furthermore, we introduced the concept of Navi reconstructions as a real-time and structure-preserving approach for generating dynamic KG embeddings.

To summarize, KGs and KG embeddings offer significant advantages for the manufacturing domain. The structured representation of complex relationships in KGs enables context-awareness, dynamic analysis, and efficient information retrieval. Furthermore, the utilization of KG embeddings promotes process optimization, leading to improved product quality, reduced errors, and an increased overall productivity.