Keywords

1 Introduction

Advancements in Artificial Intelligence (AI) have enabled automation, prediction, and problem-solving, leading to increased productivity, adaptability, and efficiency in both the service and industrial sectors. In the latter, the fourth industrial revolution, commonly referred to as Industry 4.0 [19], represents a significant shift in industrial production. Driven by the integration of digital technology, industrial advancements increasingly hinge on data, which has opened up new possibilities beyond traditional applications.

A key goal toward Industry 5.0 is to combine human adaptability with machine scalability. Knowledge Graphs (KGs) provide a foundation for developing frameworks that enable such integration because they facilitate to dynamically integrate human decision-making with AI-generated recommendations and decisions [34]. KGs represent knowledge in a graph-based structure which connect entities and their relationships. In the context of hybrid human-AI intelligence, KGs can represent shared conceptualizations between humans and AI components, providing a foundation that facilitates their collaboration in dynamically integrating human decision-making.

Furthermore, KGs provide a critical abstraction for organizing semi-structured domain information. By utilizing KGs, decision-making can be improved, knowledge management can be enhanced, personalized interactions can be enabled, predictive maintenance can be supported, and supply chain operations can be optimized. Therefore, KGs serve as a foundation for creating a shared knowledge space for AI components and humans, an environment for the representation of policies that govern the interactions between agents, and a view of real-world physical processes on the shop floor by extracting and integrating relevant events.

Thus, KGs hold enormous potential for facilitating collaboration in this field, making production lines more efficient and flexible while producing higher quality products. As such, companies seeking to achieve the goals of Industry 5.0 find that KGs can realize their vision [20]. However, research in this area is still in its early stages, and further studies are required to analyze how KGs might be implemented. This overview aims to provide a review of the current state of research in this field, as well as the challenges that remain open.

The rest of this work is structured in the following way: Sect. 2 reviews the current usage scenarios of KGs in industrial settings, Sect. 3 outlines the research questions and search strategy, Sect. 4 presents the major findings that can be extracted when analyzing the previously mentioned research questions, and Sect. 5 discusses the lessons learned and future research directions.

2 Antecedents and Motivation

The industrial landscape has been revolutionized by the emergence of collaborative paradigms between human–machine systems, such as the Internet of Things (IoT), Internet of Services (IoS), and Cyber-Physical Systems (CPS), resulting in the so-called Industry 5.0. This has led to a shift in focus toward enhancing collaboration between humans and machines [13]. The creation and connection of newly available devices generate enormous data with significant potential value, which can be used to extend the product’s life cycle, on-demand manufacturing, resource optimization, machine maintenance, and other logistical arrangements [8].

KGs have recently gained much attention due to their potential to boost productivity across various sectors. KG can primarily empower industrial goods and services and their development process in two areas. First, they may save time and labor costs while enhancing accuracy and efficiency in domain information retrieval for requirements gathering, design and implementation, and service and maintenance management, by offering a semantic-based and in-depth knowledge management approach. Second, the development of KG has made it possible to build a data architecture suitable for corporate use by combining intelligent discovery and knowledge storage, utilizing KG embedding techniques to gather more information from KGs. KGs are most employed to model a particular and frequently complex domain semantically, explicitly modeling domain knowledge used to support and increase the accuracy of tasks performed further down the pipeline. Furthermore, advanced methodologies founded on KGs have become essential for knowledge representation and business process modeling.

In recent years, there has been a rise in interest in analyzing KGs using Machine Learning (ML) techniques, such as predicting missing edges or classifying nodes. To input feature vectors into most ML models, much research has been devoted to developing methods to build embeddings from KG. The transformation of nodes and, depending on the technique, edges into a numerical representation via KG embedding enables direct input into an ML model [2].

Furthermore, KG is widely assumed to be a tool for optimizing supply chain operations, reducing costs, and improving overall efficiency. Manufacturers can model their supply chain using KGs to fully understand how their suppliers, customers, and operations depend on one another, enabling them to make real-time data-based decisions.

In summary, many KGs have been built, both open to the public and closed for internal company use. Enterprise KGs are closed applications that can only be used by authorized personnel, while Open KGs are usually academic or open-source projects available for use by anyone on the Web. By using modeling and ML organizations can gain insights and make data-based decisions thanks to the creation of these KGs. This research work describes the current state of Open KGs.

3 Research Questions and Search Strategy

KGs serve as semantic representations of various aspects involved in the manufacturing process. These aspects include all phases of system engineering, such as the phases of development (e.g., layouts), organizational development (e.g., collaboration and worker roles), and operational development (e.g., user stories). These KGs can improve the processes by considering the data coming from the industrial monitoring and the human work themselves and additional contextual data and knowledge sources. Some examples include technical documentation about the process, questionnaires about maintenance cases, constraints, and rules for representing standards and policies for safety or ethical issues, protocols about teaming workflows, logging about process states, and user feedback. This chapter investigates the present state of KGs in manufacturing. In this chapter, potential areas are identified, and chances for future works are highlighted. The following are the primary research questions that guided this study.

3.1 Research Questions

We propose some research questions to provide specific insights into how KGs are used in manufacturing. These Research Questions (RQs) consider that the two most popular KG types are Resource Description Framework (RDFs) and Labeled Property Graph (LPG). Our RQs are designed to cover the essential aspects of bibliometric facts and application scenarios.

RQ1: Which areas within manufacturing are most interested in KGs?

The purpose of RQ1 is to demonstrate the significance and relevance of the topic by providing an overview of bibliometric facts from previously published studies on the applications of KGs in manufacturing.

RQ2: Which manufacturing domains commonly employ KGs?

RQ2 investigates KG application scenarios within manufacturing. Specifically, we will examine the manufacturing domains in which KGs have been used, the specific use cases, and the types of systems developed.

RQ3: What is the popularity of RDF and LPG as KG types?

RQ3 aims to evaluate the degree to which KG applications have matured by investigating some research aspects such as the format and standards used.

RQ4: How are industrial KGs currently used?

RQ4 discusses which building, exploitation, and maintenance procedures are commonly followed in manufacturing-related KGs. This provides insight into the structure of KGs that is vital for researchers and practitioners.

3.2 Dataset

To address these RQs, we analyze a significant sample of literature published in recent years. The search scope considers gray literature such as professional forums and publications as well as academic publications published in journals or academic conferences or in books that have been peer-reviewed. In total, we have identified 40 items of publication using KG published between 2016 and 2022. The authors of these items come from a diverse range of academic disciplines and represent institutions from various parts of the world. Overall, the sample of publications provides a comprehensive and diverse set of perspectives on the research questions at hand. The following is an analysis of the main characteristics of these sources.

3.3 Subject Area

KG in manufacturing is an emerging field that has drawn considerable attention from both industry and academic communities. The current body of research primarily originates from Computer Science. Conversely, there is a significant gap in research output from the fields of Engineering and Business, which are the other two most represented areas of knowledge. The scope of research in other areas, such as Chemistry, Physics, and Astronomy, as well as Materials Science, remains limited, with only a marginal number of proposals. Figure 1 presents a comprehensive categorization of the research venues considered in this chapter. The classification scheme is based on the self-description provided by each venue where the research works have been published.

Fig. 1
A horizontal bar chart compares different research communities on the basis of the percentage of papers published. It presents the maximum percentage for computer science followed by engineering and business and almost equal percentage for chemistry, material science, and physical and astronomy.

Research communities that have published the most research on KGs in manufacturing

3.4 Manufacturing Domain

The presented findings illustrate the prevailing domains explored in the literature on applying KGs in manufacturing, as summarized in Fig. 2. To determine whether a given paper pertains to the manufacturing domain, the North American Industry Classification System (NAICSFootnote 1) was employed. However, most of the examined literature does not specify any particular application domain.

Fig. 2
A horizontal bar. It presents the maximum percentage of papers published in the general domain followed by machinery, materials, chemistry, and automotive, and almost equal percentage for textile, operations, mining, aerospace, and additive.

Manufacturing domains that have carried out the most research work around KGs

Machinery is identified as the second most frequently represented domain, after which Materials, Chemistry, and Automotive follow. Furthermore, Additive Manufacturing, Aerospace, Mining, Operations, and Textile, albeit less frequently investigated, are also observed in the literature. The identified domains highlight the diverse industries that benefit from leveraging KGs.

Most reviewed works employ knowledge fusion techniques in general scenarios where KGs combine data from multiple sources. Additionally, KGs are applied to automate the merging of isolated production processes, generate digital twins based on KGs, and utilize them for automated source code development. These findings demonstrate the versatility of KGs in manufacturing and their potential to revolutionize various aspects of industrial production.

3.5 Kinds of KGs

A KG may be modeled as either an RDF graph or an LPG, depending on the data requirements. As shown in Fig. 3, RDF-based solutions currently dominate the field. However, a considerable proportion of solutions are also represented by LPGs.

Fig. 3
A horizontal bar chart. It presents almost 80 percent of papers published for R D F and 20 percent for L P G.

Knowledge Graph adoption in the manufacturing industry by representation paradigm

RDF is a recommended standard from the World Wide Web ConsortiumFootnote 2 that provides a language for defining resources on the Web. The representation of resources is accomplished using triples that consist of a subject, predicate, and object. RDF Schema, commonly referred to as RDFS, defines the vocabulary used in RDF descriptions. The RDF data model is specifically designed for knowledge representation and is used to encode a graph as a set of statements. By standardizing data publication and sharing on the Web, RDF seeks to ensure semantic interoperability. The semantic layer of the available statements, along with the reasoning applied to it, forms the foundation of intelligent systems in the RDF domain.

On the other hand, LPG representation primarily emphasizes the graph’s structure, properties, and relationships. This highlights the unique characteristics of graph data, opening new opportunities for data analysis and visualization. It also brings a window of opportunity for developing ML systems that use graphs to infer additional information.

Different approaches to KG have a significant impact on the user experience. When developers and analysts work with RDF data, they use statements and SPARQL query language to make changes. On the other hand, LPG use the Cypher query language, which provides a more intuitive way to interact with nodes, edges, and related properties within the graph structure.

3.6 Different Approaches for KG Creation

Compared to the broader scope of research on KGs, the development of KGs in an industrial context often employs a knowledge-driven approach. Consequently, knowledge-driven KGs are more used in the industry. This trend may stem from the practical advantages of a more closed-world approach, which is better suited to the constraints and contingencies inherent in a production environment. It also suggests that the manufacturing industry remains cautious about adopting the latest advancements in KG embeddings to enhance their analytical capabilities.

Figure 4 depicts the distribution of popularity between the two distinct approaches for building KGs. Currently, the knowledge-driven approach prevails, but recent years have witnessed a significant surge in the number of data-driven solutions. These solutions are better equipped to deal with ML and other computational intelligence techniques.

Fig. 4
A horizontal bar chart. It presents almost 75 percent of papers published in the knowledge-driven category and 25 percent in the data-driven category. Values are approximate.

Manufacturing industry Knowledge Graphs by form of creation

4 Insights

This section summarizes the results obtained from our analysis and highlights potential areas for future research on the use KGs in the manufacturing domain. The findings are structured according to the RQs addressed earlier in the study.

4.1 Answers to the Research Questions

Based on our study, we can deduce the most active research communities in the field of KGs. The answer to RQ1 (AQ1) is as follows:

AQ1. The majority of primary research in the field of KGs is conducted in the discipline of Computer Science. Research in KGs is less common in other areas of knowledge.

This could be because computer scientists have been developing new representation models since the beginning. Today, KGs are considered the natural progression of such models to make them more adaptable to new platforms and emerging methods for managing large amounts of data.

Regarding the answer to RQ2 (AQ2), it is unsurprising that the most common case is the preference for proposing generic models that can be easily adapted to various domains.

AQ2. The literature primarily covers the manufacturing industry as a general concept. In most of the works examined, no specific application domain was provided.

The domains related to machinery and materials are the next most represented, followed by chemistry and material. Finally, some KGs have also been developed in the aerospace, additive manufacturing, mining, operations, and textile fields.

Regarding the representation of KGs, the two most commonly used data models are RDF and LPG. However, in answering RQ3 (AQ3), we seek to identify the current prevailing choice for representing KGs.

AQ3. In the industrial domain, RDF is the preferred format for building. KGs. This is due to RDF’s ability to represent complex data and relationships in a structured and interoperable manner, which allows for the building of integrated knowledge spaces for both humans and AI components.

RDF is beneficial for industrial applications as it facilitates the integration of diverse sources and a more comprehensive understanding of the data. Moreover, the ability to query across multiple sources makes it easier for people to analyze relevant information for their specific needs.

Regarding the question of the predominant approach to constructing industrial KGs, it has been observed that knowledge-driven approaches are most commonly used, as stated in Answer to RQ4 (AQ4):

AQ4. Knowledge-driven approaches are predominant. However, new developments using data-driven approaches are expected to be increasingly incorporated into the existing body of literature as new solutions are proposed in combination with more mature techniques.

It is worth noting that existing knowledge-driven methods still encounter several general challenges, such as the interoperability and heterogeneity of data, incompleteness, and other specific challenges that arise from the goal of integrating them as active components rather than passive artifacts or mere data stores.

4.2 Additional Lessons Learned

In light of our study, the utilization of KGs within the manufacturing industry has experienced substantial growth in recent years as manufacturers seek to enhance their operational efficiency and decision-making capabilities. The structural design of KGs facilitates a more intuitive and comprehensive representation of data than traditional database models, rendering KGs well suited for the manufacturing industry.

Additional Lesson Learned #1. Although still nascent, the application of KGs within the manufacturing industry has garnered substantial interest from academia and industry.

One of the primary reasons for this keen interest is that by modeling relationships between suppliers, manufacturers, and customers, organizations can better understand the flow of goods, services, and information through their supply chain. This, in turn, can assist them in identifying bottlenecks, optimizing production processes, and ensuring product delivery to customers.

Additional Lesson Learned #2. The majority of the studies examined have been published in conference proceedings. In many instances, this indicates that the subject of investigation is still in the developmental stages. The state of the art is gradually maturing in almost every research area, leading to more journal publications with archival significance.

KGs can aid manufacturers in enhancing their ability to predict and respond to shifts in demand. This can help reduce waste, optimize production processes, and boost efficiency. However, most of the research is a work in progress, and there is still a long way to go to consolidate the results of archival value.

4.3 Open Problems

As a result of our study, we have identified several issues that limit the adoption of KGs in manufacturing and production environments. Some of the most critical issues are described below. The first issue concerns tabular data. This kind of data is frequently represented in values separated by commas. It is typically one of the most common input methods in industrial environments because it enables modeling a wide variety of data associated with temporal aspects (timestamps) and spatial aspects (coordinates). However, more optimal solutions still need to be proposed.

Problem 1 (Dealing with Tabular Data) Most solutions today are created to deal with information that is predominately textual in its presentation. Although this information category is crucial in the sector, it is not dominant in manufacturing settings, which involve working with machinery and equipment that generate numerical data in tabular form.

Another fact that is taken for granted by both researchers and practitioners is that it is possible for KGs to effectively deal with information of varying types that may arrive via a variety of channels and sources. However, our research has not found a large number of papers concerned with the temporal component of processing KGs.

Problem 2 (Real-time and Synchronization) Because many of the processes involved in manufacturing are automated and must have a high degree of synchronization, the manufacturing industry demands solutions that can perform adequately in environments with substantial time constraints and synchronization needs.

Last but not least, according to the results of our investigation, work still needs to be done in compiling the best practices for manufacturing KGs. In this sense, we miss work in the direction of design and proposal of best practices for the sector.

Problem 3 (Lack of Standardized Procedures) A substantial obstacle still exists in identifying reference architectures to build, implement, and use KGs in industrial and production settings. A compilation of best practices can be of genuine benefit in several ways, including high standards of quality results and resource saving while developing new systems or making changes to existing ones.

KGs are suitable for the manufacturing industry because they can provide systems with contextual data to achieve efficient and effective solutions. This contextual data includes human experience, environmental knowledge, technical conventions, etc. Creating such solutions becomes critical when the influence on human life is essential, as in the case of a factory that employs human workers.

5 Conclusion

In this chapter, we have seen how the amount of data generated in the industrial sector at a high velocity is bringing new challenges. For example, this data emanates from multiple sources, each utilizing distinct formats and standards. Consequently, integrating these divergent pieces of information is not only essential but also critical. Contextualizing data elements utilizing relevant relationships is imperative to ensure consistency and high-quality data.

The study also examines KGs as multifaceted knowledge bases that capture interlinked descriptions of entities. KGs facilitate the smooth integration and structuring of information at large scale, even from heterogeneous sources. Unlike other knowledge bases, KGs are not homogeneous and do not require rigid schemas. This makes KGs highly scalable and suitable for integrating and connecting diverse data representations.

Semiautomatic methods, employing available data sources and manual effort, are used to construct manufacturing KGs. However, manual KGs construction is only practical for small-scale KGs, and automated methods are necessary for large-scale KGs. Therefore, automating the construction and maintenance of KGs in the manufacturing domain is essential for successful implementation.

In conclusion, utilizing KGs in the manufacturing industry can offer several advantages, including better decision-making processes and the ability to predict and respond to changes in demand. With the manufacturing industry evolving at an unprecedented rate, KGs will likely play an increasingly critical role in driving operational efficiency and competitiveness.