Standard Components Query System Based on Logical Filtering and Semantic Retrieval

Huang, Ziyan; Bian, Yongming; Yang, Meng

doi:10.1007/978-981-97-1876-4_91

Ziyan Huang^16,17,
Yongming Bian^16,17,18 &
Meng Yang^16,17

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

Included in the following conference series:

International Conference on Advances in Construction Machinery and Vehicle Engineering

1212 Accesses

Abstract

The establishment of a robust standard components database is essential in various industries to streamline product development and ensure quality. This paper presents a system for querying standard components data, leveraging the power of logical filtering and semantic retrieval. The structured approach of this system includes a well-defined database structure, logical filtering capabilities at different data levels, and advanced semantic retrieval techniques. The outputs of the system demonstrate its effectiveness in handling user queries, analysing unstructured data, and providing meaningful feedback based on logical filtering outcomes. This research contributes to the efficient utilization of standard components data through an innovative and powerful digital query system.

You have full access to this open access chapter, Download conference paper PDF

Keywords

1 Introduction

In modern industries, effectively managing and utilizing standard components is crucial for achieving high-quality products, ensuring cost-effectiveness, and meeting project deadlines. The fundamental point is to establish a comprehensive standard components database. While collecting comprehensive components data is too difficult to be achieved, this paper focuses on developing a query system for the database, which is expandable for effortless data integration. Logical filtering and semantic searching are integrated to enhance system’s functionality. Data of SKF [1] hydraulic seals is used as an illustrative example in the system developing process.

A robust standard components database acts as a centralized repository, promoting consistency across projects, reducing duplication, and facilitating team collaboration. Similar to MEGARes 2.0 [2], which aids identifying antimicrobial resistance genes in metagenomic data for epidemiological investigations. Such a database speeds up the design process by eliminating the need for manual handbook searches, making it indispensable for the application of AI in industrial settings.

Creating a database for standard components, akin to PubChem’s [3] inter-linked Substance, Compound, and BioAssay databases, needs a well-organized data structure, robust search functionalities (including logical and semantic filtering), and accessible through APIs for programmatic use.

The query system for the database should contain the following parts as shown in Fig. 1. The user interface is the entry point for users to interact with the query system. It comprises two main components:

A flow chart of the Query system's functional architecture. It presents a user interface consisting of user input and result presentation, a query processor consisting of a logical filtering module and semantic retrieval module, and standard component data consisting of structured data and unstructured data. — **Fig. 1**

User Input: Takes two forms, structured queries and content-based queries. Structured queries allow users to specify attributes such as dimensions, materials, and performance metrics in a structured format. Content-based queries leverage natural language input, enabling users to describe their needs in more intuitive terms.
Result Presentation: Showcases the retrieved standard components. Users can explore the results, compare components, and select the most suitable ones for their projects.

The query processor is responsible for handling user inputs and transforming them into database processable requests. It comprises two key modules:

Logical Filtering Module: Allows users to filter components based on attributes such as size, material, or other technical specifications. Supports cross-logical filtering, combining criteria from different data tables to identify components that meet complex requirements.
Semantic Retrieval Module: Leveraging advanced natural language processing as showed in the vitrivr [4] and SOSRepair [5]. Instead of relying on precise keyword matches, it interprets query descriptions based on content, delivering results that harmonize better with the user’s intended context. It’s used for exploration of standard components descriptions and related textual materials.

The standard components database is the core of the architecture, housing a collection of standardized parts, specifications, and related data. It is structured to accommodate different data types:

Structured Data: Includes basic data tables, which store the structured information about standard components, such as technical specifications, part numbers, and dimensions.
Unstructured Data: Contains additional information about the components in various formats, such as documentation, images, CAD drawings, and other multimedia elements.

2 Standard Components Database

2.1 Structured Data

The standard components database contains three types of structured data tables: basic data tables, associated data tables, and multilevel data tables. Figure 2 provides a visual representation of how these three types of data tables are interconnected, covering all the structured data related to the components in this comprehensive framework.

A diagram of the structured data tables. It presents structured data consisting of a basic data table and an associated data table, and a multilevel data table consists of the basic data table. — **Fig. 2**

Basic Data Tables: The fundamental building blocks of the database, containing essential structured information about individual standard components.

Associated Data Tables: Providing additional context information for components in different basic data tables, enhancing the abilities of components cross-reference and components filtering based on associated data.

Multilevel Data Tables: Capturing the hierarchical relationships within standard components data, which enables representing assemblies, sub-components, and other intricate structures, and retrieving information at various levels of detail.

2.1.1 Basic Data Tables

A basic data table is designed as a simple bivariate chart, in which columns stand for distinct features or attributes, while rows correspond to individual standard components. For instance, Table 1 displays a basic data table illustrating the types and basic features of seal parts.

Table 1 Seal types and basic features

Full size table

2.1.2 Associated Data Tables

An associated data table is a set of basic data tables, with each of them focusing on specific sets of attributes that are related in components filtering.

For example, Table 1 records seal types and materials, while Table 2 records hydraulic fluids and seal material compatibility. The synergy between these tables is crucial in choosing an appropriate seal for a specific application, as the compatibility of hydraulic fluids with seal materials (from Table 2) and the corresponding seal types (from Table 1) collectively determine the optimal choice. When a particular hydraulic fluid is specified, the system utilizes associated data from both Tables 1 and 2 to deliver logical filtering outcomes. This ensures that the chosen seal not only aligns with the hydraulic fluid but also correlates with other attributes detailed in the associated data tables.

Table 2 Hydraulic fluids and seal material compatibility

Full size table

2.1.3 Multilevel Data Tables

A multilevel data table is a basic data table and its sub-tables, capturing how different attributes of the component interact with one another.

For instance, consider a scenario where Table 3 represents basic seal installation dimensions, while Table 4 serves as a sub-table, capturing installation dimensions that are specifically related to pressure considerations. While Table 3 provides fundamental installation dimensions, it is Table 4 that refines these dimensions in the context of pressure requirements. Tables 3 and 4 collaboratively define the complete installation dimensions of a seal under varying pressure conditions.

Table 3 Basic seal installation dimensions

Full size table

Table 4 Maximum extrusion gape

Full size table

2.2 Unstructured Data

The standard components database contains two main types of unstructured data: component descriptions and multimedia assets. Component descriptions consist of Textual narratives that offer detailed information about the characteristics, features, and possible uses of standard components. Multimedia assets encompass visual and multimedia resources like pictures, videos, and CAD drawings, which help users visualize physical attributes of standard components.

2.2.1 Component Descriptions

Employs JSON to store textual descriptions that can not fit into SQL data table. Python can be used to manipulate JSON files, supporting programmatic interactions and implementing advanced semantic search. Here is an example describing the temperature condition of the hydraulic seals.

A set of programming codes. The written text consists of components, objects, and descriptions.

2.2.2 Multimedia Assets (Images, CAD Drawings, Etc.)

Document-oriented NoSQL databases are well-suited for handling multimedia assets. They excel in managing and storing unstructured data in flexible, JSON-like documents. Storing multimedia files alongside associated metadata and unique identifiers ensures easy retrieval, categorization, and access control.

3 Query Processor

3.1 Logical Filtering Module

In the query system design, all structured information finds its place in basic data tables. These tables comprise ‘Attributes’ as column names, ‘Entries’ as unique row values, and ‘Values’ as the contents within. In essence, structured data can be represented as triples denoted as (A, E, V), where A stands for Attributes, E for Entries, and V for Values. The logic filtering is the process of completing triple from these tables based on user-query conditions, such as (A, E, ?), (A, ?, V) and (?, E, V).

For logical filtering of basic data tables, the data of a (A, E, V) triple can be get from a single table using SQL query commands.

For an associated data table, logical filtering requires a series of (A, E, V) triples to be completed, crossing different basic data tables in the set. Take selecting types of seals that are compatible with hydraulic fluids material according to Tables 1 and 2, as described in Sect. 2.1.2, for example. As shown in Fig. 3, one basic table corresponds to one triple, where the user-query condition is (A2, ?, V2), the user-query target is (A1, E1, V1), and the associated table’s matching condition is E2 \(\iff \) (A1, V1). So, the mathematical description of the logical filtering is shown in Eq. (1).

A diagram of the associated data table. It presents Basic Data Tables 1 and 2. Basic Data Table 2 has 2 columns labeled A and A 2. Basic Data Table 1 has 3 columns and 3 rows. — **Fig. 3**

(1)

For a multilevel data table, logical filtering also involves completing a series of (A, E, V) triples, while the matching condition between basic tables is different. As exemplified in the description of parent Tables 3 and 4 in Sect. 2.1.3, Fig. 4 shows that the matching condition is (A3′, V3′) \(\Rightarrow \) E4 and V4 \(\Rightarrow \) (A3, V3). The goal is to obtain the correct value for Table 3 from Table 4. Therefore, the user-query condition is (A3, E3, ?) and (A4, ?, ?), the user-query target is (A3, E3, V3), and the mathematical description of the logical filtering is shown in Eq. (2).

(2)

An illustration of a multilevel data table. It presents Basic Data Tables 3 and 4. The Basic Data Table 3 has 3 columns labeled A, A 3 dash, and A 3. The Basic Data Table 4 has 3 columns labeled A, A 4, and an empty column. — **Fig. 4**

In the query system, all the basic data tables should be connected using the described relationships, forming a complete ontology. This means that starting from any point within the ontology, the user should be able to obtain a clear query result with sufficient conditions. Figure 5 illustrates an example ontology from Tables 1 to 4.

An illustration of the structured data ontology. It presents an associated data table consisting of Basic Tables 1 and 2, and a multilevel data table consisting of Basic Tables 2, 3, and 4. — **Fig. 5**

3.2 Semantic Retrieval Module

In the query system design, the semantic retrieval module is used for user content-based query analyse and Textual description analyse.

Content-based query analysis aims to convert unstructured textual queries into a structured format for the query system’s comprehension. This involves initial text parsing to identify relevant elements like keywords, phrases, and entities, using techniques like tokenization, part-of-speech tagging, and named entity recognition. The next steps include structuring the query by extracting subject, predicate, and object information, which often results in the creation of (A, E, V) triples. Entity resolution is then performed to link entities to specific database tables or records, ensuring the system knows where to retrieve data. Finally, the structured query conditions and targets are used to generate a formal query, typically in SQL or a similar query language, for execution against the database. Logical filtering can be applied to further refine the results based on user-defined criteria or constraints.

Textual description analysis is used to analyse component descriptions described in Sect. 2.2.1. It can extract valuable information from text, and generate meaningful natural language responses. It involves identifying key information, such as facts, entities, and relationships through techniques like sentiment analysis and topic modelling. It enables the query system to handle unstructured data and generate contextually appropriate responses by assembling information and providing human-like answers.

4 User Interface

An example of the user interface for the query system is shown in Fig. 6, featuring logical filtering and semantic search as inputs, along with a preview of structured data and unstructured data (models and figures) from the standard components database.

A screenshot of the user interface. It presents a condition, value, function, position, code, pressure, and value. An inverted U shaped illustration is also highlighted, along with a figure. — **Fig. 6**

References

Hydraulic seals|SKF. https://www.skf.com/us/products/industrial-seals/hydraulic-seals. Accessed 13 Aug 2023
Doster E et al (2020) MEGARes 2.0: a database for classification of antimicrobial drug, biocide and metal resistance determinants in metagenomic sequence data. Nucleic Acids Res 48(D1):D561–D569
Article Google Scholar
Kim S et al (2016) PubChem substance and compound databases. Nucleic Acids Res 44(D1):D1202-1213
Article MathSciNet Google Scholar
Rossetto L, Gasser R, Heller S, Amiri Parian M, Schuldt H (2019) Retrieval of structured and unstructured data with vitrivr. In: Proceedings of the ACM workshop on lifelog search challenge, pp 27–31, June 2019
Google Scholar
Afzal A, Motwani M, Stolee KT, Brun Y, Le Goues C (2021) SOSRepair: expressive semantic search for real-world program repair. IEEE Trans Software Eng 47(10):2162–2181
Article Google Scholar

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (52205279), the Open Foundation of the National Engineering Technology Research Center for Prefabrication Construction in Civil Engineering (2021CPCCE-K02), and the Top Discipline Plan of Shanghai Universities-Class I.

Author information

Authors and Affiliations

School of Mechanical Engineering, Tongji University, Shanghai, 201804, China
Ziyan Huang, Yongming Bian & Meng Yang
National Engineering Technology Research Center for Prefabrication Construction in Civil Engineering, Tongji University, Shanghai, 200092, China
Ziyan Huang, Yongming Bian & Meng Yang
Shanghai Engineering Research Center for Safety Intelligent Control of Building Machinery, Shanghai, 200032, China
Yongming Bian

Authors

Ziyan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yongming Bian
View author publications
You can also search for this author in PubMed Google Scholar
Meng Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Yang .

Editor information

Editors and Affiliations

School of Electrical Mechanical and Infrastructure Engineering, University of Melbourne, Melbourne, VIC, Australia
Saman K. Halgamuge
College of Electronic and Information Engineering, Tongji University, Shanghai, China
Hao Zhang
Yanshan University, Qinhuangdao, Hebei, China
Dingxuan Zhao
College of Mechanical Engineering, Tongji University, Shanghai, China
Yongming Bian

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, Z., Bian, Y., Yang, M. (2024). Standard Components Query System Based on Logical Filtering and Semantic Retrieval. In: Halgamuge, S.K., Zhang, H., Zhao, D., Bian, Y. (eds) The 8th International Conference on Advances in Construction Machinery and Vehicle Engineering. ICACMVE 2023. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-97-1876-4_91

Download citation

DOI: https://doi.org/10.1007/978-981-97-1876-4_91
Published: 29 June 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1875-7
Online ISBN: 978-981-97-1876-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Standard Components Query System Based on Logical Filtering and Semantic Retrieval

Abstract

Keywords

1 Introduction

2 Standard Components Database

2.1 Structured Data

2.1.1 Basic Data Tables

2.1.2 Associated Data Tables

2.1.3 Multilevel Data Tables

2.2 Unstructured Data

2.2.1 Component Descriptions

2.2.2 Multimedia Assets (Images, CAD Drawings, Etc.)

3 Query Processor

3.1 Logical Filtering Module

3.2 Semantic Retrieval Module

4 User Interface

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation