Keywords

1 Introduction

The field of manufacturing is increasingly interested in adopting Artificial Intelligence (AI) due to its ability to revolutionize operations, improve efficiency, and drive innovation. AI algorithms can effectively analyze and extract valuable insights from data, enabling manufacturers to optimize processes, detect anomalies, and make data-driven decisions. In addition, they bring automation and predictive capabilities, by enabling the automation of tasks, such as quality control and predictive maintenance, leading to faster inspections, reduced downtime, and improved operational efficiency.

However, despite all these benefits, the adoption of AI from manufacturing side is not happening as quickly as expected [1], and it comes with its own set of challenges and difficulties. While it holds great promise, several challenges need to be addressed for successful integration into the manufacturing processes. These challenges include issues related to data availability and quality, integration with existing infrastructure, shortage of skilled personnel, ethical and regulatory considerations, and change management [2]. Manufacturing environments often entail complex data ecosystems, requiring proper data collection and preparation processes to ensure the availability of high-quality and relevant data usable to produce intelligent models. Furthermore, integrating these models with existing infrastructure may require system upgrades and compatibility assurance. The shortage of skilled professionals with simultaneous expertise in both AI and manufacturing processes hinders adoption, while ethical and regulatory concerns require robust governance policies. Change management is also crucial, requiring cultural shifts and addressing employee concerns to foster a positive AI adoption environment.

To overcome these challenges, AI providers should also facilitate adoption through the development of user-centered tools, fostering partnerships between experts and manufacturing professionals, and showcasing the tangible benefits of implementing machine and deep learning algorithms. By making AI accessible, tailored, and demonstrably valuable, the manufacturing industry can overcome barriers and embrace these technologies to unlock their full potential for improving operational efficiency, productivity, and competitiveness.

In this matter, there are two research areas that can play significant roles in nurturing this adoption: Automated Machine Learning (AutoML) [3] and Explainable Artificial Intelligence (XAI) [4].Footnote 1 AutoML simplifies the process of developing models by automating tasks such as feature engineering, model selection, and hyperparameter tuning. This automation reduces the need for extensive data science expertise, making it easier for manufacturing professionals to leverage AI capabilities. By providing user-friendly interfaces, pre-built algorithms, and automated workflows, AutoML tools enable manufacturers to quickly and efficiently build accurate and robust intelligent models tailored to their specific requirements.

On the other hand, XAI addresses the critical need for transparency, interpretability,Footnote 2 and trust in AI systems. Manufacturing environments require clear explanations of model decisions, especially when it comes down to critical factors such as quality control, predictive maintenance, and process optimization. XAI techniques allow manufacturers to understand the inner workings of models, identify the factors influencing predictions, and detect potential biases or risks. By providing interpretability and explanations, XAI helps build trust in AI systems and facilitates their adoption by manufacturing professionals.

Currently, there are some large-scale proprietary solutions that aim to tackle the aforementioned issues by providing users with end-to-end solutions that operationalize the AI cycle with very limited prior technical knowledge. The most popular comes from big tech companies, e.g., Google AutoML-Zero [5], Microsoft Azure AI [6], Amazon SageMaker [7], and the open-source H2O [8].

Although using this type of platforms has been shown to be beneficial [9], they bring about some drawbacks that organizations should seriously consider. One major setback is the potential lack of transparency and control over the underlying algorithms and models. These platforms often abstract away the details of the machine learning process, making it difficult to understand and interpret how the models arrive at their predictions. This lack of transparency can be a concern, especially in regulated industries or situations where interpretability is crucial for decision-making.

Another inconvenience is that in many cases using these tools requires contracting their providers’ services and infrastructure. The user is then subject to the limitations and potential downtime of third-party software and hardware. Unexpected disruptions or system issues are out of the control of the user and can impact the availability and performance of AutoML platforms. Additionally, organizations may be locked into specific pricing models and contracts with the cloud provider, limiting flexibility and potentially leading to increased costs in the long run.

Data privacy and security are also important considerations when using AutoML platforms in Cloud. Organizations need to ensure that sensitive or proprietary data used for training models are protected and handled in compliance with relevant regulations. The transfer of data to the cloud, data storage, and data access controls should all be carefully evaluated to mitigate any risks associated with data privacy and security breaches.

Finally, we should consider vendor lock-in issues. Once an organization builds and deploys models on a specific platform, migrating those models to another platform or bringing the infrastructure in-house can be challenging and time-consuming. This can limit flexibility and hinder the ability to switch providers or adapt to changing business needs in the future.

To tackle previous obstacles and motivate the use of AI in manufacturing, we introduce our framework called the AI Model Generation.

This framework is part of an European project called knowlEdge—Toward AI-powered manufacturing services, processes, and products in an edge-to-cloud-knowlEdge continuum for humans ìn-the-loop and is intended to be used by people with no background in AI, including manufacturing engineers, plant managers, quality control specialists, and supply chain managers. Throughout the chapter, we will refer to it as AMG. AMG is responsible for the automatic creation of supervised AI models and is able to solve tasks based on various scenarios and input variables. Each of these stages constitutes a submodule by itself. After describing the main functionalities of the system (Sect. 2), we then describe the architecture (Sect. 3), the use cases (Sect. 4), and we describe each of the submodules that make it up (Sect. 5).

2 AI Model Generation Framework

This component is able to automatically generate AI models capable of solving user-defined tasks based on an initial configuration in which the data source, the type of problem, and the algorithm are specified. In addition, it enables the computation of the algorithms’ training costs using a set of heuristics. Models can be efficiently deployed in different layers of the computer continuum (cloud, fog, and edge) and be saved using standards such as ONNX (Open Neural Network Exchange) and PMML (Predictive Model Markup Language).Footnote 3

The initial configuration is defined in terms of data source, task, task setup, and strategy. Any dataset is made up of a number of variables and can come from three different data sources (local, static database, broker). See Sect. 5.1 for more information on data sources. The problem to be solved is formalized in a task object. A task has a name, a type (classification, regression, or optimization), an associated performance metric, an optional risk function, a set of input and output variables, and one or more associated execution settings. The configuration of a task is formalized in a task setup object, which contains information about the type of validation, training, and evaluation datasets, random seeds, and implemented strategies. A task configured with a task setup can train or run inference using one or more algorithms. The algorithm and the hyperparameters to use are encapsulated in an object called strategy. A strategy consists of the method name, the hyperparameters, the initial state, and the loss function of a given algorithm. As a result, models are generated, and a model is associated with a strategy, a set of metrics describing its performance, and the Docker image tag that deployed it.

On the top of that, the user can infer existing models or train new models by defining an initial configuration in JSON format. The next code 2 is an example of configuration file that uses a local dataset to perform training and inference.

{     "task": "classification",     "task_name": "mushrroom_classification",     "method": {         "strategy_list": ["randomForestClassifier"],         "arguments": {             "validation_type": "SPLIT",             "validation_percentage": 0.2,             "random_seed": 24,             "risk_function": "risk_function",             "performance_metric": "accuracy"         }     },     "processing": {         "arguments": {             "dataset_name": ""         },         "orders": [             {                 "order": 1,                 "action": "train",                 "read": {                     "url": "~/home/datasets/mushrooms                        .csv",                     "type": "static",                     "source_type": "tabular",                     "connector": {                         "name": "local",                         "arguments": {}                     },                     "input_attributes": [],                     "target_attributes": ["class"],                     "from_i": 0,                     "to_i": 0                 }             },             {                 "order": 2,                 "action": "predict",                 "read": {                     "url": "~/home/datasets/mushrooms                        .csv",                     "type": "static",                     "source_type": "tabular",                     "connector": {                         "name": "local",                         "arguments": {}                     },                     "input_attributes": [],                     "target_attributes": ["class"],                     "from_i": 0,                     "to_i": 0                 }         }         ]     },     "modelrepo": {         "url": "~/home/model_descriptors/"     } }

To specify what kind of operations we want to perform, we create different orders. An order can perform training or inference. We can also specify the columns to use, or leave the field blank if we want to use all of them. At the end of the process, the model output is provided in a different JSON and converted into a standard for future reuse. This file is stored in the directory specified in modelrepo.

3 System Architecture

At the structural level, the component is divided into the following elements depicted in Fig. 1:

Fig. 1
A flow chart of high level architecture of the A M G component. It starts with the producer and ends at the celery, backend, and docker image following 3 different paths. It includes the message broker of the transport queue, A P I of web service, relational S Q L database, et cetera.

High-level architecture of the AMG component

  • Python RESTful API: It is implemented through a Flask web service. Allows the user to make requests about existing models, tasks, task configurations, and strategies.

  • Core functionalities module: It is responsible for implementing the functionalities of the component. In turn, each step of the AI cycle is implemented as a submodule.

  • PostgreSQL Database: Relational database that stores information about datasets, tasks, task configurations, strategies, and models in order to reproduce results and keep a history of the models’ evolution.

  • Redis and Celery: Celery is a popular asynchronous task queue library in Python that allows for the distribution and execution of tasks across multiple workers. Redis is used as the result backend in Celery, which means it stores the results of completed tasks. After a task is executed by a Celery worker, the result is stored in Redis for retrieval by the Celery client.

  • RabbitMQ broker: It is a robust and feature-rich message broker that implements the Advanced Message Queuing Protocol (AMQP). After configuring the broker, several consumers subscribed to a queue can receive asynchronous messages. In order to establish connection to a specific queue, the configuration of the broker must be set.

  • Edge Embedded AI Kit: This component is responsible for the deployment of models. All the models are containerized to enhance portability and scalability allowing their deployment across different platforms, such as local machines, cloud infrastructure, or edge devices.

4 Use Cases

In order to validate the component, several use cases have been defined for three pilot partners participating in the project:

  • Dairy products company: A constraint optimization model has been implemented for production scheduling. They are interested in predicting the orders to be produced in a week window taking into account working machines, current orders, and product–machine compatibility.

  • Manufacturing plastic fuel company: Several anomaly detection models are currently under testing. The goal is to predict anomalies in the production chain using manufacturing and quality data.

  • Power transmission and drives company: Using an image dataset, they want to improve the quality of their assembly procedure automatic quality controls and thus reducing the rate of possible failures. Several defect detection models based on CNNs have been implemented.

Before testing these models in customer environments, we have the test machine in the LINKSFootnote 4 infrastructure in order to test these components and to correctly estimate the host specifications.

In addition, several configurations have been defined for the deployment of the component along the continuum depending on the pilot needs. We find two kinds of deployments:

  • Training cloud-based/inference fog-based: In this scenario, AI models are trained in the cloud infrastructure of the client. This means that the training process, which often involves computationally intensive tasks such as processing large datasets and training complex models, is handled in their remote cloud environment. Once the model is trained, it is deployed and executed on fog devices or edge nodes for real-time inference in the manufacturing environment. The fog nodes, being closer to the edge devices, are responsible for performing the inference tasks on local data. This deployment scenario offers several advantages. First, cloud-based training allows for the efficient utilization of computational resources and can handle large-scale datasets and complex model architectures. It provides flexibility in terms of accessing various tools, libraries, and computing power required for model development and training. Additionally, centralized management in the cloud simplifies the process of model training, deployment, and updates. This deployment scenario offers several advantages. First, cloud-based training allows for the efficient utilization of computational resources and can handle large-scale datasets and complex model architectures. It provides flexibility in terms of accessing various tools, libraries, and computing power required for model development and training. Additionally, centralized management in the cloud simplifies the process of model training, deployment, and updates. However, there are some drawbacks to this deployment scenario. The latency introduced by transmitting data from fog devices to the cloud for training and then back to the edge for inference may not be suitable for real-time or time-sensitive applications. It also relies on reliable and high-bandwidth network connectivity between the edge/fog devices and the cloud, which may not always be available or practical. Furthermore, fog devices may have limited offline capabilities, as they may not be able to access the latest models or perform updates if they are disconnected from the cloud.

  • Training and inference in fog: Both training and inference occur at the fog or edge devices themselves. Fog nodes have sufficient computational resources to handle training tasks, and the trained models are directly executed on the same devices. This scenario offers low latency as data processing and decision-making happen locally, without the need to communicate with the cloud. It also provides offline capability, making it suitable for environments with intermittent connectivity. Furthermore, fog-based deployment enhances privacy and security as sensitive data remain on the local devices, reducing the need to transmit it to external servers. However, the limited computational resources of fog devices can pose challenges for complex and large-scale training tasks. Managing and updating models across a distributed network of edge devices can also be more complex compared to cloud-based deployment. For this reason, techniques such as split learning along with the application of privacy-preserving encryption methods are advisable. By leveraging split learning, the fog devices can benefit from more efficient utilization of their limited resources. The data remain on the edge device, minimizing the need for data transmission and reducing latency. Only the model updates or gradients are transferred between the edge device and the server, significantly reducing the bandwidth requirements. This approach enables fog devices to participate in the training process without being overwhelmed by the computational demands.

5 Core Components

The overall logic of model generation can be broken down into the various steps that make up the AI life cycle: data retrieval model (data acquisition), automatic pre-processing module, cost computation module (estimation of training cost for a specific algorithm), automatic hyperparameter tuning module, automatic training, inference and standardization, explainability module (generation of local and global explanations), pipeline execution module (constitutes the main program that calls the rest of modules), and edge embedded AI kit (builds Docker images for model deployment). The implementation of the individual submodules is described in detail in the following sections.

5.1 Data Retrieval Module

Data retrieval stage involves the acquisition and collection of relevant data required for training and testing AI models. It is an essential step as the quality and comprehensiveness of the data directly impact the performance and effectiveness of the AI system. Typically, this process in turn includes: data identification, data collection, and data storage. At this point, we assume that the above processes have been performed and the data are ready to be consumed. Accepted data types include tabular data, images, or time series. Different types of connectors are offered to allow the user to upload their data:

  • Local data connector: This allows files to be uploaded from the user’s local file system. Files in csv format or training and evaluation directories for image datasets are taken into account. However, it is not mandatory that image directories be pre-partitioned for training, evaluation, and optional validation. If only one directory is uploaded, the component takes care of the partitioning (note that the data are split 70% training, 10% validation, and 20% evaluation).

  • Broker connector: This connector is used to get real-time data from other databases, APIs, or sensors through a RabbitMQ broker. The user must format the data into an appropriate MQTT message format, such as JSON or plain text. Then publish the formatted data to the appropriate topic on the RabbitMQ broker. Later, the MQTT clients subscribed to this topic can receive the published data. Only certain topics are currently considered, but the component is easily scalable to add new ones.

  • Connector for Edge/Cloud Apache Database: This connector has support for retrieving data from the Apache IoTDB (Database for Internet of Things), an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud. Because of its lightweight architecture, high performance, and rich feature set, as well as deep integration with Apache Hadoop, Spark, and Flink, Apache IoTDB can meet the needs of massive data storage, high-speed data ingestion, and complex data analysis in IoT industrial fields. Using the connection details, i.e., IP address, port and login information, data can be easily queried in the form of SQL statements.

Once the data are collected, it often requires pre-processing to ensure its quality and usability. This process involves tasks such as removing noise, normalizing, or standardizing variables. Other tasks such as formatting issues, fixing inconsistencies and outliers, and missing value analysis are not performed in this component because the management of missing data and outliers is highly user-dependent and has been handled by other components.

5.2 Automatic Pre-processing Module

Overall, the pre-processing stage aims to transform raw data into a clean, structured, and optimized form that is suitable for analysis by AI algorithms. As mentioned above, while data cleaning is out of our scope, our focus is put on data transformation, feature selection, feature engineering, and class imbalance handling.

It should be noted that pre-processing is highly dependent on the type of data you are working with. For tabular data, you can choose between normalization and standardization. In addition, label encoding is also applied. For feature selection and engineering, we use the AutoGluon [10] framework, specifically the AutoMLPipelineFeatureGenerator. This pipeline is able to handle most tabular data including text and dates adequately.

Pre-processing for image datasets includes, but is not limited to, data augmentation, label encoding, normalization, and rescaling. Since these processes depend on the neural network model to be used, they are only executed after the model has been defined, i.e., before training. This process was implemented using Keras framework. Specifically, it is worth noting that all Keras models with predefined architectures have an in-built method called preprocess_input, which is responsible for pre-processing a tensor or NumPy array representing a stack of images into the right format for the corresponding model. On top of that, data augmentation parameters are defined by the HyperImageAugment class implemented by the KerasTuner.Footnote 5

Regarding time series data, we have generalized the steps to scaling and data transformation for stationarity conversion. This last step is really important since working with stationary data in time series analysis simplifies the modeling process, enhances interpretability, and ensures reliable and accurate analysis. Time stationarity is tested by applying the adfuller test for each feature. If data are non-stationary, the following transformations are considered: differencing, detrending, moving average, and power transformation. In order to ensure that the previous transformations can be applied, autocorrelation, non-zero mean checking, seasonality, and heteroscedasticity (Breusch–Pagan Lagrange Multiplier test) are tested for each feature. Statistical tests are imported from the stattools module from statsmodels.Footnote 6

To overcome class imbalance issues when dealing with tabular datasets, we use a combination of over- and under-sampling methods implemented by the library imbalanced-learn [11]: SMOTETomek and SMOTEENN [12].

5.3 Cost Computation Module

This module is responsible for computing the cost of training a model and ensures that executed AI models are running with the desired behavior and performance, providing enough data to expose inefficiencies or poorly managed resources. The cost is computed using the performance and runtime history of previous models trained on the same algorithm. For this purpose, the execution of a number of representative models of the different algorithms was analyzed manually using tools such as Extrae, Paraver, and Dimemas, commonly used in the field of parallel computing and performance analysis. The scalability of each model has been analyzed on four machines with different architectures (MN4,Footnote 7 CTE-ARM,Footnote 8 CTE-AMD,Footnote 9 CTE-PowerFootnote 10), and Barcelona Supercomputing Center infrastructure.Footnote 11 The overall process is depicted in Fig. 2.

Fig. 2
A flow chart of the overview of the cost computation process. It includes clients, A P I of web service, and a relational S Q L database. This relational S Q L database includes performance metrics, models, and metrics along with their programming language.

Overview of the cost computation process

5.4 Automatic Hyperparameter Tuning Module

The Automatic Hyperparameter Tuning module provides automated methods for finding the best combination of hyperparameters that optimize the performance of machine learning and deep learning models. Hyperparameters are configuration settings that cannot be learned directly from the training data, such as the learning rate, regularization strength, or the number of hidden layers in a neural network, that are involved in the training process. They are not learned from the data but are set by the practitioner or determined through a process called hyperparameter tuning or optimization. However, determining their optimal values can be a time-consuming and computationally expensive process and requires extensive experimentation and testing. AutoML algorithms for hyperparameter tuning automate this process by automatically searching through a predefined range of hyperparameters and selecting the best combination based on performance metrics, such as accuracy or loss.

In case of machine learning algorithms, hyperparameter tuning is performed by RayTune library [13]. It allows to tune several machine learning frameworks (PyTorch, XGBoost, Scikit-Learn, TensorFlow, and Keras, etc.) by running state-of-the-art search algorithms such as population-based training (PBT) [14] and HyperBand [15]/ASHA [16]. In addition, RayTune further integrates with a wide range of additional hyperparameter optimization tools. However, it is worth noting that not all optimizers are available for all frameworks and image datasets must be treated as a numpy array without allowing the use of TF.data.DatasetFootnote 12 or other dynamic objects.

The machine learning models included in the component are implemented using the scikit-learn [17] library or similar libraries such as lightgbm [18] and XGBoost [19], reason for which we chose the Tune.Sklearn. According to the type of search, it implements: TuneGridSearchCV (grid search) and TuneSearchCV (random search). The latter with Bayesian search method was preferable to grid search because of its advantages [20]. Bayesian optimization is a method for hyperparameter optimization that uses Bayesian inference to efficiently search for the optimal set of hyperparameters. It is particularly useful when the evaluation of the objective function (e.g., model performance) is time-consuming or computationally expensive. After the search algorithm has been defined, the model and the parameter grid with the value range of the hyperparameters to be examined must be passed to the tuner along with other optional parameters, i.e., n_trials (the number of parameter settings tried by the tuner). All machine learning models are encapsulated in a class called SklearnModelBuilder. See Sect. 5.5 for more details. The best set of hyperparameters can be retrieved after fitting the tuner with the call tuner.best_params_.

In the case of deep learning models, the difficulty of passing complex data to the Ray tuner led us to use tuners specifically designed for hyperparameter tuning of neural networks, that is, the KerasTuner mentioned above. The tuner manages the hyperparameter search process, including model creation, training, and evaluation. Keras Tuner provides different kind of tunners based on the optimization strategy that is used to select hyperparameters. In addition, some tuners can be combined or aggregated to leverage the strengths of each. In our specific case, we define an instance of BayesianOptimization tuner with the metric to be optimized during (called objective) and the seed. At this point, we will also define the hyperparameters and their parameter space to carry out the search, by means of KerasTunerHyperparameters class. Additionally, we modify the tuner training flow so that we can also fine-tune training-associated hyperparameters such as batch size or epochs. In this flow, the hyperparameters for data augmentation, as implemented by the KerasTuner HyperImageAugment class mentioned before, can also be tuned. After calling the search method, the best selection of hyperparameters is retrieved with the call get_best_hyperparameters. Furthermore, it is also possible to obtain the trained model that maximizes the objective using get_best_models call.

In the end, both tuners are encapsulated in two different methods of the HyperparameterTuner class: tune_ml and tune_dl.

5.5 Automatic Training, Inference, and Standardization

This module builds a model that can make accurate predictions or classifications based on pre-processed data and the optimal choice of hyperparameters. Once a model has been trained, it can be deployed to a production environment where it can process new data and generate predictions or classifications. Models are interoperable, and all of them have been wrapped in the convention of Sklearn objects to enable interoperability. Once a model has been trained, it can be directly converted into ONNX or PMML. PMML and ONNX are both model interchange formats that enable the portability and interoperability of machine learning models across different platforms and frameworks. They serve as standard representations for sharing, deploying, and executing machine learning models. PMML is a more general-purpose model interchange format that supports a broader range of predictive models beyond deep learning. It is used for various machine learning algorithms and techniques. ONNX, on the other hand, focuses specifically on deep learning models and their interchange between frameworks.

All machine learning models are instantiated within the method _build_model inside the SklearnModelBuilder class. This method passes the best selection of hyperparameters to the model. In order to perform any operation to a machine learning model, an instance of SklearnModelBuilder is created. A SklearnModelBuilder object can execute the following public methods:

  • fit: Trains the algorithm using the training dataset. The builder already has defined the best selection of hyperparameters and the name of the algorithm to be trained.

  • predict: Executes inferences process on evaluation dataset.

  • export: Saves the model into the corresponding standard.

  • load: Loads the model from the filename passed as argument. The filename must be a PMML or ONNX file.

  • explain_model: Provides insights and explanations behind the model predictions or actions, enabling users to trust, validate, and effectively use the model properly.

  • explain_instance: Focuses on explaining individual predictions or decisions made by the model, providing insights into which features or factors contributed most to a particular output. More information can be found in this matter in Sect. 5.6.

Deep learning models are implemented following two approaches. In case the user wants to perform hyperparameter tuning, the training is done according to the process described above. On the contrary, when the set of hyperparameters is initially clear, the system creates an object of the KerasModelBuilder class. All objects of this class have the same format as the objects of the SklearnModelBuilder class. Both classes inherit from the abstract ModelBuilder class. This allows us to keep a more homogeneous format while making the code more intuitive for the user.

5.6 Explainability Generation Module

Toward the end of the pipeline, we find the application of Explainable Artificial Intelligence (XAI). Its application is of utmost importance as it addresses key challenges in AI adoption. XAI enhances transparency, fostering trust between users and AI systems by providing insights into the decision-making process. It promotes ethical practices and accountability, ensuring that AI models operate with fairness and avoid biases or discriminatory outcomes. Furthermore, XAI supports compliance with regulatory standards that demand interpretability and explainability. Despite its significance, XAI is often skipped in the AI pipeline due to several factors. These include the complexity of implementing XAI techniques, the focus on achieving high performance without considering interpretability, and the trade-off between model complexity and explainability. Additionally, limited awareness and understanding of XAI among practitioners, along with the perception that explainability compromises predictive accuracy, may contribute to its omission.

To overcome these obstacles and motivate its application, explainability is embedded in the model. So it can be run once the model has been trained by calling the methods previously introduced: explain_model and explain_instance. The XAI methods are implemented by the library Dalex [21].Footnote 13 All the explanations are model-agnostic that allow us to provide information of these models without relying on their internal structure. As an example, the Dalex explainer generates, in a JSON or png format, feature importance analysis and visualizations such as partial dependence plots [22] and accumulated local effects (ALE) [22]. These visualizations offer intuitive representations of the model’s behavior and enable users to explore the relationships between input features and the model’s predictions. In order to explain concrete predictions, the explainer generates visualizations of the SHAP values [23], the ceteris paribus profile, and interactive breakdown plots. Although the number of XAI methods applied may seem sufficient because of the possibilities that Dalex offers, we are considering expanding the explainer further.

5.7 Pipeline Execution Module

The above steps are part of the so-called AI life cycle or AI pipeline. To deploy the model later, all these steps are encapsulated in the execute_pipeline method. This method is responsible for receiving the input configuration and creating the appropriate objects to perform data loading, pre-processing, hyperparameter optimization (optional), training and/or evaluation, and explainability (optional). As a result of executing the pipeline, a Docker image responsible for deploying the model is generated if it did not previously exist. Afterward, using the image, the container can be created to execute the pipeline by passing it the input configuration. In turn, the pipeline returns a JSON file with the execution result.

5.8 Edge Embedded AI Kit

This module is in charge of the model deployment. In this stage, it is assumed that the AI model is available and operational for use in a production environment. After developing and training a model, model deployment involves implementing the model into a system where it can generate predictions or perform specific tasks in real time.

The complete AI pipeline can be integrated seamlessly with the existing software or system architecture of the client using Docker images for the deployment. Docker helps to simplify the deployment process, improves flexibility, and ensures consistent and reliable execution of AI models. More precisely, it provides portability, allowing the model to run consistently across different environments without compatibility issues. Overall, there are two types of images, those that are in charge of deploying machine learning models and those for deep learning, both use the Python image as a base but use different dependencies. As endpoint, we use the execute_pipeline method described above.

Once the image has been created, it is stored in a private Docker registry. The creation and configuration of the Docker registry must be done by the user. The username, password, and ip address along with the port are part of the configuration of this component. In order to upload and download images, the Edge Embedded AI Kit has an API with push and pull methods, respectively. Those methods have been implemented using the Docker SDK for Python.Footnote 14

6 Conclusions and Future Work

In this chapter, we presented the AI Model Generation (AMG) framework, which enables the creation of AI models for non-experienced users. AMG component has been designed in order to solve general-purpose problems supporting different types of data and allowing the deployment of machine learning and deep learning models. Ensuring the reproducibility of any type of analysis, model results and configuration are saved in a PostgresSQL Database. In addition, model descriptors are generated that allow easy loading and exporting of models by means of standardization (ONNX and PMML). All the models are interoperable, which will allow to add new models or integrate other AutoML systems in the future. Furthermore, it enhances the scalability, promotes the interpretability, and simplifies long-term maintainability of the code.

On the other hand, it should be noted that the models are also containerized to ensure that they can be run in any environment. This gives the customer the freedom to transfer the model and use it wherever they want. Explainability is easy to apply, and all models have a set of explanatory techniques that can provide insight into the model’s decision-making process. In that direction, as future work we believe that the platform is easily scalable and a possible research direction would be to analyze the performance of new models and applying new methods of explainability. It would also be worth comparing different search algorithms outside of Bayesian optimization. Additionally, it would make sense to consider new approaches such as Federated Learning (FE)Footnote 15 and Split Learning (SL)Footnote 16 in cases where privacy, data distribution, or network restrictions play an important role. However, building a general federated system that can effectively support most of machine learning and deep learning models is challenging due to the inherent heterogeneity of models, varying data distributions, and the need to address communication and privacy concerns. Models used in different domains or tasks have distinct architectures, training algorithms, and requirements, making it difficult to create a single framework that caters to all models. Furthermore, ensuring efficient communication and privacy preservation across participants adds complexity to the design and implementation of a general federated system.