1 Introduction

In the era of Industry 5.0, the integration of collaborative robots, commonly known as cobots, is revolutionizing the landscape of various industries, particularly in manufacturing [1, 2]. These kinds of robots are designed to work alongside humans, emphasizing a collaborative approach to enhance overall efficiency and productivity [3]. As cobots are central to the vision of Industry 5.0, it becomes imperative to study the implications of human–robot collaboration (HRC) to fully exploit the potential benefits of this paradigm shift [4].

A critical aspect demanding attention in HRC is the user experience during close interaction with cobots [5]. For optimal collaboration, it is essential to comprehend how certain factors, such as the physical size of cobots, influence the user’s perception during the interaction [3].

In one of the first works aimed to explore this aspect, Rahimi and Waldemar Karwowski [6] found out that the selection by the operator of safe speed for a traditional industrial robot depended on the size of the robot, leading to the conclusion that larger robots produce the perception of a more dangerous motion in space. Karwowski et al. [7] investigated the effects of exposure to a simulated accident, involving a robot hitting a mannequin placed inside the work envelope by exploring also industrial robot size. Results, surprisingly, showed that industrial workers felt safer in the presence of an arm of the larger robot than the smaller one. Hirori and Ito [8] explored the influence of size, in terms of robot height (small, medium, and high), of a mobile robot on the anxiety or threat felt by a human to be caused by a robot approaching him or her and determining the appropriate human–robot distance. They observed a tendency for the human–robot distance to increase along with the size of the robot, but surprisingly, the one that resulted in less anxiety was the robot with the medium height.

From the literature, it is not entirely clear how the size of the robot affects the user experience, especially in HRC. This research aims to address this crucial aspect by investigating the relationship between cobot size – specifically, comparing a small cobot (UR3e) with a large one (UR10e) – and user experience in HRC contexts in terms of both subjective and physiological responses. To provide a comprehensive understanding, this exploration is conducted in conjunction with other factors, namely cobot movement speed and product assembly complexity. Two main research questions are addressed: (i) How does the cobot’s physical size affect user experience? (ii) Do the joint effects of the cobot size with movement speed or product complexity affect user experience?

The study employs a methodology incorporating controlled experiments with 32 participants. Data collection encompasses both physiological measures, including electro-dermal activity, heart activity, and eye-tracking metrics, as well as subjective responses obtained through questionnaires addressing perceived workload, interaction quality, and affective state. Through this multifaceted approach, the dynamics that modulate user experience in the context of cobots of different sizes are explored. The main novel elements of this study reside in (i) the comparison of two morphologically identical cobots that differ only in size, effectively eliminating possible other confounding elements that can affect human perception; (ii) the joint analysis of multiple physiological signals to provide a broader view on the cognitive aspects involved in the HRC.

The significance of this research extends beyond conceptual considerations, offering practical insights that can significantly impact the implementation of cobots in real-world scenarios. By understanding how users perceive and interact with cobots of different sizes, we aim to provide valuable guidance for designers, engineers, and industry stakeholders seeking to optimize HRC for enhanced effectiveness and acceptance. The subsequent sections present detailed findings and analyses derived from our experiments, shedding light on the intricate interplay between cobot size, movement speed, and product assembly complexity in shaping the user experience.

The paper is structured as follows. The Section 2 presents the methodology used to explore the research questions, describing the experimental setup, collected data, and data processing. The Section 3 illustrates the results obtained by analyzing the data collected through the experiment. In the Section 4, a discussion of the implications of the cobot size and movement speed on user experience is presented. Finally, future work and conclusions are presented in the Section 5.

2 Methodology

In this section, the implemented methodology to explore the cobot size effect is explained.

2.1 Experimental setup

In the “Mind4Lab” (Manufacturing Industry 4.0 Laboratory) at the “Politecnico di Torino,” collaborative assembly tasks were implemented in an HRC setting. To explore the effects of cobot size (Size) on user experience, two collaborative workstations were used: one with the UR3e cobot (Small) and the other with the UR10e cobot (Large) (Fig. 1). These cobots were chosen for their similarity in terms of esthetic features, differing only in physical size, and both had a setup hanging above the operator (i.e., overhead) (Fig. 2). The UR3e weights 11.2 kg, has a reach of 0.5m, and a payload of 3 kg, whereas the UR10e weights 33.5 kg, has a reach of 1.3 m, and a payload of 12.5 kg. Two types of products with different assembly complexity were considered (Product). A mechanical component (Fig. 3a), consisting of a base, two oval flanges, and one square flange, was considered a product of low assembly complexity (Low), whereas a tile cutter was considered a more complex product (High) (Fig. 3b).

Fig. 1
figure 1

Size comparison between cobots UR3e and UR10e [9]

Fig. 2
figure 2

Collaborative workstation with overhead UR3e setup

Fig. 3
figure 3

The mechanical component (a) and the tile cutter (b)

The operations for assembling the mechanical component can be divided into the following phases:

  1. 1.

    Positioning the base. The cobot picks the base and brings it closer to the operator.

  2. 2.

    Assembling the square flange. The cobot picks the square flange and places it in the correct position on the base. The operator takes the screws, inserts them into the holes, and tightens them.

  3. 3.

    Assembling the first oval flange. The cobot takes the oval flange and places it correctly on the base of the square flange. The operator takes two screws, inserts them into the holes, and tightens them.

  4. 4.

    Assembling the second oval flange. The cobot takes the remaining oval flange and places it correctly on the base. The operator takes the final two screws, inserts them into the holes, and tightens them. The cobot takes the assembled mechanical component and places it in another work area.

The operations for assembling the tile cutter can be broken down into four macro-phases:

  1. 1.

    Assembly of the side supports. The cobot brings the base of the tile cutter closer to the operator, who assembles the side supports for the rail rods. When the operation is finished, the cobot sets the assembled component aside.

  2. 2.

    Assembly of the cutting mechanism. The cobot takes the main component of the cutting mechanism and holds it in an ergonomic position in front of the operator. The operator positions and assembles the round blade and the remaining two components of the mechanism. When the operations are finished, the cobot releases the cutting mechanism close to the operator.

  3. 3.

    Assembling the base with the cutting mechanism. The cobot brings the base with assembled side supports back to the operator; the operator picks up the cutting mechanism, inserts the rail rods through the appropriate slots of the cutting mechanism, and joins the rail rods to the base through the side supports.

  4. 4.

    Completion of the tile cutter. The operator takes the handle and inserts it into the threaded slot of the cutting mechanism. When finished, the cobot takes the completed tile cutter and sets it aside.

Two levels of the robot’s joint speed (Speed) were implemented: 90°/s (Normal) and 270°/s (High). These values represent the maximum speed that all the robot’s joints could reach.

A within-subject experimental design was implemented in this study to examine the effects of two fixed factors (i.e., cobot size and cobot movement speed) on perceived interaction quality, affective state, workload, and physiological response. A between-subject approach was used to explore the effect of product complexity in order to reduce the duration of the experimental session per participant (Fig. 4).

Fig. 4
figure 4

Flowchart of the experimental process

Thirty-two participants, with no prior experience with cobots, were recruited from the “Politecnico di Torino” and the surroundings for the study (27.3% females and 72.7% males). After randomly selecting the product to be assembled, each participant performed the assembly task twice in all four possible configurations (i.e., small cobot – normal speed, small cobot – high speed, large cobot – normal speed, and large cobot – high speed) in random order.

At the end of each trial, three different questionnaires were administered. An interaction quality questionnaire (Table 1) composed of seven items was used to collect participants’ perceptions on different dimensions related to the interaction with the cobot. The items were evaluated on a seven-point scale (from “strongly disagree” to “strongly agree”).

Table 1 Questionnaire for interaction quality [10]

The self-assessment manikin (SAM) [11, 12] was administered to participants to gather their affective state in the different task configurations by evaluating on a nine-point scale three dimensions: valence, arousal, and dominance (Fig. 5). Valence represents the pleasantness relative to a stimulus (for instance, happiness and relaxation are associated with a high valence, whereas anxiety or anger with a low valence). Arousal refers to the intensity of emotion provoked by a stimulus (e.g., fear and anger are usually associated with a high arousal, whereas relaxation and boredom with a low arousal). Dominance represents the degree of control felt relative to a stimulus (e.g., relaxation or anger is usually associated with a high dominance, while fear or anxiety with low dominance).

Fig. 5
figure 5

Self-assessment manikin (SAM) [11]

The commonly used NASA-TLX [13] was implemented to assess operator workload during the collaborative assembly (Fig. 6). NASA-TLX is composed of the following six dimensions:

  • Mental demand, referring to the amount of mental activity required by the task.

  • Physical demand, which represents the amount of physical activity required by the task.

  • Temporal demand, which refers to how much time pressure is perceived due to the task pace.

  • Performance, concerning with the degree of success and satisfaction with the results obtained in performing the task.

  • Effort, which represents how hard one has to work (both mentally and physically) to achieve a certain level of performance.

  • Frustration, referring to the amount of annoyance, frustration, or irritation felt during the task.

Fig. 6
figure 6

NASA-TLX questionnaire [13]

Each NASA-TLX dimension is rated on a 0–100 scale with five-point steps (see Fig. 6), and the final workload score is obtained by averaging the dimension ratings.

In addition to subjective evaluations, physiological signals were also collected to deepen the participant’s state during the experimental trials. Electro-dermal activity (EDA) data and heart data through photopletismogram (PPG) were obtained, respectively, at 4 Hz and 64 Hz using the non-invasive biosensor Empatica E4 wristband (Fig. 7a). From EDA and PPG signals, stress and arousal indicators were derived for each HRC configuration, as explained in the following sub-section. Eye-tracking data were also collected using the Tobii Pro Glasses 3 at 100Hz (Fig. 7b), providing information on cognitive processes and user experience, and analyzing pupillometry and metrics related to fixations and saccades.

Fig. 7
figure 7

The Empatica E4 [14] (a) and the wearable eye-tracking Tobii Pro Glasses 3 [15] (b)

2.2 Data processing and modeling

Table 2 provides a summary of all the dependent and independent variables included in the analysis.

Table 2 Summary of independent and dependent variables

From physiological signals, potential artifacts were identified and removed. By using the MATLAB “Ledalab,” EDA signals were decomposed in tonic components and phasic components through continuous decomposition analysis (CDA) [16]. The tonic component is characterized by the skin conductance level (SCL), which represents the long-term fluctuations in EDA that are not directly derived from external stimuli. Short-term EDA fluctuations elicited by an external stimulus represent a phasic component. From the phasic component, Skin conductance responses (SCRs), which are amplitude differences between the SCL and response peaks, are detected. In the present study, the average SCR (EDA – average SCR) and SCL (EDA – average SCL) were calculated for each HRC configuration, representing stress response and mental strain indicators, respectively. Regarding heart data, NN intervals (i.e., time intervals between systolic peaks) were obtained from PPG. As an HRV measure for stress, the standard deviation of NN intervals (SDNN) was included due to its widespread usage [17, 18].

Eye-tracking data were processed using the “Tobii Pro Lab” software, which filtered raw data and extracted the metrics of interest reported in Table 2. Pupillary dilation is influenced by mental and cognitive processes, and an increase in it is often attributable to a stress response or increased use of mental resources [19, 20]. Fixations represent periods of stable gaze during which the eyes remain relatively still and gain detailed visual information. A longer duration of fixations may be attributable to increased interest in a certain area of interest or increased mental effort resulting from fatigue or difficulty in processing information [19, 20]. Saccades are rapid movements of the eyes, shifting the gaze from one point of interest to another. Analysis of saccade amplitude and velocity provides additional insight into mental effort. A decrease in amplitude as well as velocity of saccades is often related to an increase in mental effort [20, 21].

A series of mixed-effect ordinal logistic regression (MOLR) models were implemented in order to investigate the relationship of the fixed factors of the experiment and their interactions with the subjective responses (i.e., interaction quality and SAM dimensions). The MOLR model was chosen for its suitability in (i) modeling dependent variables defined on an ordinal scale and (ii) handling the participant effect as a random block effect [22, 23]. MOLR models are based on cumulative link mixed models (CLMMs) and can be specified in terms of cumulative logits as follows (supposing to have K levels of the ordinal response variable, n observations, and J participants):

$$\text{logit}\left({\mathbb{P}}\left({Y}_{ij}\le k\right)\right)=\text{log}\left(\frac{{\mathbb{P}}\left({Y}_{ij}\le k\right)}{1-{\mathbb{P}}\left({Y}_{ij}\le k\right)}\right)={\theta }_{k} - {{\varvec{x}}}_{ij}^{\text{T}}{\varvec{\beta}}- {u}_{j} \qquad \text{for} \; k=1, \dots , K-1$$
(1)

with i = 1, …, n and j = 1, …, J. The term \({\mathbb{P}}\left({Y}_{ij}\le k\right)\) is the probability that the observation i of the participant j is associated with a rating below or equal to k. Note that the logit is not defined for k = K since \({\mathbb{P}}\left({Y}_{ik}\le K\right)=1\). The coefficients \({\theta }_{k}\) are called threshold parameters and act as model intercepts. Data of the independent variables of observation i of participant j are contained in the vector \({{\varvec{x}}}_{ij}^{\text{T}}\), whereas the model parameters are in vector \({\varvec{\beta}}\). The term \({u}_{j}\) represents the random effect for participant j, where \({u}_{j} \sim N(0, {\sigma }^{2})\). Note that the negative sign in front of \({{\varvec{x}}}_{ij}^{\text{T}}{\varvec{\beta}}\) ensures that positive parameters are associated with increased probability for higher levels as the independent variables increase. The “ordinal” package from the software R was used to fit MOLR models [24].

Since the NASA-TLX dimension scores and selected indicators for physiological response can be considered continuous variables, linear mixed models (LMMs) were implemented to explore the relationship with the considered factors and to handle the participant effect as a random block effect. LMMs can be represented as follows:

$${Y}_{ij}={{\varvec{x}}}_{ij}^{\text{T}}{\varvec{\beta}}+ {u}_{j} \qquad \text{with} \ i=1, \dots , n \; \text{and} \; j=1,\dots , J$$
(2)

where \({{\varvec{x}}}_{ij}^{\text{T}}\) is the vector containing data of the independent variables for observation i of participant j, \({\varvec{\beta}}\) the vector for the model parameters, and \({u}_{j}\) the random effect for participant j with \({u}_{j} \sim N(0, {\sigma }^{2})\). Models were fitted using the “lmerTest” package from the software R.

The formula used for the models is reported below using the Wilkinson notation [25]:

$$Y \sim 1+\text{Size}*\text{Speed}*\text{Exposure}+\text{Size}*\text{Product}+\text{Speed}*\text{Product}+\text{Exposure}*\text{Product}+(1|\text{Participant})$$
(3)

The significance of the model terms will be analyzed and discussed in the following section for each response variable.

3 Results

In this section, the results of the fitted regression models are described and analyzed. See the Appendix for additional details on the resulting models.

3.1 Perceived interaction quality

The relationships between the experimental factors and the different aspects of the perceived quality of interaction with the robot will be presented. In Table 3, the results of the analysis of deviance (ANODE) for each MOLR model are reported, showing the significance of each term. The large cobot was felt during the interaction slightly safer, more efficient, fluid, and trustworthy. In addition, when the cobot movement speed was high, the cobot was perceived as more helpful, efficient, and fluid and also slightly more uncomfortable. The second experience with the same configuration tended to result in an overall improved interaction quality: The cobot was perceived to be safer, more natural, efficient, fluid, and trustworthy. However, it was also perceived as slightly less helpful and less comfortable when moving slowly. By assembling a more complex product, the cobot was perceived in the second experience to be furthermore helpful, natural, efficient, fluid, and comfortable. In addition, assembling a more complex product with higher movement speed resulted in a slight decrease in perceived efficiency and fluency.

Table 3 ANODE table with resulting p-values for quality of interaction

3.2 Perceived workload

Table 4 contains the results of the analysis of variance (ANOVA) for each linear mixed model of the NASA-TLX dimensions, showing the significance of each model term. In general, it emerges that the perceived workload decreases in the second experience, especially with the more complex product, due to a learning effect. Looking in detail at the various dimensions of workload, there emerges a decrease in mental demand, physical demand, temporal demand, and effort and an increase in performance. Especially with the more complex product, the second experience resulted in a further decrease in perceived mental demand and effort and an increase in performance. Regarding perceived performance, a decrease was observed with the more complex product, especially with even higher speed. Concerning frustration, a slight increase was observed in the second experience, especially when the cobot was moving slower. In fact, a decrease in frustration was observed with higher speed in the second experience.

Table 4 ANOVA table with resulting p-values for NASA-TLX

3.3 Affective state

Table 5 contains the results of the ANODE for each MOLR model for the SAM dimensions, showing the significance of each model term. Initially, the higher speed with a complex product resulted in more negative emotions due to more initial effort on the part of the participant. However, in the second experience, the higher speed was more appreciated once they became familiar with the product assembly. In terms of arousal, an increase was observed with the highest speed and in the second experience. However, there was also a significant decrease in the second experience with the more complex product. The larger cobot implied an initial feeling of less dominance, but this increased during the second experience, especially with the more complex product.

Table 5 ANODE table with resulting p-values for SAM dimensions

3.4 Physiological parameters

Analysis of multiple physiological parameters allows for a broader representation of the mental and cognitive aspects involved during the process. At the same time, finding concordances between multiple physiological signals can provide stronger evidence of the observed phenomena. Table 6 contains the results of the ANOVA for each linear mixed model of the physiological measures considered. Looking at HRV, no particularly significant terms emerged. Focusing on EDA, only a significant reduction in average SCR and SCL was noted in the second experience, mainly due to a learning effect. Regarding eye-tracking metrics, interesting effects emerged in pupillary dilation, peak saccade velocity, and saccade amplitude. The larger cobot and higher speed resulted in an increase in pupillary diameter, likely due to higher user engagement situations. A decrease in pupillary diameter was also observed when the user interacted with the larger cobot while assembling the more complex product. Another decrease emerged in the second experience, denoting a learning effect. Having to assemble a more complex product resulted in a slightly smaller saccade amplitude, indicative of slightly higher cognitive effort. However, in the interaction with the larger cobot and in the second experience, increases in the peak velocity of saccades were observed with the more complex product, indicative of an alleviation of cognitive load.

Table 6 ANOVA table with resulting p-values for physiological measures

4 Discussion

The quantitative analysis revealed the influence of the configuration factors, i.e., cobot size, speed, and product complexity, on the various response variables related to user experience, highlighting some interesting relationships.

Perceived workload was mainly influenced by product complexity and learning effect, meaning that the cobot size and movement speed did not have a significant influence. Therefore, the workload was influenced mainly by the product to be assembled.

Cobot size and movement speed were most influential in the perceived quality of interaction. When the movement speed was higher, the cobot was perceived as more helpful, efficient, and fluid, especially during the second experience of the user. Surprisingly, the larger cobot generally resulted in better interaction quality, with higher perceived safety, efficiency, fluency, and trustworthiness. Although there was lower initial dominance with it, the learning effect totally compensated for the loss.

From a physiological point of view, observing pupillometry, SCR, and SCL confirmed the learning effect, which resulted in a general decrease in mental strain. From the pupillary dilation, it was also noticed a higher activation by the participants with the large cobot and high speed, which can be interpreted as a higher engagement. Through saccade-related metrics, a slight positive influence of product complexity on cognitive load was noted and how it was also slightly mitigated by the large cobot. No significant effect emerged in terms of HRV.

The participants’ unstructured feedback was in line with the results obtained from the questionnaires. A large proportion of participants showed a slight preference toward the larger cobot, perceiving it to be more reliable and stable during movements. This result is in line with the one observed by Karwowski et al. [7], in which industrial workers felt safer in the presence of an arm of a larger robot than a smaller one. Initially, before starting the experiment, some participants had the impression that interacting with the large cobot might result in more anxiety. However, they noticed that during the assembly process, their attention toward the cobot was mainly focused on its end-effector (i.e., the gripper), which was the same for both cobots. As a result, during the interaction, they did not pay special attention to the entire size of the cobot, which suggests that the characteristics of the end-effector may have a greater influence. This aspect needs to be explored further in future work. Another aspect that will be interesting to explore in future work is to compare, in terms of cobot size, the effects of unexpected movements by the cobot on the user experience of the operator.

5 Conclusions

The proposed study aimed to explore the effect of cobot size, in conjunction with robot movement speed and product complexity, on user experience during an HRC assembly process.

Experimental results revealed that the larger cobot was the most preferred during collaborative assembly and did not induce significant stress responses or negative mental strain. The large cobot was perceived as slightly safer, more efficient, fluent, and reliable. Initially, a lower sense of dominance and higher activation in terms of pupillary dilation were found with the large cobot, but with the second exposure to the same configuration, these effects were compensated.

This result can be particularly useful when a company decides to implement collaborative robotics in a production process and needs to choose the cobot model taking into account also human factors. A larger cobot, such as the UR10e, offers the opportunity to work with larger payloads, potentially facilitating reconfigurability and deployment in production processes. Additionally, there was no significant interaction observed between cobot speed and size. This is because, during interaction, the operator’s attention was more focused on the cobot’s end-effector (which was the same for both) than the entire body of the cobot. This phenomenon suggests that it would be beneficial, in future work, to explore the effect of different end-effectors on user experience in HRC. The results of this study should indeed be interpreted as an initial exploration of the effect of cobot size that needs further investigation of other factors that may intervene during HRC. Future work will focus on strengthening the results obtained by evaluating the effect of the cobot larger than the UR10e and other cobot settings, e.g., with the base resting on the floor or the work surface, on user experience.