Keywords

1 Introduction

Common ways available today to experience cultural heritage are visiting sites (real and on-line), registering to museum tours and actively participating in guided hands-on activities, acquiring memorabilia and taking photos of the historical sites, but also immersing into virtual reality exhibits [1].

Virtual heritage has been disseminated to the public at large in various forms, starting with multimedia provided on CD’s and websites [2] up to dedicated desktop solutions [3, 4], virtual reality, and augmented reality installations [5].

The benefits of the latter consist in the opportunity for their users to visit realistically reconstructed historical sites, which, by means of novel technologies, are being augmented with 3-D content [6]. Among all commonly-used interaction modalities, haptic interaction started to be explored [7] in installations that promote cultural values through interaction with the virtual artifacts.

User experience inside a virtual heritage environment may be enhanced also by supporting natural interactions and, consequently, many researchers have proposed multimodal metaphors to support these types of interaction beyond haptic [8], e.g., mouse pointing, click and drag [9], head, eye, and body tracking [10, 11], and face, gesture, and speech recognition [12].

In this work, we focus on natural gesture interaction as it has no language barriers and can rapidly turn into reflex due to its naturalness. Moreover, dealing with cultural heritage artifacts represents a specific challenge because the intrinsic fragility, inaccessibility, or even the lost meaning of these artifacts. Consequently, when users face technological barriers, they quickly become overwhelmed by technology and may lose the original interest for the artifacts, known as the gulf of execution that amplifies for technologies driving virtual environments [13].

To this end we focused on low-cost and accessible technology as Myo [19] and Leap Motion [20] devices. Both Myo wireless armband and Leap Motion desktop device enables the user to control computer generated content using various hand gestures and motions, but they rely on different technologies in user gesture recognition. Myo uses a set of electromyographic sensor which detect activity in the forearm muscles, together with a gyroscope and accelerometer to help recognize motions and magnetometer for gesture recognition, while Leap Motion uses IR cameras and infrared LEDs that observes a roughly hemispherical working area in which the user hands are detected.

Morgan et al. consider Myo arm-swinging as a way to explore a virtual environment and comparing it to joystick locomotion and physical walking [14]. They concluded that people made fewer errors if they explored the virtual environment physically or with Myo arm–swinging than with the joystick. Participants performed equally well in the walking and Myo arm– swinging conditions in terms of errors.

A similar approach was made by Mulling et al. by considering a 2D map [15]. According to them, navigation on interactive maps through the movement of hands and arms through the Myo showed that improvements need still to be performed both on the device and in the same graphical interface (GUI). The use of hand gestures and arms to control interactive maps can be optimized, first from the design of native applications for Myo, in order to explore the device at its maximum performance.

Another attempt uses mixed-reality based interaction system for digital heritage exploration [16]. Here the advantages of adding a scale 3D-printed replica of the architecture to the user interface was experienced. Based on a user study, Bugalia et al. established the ease of interaction and found that even novices are comfortable with the proposed light-based interface and find it intuitive to use. They think that it would be interesting to explore the effectiveness of adding a gyro-meter to the pointing devices, which is something Myo does have for example, to support more powerful controls during the walk.

Aslan et al. have considered how challenges related with many closely positioned(expanding) targets can be addressed [17]. Prototypes were used as probes to foster the discussions on the results of the driving simulator study. They concluded that by combining mid-air gestures (provided by Leap Motion controller) with touch, it is possible to improve in-car touch-based interaction in situations that rely on visual attention, and therefore to augment the user safety while driving.

Moser explores touch versus mid-air gesture input in physics-based gaming [18]. The study showed that, although the developers adapted the game to suit mid-air gestures, several playability problems occurred that should be considered for future game developments. The observations revealed difficulties with accuracy for small and precise or only partial recognition of very fast swipe mid-air gestures. Another problem was that player lost the orientation and moved towards the monitor when performing mid-air gestures (i.e., the signal was lost). Therefore, the mid-air gestures were rated to be more complex and difficult than touch gestures (i.e., rather easy to use).

2 From Virtual Environment Exploration to Artifact Manipulation

Experiencing a virtual heritage environment usually means letting the user to freely navigate in the 3D replica of the artifact, by bounding its movements to the virtual environment dimensions. To this end, we adopt hand-controlled navigation [9] by using a Leap Motion device or Myo, depending on the system’s mode of operation [20].

In our approach, the focal task the users will perform is exploring the environment by interacting with their target artifact, by dividing the task into (a) approaching a target, (b) touching/reaching the target and (c) manipulating the artifact. Touch becomes possible once the user’s avatar is close enough to the target. Touch begins when physical contact between the virtual artifact and the user’s virtual hand takes place. The Myo device accurately signals this event [19]. Grabbing the object is possible by squeezing the hand, action detected by Leap Motion device [20]. From this point forward, the user can manipulate the object.

2.1 Switching Between Navigation and Manipulation Tasks – First Use-Case

The default task state is navigation. However, when users approach an artifact that may be subject of interaction, navigation switches to manipulation. Interactions may take place locally inside the user’s action area (which is denoted “A” in Fig. 1) in the form of observing the artifact (denoted by small “a” in Fig. 1) or manipulating the artifact (“a1”). Should the user decide to manipulate the artifact outside its interaction area then transportation must occur (“a2” in Fig. 1).

Fig. 1.
figure 1

Manipulation scenarios for different users (A, B) and artifacts (a, b) under various constraints (a1, a2).

Switching between navigation and manipulation becomes relevant as the distance between the user’s virtual hand and the artifact is getting smaller. Observation may be achieved if the user is close enough to the artifact, such as for the avatar in the “B” position with respect to the artifacts located in “a”, during user’s navigation. But observation may also be considered as special case of manipulation, as for the avatar in location “B” with artifact in “a”.

Deciding whether control should be given to the avatar’s movement (Fig. 2a) or to the hand (Fig. 2c) depends on the distance to the target and the angle relative to the user’s visual focus (i.e., the amount of focus given to a certain target). These conditions must be validated simultaneously and we achieved this behavior by employing a coordinate system composed of the following components: the 3-D distance to target (the X axis), yaw or the horizontal angle (the Y axis), and pitch or the vertical angle (the Z axis). Distances in this coordinate system represent the ratio of interest in manipulating a virtual artifact (Fig. 3).

Fig. 2.
figure 2

Transition from navigation to manipulation depending on the position of the user’s object of interest.

Fig. 3.
figure 3

The result of a transportation action for several virtual artifacts.

By constraining that the total sum of interests in manipulation and navigation to be constant, we achieve seamless transition between the two tasks, without requiring users to specifically switch between tasks (Fig. 2b). For the scenario in which multiple targets are present in the environment, the closest one will be selected by default according to the distance measurement described above (Fig. 2).

2.2 Switching Between Navigation and Manipulation Tasks – Second Use-Case

Our first solution to switching between navigation and manipulation was tested and deemed hard to use accurately. Therefore, we made a second option that was more reliable.

When the user’s character is close enough to an interactable object and he is looking at it the Myo vibrates shortly to notify the user he can interact with that object and also his virtual hand changes color (Fig. 4b). At this point he can grab and move the object at the same pace with his character. This can be achieved by using the distance between the first-person camera and the objects that the user can interact with.

Fig. 4.
figure 4

The virtual hand changing color when it gets close to an interactable object.

A ray that is casted from the first-person camera towards where the camera is pointed at returns the distance between the camera and the object it hits, if the object is interactable. The returned value is compared with the given average hand length and if the value is the lesser one we consider the user in range to interact with the object.

When the user turns his left palm towards his face, an interface appears near the hand (Fig. 5a) and if the button is pressed he switches to manipulation mode and the button changes color (Fig. 5b). In this mode any object he can interact with loses gravity and navigation is blocked until the same button is pressed again. At which point the navigation and also the object’s gravity are switched back on and the button returns to the original color.

Fig. 5.
figure 5

Using the virtual UI to switch between the navigation and manipulation mode. (Color figure online)

In the manipulation mode the user can grab and rotate object more easily. This is due to the lack of gravity that the object is subjected to. This allows the user to better inspect the heritage until he decides to leave the manipulation mode (Fig. 6).

Fig. 6.
figure 6

User in manipulation mode, interacting with a barrel that is kinematic (has no gravity).

3 Technical Aspects

Our solution is based on three main components: a visualization module responsible with 3D environment real-time rendering – Unity based [21], a hand-oriented interaction module responsible with user navigation inside 3D virtual environment and user manipulation of virtual artifacts – Leap Motion device based [20], and an arm-oriented interaction module responsible with user warning when it approaches to virtual artifacts - Myo device based [19].

In the Leap Motion user-case, moving his hand forwards and backwards makes the user’s avatar move at a directly proportional velocity with the difference between the hand and the Leap Motion center. The same thing happens when the hand is moved left/right and up/down for rotation and orientation.

In the Myo user-case (Fig. 7), making a fist would make the character move frontward (Fig. 7a), waving left/right makes the character move left/right (Fig. 7b, c). Grabbing an interactable object can be achieved by making a spread-fingers pose while facing towards it and being at a certain distance from it that makes the Myo vibrate (Fig. 7d). Also, double taping recalibrates the origin of the local coordinates system relative to the user’s hand (Fig. 7e).

Fig. 7.
figure 7

Myo poses.

Moreover, for displaying the 3D environment we chose to build a holographic pyramid that opens our system both for single and multi-user real-time cultural heritage exploration (Fig. 8).

Fig. 8.
figure 8

Multimodal-based experience with a 3D virtual artifact.

In the single-user option, one user would wear the Myo and use the Leap Motion at the same time. Leap Motion is used for movement and orientation of the character and the Myo armband for warnings and feedback in form of vibrations. All though Myo has gesture recognition and gyroscopes, that could be used for interacting with the environment, we decided to make more use of the Leap Motion because through our testing we concluded that it was more reliable and precise to use.

In the Fig. 9a it is presented the user navigating the environment and, in the Fig. 9b, we can observe him grabbing a heritage object using the grab motion recognized by the Leap Motion.

Fig. 9.
figure 9

Approaching and grabbing virtual artifacts.

The multi-user option is based on our holographic pyramid that displays the two users’ perspectives. The other two perspectives of the pyramid may be used by spectators to watch as the users interact with the environment. Two users can explore the heritage environment in real time and interact. One of the users uses the Myo armband as a controller to move and interact by making certain gestures and moving the hand he’s wearing it on. The other user uses the Leap Motion’s interpretation of his hand movements for controlling his character through the environment. He can also naturally grab some objects and also inspect them more thoroughly by entering the manipulation state.

4 Discussion

In order to evaluate the usability of our solution, we start conducting an experiment to verify the following hypothesis on natural gesture-based interaction for cultural environments.

4.1 Hypothesis and Premises

For a user that comes to experiment a cultural heritage virtual reconstruction, natural gesture-based interaction is easier to be accepted rather than using any conventional interaction device.

For our evaluations we adopted the following two premises:

  1. (1)

    Users know neither the structure nor the topology of the environment beforehand. This premise means that users may become disoriented at the start of their virtual experience and make little sense of the things around them.

  2. (2)

    Users are not aware a priori neither of the actions that are allowed inside the virtual environment nor of the metaphors to engage into these actions. Simply browsing a new world is not enough to deliver the feeling of being part of that world. Instead, it is the exploration, discovery, and participation to that world’s specific cultural activities that are able to deliver the cultural immersive feeling.

4.2 Apparatus

We target about one hundred volunteers to take part in the study, selected from the body of university students and visitors.

We conducted our study using the virtual world platform of the TOMIS project [22] that enables users to engage into the discovery of the reconstructed historical site of the city of Tomis, which was a Greek colony situated on the West coast of the Black Sea.

For the moment, we test our solution only on very few voluntaries in lab setup. Preliminary results showed that natural gesture can provide good guidance for user navigation towards the target implemented as a place in the virtual world or a virtual object to grasp and manipulate, if and only if the system coherently respond to the user gesture.

Although it took a few minutes for the subjects to adjust to the more delicate controls of the system, they learned them quickly because the movements and gestures came naturally.

Given the fact that our previous study was focused on few users we decided to conduct another one. This one’s target audience were visitors that came to our faculty for presentations regarding potential registration into our faculty. Considering the time spent on learning the controls and getting used to them and the high-density audience, we were unable to present the full application in order to not disturb the time table of the presentations. With that in mind we made a demo for our holographic display with a few gestures controls using the Leap Motion. We used a questionnaire to gather the opinions of the users that played the demo.

The demo consists of the holographic display from four perspectives of a building from the virtual world platform of the TOMIS project [22]. The building rotates slowly so that it can be seen from all angles. The user could control the rotation of the building by moving his hand above the Leap Motion on the X and Z axis in order to better explore the building.

The demographic of this study is 42 people with the ages between 17 and 24, 26 of them haven’t seen a holographic display until testing the application. When asked how often they use gesture-based technology those were the results (Fig. 10).

Fig. 10.
figure 10

The frequency our users use of gesture-based technology. (Legend 1-never 2-rarely 3-sometimes 4-often 5-all the time)

In the questionnaire we focused on the following topics: ease of usage, quality of the image, responsiveness and enjoyment of using the application, and final comments. The questionnaire was based on a 1 to 5 rating of the application attributes, based on the previously mentioned topics. For example, responsiveness, it being one of the most important topic, the results are presented in the histogram from (Fig. 11). The other topics had the following highest percentage rating: ease of usage – 4–54%, quality of the image – 4–57%, enjoyment of usage – 5–52%.

Fig. 11.
figure 11

Rating of our application responsiveness.

We had a few complaints with the image not moving fluently but this was caused by the long time the application was running and heating of our computer.

Under the “what you liked the most?” section of the questionnaire, most common answers were about the holographic display, gesture-based controls and the 3D model.

5 Conclusion and Future Work

In this paper, we have presented different types and approaches on natural, gesture-based interaction with a virtual heritage environment. We have explored the advantages and disadvantages of using two different gesture recognition technologies for VE exploration. The interface setup is also simple and cost effective, consisting of cheap materials used for the holographic setup and a desktop monitor.

Although we were unable to conduct a formal study on a broader user experience in order to determine the best approach with our resources, we believe we came close to an easy to learn and use user interface and configuration. It would be interesting to further explore the effectiveness of adding, say, a virtual-reality headset with an Leap Motion device mounted onto it for a better user experience, or even VR controllers.

We could also add some mini-games that are based on the human interaction that took place in the heritage environment’s prime time, for a better heritage understanding and immersion. Further we could add a few Non-Player Characters that you can interact with. For instance, you could help a character make an item or even participate at events. Those characters could be personalities of that time that you could learn more about by interacting with them.

Adding a haptic device for recreating the sense of touch by applying forces, vibrations or motions to the users could make them feel more immersed and make them better understand the texture of the artifacts.