Keywords

1 Introduction

Tourism is a major industry in the world economy and plays an important role in our lives. While tourism can bring significant economic benefits to destinations, large numbers of visitors can cause overcrowded attractions and traffic congestion. Public facilities such as restrooms, parking areas, and visitor centers can become overwhelmed by the influx of tourists, which results in a negative impact on both tourists and local residents.

On one hand, for local governments, by adopting a holistic and sustainable approach to route recommendation, destinations can strike a balance between the economic benefits of tourism and mitigating the congestion problems associated with increased tourist numbers. On the other hand, for tourists, their primary focus is on finding a route that meets their individual conditions and maximizes their personal gains. The congestion-aware route recommendation can lead to a potential Hawk-Dove game [1]: If a sufficient number of tourists share the same preference, a portion of tourists has to be recommended relatively unfavorable routes to avoid excessive crowding at attractions. Taking tourists’ selfishness into consideration, the tourists who are assigned to relatively unfavorable routes may feel dissatisfaction and unfairness.

However, most existing methods are designed from an overall planner’s perspective, and their evaluations are primarily based on simulation data. There have been no studies that have investigated the actual personal experience of tourists regarding congestion-aware route recommendations. To investigate the evaluation of congestion-aware methods from tourists’ personal perspectives, in this study, we conduct a user survey on congestion-aware route recommendations in Kyoto, Japan. We selected five state-of-the-art route recommendation methods with varying degrees of consideration for congestion and tourism diversification. While these methods have demonstrated promising results on their respective datasets, there have been no user studies that verify their performance from the perspective of actual users. Respondents evaluated routes recommended by these methods from various aspects in different scenarios and provided comments on each method within each scenario. We compared the methods using scores from different aspects to reveal their strengths and weaknesses. Furthermore, based on text comments from respondents, we conduct cluster analysis to uncover distinctions among various user types. Additionally, we use a series of regression to estimate the effects of scheduling, visiting order, distances between attractions, and travel comfort on the overall feelings of routes. We compare the coefficients for different groups and apply a bootstrapping method to determine the significance of differences observed between groups.

The contributions of this work are summarized as follows.

  • We conducted a user study to investigate the evaluations of five route recommendation methods, with varying degrees of consideration for congestion and tourism diversification. To the best of our knowledge, this is the first study that compares congestion-aware route recommendation methods from tourists’ personal perspectives.

  • We examined the effect of scheduling, visiting order, distances between attractions, and comfort on the overall feeling of routes. Moreover, we compared the effect of variables on overall feelings among groups and revealed that the sociodemographic factors exert a significant influence over the users’ evaluation for each method.

  • We applied clustering to identify patterns and similarities between users’ responses. The demographic profiles and the effect of variables on overall feelings from each cluster are compared to reveal the differences in evaluation of route recommendation methods between clusters.

2 Related Work

Sustainability, green tourism, and ecotourism have attracted increasing attention in recent years [2, 3]. With an increasing number of tourists flocking to particular popular attractions, congestion has emerged as a significant problem. Marsiglio [4] studied the determination of the optimal number of visitors in a tourism-based economy. Albaladejo et al. [5] conducted an emprical research on tourism demand of the most visited destinations in Spain, with special emphasis on the role of congestion. Takeuchi et al. [6] uses causal inference to estimate the effects of crowd movement guidance from a policy-making perspective.

To address overtourism issues, there has been a growing interest in congestion-aware route recommendation, a variant of the Orienteering Problem [7, 8], while it is more challenging due to the potential congestion caused by multiple tourists arriving at the same attractions.

Cheng et al. [9] proposed a congestion-aware rescheduling method focusing on multigroup travelers with multiple destinations. Varakantham et al. [10] tackled the issue of crowd congestion at particular attractions by providing route guidance to multiple selfish users moving simultaneously. Kong et al. [11] proposed a multi-agent reinforcement learning approach with a dynamic reward mechanism to tackle multi-user route recommendation problem. Kong et al. [12] introduced a multi-agent reinforcement learning approach to address both traffic congestion and spot congestion. In [13], the Orienteering Problem with Time Windows is addressed by taking into account the reward and required stay duration at a spot, considering its congestion level.

However, most existing studies are conducted from an overall planner or policy-making perspective, and the evaluation of these methods is based on simulation data, leaving the actual experience of tourists hardly discussed. To bridge this gap, we conduct a user study to investigate the evaluation of the congestion-aware route recommendation method from the personal perspective of tourists.

3 Methodology

We conducted a user study in Kyoto, Japan, one of the world’s most famous destinations. A screening questionnaire was distributed to 41 respondents. We investigated the following five route recommendation methods. These methods have varying degrees of consideration for congestion and tourism diversification. These methods are trained using the dataset of Kyoto Sightseeing Map 2.0 [14] and trajectory data provided by Yahoo Japan Corporation [15].

  • MARLRR: An multi-agent reinforcement learning approach proposed in [11] for solving the congestion-aware route recommendation task. This method considers that several tourists groups are moving simultaneously, and a congestion penalty of reward function is introduced to avoid overconcentration at attractions.

  • RPMTD: A multi-agent reinforcement learning based route recommendation method proposed in [12] that considers both multiple users accessing simultaneously and the congestion at attractions. This method introduces a dual-congestion mechanism, in which both the local congestion at visited spots and the global distribution of tourists affect tourists’ satisfaction.

  • Non-Dual RPMTD: An alternative version of RPMTD that does not consider the global distribution of tourists in the dual-congestion mechanism.

  • Pointer-NN: A reinforcement learning approach to the Orienteering Problem with Time Windows proposed in [16], which considers neither multiple-agent nor congestion.

  • TRGCSC: A reinforcement learning approach proposed in [13], which is based on [16] and introduces two novel concepts, “dynamic stay duration” and “environmental tax metaphor.” The former concept estimates the necessary stay duration at a spot based on its congestion, and the latter concept assigns dynamic rewards depending on congestion at spots.

We investigated five different scenarios in the questionnaire.

  • Start from Kyoto Tower and end at Kawaramachi, 4-hour time budget.

  • Start from Kyoto Tower and end at Kawaramachi, 6-hour time budget.

  • Start from Kyoto Tower and end at Kawaramachi, 8-hour time budget.

  • Start from Kawaramachi and end at Arashiyama Station, 4-hour time budget.

  • Start from Kawaramachi and end at Kinkakuji Temple, 6-hour time budget.

All participants were asked to evaluate routes from the following aspects for each method under each scenario.

  • Scheduling: The evaluation of time scheduling that includes the time spent at attractions and the time spent moving between attractions.

  • Visiting Order: Whether the visiting order of attractions in the recommended route is reasonable.

  • Distance: Whether the distances between attractions in the recommended route are too far.

  • Comfort: The evaluation of comfort with considering the congestion at spots and the traffic congestion during movement between spots.

  • Overall Feeling: The overall evaluation for each route.

Fig. 1.
figure 1

The web page that displays the information about the recommended spots, distances between spots, and congestion levels.

Table 1. Demographic profile of respondents.

Moreover, all respondents were asked to provide comments on routes recommended by each method under each scenario.

To ensure that respondents are well-informed about the recommended route details, we have created a webpage displaying the information as illustrated in Fig. 1. The webpage consists of three parts from left to right: route information, map display, and spot information. In the route information section, respondents are able to see the designated visitation time for each spot. The map display section illustrates the route and congestion level on a map, providing a visual representation of the journey. The spot information section displays photos, aesthetics scores of photos, congestion information, and reviews for each spot based on Kyoto Sightseeing Map 2.0 data [14].

Fig. 2.
figure 2

Clustering of users.

Fig. 3.
figure 3

Word clouds of each cluster.

4 Characteristics of Respondents

4.1 Demographic Profile of Respondents

The respondents’ demographic characteristics are listed in Table 1. There were approximately 20% more female (60.98%) than male respondents (39.02%). The frequency of travel varied from person to person, with 73.17% of the respondents traveling no less frequently than twice a year. The duration of travel also shows variability among individuals. We also asked about the factors that influence the selection of travel plans. Congestion is the most considered factor (80.49%), followed by price (78.05%), scheduling (63.42%), and season (56.10%). In the comparison of popular attractions and less-known places, 63.42% of the respondents preferred less-known spots, while 29.27% favored popular spots. Concerning congestion, 90.24% of the respondents expressed concern about traffic congestion, and 97.56% were concerned about congestion at spots.

4.2 Clustering Analysis

To explore the similarities and differences between respondents, we cluster the respondents using their text comments for routes under scenarios. We collected a total of 7,708 word comments from 41 users to route recommendation methods under five scenarios. We removed punctuation, convert case, reduce word forms and filter stop words by using Spacy [17].

We use Skip-Gram Word2Vec [18] to represent the words in vector embeddings. The vector representation of each comment is calculated by taking the mean of the words in the comment. Then the vector representation of each user is obtained by calculating the mean of the comments from that user.

We use the K-means method [19] to cluster the respondents, selecting \(K=3\) clusters based on the elbow method. Figure 2 shows the results of clustering after dimensionality reduction by t-SNE [20]. Furthermore, to gain insights into the features of each cluster, we create word clouds to visualize the most common words within each cluster, as illustrated in Fig. 3.

In Cluster 1 (26.8%), 73% of the respondents are female, and 27% are male. The respondents in Cluster 1 are the least frequent travelers but have the longest duration: 72.7% of them select “Twice a year” or less for the frequency, and 63.6% select “7 days” or “More than 7 days” for the duration. Respondents in Cluster 1 show a higher concern about time cost (54.5%), scheduling (72.7%), season (63.6%), and popularity (63.6%) compared to the other clusters. Respondents in Cluster 1 focus on “scenic spot,” “route,” and “order.”

In Cluster 2 (41.5%), 18% of the respondents are female, and 82% are male. Cluster 2 respondents are more concerned about price (88.2%) and congestion (94.1%). Cluster 2 respondents focus on “time,” “route,” and “scenic spot.”

In Cluster 3 (31.7%), 31% of the respondents are female, and 69% are male. Respondents in Cluster 3 are the most frequent travelers but have the shortest duration of travel. Specifically, 61.5% of them select “Once every quarter” or more for the “Frequency of travel,” and 84.6% select “5 days” or less for the “Duration of travel,” with 30.8% of them choosing “Daytrip.” Additionally, they have a stronger preference for popular spots compared to the other clusters. Respondents in Cluster 3 focus on “travel,” “software,” and “time.”

The differences among clusters demonstrate the respondents have varying points of emphasis and preferences which may influence the evaluation of route recommendation methods.

5 Comparison of Methods

5.1 Survey Results

Figure 4(a) demonstrates the evaluation of scheduling for each method. Pointer-NN receive a total of 63% “Very satisfied” or “Satisfied,” followed by TRGCSC (60%), Non-Dual RPMTD (45%), RPMTD (33%), and MARLRR (32%).

Figure 4(b) illustrates the evaluation of the visiting order for each method. The routes recommended by TRGCSC are the most satisfactory on visiting order for respondents, with a total of 62% of the respondents selecting “Very satisfied” or “Satisfied,” followed by Pointer-NN (60%), MARLRR (51%), Non-Dual RPMTD (41%), and RPMTD (40%).

Figure 4(c) shows the evaluation of distance for each method. TRGCSC received a total of 60% “Very satisfied” or “Satisfied,” followed by Pointer-NN (53%), Non-Dual RPMTD (39%), RPMTD (32%), and MARLRR (30%).

Figure 4(d) displays the evaluation of comfort for each method. TRGCSC received the highest satisfaction, with a total of 65% of respondents selecting “Very satisfied” or “Satisfied”. Following closely were Pointer-NN (63%), Non-Dual RPMTD (44%), RPMTD (39%), and MARLRR (39%).

Figure 4(e) illustrates the overall feeling. Pointer-NN is the most satisfactory method for respondents, with a total of 67% of the respondents selecting “Very satisfied” or “Satisfied.” This is followed by TRGCSC, with a total of 51% of the respondents selecting “Very satisfied” or “Satisfied.” Non-Dual RPMTD (40%), RPMTD (41%), and MARLRR (40%) achieved similar performance.

Fig. 4.
figure 4

Survey results for each method across scenarios.

5.2 Discussion

Compared to Pointer-NN, which does not consider congestion, the congestion-aware methods are outperformed in terms of overall feeling and distances between spots. This phenomenon supports the thought mentioned previously that the consideration of congestion could not always improve tourists’ satisfaction. To avoid overcrowding, the congestion-aware methods are more likely to lead tourists to less-known spots rather than popular spots. Although local residents and local governments may benefit from it, the tourists might experience dissatisfaction with non-ordinary trajectories, particularly when the recommended less-known spots are far from other spots, resulting in low scores on distance and overall feeling.

Notably, the differences between Non-Dual RPMTD and RPMTD support the above opinion. With considering the places that no tourists visit, RPMTD tends to recommend more unpopular spots, compared to Non-Dual RPMTD. However, Non-Dual RPMTD outperforms RPMTD in all aspects, indicating that the consideration of the overall distribution of tourists actually makes the recommended routes worse. Hence, the consideration for congestion and tourism diversification is not a case of “the more, the better.” Determining the level of consideration for congestion is a crucial concern for congestion-aware route recommendation methods.

6 Factors Influencing Overall Feeling

6.1 Regression Estimation Results for All Responses

In this section, we consider the effects of scheduling, visiting order, distance, and comfort on overall feeling. Coefficients of the following regression equation are obtained using ordinary least squares (OLS).

$$\begin{aligned} Feeling_i = \beta _0 + \beta _1 Scheduling_i + \beta _2 Order_i + \beta _3 Distances_i + \beta _4 Comfort_i + \varepsilon _i \end{aligned}$$

Regression estimates for all responses are shown in Table 2. All of the variables are statistically significant, which indicates that the variables influence the overall feeling significantly. In terms of their impact on the final score, scheduling appears to have the greatest influence, followed by visiting order, and then distances between attractions. Comfort seems to be the least influential variable on the final score. We use Variance Inflation Factor (VIF) to detect multicollinearity. The VIFs of variables are smaller than 5, which suggests that the multicollinearity between variables is moderate.

With a decreasing order on importance, scheduling, visiting order, distance and comfort all have significantly positive effect on the overall feeling. These results support the previously mentioned opinion: the consideration of congestion and tourism diversification degrees should be balanced with other factors, especially when congestion consideration negatively impacts scheduling, visiting order, and distance between spots.

Table 2. Regression results for all responses. *** denotes \(p<0.01\).

6.2 Regression Estimation Results for Groups

As mentioned previously, differences in user demographics and preferences have been observed. We perform regressions to compare the effects of factors on overall feelings across user groups. The VIFs of variables for each regression, which are not reported here, are smaller than 5, suggesting moderate multicollinearity.

To determine the significance of observed differences in coefficient estimates, a bootstrapping method proposed in [21] is used to calculate empirical p-values that estimate the likelihood of obtaining the observed differences in coefficient estimates if the true coefficients are, in fact, equal. Observations are pooled from the two groups whose coefficient estimates are to be compared. Denoting the number of observations from each group as \(n_1\) and \(n_2\). For each simulation, \(n_1\) and \(n_2\) observations are randomly selected from the pooled \(n_1+n_2\) observations and assigned to group 1 and group 2, respectively. Coefficient estimates are then determined for each group using these observations, and this procedure is repeated 50000 times. The empirical p-value is the percentage of simulations where the difference between coefficient estimates exceeds the actual observed difference in coefficient estimates. For example, a p-value of 0.01 indicates that only 500 out of 50000 simulated outcomes exceeded the sample result, which implies the sample difference is significant.

Table 3. Regression results for popular group and less-known group. Reported coefficients are estimated using OLS. Standard errors are in the parentheses. *** denotes \(p<0.01\), ** denotes \(p<0.05\), * denotes \(p<0.1\).

Popular Group vs. Less-Known Group. We divided the respondents into a popular group and a less-known group based on their responses to the question “prefer to visit popular places or less-known places.” Subsequently, we compared the effects of variables on overall feeling across the groups to explore the differences between different types of tourists.

Regression results for popular group and less-known group are presented in Table 3. As demonstrated by the empirical p-values, compared to respondents who prefer less-known spots, the popular group focuses more on visiting order (at a significance level of 95.55%) and distance between spots (at a significance level of 82.93%), and less on comfort (at a significance level of 1.86%) and scheduling (at a significance level of 22.49%).

Table 4. Regression results for male group and female group. Reported coefficients are estimated using OLS. Standard errors are in the parentheses. *** denotes \(p<0.01\), ** denotes \(p<0.05\), * denotes \(p<0.1\).

Male Group vs. Female Group. Similarly, we compare the effects of variables on overall feeling between the male group and the female group (Table 4).

The 0.0254 p-value in the scheduling column suggests that the scheduling coefficient for the male group is smaller than the female group at a significance level of 2.54%, which indicates that the overall feeling score of female group is more sensitive to scheduling than the male group. In other words, the female respondents are more concerned about scheduling than the male respondents.

Regression Estimation Results for Clusters. Regression estimation results for the clusters are presented in Table 5. The respondents in Cluster 1 focus more on comfort (87.90%) and distance (83.50%), but less on scheduling (16.71%) and visiting order (0.88%) compared to Cluster 2. The respondents in Cluster 2 are more sensitive to scheduling (69.24%) and visiting order (87.65%) but less sensitive to comfort (25.73%) compared to Cluster 3. The 0.8512 p-value in the visiting order column for Cluster 3 vs. 1 suggests that the visiting order coefficient for Cluster 3 is greater than Cluster 1 at 85.12% level of significance.

Table 5. Regression results for each cluster. Reported coefficients are estimated using OLS. Standard errors are in the parentheses. *** denotes \(p<0.01\), ** denotes \(p<0.05\), * denotes \(p<0.1\).

Discussion. The comparison of user groups reveals that tourists have significantly varying concerns about comfort among different groups, indicating that their tolerance for congestion varies and could be inferred from their demographic profiles and historical records. Therefore, a further study with more focus on personalized congestion awareness is suggested. For instance, incorporating personalized congestion penalties into the reward function based on tourists’ demographic profiles and preferences may yield better results than fixed penalties.

7 Conclusion

The purpose of this study is to determine the evaluation of congestion-aware route recommendation methods from tourists’ personal perspective through an experimental questionnaire survey. We investigated respondents’ evaluations of five route recommendation methods with varying degrees of consideration for congestion and tourism diversification. Respondents were asked to evaluate these methods based on five aspects. Moreover, we conducted a series of regression estimations to explore the differences among respondent groups.

Although congestion-aware methods are preferred in terms of visiting order scores and comfort scores, the method that does not consider multiple users or congestion tends to provide the highest overall feeling satisfaction for tourists. The results of regression estimation also support the notion that consideration for congestion is necessary but needs to be balanced with other factors. In decreasing order of importance, scheduling, visiting order, distance between spots, and comfort all exhibit a significantly positive effect on overall feeling.

Additionally, this study did not consider other stakeholders, such as local residents, travel agencies and local governments. These entities typically benefit more from congestion-aware route recommendation methods than tourists. To determine an appropriate degree of consideration for congestion and, consequently, sustainable tourism, the opinions of these stakeholders are also important. We are planning to conduct a survey that takes into account both tourists and local stakeholders in future work.