Variational Autoencoders and Generative Adversarial Networks for Multivariate Scenario Generation

Carbonera, Michele; Ciavotta, Michele; Messina, Enza

doi:10.1007/s42421-024-00097-y

Variational Autoencoders and Generative Adversarial Networks for Multivariate Scenario Generation

Research
Open access
Published: 20 September 2024

Volume 6, article number 23, (2024)
Cite this article

Download PDF

You have full access to this open access article

Data Science for Transportation Aims and scope Submit manuscript

Variational Autoencoders and Generative Adversarial Networks for Multivariate Scenario Generation

Download PDF

Michele Carbonera¹,
Michele Ciavotta¹^na1 &
Enza Messina¹^na1

165 Accesses
Explore all metrics

Abstract

When making decisions with lasting implications over a medium to long timeframe, it is essential to consider not only the most probable scenario, possibly obtained through a forecasting model, but also a range of potential outcomes. This approach allows for effective risk mitigation across a spectrum of scenarios, including less probable ones, and enhances the resilience of planning strategies. In this paper, we demonstrate the development of a generative model capable of learning the multivariate joint probability distribution of link speeds on a road network, using real sensor data. To further enhance the performance of our Generative Adversarial Network model, we employed a Variational AutoEncoder for pre-training the generator network. Experimental results, conducted on three distinct benchmark datasets, highlight the potential of the proposed model in generating new scenario samples of multivariate variables. The Wasserstein distance between the generated distribution and the real data, confirms the good performance of our model compared to state-of-the-art models, based on copulae. The proposed model has shown its ability to generate scenarios that preserve correlations among variables, while producing samples that faithfully represent the empirical marginal distributions.

A Deep Learning Framework for Generation and Analysis of Driving Scenario Trajectories

Article Open access 04 March 2023

Conditional Variational Autoencoder Networks for Autonomous Vehicle Path Prediction

Article 02 April 2022

Hierarchical Latent Structure for Multi-modal Vehicle Trajectory Forecasting

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Rapid advancements in technology, particularly in areas such as information technology, artificial intelligence, and the Internet of Things, have endowed cities with tools to collect data, analyze them, and make data-driven decisions for urban planning and management. These decisions often rely on parameter values that are not definitively known at the time they are made. Forecasting models that predict the most probable realization of uncertain parameters can be employed. However, for decisions with enduring implications in the medium to long-term, it is essential to account not only for the most likely scenario but also for a range of potential outcomes. In such cases, scenario generation procedures are necessary to assemble a comprehensive representation of potential future developments. This allows the decision model to effectively mitigate risks and strengthen the resilience of chosen strategies. This proactive approach enables adaptability to a variety of future circumstances and empowers decision-makers to anticipate and hedge against uncertainty.

When dealing with transportation management problems, it is crucial to consider various possible traffic conditions and generate scenarios for travel speed across the road network. In smart cities, hundreds of thousands of strategically placed sensors provide the capability to collect substantial data on travel speed under different conditions. Several studies, such as Lv et al. (2015), Geng et al. (2019), Yao et al. (2019), have successfully employed machine learning techniques to analyze this data and develop speed forecasting models. Nevertheless, unlike previous works, our objective is to develop a generative machine learning model that can generate scenarios by effectively capturing the joint distribution of speed variables throughout a road network. This approach accounts for complex dependencies among speed variations along road links. It prioritizes the generation of scenarios that approximate the actual joint distribution of the variables, rather than solely focusing on forecasting the most likely scenario. This will enable us to generate scenarios for consideration in decision models under uncertainty, such as those based on stochastic programming or robust optimization.

Speed measurements on road networks are influenced not only by proximity but also by exogenous factors, including human behavior, weather conditions, and the geographical distribution of Points of Interest (POIs), such as theaters, schools, hospitals, and shopping centers. Learning the joint distribution and dependencies among these variables from sensor data holds the potential to acquire more robust and generalizable representations. Undoubtedly, this task is challenging, as the learned models must not only account for intricate topological relationships but also effectively handle the influence of traffic attractors.

In this study, we propose a model based on generative adversarial networks (GANs) (Goodfellow et al. 2020), a deep learning (DL) architecture. GAN models are self-supervised and have demonstrated their potential in capturing multivariate joint distributions across various domains, including image generation (Bao et al. 2017; Gao et al. 2018; Reed et al. 2016), molecular structure generation (Maziarka et al. 2020), and multi-modal information integration (Sutter et al. 2020). To enhance the performance of our GAN-based model, we suggest pre-training the generator using a Variational AutoEncoder (VAE) (Kingma and Welling 2013a). The paper is structured as follows: the next section, we review related work. In Sect. 3, we formalize the problem. The subsequent section outlines our proposed model. The penultimate section provides details on the datasets used for model validation and presents the results compared to state-of-the-art methods. Finally, we offer conclusions and suggest possible directions for future research.

Related Work

Scenario generation is an algorithmic strategy to organize the realizations of the stochastic process, which describes the random parameters affecting a decision problem, with a finite discrete distribution that can be used to apply computational solution techniques. The quality of the solution obtained naturally depends on the scenarios used. In traffic engineering, scenarios generation is a sophisticated approach to enhance the planning and development of urban transportation systems. Using scenarios, it is possible to create and analyze a wide range of traffic conditions in order to optimize the design and functionality of infrastructures such as roads and urban spaces. Scenario generation allows transport and urban planners to implement solutions that take into account uncertainty, for example, to identify potential vulnerabilities in road networks, as proposed in Ciplyte et al. (2014), and to develop strategies to minimize traffic pollution, as in Ejercito et al. (2017). To make resilient and robust decisions, traffic scenarios generation techniques can be used to generate traffic flow, as in Cervellera et al. (2022) and Wu et al. (2020), and traffic travel time scenarios, as proposed in Chen et al. (2017), Yu et al. (2020) and Meiping et al. (2019). Our proposed model focuses on scenarios generation of speed values on a road network. In the last years, several approaches have been proposed to generate scenarios whose distributions approximate the original multivariate data distributions, especially in the field of asset management (Kouwenberg and Zenios 2001) and renewable energy production (Ma et al. 2013). Scenario generation methods are generally based on sampling (Kaut and Wallace 2003) and forecasting (Kaut 2017; Lucheroni et al. 2019) techniques. Sampling-based methods generate scenarios by iteratively drawing samples from the underlying data distributions. Within this category, we can mention techniques such as Monte Carlo sampling (Dong et al. 2019; Xie et al. 2018), Markov Chain Monte Carlo sampling (Papaefthymiou and Klockl 2008), and Latin Hypercube sampling (Yu et al. 2009). Forecasting-based methods are based on models trained using historical data without making any assumptions regarding the underlying distribution functions. Some examples are Auto-Regressive Moving Average models (ARMA) (Meibom et al. 2011), Auto-Regressive Integrated Moving Average (ARIMA) (Chen et al. 2010) and more recently generative models based on Neural Networks (Vagropoulos et al. 2016; Stappers et al. 2020). In a multivariate setting, sampling-based approaches must rely on a joint probability distribution model that in real cases, when a large number of variables must be considered, is not easy to obtain. One popular technique for modeling joint probability distribution employs a family of multivariate distributions with uniform marginals called copulas (Becker 2018; Valizadeh Haghi and Lotfifard 2015), a statistical model grounded on Sklar’s theorem (Chen et al. 2013). This theorem assumes that a joint distribution can be described by the marginal distribution function of each random variable in a multivariate setting. The Copula function models the dependencies between random variables in a multivariate structure, taking as input the marginal distribution function, and can be sampled to generate new scenarios (Kaut and Wallace 2011). The Copula function sampling method avoids constructing a joint probability distribution but requires marginal distributions that resemble the normal distribution. Some of the commonly used copulas include elliptical and Archimedean. However, when considering real case studies, empirical copulas (copulas from real data) that consist of a mixture of copulas commonly modeled using parametric methods are often used. These methods, however, tend to become computationally challenging when applied to high-dimensional data. More recently, generative models have been proposed for statistical modeling. These learn to generate data with the same statistics as a given training dataset, effectively learning its distribution. The model can be used to output new samples that could plausibly have belonged to the original dataset. GAN architectures (Jiang et al. 2018; Chen et al. 2018) are examples of this approach in the context of scenario generation. However, generative methods are mainly used for forecasting and not for learning data distributions. Forecasting the most probable scenario is a different task and they disregard scenarios with low probability that, however, could have a big impact when dealing with decisions under uncertainty. For this reason, we propose a new model based on GAN aimed at learning the joint probability distribution directly from data, to generate a finite discrete distribution of scenarios.

Problem Statement

Given a set of $N$ roads with $k \in \{1, \ldots , N\}$, let $D^k = (x^k_1, x^k_2, \ldots , x^k_n) \in \mathbb {R}^n$, $\forall k \in \{1, \ldots , N\}$, be a collection of $n$ observations regarding the average speed on road $k$. The sample size for each dataset is $n$, and each sample has been collected simultaneously for all roads. In other words, each data point is an $N$-tuple of observations $s_i = (x^1_i, x^2_i, \ldots , x^N_i) \in \mathbb {R}^N$, where $i \in \{1, \ldots , n\}$. Let the set of all $n$-tuples $\{s_1, s_2, \ldots , s_n\} \in \mathbb {R}^{N \times n}$ be the observation set $S$. It represents a particular realization of an unknown multivariate random variable $X$, while $D^k$ represents realizations of the marginal variable $X^k$ for all $k \in \{1, \ldots , N\}$.

A random variable is a formalization of a quantity (in this case, average car speeds measured on the road network) that depends on random events. The formal characterization of a random variable is typically a complex process, as discussed in the previous section, where various approaches from the literature have been presented. In this study, we propose to approximate $X$ using a deep neural network $N(\theta ): \mathcal {{\xi }}\sim \mathcal {N}(0, 1)^Z \longrightarrow \mathbb {R}^N$. This network takes as input a sample extracted from a $Z$-dimensional uncorrelated Gaussian distribution and outputs an $N$-tuple of real values representing the speeds on the considered roads. The network’s weights $\theta $ are obtained through an adversarial training process and backpropagation over the observation set S.

Proposed Approach

This research employs Deep Learning techniques, specifically a Generative Adversarial Network (GAN) and Variational AutoEncoder (VAE), for traffic speed scenario generation. A GAN comprises two distinct neural networks: a Generator (Gen), responsible for creating synthetic instances from random noise as input, and a Discriminator (Dis), which is trained to differentiate between real instances (referred to as $I_{\text {R}}$ in the following) and those generated artificially (synthetic instances or $I_{\text {S}}$). This architecture implements a minimax two-player game, in which the Discriminator aims to enhance its ability to distinguish real from artificially generated instances, while the Generator strives to produce synthetic instances that closely resemble real ones. The GAN architecture employs the binary cross-entropy ($\mathcal {L}_{\text {BCE}}$) loss function, presented below in vectorized form:

$$\begin{aligned} \mathcal {L}_\text {{BCE}} = - \frac{1}{|I|} \left[ L \cdot \log (P) + (1 - L) \cdot \log (1 - P)\right] . \end{aligned}$$

(1)

Here, $P \in [0,1]^{|I|}$ represents the vector of probabilities assigned by the Discriminator to a set of instances I, and $L\in \{0,1\}^{|I|}$ is the vector of binary labels associated with real instances (1) and synthetic instances (0).

The training process of a GAN is structured into three distinct phases, iteratively applied:

1.
The Discriminator is trained only on real instances ($I_{\text {R}}$). During this phase, the loss function is minimized, affecting only the Discriminator’s weights. The Generator is kept idle, meaning that its weights are not updated.
2.
The Discriminator is trained on synthetic instances generated by the Generator from uncorrelated Gaussian noise $z$, that is $\text {Gen}(z)=I_{S}$. In this phase, the Generator also is kept idle.
3.
The Generator is trained with the objective of generating synthetic instances capable of deceiving the Discriminator. The loss function from Eq. 1 reads as follows:
$$\begin{aligned} \mathcal {L}_{\text {BCE}} = -\frac{1}{|I_{\text {S}}|} \log (1 - \text {Dis}(\text {Gen}(z))). \end{aligned}$$
(2)
Here, $\text {Dis}(\text {Gen}(z)) \in [0,1]^{|I_\textrm{S}|}$ represents the vector of probabilities assigned by the Discriminator to synthetic instances $\text {Gen}(z) = I_{\text {S}}$, and $L = \{0\}^{|I_\textrm{S}|}$. During this phase, the loss function is maximized by updating the Generator’s weights while the Discriminator is kept idle.

GAN-based architectures have demonstrated their effectiveness in various fields but often suffers from convergence issues (Kodali et al. 2017). To address this challenge and mitigate instability, we propose pre-training the Generator using a Variational AutoEncoder (VAE) (Kingma and Welling 2013b). VAE models extend the AutoEncoder architecture (Goodfellow et al. 2017; Hsieh 2001), which comprises two key components: an Encoder, which is responsible for generating a compact representation (referred to as the latent representation or latent space) of the input and a Decoder, which, starting from the latent representation, reconstructs the input to the model. In VAE architectures, the encoder creates a latent space that represents the regularized distribution of the input data. The decoder samples from the latent space distribution to reconstruct the provided input. The choice of a VAE for pre-training the Generator is grounded in two fundamental considerations: (i) It is possible to design a VAE where the primary objective is to learn how to reconstruct the input distribution, rather than reconstructing the individual instances provided as input to the encoder, and (ii) It is possible to feed the decoder with samples extracted from an uncorrelated multivariate Gaussian distribution to generate synthetic instances consistent with a specific data distribution. presents an architectural overview of the proposed approach. To achieve goals (i) and (ii) we propose the following specialized loss function:

$$\begin{aligned} \mathcal {L}_{\text {VAE}}=\alpha \cdot \mathcal {L}_{\text {JSD}} + \beta \cdot \mathcal {L}_{\text {M}} + \gamma \cdot \mathcal {L}_{\rho _\textrm{s}} + \delta \cdot \mathcal {L}_{\perp }. \end{aligned}$$

(3)

Here, $\mathcal {L}_{\text {JSD}}$, $\mathcal {L}{\text {M}}$, $\mathcal {L}_{\rho _\textrm{s}}$, $\mathcal {L}_{\perp }$ are different loss terms and $\alpha $, $\beta $, $\gamma $, and $\delta $ are relative scalar coefficients. The first three loss terms are introduced to steer the training process and penalize differences in terms of shape ($\mathcal {L}_{\text {JSD}}$), median ($\mathcal {L}{\text {M}}$), and correlation ($\mathcal {L}_{\rho _\textrm{s}}$) between the input and reconstructed random (Goal i), while $\mathcal {L}_{\perp }$ is responsible for decorrelating the dimensions of the latent space (Goal ii). It is worth noting that to capture statistical features of both the input and reconstructed distributions, the computation of the loss function’s gradient involves considering instances within the input batch collectively, rather than aggregating the contribution of each instance individually.

Table 1 PEMS-BAY-32-METR-LA-32 datasets and PEMS-BAY-48/METR-LA-48 datasets architecture models

Full size table

Going into more detail, the $\mathcal {L}_{\text {JSD}}$ term (presented in Eq. 4) is included to force the model to generate an reconstructed joint distribution in which each constituent marginal distribution closely aligns with the corresponding input one. To achieve this, we employ the Jensen–Shannon Divergence (JSD) (Lin 1991), a metric quantifying the similarity between two distributions. To elaborate further, in our approach, we calculate the JSD to assess the dissimilarity between the kth input marginal variable ($X^k_{\text {in}}$) from the encoder and its corresponding reconstructed counterpart ($X^k_{\text {out}}$). We subsequently aggregate these measurements across all marginal variables ($k=1\dots N$). The $\mathcal {L}_{\text {JSD}}$ is calculated as follows:

$$\begin{aligned} \mathcal {L}_{\text {JSD}}=\sum _{k=1}^{N}{\sqrt{\frac{1}{2}\cdot KL(X_{\text {out}}^k,m_k)+\frac{1}{2}\cdot KL(X_{\text {in}}^k,m_k)}}. \end{aligned}$$

(4)

Here, $m_k\quad =\quad \frac{1}{2}\cdot (X_{\text {in}}^k + X_{\text {out}}^k)$ and KL represents the Kullback–Leibler divergence (Kullback and Leibler 1951). The loss term $\mathcal {L}{\text {M}}$ (presented in Eq. 5) measures the aggregated distance between the medians of each input marginal variable ($\widetilde{X}_{\text {in}}^k$) and its reconstructed counterpart ($\widetilde{X}_{\text {out}}^k$).

$$\begin{aligned} \mathcal {L}{\text {M}}=\sum _{k=1}^{N}{|\widetilde{X}_{\text {in}}^k- \widetilde{X}_{\text {out}}^k|}. \end{aligned}$$

(5)

The loss term $\mathcal {L}_{\rho _\textrm{s}}$ is introduced to penalize the dissimilarity between the input and reconstructed variables concerning their correlation, as quantified by Spearman correlation. Similar to Pearson correlation (Pearson 1895), Spearman correlation ($\rho _\textrm{s}$) (Spearman 1904; Schober et al. 2018) assesses the rank correlation of two random variables. Nonetheless, whereas Pearson correlation evaluates linear relationships, Spearman correlation considers both linear and non-linear relationships, as well as monotonic relationships. The Spearman correlation is calculated as follows:

$$\begin{aligned} \rho _\textrm{s} {(X,Y)} = \frac{\text {cov}(rg_X, rg_Y)}{\sigma _{rg_X} \sigma _{rg_Y}}. \end{aligned}$$

(6)

Here, $rg_X$ and $rg_Y$ represent the ranks associated with the random variables X and Y. Going into more detail, in our approach, we compute Spearman correlation matrices for every possible pair of marginal variables in both the input and reconstructed batches. Subsequently, the loss term $\mathcal {L}_{\rho _\textrm{s}}$ is computed as the aggregated distance between the input and reconstructed correlation matrices, as follows:

$$\begin{aligned} \mathcal {L}_{\rho _\textrm{s}} = \sum _{k=1}^{N}{\sum _{l=1}^{N}{\left| \rho _\textrm{s}{(X^k_{\text {in}},X^l_{\text {in}}})-\rho _\textrm{s}{(X^k_{\text {out}},X^l_{\text {out}})} \right| }}. \end{aligned}$$

(7)

Here, $\rho _\textrm{s}{(X^k_{\text {in}},X^l_{\text {in}})}$ represents the Spearman correlation calculated for the kth and lth input marginal variables while $\rho _\textrm{s}{(X^k_{\text {out}},X^l_{\text {in}})}$ is the Spearman correlation for the reconstructed counterparts. Finally, to drive the training process to generate an uncorrelated latent space, the term $\mathcal {L}_{\perp }$ is incorporated into the loss function, drawing inspiration from (Yoo et al. 2021). The rationale behind this choice lies in the idea that a latent space comprising uncorrelated variables compels the decoder to acquire an understanding of the relationships between its constituent components, as these relationships are no longer encoded in the latent space. Consequently, it becomes possible to generate verisimilar instances, i.e., instances that are consistent with the joint distribution used to train the VAE, by providing the decoder with uncorrelated Gaussian noise. The $\mathcal {L}_{\perp }$ loss term is calculated as follows:

$$\begin{aligned} \mathcal {L}_{\perp }=\sum _{k=1}^{Z}{\sum _{i=1}^{Z}{\left| \frac{1}{M}\left( \sum _{j=1}^{M}{z_j^k z_j^i}\right) -\frac{1}{M^2}\left( \sum _{j=1}^{M}{z_j^k} \sum _{j=1}^{M}{z_j^i}\right) \right| }}, \end{aligned}$$

(8)

where Z represents the dimension of the latent space, M is the batch size, and $z^k_j$ and $z^i_j$ denote the values of the kth and ith latent space variables for the jth batch instance, respectively.

Experimental Evaluation

This section proves how the proposed GAN-based model for scenario generation can learn multivariate distribution and generate scenarios having the same distribution of the original dataset. Specifically, the scenarios generated are consistent and coherent to dataset instances.

Datasets

To evaluate our proposed model, we consider three datasets, PEMS-BAY, METR-LA, and CHENGDU.

PEMS-BAY (Li et al. 2017) is a dataset collected by the California Transportation Agencies Performance Measurement System (PeMS). The dataset contains average vehicle speed values computed every 5 min for each of the 325 speed detectors placed on the urban road network (Fig. 2) of Santa Clara (California, USA), between January $1{\textrm{st}}$, 2017 and June $30{\textrm{th}}$, 2017. The total number of observed traffic data points is 16,937,179.

METR-LA (Jagadish et al. 2014) is a dataset consisting of average vehicle speed values collected by 207 detectors on highways of Los Angeles County (Fig. 3). The data spans 4 months, from March $1{\textrm{st}}$, 2012, to June $30{\textrm{th}}$, 2012. It encompasses a total of 6,519,002 observed traffic data points.

Both PEMS-BAY and METR-LA datasets provide speed readings for different sections of 8 main roadways in both directions. For our experiments, we initially considered 16 marginal variables by averaging data from all sensors associated with each lane of the same road.

Table 2 CHENGDU dataset architecture model—16 roads

Full size table

The CHENGDU dataset consists of data of the Chengdu road network (Gao et al. 2021) as described in Guo et al. (2019). Chengdu is a megacity in the western region of China. Recorded at 2-min intervals across five representative time horizons (3:00–5:00, 8:00–10:00, 12:00–14:00, 17:00–19:00, and 21:00–23:00), the dataset encompasses 5943 individual road segments (Fig. 4). To be consistent to the previous datasets, we considered the 16 main road arteries.

Model Architecture

The PEMS-BAY and METR-LA datasets exhibit similar structures and data volumes, allowing for the use of the same general architecture with consistent hyperparameters. In contrast, the CHENGDU dataset necessitates a specialized configuration due to its different structure. The specifics of the model structures are presented below.

For the PEMS-BAY-16 and METR-LA-16 datasets, the Variational AutoEncoder (VAE) component of the proposed approach features three linear layers of heterogeneous dimensions in the encoder, mirrored in the decoder to ensure symmetry in the model’s representation learning. The input and output dimensions are $\mathbb {R}^{16\times 1}$, and the latent space size is $\mathbb {R}^{12\times 1}$. In addition, the latent space layer is preceded by a batch normalization layer (to enable the resulting Generator in the GAN architecture to work on normal data with mean 0 and variance 1). Figure 5 illustrates the pairwise correlations among components of the VAE latent space when fed with the PEMS-BAY-16 dataset. The decoder generates scenarios that approximate the real data distribution starting from an uncorrelated and Gaussian-distributed latent space, suggesting that the decoder is able to learn the distributions and correlations of the original data. In this way, we can use uncorrelated Gaussian-sampled values as input to the decoder to generate scenarios. It is worth noting that the decorrelation observed in the figure is a result of the specific loss function employed in the model training.

The generative component (GAN) is composed of a Discriminator featuring three linear layers, while the Generator corresponds to the Decoder element of the Variational AutoEncoder (VAE), as previously described. The architectural configuration, where network parameters such as the number of layers and neurons are chosen experimentally, of both the VAE and GAN components within the proposed approach for PEMS-BAY-16 and METR-LA-16 datasets is presented in Table 1 in the first column. The models’ architecture for PEMS-BAY-32/METR-LA-32 and PEMS-BAY-48/METR-LA-48 datasets are reported in the second and third of Table 1.

Since the CHENGDU dataset comprises a significantly larger quantity of samples, it has facilitated the exploration of deeper parameterizations for the generative model proposed in this study. The availability of large data volumes enables deeper networks to capture hierarchical features and abstractions within the data. This proves particularly advantageous when dealing with complex patterns or nuanced information inherent in traffic-related phenomena. Furthermore, training with large sample collections helps mitigate concerns about overfitting and promotes generalization. As a result, both the VAE and the GAN Generator have five hidden layers (instead of three), while the latent space of the VAE remains the same size ($\mathbb {R}^{12\times 1}$). Additional details are provided in Table 2.

Computational Results

The analysis of the results obtained on the PEMS-BAY-16, METR-LA-16, and CHENGDU datasets reveals our approach’s proficiency in learning and replicating the actual correlation among random variables. To qualitatively assess our approach, we present in Fig. 6 the correlation matrices of both the original and generated distributions for PEMS-BAY-16, METR-LA-16, and CHENGDU. The visualizations reveal strong similarities in patterns, suggesting that our proposed model effectively captures the correlations.

Table 3 PEMS-BAY-16 dataset

Full size table

Table 4 METR-LA-16 dataset

Full size table

In Fig. 7, we show a visual comparison between the original and the generated distributions of different marginal variables, each associated to the speed on the corresponding road. It is possible to observe that empirical distributions of different shapes are well represented.

To compare our method with a state-of-the-art method, we conducted a comparative analysis between our proposed approach and the Gaussian copulas, a widely used technique for scenario generation. Tables 3, 4, and 5 display a comprehensive summary of statistical measures for each random variable. Specifically, we present the mean and standard deviation of the actual empirical distribution, alongside the distributions derived from 1000 instances generated by both approaches.

Table 5 CHENGDU dataset

Full size table

Table 6 Wasserstein distances—10 repetitions

Full size table

For our comparative analysis of distribution shapes, we also computed the Wasserstein distance between the original marginal distributions and the ones generated by our model and the Copula model, respectively. Table 6 exhibits our model’s superior or comparable performance to the Copula method in generating less common instances in the distribution tails, with significantly lower computational effort. Further insights are gained by examining these tables in conjunction with the marginal distributions depicted in Fig. 7. The results indicate that, when a marginal distribution deviates from the Gaussian distribution, the Copula-based model becomes less effective in replicating the true distribution, reporting a higher Wasserstein distance compared to our model. This underscores our model’s substantial capacity to learn and faithfully replicate complex joint probability distributions directly from data, without making any assumptions about the distribution of the underlying variables. Notably, this ability is evident in scenarios such as bimodal distributions, as seen in marginal variables #9 and #11 in the METR-LA-16 dataset (Fig. 7b), distributions characterized by outliers, like marginal variables #9 and #11 in the PEMS-BAY-16 dataset (Fig. 7a), or distributions exhibiting pronounced skewness, for instance, marginal variables #0 and #14 in the METR-LA-16 dataset (Fig. 7b), and marginal variables #5 and #10 in the CHENGDU dataset (Fig. 7c).

To study the ability of our model to learn spatial correlations on an increased number of marginal components, we split each road of the PEMS-BAY and METR-LA datasets into sub-segments, obtaining models with 32 marginal variables, by splitting the roads in 2 segments, and 48 marginal variables, by splitting the original roads into 3 sections. The resulting datasets are denoted with PEMS-BAY-$N$ and METR-LA-$N$, where N is the number of road sections considered. The results demonstrate that the model effectively captures and reproduces the correlations present in the original data. Figure 8 visually presents the correlation matrices for both original and generated data from the PEMS-BAY-$N$ and METR-LA-$N$ datasets, with 32 and 48 road segments. Notably, the correlation patterns observed in the original data matrices (on the left) persist for the generated data (on the right).

Also for models with 32 and 48 marginal variables, we performed a comparative analysis with the Copula model. To enhance the robustness of our results, we performed 10 runs of the training and generation processes. The outcomes are presented in Tables 7, 8, 9, and 10.

Finally, in Fig. 9 we show, for each dataset, the 2D t-SNE (t-Distributed Stochastic Neighbor Embedding) plots (Maaten and Hinton 2008) of data distributions representing the real data (blue points), the GAN generated data (orange points) and the Copula generated data (green points), while for comparison, in Tables 6 and 9, the Wasserstein distance is reported.

T-SNE serves as a pivotal method for visualizing multidimensional data by effectively reducing it to two dimensions. T-SNE plots allow us to evaluate the ability of our model to learn features and patterns from the real data, and to generate scenarios where the more generated and real data are close, the more their t-SNE representation overlap (Tai et al. 2023). Observing the t-SNE representations of GAN-generated data, distributions reveal a substantial congruence with those derived from real data. Examining the Wasserstein distance in the METR-LA-16 dataset (see second of Table 6), one can observe a higher distance for marginal variables #3, #6, and #7. Similarly, in the METR-LA-32 dataset (see Table 9), marginal variables #10, #14, and #25 exhibit higher Wasserstein distances. Moreover, the mismatch between real and generated data in t-SNE plots is more pronounced in smaller METR-LA datasets, specifically those with 16 and 32 road segments, than in the larger one, that is the METR-LA-48 dataset with 48 road segments. This seemingly counterintuitive phenomenon can be explained by the dataset creation process, which involves aggregation on different scales of average speed information across segments belonging to the same roads. When the aggregation scale is larger, the resulting road segments are, on average, longer, leading to heightened variability in average speeds and longer-tailed distributions. In contrast, shorter segments (as in the METR-LA-48 dataset) result in reduced variability. Furthermore, considering shorter segments allows the model to more effectively discern latent patterns associated with exogenous factors that may influence traffic with continuity or periodicity, such as Points of Interest.

Conclusion and Future Works

This work introduces a generative model for realistic traffic scenarios. The model aims to capture marginal variable distributions and their correlations found in real data. Key contributions include: (i) a GAN model with a pre-trained VAE-based generator for scenario creation, (ii) a specialized loss function prompting the VAE to learn both overall distributions and variable correlations, and (iii) an empirical proof of the model’s ability to accurately replicate the actual underlying marginal distributions and correlations. This approach outperforms existing methods in faithfully reproducing complex distributions, ensuring consistency with real datasets in generating instances. A thorough analysis, employing statistical indices and Wasserstein distance to compare the generated and real distributions, has been conducted to assess the performance of our model in comparison to a Gaussian Copula-based approach. The findings indicate that our model outperforms the Copula-based model without necessitating assumptions about the actual marginal distributions. The model architecture presented in this work can be used to solve logistic problems under road speed uncertainty. By capturing the correlation between roads, robust solutions, resilient to the inherent uncertainties in traffic data, can be found. An analysis of the experimental results reveals the proposed model’s capability to generate scenarios close to the original data. Future advancements may focus on exploring Deep Learning architectures and techniques that incorporate the graph structure during both the training and generation processes. Finally, the proposed model presents opportunities for incorporating temporal correlations evolution, potentially through temporal Neural Networks based on attention mechanisms.

Table 7 PEMS-BAY-32 dataset—Wasserstein distances—10 repetitions

Full size table

Table 8 PEMS-BAY-48 dataset dataset—Wasserstein distances—10 repetitions

Full size table

Table 9 METR-LA-32 dataset—Wasserstein distances—10 repetitions

Full size table

Table 10 METR-LA-48 dataset—Wasserstein distances—10 repetitions

Full size table

Data availability

The data that support the findings described in this paper can be found in the related cited papers.

References

Bao J, Chen D, Wen F, Li H, Hua G (2017) Cvae-gan: fine-grained image generation through asymmetric training. In: 2017 IEEE international conference on computer vision (ICCV). IEEE Computer Society, Los Alamitos, pp 2764–2773. https://doi.org/10.1109/ICCV.2017.299
Becker R (2018) Generation of time-coupled wind power infeed scenarios using pair-copula construction. IEEE Trans Sustain Energy 9(3):1298–1306. https://doi.org/10.1109/TSTE.2017.2782089
Article Google Scholar
Carbonera M (2023) DrivingIntoUncertainty—Code Repository. https://github.com/MIND-Lab/DrivingIntoUncertainty
Cervellera C, Macciò D, Rebora F (2022) Copula-based scenario generation for urban traffic models. Expert Syst Appl 210:118389. https://doi.org/10.1016/j.eswa.2022.118389
Article Google Scholar
Chen P, Pedersen T, Bak-Jensen B, Chen Z (2010) Arima-based time series model of stochastic wind power generation. IEEE Trans Power Syst 25(2):667–676. https://doi.org/10.1109/TPWRS.2009.2033277
Article Google Scholar
Chen Y, Wen J, Cheng S (2013) Probabilistic load flow method based on Nataf transformation and Latin hypercube sampling. IEEE Trans Sustain Energy 4(2):294–301. https://doi.org/10.1109/TSTE.2012.2222680
Article Google Scholar
Chen M, Yu G, Chen P, Wang Y (2017) A copula-based approach for estimating the travel time reliability of urban arterial. Transp Res Part C Emerg Technol 82:1–23. https://doi.org/10.1016/j.trc.2017.06.007
Article Google Scholar
Chen Y, Wang Y, Kirschen D, Zhang B (2018) Model-free renewable scenario generation using generative adversarial networks. IEEE Trans Power Syst 33(3):3265–3275. https://doi.org/10.1109/TPWRS.2018.2794541
Article Google Scholar
Ciplyte K, Vingrys S, Sabonis R (2014) Evaluation of infrastructure design solutions based on traffic modelling. In: Environmental engineering. Proceedings of the international conference on environmental engineering. ICEE, vol 9, p 1. https://doi.org/10.3846/enviro.2014.115
Dong G, Chen Z, Wei J (2019) Sequential Monte Carlo filter for state-of-charge estimation of lithium-ion batteries based on auto regressive exogenous model. IEEE Trans Ind Electron 66(11):8533–8544. https://doi.org/10.1109/TIE.2018.2890499
Article Google Scholar
Ejercito PM, Nebrija KGE, Feria RP, Lara-Figueroa LL (2017) Traffic simulation software review. In: 2017 8th international conference on information, intelligence, systems and applications (IISA), pp 1–4. https://doi.org/10.1109/IISA.2017.8316415
Gao F, Yang Y, Wang J, Sun J, Yang E, Zhou H (2018) A deep convolutional generative adversarial networks (dcgans)-based semi-supervised method for object recognition in synthetic aperture radar (sar) images. Remote Sens. https://doi.org/10.3390/rs10060846
Article Google Scholar
Gao C, Guo H, Sheng W (2021) Travel time data of Chengdu road network. https://doi.org/10.21227/65x6-2f13
Geng X, Li Y, Wang L, Zhang L, Yang Q, Ye J, Liu Y (2019) Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. Proc AAAI Conf Artif Intell 33(01):3656–3663. https://doi.org/10.1609/aaai.v33i01.33013656
Article Google Scholar
Goodfellow I, Bengio Y, Courville A, Heaton J (2017) Deep learning. Genet Programm Evol Mach 19:305–307
Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139–144. https://doi.org/10.1145/3422622
Article MathSciNet Google Scholar
Guo F, Zhang D, Dong Y, Guo Z (2019) Urban link travel speed dataset from a megacity road network. Sci Data 6(1):61. https://doi.org/10.1038/s41597-019-0060-3
Article Google Scholar
Hsieh WW (2001) Nonlinear principal component analysis by neural networks. Tellus A Dyn Meteorol Oceanogr. https://doi.org/10.3402/tellusa.v53i5.12230
Article Google Scholar
Jagadish HV, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big data and its technical challenges. Commun ACM 57(7):86–94
Article Google Scholar
Jiang C, Mao Y, Chai Y, Yu M, Tao S (2018) Scenario generation for wind power using improved generative adversarial networks. IEEE Access 6:62193–62203. https://doi.org/10.1109/ACCESS.2018.2875936
Article Google Scholar
Kaut M (2017) Forecast-based scenario-tree generation method. https://doi.org/10.1049/iet-rpg.2015.0568
Kaut M, Wallace SW (2003) Evaluation of scenario-generation methods for stochastic programming. https://doi.org/10.18452/8296
Kaut M, Wallace SW (2011) Shape-based scenario generation using copulas. Comput Manag Sci 8(1):181–199. https://doi.org/10.1007/s10287-009-0110-y
Article MathSciNet Google Scholar
Kingma DP, Welling M (2013a) Auto-encoding variational Bayes. arXiv:1312.6114. https://doi.org/10.48550/arXiv.1312.6114
Kingma DP, Welling M (2013b) Auto-encoding variational Bayes. arXiv:1312.6114. https://doi.org/10.48550/arXiv.1312.6114
Kodali N, Abernethy J, Hays J, Kira Z (2017) On convergence and stability of gans. arXiv:1705.07215
Kouwenberg R, Zenios S (2001) Stochastic programming models for asset liability management. Handb Asset Liabil Manag Set 1. https://doi.org/10.1016/B978-044453248-0.50012-5
Article Google Scholar
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article MathSciNet Google Scholar
Li Y, Yu R, Shahabi C, Liu Y (2017) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. Learning. https://doi.org/10.48550/arXiv.1707.01926
Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145–151. https://doi.org/10.1109/18.61115
Article MathSciNet Google Scholar
Lucheroni C, Boland J, Ragno C (2019) Scenario generation and probabilistic forecasting analysis of spatio-temporal wind speed series with multivariate autoregressive volatility models. Appl Energy 239:1226–1241. https://doi.org/10.1016/j.apenergy.2019.02.015
Article Google Scholar
Lv Y, Duan Y, Kang W, Li Z, Wang F-Y (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873. https://doi.org/10.1109/TITS.2014.2345663
Article Google Scholar
Ma X-Y, Sun Y-Z, Fang H-L (2013) Scenario generation of wind power based on statistical uncertainty and variability. Sustain Energy IEEE Trans 4:894–904. https://doi.org/10.1109/TSTE.2013.2256807
Article Google Scholar
Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11)
Maziarka Ł, Pocha A, Kaczmarczyk J, Rataj K, Danel T, Warchoł M (2020) Mol-cyclegan: a generative model for molecular optimization. J Cheminform. https://doi.org/10.1186/s13321-019-0404-1
Article Google Scholar
Meibom P, Barth R, Hasche B, Brand H, Weber C, O’Malley M (2011) Stochastic optimization model to study the operational impacts of high wind penetrations in Ireland. IEEE Trans Power Syst 26(3):1367–1379. https://doi.org/10.1109/TPWRS.2010.2070848
Article Google Scholar
Meiping Yun XY, Qin Wenwen, Liang F (2019) Estimation of urban route travel time distribution using Markov chains and pair-copula construction. Transportmetrica B Transp Dyn 7(1):1521–1552. https://doi.org/10.1080/21680566.2019.1637798
Article Google Scholar
Papaefthymiou G, Klockl B (2008) Mcmc for wind power simulation. IEEE Trans Energy Convers 23(1):234–240. https://doi.org/10.1109/TEC.2007.914174
Article Google Scholar
Pearson K (1895) Note on regression and inheritance in the case of two parents. Proc R Soc Lond 58(347–352)
Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis. In: International conference on machine learning, pp 1060–1069. https://doi.org/10.48550/arXiv.1605.05396(PMLR)
Schober P, Boer C, Schwarte LA (2018) Correlation coefficients: appropriate use and interpretation. Anesth Analg 126(5):1763–1768. https://doi.org/10.1213/ANE.0000000000002864
Article Google Scholar
Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72–101. https://doi.org/10.2307/1412159
Article Google Scholar
Stappers B, Paterakis NG, Kok K, Gibescu M (2020) A class-driven approach based on long short-term memory networks for electricity price scenario generation and reduction. IEEE Trans Power Syst 35(4):3040–3050. https://doi.org/10.1109/TPWRS.2020.2965922
Article Google Scholar
Sutter T, Daunhawer I, Vogt J (2020) Multimodal generative learning utilizing Jensen-Shannon-divergence. Adv Neural Inf Process Syst 33:6100-6110
Google Scholar
Tai C-Y, Wang W-J, Huang Y-M (2023) Using time-series generative adversarial networks to synthesize sensing data for pest incidence forecasting on sustainable agriculture. Sustainability. https://doi.org/10.3390/su15107834
Article Google Scholar
Vagropoulos SI, Kardakos EG, Simoglou CK, Bakirtzis AG, Catalão JPS (2016) Ann-based scenario generation methodology for stochastic variables of electric power systems. Electr Power Syst Res 134:9–18. https://doi.org/10.1016/j.epsr.2015.12.020
Article Google Scholar
Valizadeh Haghi H, Lotfifard S (2015) Spatiotemporal modeling of wind generation for optimal energy storage sizing. IEEE Trans Sustain Energy 6(1):113–121. https://doi.org/10.1109/TSTE.2014.2360702
Article Google Scholar
Wu C, Chen L, Wang G, Chai S, Jiang H, Peng J, Hong Z (2020) Spatiotemporal scenario generation of traffic flow based on lstm-gan. IEEE Access 8:186191–186198. https://doi.org/10.1109/ACCESS.2020.3029230
Article Google Scholar
Xie ZQ, Ji TY, Li MS, Wu QH (2018) Quasi-Monte Carlo based probabilistic optimal power flow considering the correlation of wind speeds using copula function. IEEE Trans Power Syst 33(2):2239–2247. https://doi.org/10.1109/TPWRS.2017.2737580
Article Google Scholar
Yao H, Tang X, Wei H, Zheng G, Li Z (2019) Revisiting spatial-temporal similarity: a deep learning framework for traffic prediction. Proc AAAI Conf Artif Intell 33(01):5668–5675. https://doi.org/10.1609/aaai.v33i01.33015668
Article Google Scholar
Yoo B, Lee J, Ju J, Chung S, Kim S, Choi J (2021) Conditional temporal neural processes with covariance loss. In: Proceedings of the 38th international conference on machine learning, vol 139, pp 12051–12061
Yu H, Chung CY, Wong KP, Lee HW, Zhang JH (2009) Probabilistic load flow evaluation with hybrid Latin hypercube sampling and Cholesky decomposition. IEEE Trans Power Syst 24(2):661–667. https://doi.org/10.1109/TPWRS.2009.2016589
Article Google Scholar
Yu Y, Chen M, Qi H, Wang D (2020) Copula-based travel time distribution estimation considering channelization section spillover. IEEE Access 8:32850–32861. https://doi.org/10.1109/ACCESS.2020.2970530
Article Google Scholar

Download references

Acknowledgements

he work presented in this paper has been partially funded by MOST—National Sustainable Mobility Center, part of the European Union-NextGenerationEU project (MOST—National Sustainable Mobility Center CN00000023, Italian Ministry of University and Research Decree n. 1033-17/06/2022, Spoke 8) and ULTRA OPTYMAL—Urban Logistics and sustainable TRAnsportation: OPtimization under uncertainty and MAchine Learning, a PRIN2020 project funded by the Italian University and Research Ministry (grant number 20207C8T9M). An earlier version of this paper was presented at "2023 IEEE International Conference on Big Data".

Funding

Open access funding provided by UniversitÃ degli Studi di Milano - Bicocca within the CRUI-CARE Agreement. This research is funded by Italian University and Research Ministry (grant number 20207C8T9M) and European Union-NextGenerationEU—Italian Ministry of University and Research (National Sustainable Mobility Center CN00000023, spoke 8).

Author information

Michele Ciavotta and Enza Messina contributed equally to this work.

Authors and Affiliations

DISCo, University of Milano-Bicocca, Milan, Italy
Michele Carbonera, Michele Ciavotta & Enza Messina

Authors

Michele Carbonera
View author publications
You can also search for this author in PubMed Google Scholar
Michele Ciavotta
View author publications
You can also search for this author in PubMed Google Scholar
Enza Messina
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Michele Carbonera is involved in model creation and writing. Professors Enza Messina and Michele Ciavotta contributed to the writing process, goal definition, and supervision.

Corresponding author

Correspondence to Michele Carbonera.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Carbonera, M., Ciavotta, M. & Messina, E. Variational Autoencoders and Generative Adversarial Networks for Multivariate Scenario Generation. Data Sci. Transp. 6, 23 (2024). https://doi.org/10.1007/s42421-024-00097-y

Download citation

Received: 20 January 2024
Revised: 02 February 2024
Accepted: 18 June 2024
Published: 20 September 2024
DOI: https://doi.org/10.1007/s42421-024-00097-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Variational Autoencoders and Generative Adversarial Networks for Multivariate Scenario Generation

Abstract

Similar content being viewed by others

A Deep Learning Framework for Generation and Analysis of Driving Scenario Trajectories

Conditional Variational Autoencoder Networks for Autonomous Vehicle Path Prediction

Hierarchical Latent Structure for Multi-modal Vehicle Trajectory Forecasting

Introduction

Related Work

Problem Statement

Proposed Approach