Abstract
In this chapter, we will consider the case where \(x_{k}\) evolves with time following a slightly more complicated state equation and gives rise to both a binary observation \(n_{k}\) and a continuous observation \(r_{k}\). Prior to looking into the equation derivations, however, as in the previous chapter, we will again first consider a few example scenarios where the need for such a model arises.
You have full access to this open access chapter, Download chapter PDF
In this chapter, we will consider the case where \(x_{k}\) evolves with time following a slightly more complicated state equation and gives rise to both a binary observation \(n_{k}\) and a continuous observation \(r_{k}\). Prior to looking into the equation derivations, however, as in the previous chapter, we will again first consider a few example scenarios where the need for such a model arises.
In the previous chapter, we considered the estimation of a continuous-valued learning state \(x_{k}\) based on correct/incorrect responses in a sequence of experimental trials. Based on a state-space model consisting of \(x_{k}\) and the binary observations \(n_{k}\), the cognitive learning state of an animal could be estimated over time [4]. Note, however, that it is not just the correct/incorrect responses that contain information regarding the animal’s learning state. How fast the animal responds also reflects changes in learning. For instance, as an animal gradually begins to learn to recognize a specific visual target, not only do the correct answers begin to occur more frequently, but the time taken to respond in each of the trials also starts decreasing (Fig. 4.1). Thus, a state-space model with both a binary observation \(n_{k}\) and a continuous observation \(r_{k}\) was developed in [5] to estimate learning. This was an improvement over the earlier model in [4].
This particular state-space model is not just limited to cognitive learning. It can also be adapted to other applications as well. Human emotion is typically accounted for along two different axes known as valence and arousal [55]. Valence denotes the pleasant–unpleasant nature of an emotion, while arousal denotes its corresponding activation or excitement. Emotional arousal is closely tied to the activation of the sympathetic nervous system [56, 57]. Changes in arousal can occur regardless of the valence of the emotion (i.e., arousal can be high when the emotion is negative, as in the case of rage, or when it is positive, as in the case of excitement). As we saw in the earlier chapter, skin conductance is a sensitive index of arousal. Changes in emotional valence, on the other hand, often cause changes in facial expressions. Information regarding these facial expressions can be captured via EMG sensors attached to the face. The state-space model with one binary observation \(n_{k}\) and one continuous observation \(r_{k}\) was used in [27] for an emotional valence recognition application based on EMG signals. In [27], Yadav et al. extracted both a binary feature and a continuous feature based on EMG amplitudes and powers from data in an experiment where subjects watched a series of music videos meant to evoke different emotions. Based on the model, they were able to extract a continuous-valued emotional valence state \(x_{k}\) over time. The same model was also used in [28] for detecting epileptic seizures. Here, the authors extracted a binary feature and a continuous feature from scalp EEG signals to detect the occurrence of epileptic seizures. Based on the features, a continuous-valued seizure severity state could be tracked over time. These examples serve to illustrate the possibility of using physiological state-space models for a wide variety of applications.
4.1 Deriving the Predict Equations in the State Estimation Step
Let us now consider the state-space model itself. Assume that \(x_{k}\) varies with time as
where \(\rho \) is a constant (forgetting factor) and \(\varepsilon _{k} \sim \mathcal {N}(0, \sigma ^{2}_{\varepsilon })\). As in the previous chapter, we will, for the time-being, ignore that this is part of state-space control system, and instead view the equation purely in terms of a relationship between three random variables. As before, we will also consider the derivation of the mean and variance of \(x_{k}\) using basic formulas. We first consider the mean.
Next we consider the variance.
Now that we know the mean and variance of \(x_{k}\), we can use the fact that it is also Gaussian distributed to state that
When \(x_{k}\) evolves with time following \(x_{k} = \rho x_{k - 1} + \varepsilon _{k}\), the predict equations in the state estimation step are
4.2 Deriving the Update Equations in the State Estimation Step
In the current model, \(x_{k}\) gives rise to a continuous-valued observation \(r_{k}\) in addition to \(n_{k}\). We shall assume that \(x_{k}\) is related to \(r_{k}\) through a linear relationship.
where \(\gamma _{0}\) and \(\gamma _{1}\) are constants and \(v_{k} \sim \mathcal {N}(0, \sigma ^{2}_{k})\) is sensor noise. Our sensor readings \(y_{k}\) now consist of both \(r_{k}\) and \(n_{k}\). What would be the best estimate of \(x_{k}\) once we have observed \(y_{k}\)? Just like in the previous chapter, we will make use of the result in (2.16) to derive this estimate. First, however, we need to note that
This can be easily verified from (4.15). We are now ready to derive the best estimate (mean) for \(x_{k}\) and its uncertainty. We will need to make use of \(p(r_{k}|x_{k})\), \(p(n_{k}|x_{k})\), and \(p(x_{k}|n_{1:k - 1}, r_{1:k - 1})\) to derive this estimate. Note that we now have an additional exponent term for \(r_{k}\) in \(p(x_{k}|n_{1:k}, r_{1:k})\). Using (2.16), we have
Taking the log on both sides, we have
The mean and variance of \(x_{k}\) can now be derived by taking the first and second derivatives of q. Making use of (3.39), we have
We will use a small trick to solve for \(x_{k}\) in the equation above. We will add and subtract the term \(\gamma _{1}x_{k|k - 1}\) in the term involving \(r_{k}\).
This yields the mean update
Again, to clarify the explicit dependence of \(p_{k}\) on \(x_{k}\) and the fact that this is the estimate of \(x_{k}\) having observed \(n_{1:k}\) and \(r_{1:k}\) (the sensor readings up to time index k), we shall say
We next take the second derivative of q similar to (3.40). This yields
Based on (2.21), the uncertainty or variance associated with the new state estimate \(x_{k|k}\), therefore, is
When \(x_{k}\) gives rise to a binary observation \(n_{k}\) and a continuous observation \(r_{k}\), the update equations in the state estimation step are
4.3 Deriving the Parameter Estimation Step Equations
In the previous chapter, we only needed to derive the update equation for the process noise variance \(\sigma ^{2}_{\varepsilon }\) at the parameter estimation step. In the current model, we have a few more parameters. Thus we will need to derive the update equations for \(\rho \), \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\) in addition to the update for \(\sigma ^{2}_{\varepsilon }\).
4.3.1 Deriving the Process Noise Variance
The derivation of the process noise variance update is very similar to the earlier case in the preceding chapter. In fact, the only difference from (3.62) is that we will now have \(\rho x_{k - 1}\) in the log-likelihood term instead of \(x_{k - 1}\). We shall label the required log-likelihood term \(Q_{1}\).
We take the partial derivative of \(Q_{1}\) with respect to \(\sigma ^{2}_{\varepsilon }\) and set it to 0 to solve for the parameter estimation step update.
The parameter estimation step update for \(\sigma ^{2}_{\varepsilon }\) when \(x_{k}\) evolves with time following \(x_{k} = \rho x_{k - 1} + \varepsilon _{k}\) is
4.3.2 Deriving the Forgetting Factor
We will take the partial derivative of \(Q_{1}\) in (4.32) with respect to \(\rho \) and set it to 0 to solve for its parameter estimation step update.
The parameter estimation step update for \(\rho \) when \(x_{k}\) evolves with time following \(x_{k} = \rho x_{k - 1} + \varepsilon _{k}\) is
4.3.3 Deriving the Constant Coefficient Terms
We will next consider the model parameters that are related to \(r_{k}\). Recall from (3.57) that we need to maximize the expected value of the log of the joint probability
In the current state-space model, \(y_{k}\) comprises both \(n_{k}\) and \(r_{k}\). The probability term containing \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\) is
Let us first take the log of this term followed by the expected value. Labeling this as \(Q_{2}\), we have
To solve for \(\gamma _{0}\) and \(\gamma _{1}\), we have to take the partial derivatives of \(Q_{2}\) with respect to \(\gamma _{0}\) and \(\gamma _{1}\), set them each to 0, and solve the resulting equations. We first take the partial derivative with respect to \(\gamma _{0}\).
This provides one equation containing the two unknowns \(\gamma _{0}\) and \(\gamma _{1}\). We next take the partial derivative with respect to \(\gamma _{1}\).
This provides the second equation necessary to solve for \(\gamma _{0}\) and \(\gamma _{1}\).
The parameter estimation step updates for \(\gamma _{0}\) and \(\gamma _{1}\) when we observe a continuous variable \(r_{k} = \gamma _{0} + \gamma _{1}x_{k} + v_{k}\) are
4.3.4 Deriving the Sensor Noise Variance
The term \(Q_{2}\) in (4.44) also contains the sensor noise variance \(\sigma ^{2}_{v}\).
We take its partial derivative with respect to \(\sigma ^{2}_{v}\) and set it to 0 to solve for \(\sigma ^{2}_{v}\).
The parameter estimation step update for \(\sigma ^{2}_{v}\) when we observe a continuous variable \(r_{k} = \gamma _{0} + \gamma _{1}x_{k} + v_{k}\) is
4.4 MATLAB Examples
The MATLAB code examples for implementing the EM algorithm described in this chapter are provided in the following folders:
-
one_bin_one_cont∖
-
sim∖
-
data_one_bin_one_cont.mat
-
filter_one_bin_one_cont.m
-
-
expm∖
-
expm_data_one_bin_two_cont.mat
-
expm_filter_one_bin_one_cont.m
-
-
Note that the code implements a slightly different version of what was discussed here in that the state equation does not contain \(\rho \). Code examples containing \(\rho \) and \(\alpha I_{k}\) are provided in the following chapter for the case where one binary and two continuous observations are present in the state-space model. The current code can easily be modified if \(\rho \) is to be included.
The code for this particular state-space model is an extension of the earlier model. It takes in as input variables n and r that denote \(n_{k}\) and \(r_{k}\), respectively. We use r0, r1, and vr for \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\). Shown below is a part of the code where \(\beta _{0}\) is calculated and the model parameters are initialized.
Similar to the code examples in the preceding chapter, we also use x_pred, x_updt, and x_smth to denote \(x_{k|k - 1}\), \(x_{k|k}\), and \(x_{k|K}\), respectively. Similarly, v_pred, v_updt, and v_smth are used to denote the corresponding variances \(\sigma ^{2}_{k|k - 1}\), \(\sigma ^{2}_{k|k}\), and \(\sigma ^{2}_{k|K}\). Just like in the earlier case as well, the code first progresses through the time indices \(k = 1, 2, \ldots , K\) at the state estimation step.
The update for \(x_{k|k}\) is calculated using the function shown below based on both \(n_{k}\) and \(r_{k}\).
The smoothing step also remains the same (there would have been a change if \(\rho \) was included).
The updates for \(\sigma ^{2}_{\varepsilon }\), \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\) are calculated at the parameter estimation step.
4.4.1 Application to EMG and Emotional Valence
Running the simulated and experimental data code examples produces the results shown in Fig. 4.2. The experimental data example relates to emotional valence and EMG. Emotion can be accounted for along two different orthogonal axes known as valence and arousal [55]. Valence refers to the pleasant–unpleasant nature of an emotion. In [27], this state-space model with one binary and one continuous feature was used to estimate emotional valence using EMG signal features. The binary and continuous features were extracted based on the amplitudes and powers of the EMG signal. The data were collected as a part of the study described in [58] where subjects were shown a series of music videos to elicit different emotional responses.
References
A. C. Smith, L. M. Frank, S. Wirth, M. Yanike, D. Hu, Y. Kubota, A. M. Graybiel, W. A. Suzuki, and E. N. Brown, “Dynamic analysis of learning in behavioral experiments,” Journal of Neuroscience, vol. 24, no. 2, pp. 447–461, 2004.
M. J. Prerau, A. C. Smith, U. T. Eden, Y. Kubota, M. Yanike, W. Suzuki, A. M. Graybiel, and E. N. Brown, “Characterizing learning by simultaneous analysis of continuous and binary measures of performance,” Journal of Neurophysiology, vol. 102, no. 5, pp. 3060–3072, 2009.
T. Yadav, M. M. Uddin Atique, H. Fekri Azgomi, J. T. Francis, and R. T. Faghih, “Emotional valence tracking and classification via state-space analysis of facial electromyography,” in 53rd Asilomar Conference on Signals, Systems, and Computers, 2019, pp. 2116–2120.
M. B. Ahmadi, A. Craik, H. F. Azgomi, J. T. Francis, J. L. Contreras-Vidal, and R. T. Faghih, “Real-time seizure state tracking using two channels: A mixed-filter approach,” in 53rd Asilomar Conference on Signals, Systems, and Computers, 2019, pp. 2033–2039.
J. A. Russell, “A circumplex model of affect,” Journal of Personality and Social Psychology, vol. 39, no. 6, p. 1161, 1980.
H. J. Pijeira-Díaz, H. Drachsler, S. Järvelä, and P. A. Kirschner, “Sympathetic arousal commonalities and arousal contagion during collaborative learning: How attuned are triad members?” Computers in Human Behavior, vol. 92, pp. 188–197, 2019.
M.-Z. Poh, N. C. Swenson, and R. W. Picard, “A wearable sensor for unobtrusive, long-term assessment of electrodermal activity,” IEEE Transactions on Biomedical Engineering, vol. 57, no. 5, pp. 1243–1252, 2010.
S. Koelstra, C. Muhl, M. Soleymani, J.-S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: A database for emotion analysis using physiological signals,” IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 18–31, 2012.
Author information
Authors and Affiliations
4.1 Electronic Supplementary Material
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this chapter
Cite this chapter
Wickramasuriya, D.S., Faghih, R.T. (2024). State-Space Model with One Binary and One Continuous Observation. In: Bayesian Filter Design for Computational Medicine. Springer, Cham. https://doi.org/10.1007/978-3-031-47104-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-47104-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47103-2
Online ISBN: 978-3-031-47104-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)