Abstract
The human body is an intricate network of multiple functioning sub-systems. Many unobserved processes quietly keep running within the body even while we remain largely unconscious of them. For decades, scientists have sought to understand how different physiological systems work and how they can be mathematically modeled. Mathematical models of biological systems provide key scientific insights and also help guide the development of technologies for treating disorders when proper functioning no longer occurs. One of the challenges encountered with physiological systems is that, in a number of instances, the quantities we are interested in are difficult to observe directly or remain completely inaccessible. This could be either because they are located deep within the body or simply because they are more abstract (e.g., emotion). Consider the heart, for instance. The left ventricle pumps out blood through the aorta to the rest of the body. Blood pressure inside the aorta (known as central aortic pressure) has been considered a useful predictor of the future risk of developing cardiovascular disease, perhaps even more useful than the conventional blood pressure measurements taken from the upper arm (McEniery et al. (Eur Heart J 35(26):1719–1725, 2014)). However, measuring blood pressure inside the aorta is difficult. Consequently, researchers have had to rely on developing mathematical models with which to estimate central aortic pressure using other peripheral measurements (e.g., Ghasemi et al. (J Dyn Syst Measur Control 139(6):061003, 2017)). The same could be said regarding the recovery of CRH (corticotropin-releasing hormone) secretion timings within the hypothalamus—a largely inaccessible structure deep within the brain—using cortisol measurements in the blood based on mathematical relationships (Faghih (System identification of cortisol secretion: Characterizing pulsatile dynamics, Ph.D. dissertation, Massachusetts Institute of Technology, 2014)). Emotions could also be placed in this same category. They are difficult to measure because of their inherently abstract nature. Emotions, however, do cause changes in heart rate, sweating, and blood pressure that can be measured and with which someone’s feelings can be estimated. What we have described so far, in a sense, captures the big picture underlying this book. We have physiological quantities that are difficult to observe directly, we have measurements that are easier to acquire, and we have the ability to build mathematical models to estimate those inaccessible quantities.
You have full access to this open access chapter, Download chapter PDF
The human body is an intricate network of multiple functioning sub-systems. Many unobserved processes quietly keep running within the body even while we remain largely unconscious of them. For decades, scientists have sought to understand how different physiological systems work and how they can be mathematically modeled. Mathematical models of biological systems provide key scientific insights and also help guide the development of technologies for treating disorders when proper functioning no longer occurs. One of the challenges encountered with physiological systems is that, in a number of instances, the quantities we are interested in are difficult to observe directly or remain completely inaccessible. This could be either because they are located deep within the body or simply because they are more abstract (e.g., emotion). Consider the heart, for instance. The left ventricle pumps out blood through the aorta to the rest of the body. Blood pressure inside the aorta (known as central aortic pressure) has been considered a useful predictor of the future risk of developing cardiovascular disease, perhaps even more useful than the conventional blood pressure measurements taken from the upper arm [1]. However, measuring blood pressure inside the aorta is difficult. Consequently, researchers have had to rely on developing mathematical models with which to estimate central aortic pressure using other peripheral measurements (e.g., [2]). The same could be said regarding the recovery of CRH (corticotropin-releasing hormone) secretion timings within the hypothalamus—a largely inaccessible structure deep within the brain—using cortisol measurements in the blood based on mathematical relationships [3]. Emotions could also be placed in this same category. They are difficult to measure because of their inherently abstract nature. Emotions, however, do cause changes in heart rate, sweating, and blood pressure that can be measured and with which someone’s feelings can be estimated. What we have described so far, in a sense, captures the big picture underlying this book. We have physiological quantities that are difficult to observe directly, we have measurements that are easier to acquire, and we have the ability to build mathematical models to estimate those inaccessible quantities.
Let us now consider some examples where the quantities we are interested in are rather abstract. Consider a situation where new employees at an organization are being taught a new task to be performed at a computer. Let us assume that each employee has a cognitive “task learning” state. Suppose also that the training sessions are accompanied by short quizzes at the end of each section. If we were to record how the employees performed (e.g., how many answers they got correct and how much time they took), could we somehow determine this cognitive learning state, and see how it gradually changes over time? The answer indeed is yes, with the help of a mathematical model, we can estimate such a state and track an employee’s progress over time. We will, however, first need to build such a model that relates learning to quiz performance. As you can see, the basic idea of building models that relate difficult-to-access quantities to measurements that we can acquire more easily and then estimate those quantities is a powerful concept. In this book, we will see how state-space models can be used to relate physiological/behavioral variables to experimental measurements.
State-space modeling is a mature field within controls engineering. In this book, we will address a specific subset of state-space models. Namely, we will consider a class of models where all or part of the observations are binary. You may wonder why binary observations are so important? In reality, a number of phenomena within the human body are binary in nature. For instance, the millions of neurons within our bodies function in a binary-like manner. When these neurons receive inputs, they either fire or they do not. The pumping action of the heart can also be seen as a binary mechanism. The heart is either in contraction and pumping out blood or it is not. The secretion of a number of pulsatile hormones can also be viewed in a similar manner. The glands responsible for pulsatile secretion are either secreting the hormone or not. In reality, a number of other binary phenomena exist and are often encountered in biomedical applications. Consequently, physiological state-space models involving binary-valued observations have found extensive applications across a number of fields including behavioral learning [4,5,6,7,8,9], position, and movement decoding based on neural spiking observations [10,11,12,13,14,15,16,17], anesthesia, and comatose state regulation [18,19,20], sleep studies [21], heart rate analysis [22, 23], and cognitive flexibility [9, 24]. In this book, we will see how some of these models can be built and how they can be used to estimate unobserved states of interest.
1.1 Physiology, State-Space Models, and Estimation
As we have just stated, many things happen inside the human body, even while we are largely unaware that they are occurring. Energy continues to be produced through the actions of hormones and biochemicals, changes in emotion occur within the brain, and mental concentration varies throughout the day depending on the task at hand. Despite the fact that they cannot be observed, these internal processes do give rise to changes in different physiological phenomena that can indeed be measured. For instance, while energy production cannot be observed directly, we can indeed measure the hormone concentrations in the blood that affect the production mechanisms. Similarly, we can also measure physiological changes that emotions cause (e.g., changes in heart rate). Concentration or cognitive load also cannot be observed, but we can measure how quickly someone is getting their work done and how accurately they are performing. Let us now consider how these state-space models relate unobserved quantities to observed measurements.
Think of any control system such as a spring–mass–damper system or RLC circuit (Fig. 1.1). Typically, in such a system, we have several internal state variables and some sensor measurements. Not all the states can be observed directly. However, sensor readings can and do provide some information about them. By deriving mathematical relationships between the sensor readings and the internal states, we can develop tools that enable us to estimate the unobserved states over time. For instance, we may not be able to directly measure all the voltages and currents in a circuit, but we can use Kirchoff’s laws to derive relationships between what we cannot observe and what we do measure. Similarly, we may not be able to measure all the positions, velocities, or accelerations within a mechanical system, but we can derive similar relationships using Newton’s laws. Thus, a typical engineering system can be characterized via a state-space formulation as shown below (for the time-being, we will ignore any noise terms and non-linearities).
Here, \({\mathbf {x}}_{k}\) is a vector representing the internal states of the system, \({\mathbf {y}}_{k}\) is a vector representing the sensor measurements, \({\mathbf {u}}_{k}\) is an external input, and A, B, and C are matrices. The state evolves with time following the mathematical relationship in (1.1). While we may be unable to observe \({\mathbf {x}}_{k}\) directly, we do have the sensor readings \({\mathbf {y}}_{k}\) that are related to it. The question is, can we now apply this formulation to the human body? In this case, \({\mathbf {x}}_{k}\) could be any of the unobserved quantities we just mentioned (e.g., energy production, emotion, or concentration) and \({\mathbf {y}}_{k}\) could be any related physiological measurement(s).
In this book, we will make use of an approach known as expectation–maximization (EM) for estimating unobserved quantities using state-space models. In a very simple way, here is what the EM algorithm does when applied to state estimation. Look back at (1.1) and (1.2). Now assume that this formulation governs how emotional states (\({\mathbf {x}}_{k}\)) vary within the brain and how they give rise to changes in heart rate and sweat secretions (\({\mathbf {y}}_{k}\)) that can be measured. We do not know \({\mathbf {x}}_{k}\) for \(k = 1, 2, \ldots , K\), and neither do we know A, B, or C. We only have the recorded sensor measurements (features) \({\mathbf {y}}_{k}\). First, we will assume some values for A, B, and C, i.e., we will begin by assuming that we know them. We will use this knowledge of A, B, and C to estimate \({\mathbf {x}}_{k}\) for \(k = 1, 2, \ldots , K\). We now know \({\mathbf {x}}_{k}\) at every point in time. We will then use these \({\mathbf {x}}_{k}\)’s to come up with an estimate for A, B, and C. We will then use those new values of A, B, and C to calculate an even better estimate for \({\mathbf {x}}_{k}\). The newest \({\mathbf {x}}_{k}\) will again be used to determine an even better A, B, and C. We will repeat these steps in turn until there is hardly any change in \({\mathbf {x}}_{k}\), A, B, or C. Our EM algorithm is said to have converged at this point. The step where \({\mathbf {x}}_{k}\) is estimated is known as the expectation-step or E-step and the step where A, B, and C are calculated is known as the maximization-step or M-step. For the purpose of this book, we will label the E-step as the state estimation step and the M-step as the parameter estimation step. What follows next is a basic description of what we do at these steps in slightly more detail.
1.1.1 State Estimation Step
As we have just stated, our EM algorithm consists of two steps: the state estimation step and the parameter estimation step. At the state estimation step we assume to know A, B, and C and try to estimate \({\mathbf {x}}_{k}\) for \(k = 1, 2, \ldots , K\). We do this sequentially. Again, look back at (1.1) and (1.2). Suppose you are at time index k and you know what A, B, C, and \({\mathbf {x}}_{k - 1}\) are, could you come up with a guess for \({\mathbf {x}}_{k}\)? You can also assume that you know what the external input \({\mathbf {u}}_{k}\) is for \(k = 1, 2, \ldots , K\). How would you do determine \({\mathbf {x}}_{k}\)? First, note that we can re-write the equations as
If you knew A, B, C, \({\mathbf {x}}_{k - 1}\), and \({\mathbf {u}}_{k - 1}\), and had to determine \({\mathbf {x}}_{k}\) just at time index k, you would encounter a small problem here. Do you see that \({\mathbf {x}}_{k}\) appears in both equations? You could simply plug-in the values of \({\mathbf {x}}_{k - 1}\) and \({\mathbf {u}}_{k - 1}\) into (1.3) and get a value for \({\mathbf {x}}_{k}\). Since you are using the past values up to time index \((k - 1)\) to determine \({\mathbf {x}}_{k}\), this could be called the predict step. You are done, right? Not quite. If you determine \({\mathbf {x}}_{k}\) solely based on (1.3), you would always be discounting the sensor measurement \({\mathbf {y}}_{k}\) in (1.4). This sensor measurement is also an important source of information about \({\mathbf {x}}_{k}\). Therefore, at each time index k, we will first have the predict step where we make use of (1.3) to guess what \({\mathbf {x}}_{k}\) is, and then apply an update step, where we will make use of \({\mathbf {y}}_{k}\) to improve the \({\mathbf {x}}_{k}\) value that we just predicted. The full state estimation step will therefore consist of a series of repeated predict, update, predict, update, \(\ldots \) steps for \(k = 1, 2, \ldots , K\). At the end of the state estimation step, we will have a complete set of values for \({\mathbf {x}}_{k}\).
Dealing with uncertainty is a reality with any engineering system model. These uncertainties arise due to noise in our sensor measurements, models that are unable to fully account for actual physical systems and so on. We need to deal with this notion of uncertainty when designing state estimators. To do so, we will need some basic concepts in probability and statistics. What we have said so far regarding estimating \({\mathbf {x}}_{k}\) can be mathematically formulated in terms of two fundamental ideas in statistics: mean and variance. In reality, (1.3) and (1.4) should be
where \({\mathbf {e}}_{k}\) is what we refer to as process noise and \({\mathbf {v}}_{k}\) is sensor noise. Therefore, when we “guess” what \({\mathbf {x}}_{k}\) is at the predict step, what we are really doing is determining the mean value of \({\mathbf {x}}_{k}\) given that we have observed all the data up to time index \((k - 1)\). There will also be a certain amount of uncertainty regarding this prediction for \({\mathbf {x}}_{k}\). We quantify this uncertainty in terms of variance. Thus we need to determine the mean and variance of \({\mathbf {x}}_{k}\) at our predict step. But what happens after we observe \({\mathbf {y}}_{k}\)? Again, the idea is the same. Now that we have two sources of information regarding \({\mathbf {x}}_{k}\) (one based on the prediction from \({\mathbf {x}}_{k - 1}\) and \({\mathbf {u}}_{k - 1}\), and the other based on the sensor reading \({\mathbf {y}}_{k}\)), we will still be determining the mean and variance of \({\mathbf {x}}_{k}\). So we need to calculate one mean and variance of \({\mathbf {x}}_{k}\) at the predict step, and another mean and variance of \({\mathbf {x}}_{k}\) at the update step.
1.1.2 Parameter Estimation Step
Recall that our EM algorithm iterates between the state estimation step and the parameter estimation step until convergence. Assume that we sequentially progressed through repeated predict, update, predict, update, \(\ldots \) steps for \(k = 1, 2, \ldots , K\) and determined a set of mean and variance (uncertainty) values for \({\mathbf {x}}_{k}\). How could we use all of these mean and variance values to determine what A, B, and C are? Here is how we proceed. We first calculate the joint probability for all the \({\mathbf {x}}_{k}\) and \({\mathbf {y}}_{k}\) values. The best estimates for A, B, and C are the values that maximize this probability (or the log of this probability). Therefore, we need to maximize this probability with respect to A, B, and C. One simple way to determine the value at which a function is maximized is to take its derivative and solve for the location where it is 0. This is basically what we do to determine A, B, and C (in reality, we actually maximize the expected value or mean of the joint log probability of all the \({\mathbf {x}}_{k}\) and \({\mathbf {y}}_{k}\) values to determine A, B, and C).
1.1.3 Algorithm Summary
In summary, we have to calculate means and variances at the state estimation step and derivatives at the parameter estimation step. We will show how these equations are derived in a number of examples in the chapters that follow. The EM approach enables us to build powerful state estimators that can determine internal physiological quantities that are only accessible through a set of sensor measurements.
What we have described so far is a very simple introduction to the EM algorithm as applied to state estimation. Moreover, for someone already familiar with state-space models, the predict and update steps we have just described should also sound familiar. These are concepts that are found in Kalman filtering. The derivation of the Kalman filter equations is generally approached from the point of view of solving a set of simultaneous equations when new sensor measurements keep coming in. In this book, we will not approach the design of the filters through traditional recursive least squares minimization approaches involving matrix computations. Instead, we will proceed from a statistical viewpoint building up from the basics of mean and variance. Nevertheless, we will use the terminology of a filter when deriving the state estimation step equations. For reasons that will become clearer as we proceed, we can refer to these state estimators as Bayesian filters.
1.2 Book Outline
State-space models have been very useful in a number of physiological applications. In this book, we consider state-space models that give rise, fully or partially, to binary observations. We will begin our discussion of how to build Bayesian filters for physiological state estimation starting with the simplest cases. We will start by considering a scalar-valued state \(x_{k}\) that follows the simple random walk
where \(\varepsilon _{k} \sim \mathcal {N}(0, \sigma ^{2}_{\varepsilon })\) is process noise. We will consider how to derive the state and parameter estimation step equations when \(x_{k}\) gives rise to a single binary observation \(n_{k}\). We will next proceed to more complicated cases. For instance, one of the cases will be where we have a forgetting factor \(\rho \) such that
and \(x_{k}\) gives rise to both a binary observation \(n_{k}\) and a continuous observation \(r_{k}\). An even more complicated case will involve an external input so that
where \(\alpha I_{k}\) is similar to the \(B {\mathbf {u}}_{k}\) in (1.1), and \(x_{k}\) gives rise to a binary observation \(n_{k}\) and two continuous observations \(r_{k}\) and \(s_{k}\). As we shall see, changes in the state equation primarily affect the predict step within the state estimation step. In contrast, changes in the observations mainly affect the update step.
Note that we mentioned the observation of binary and continuous features. When introducing the concept of physiological state estimation for the first time, we used the formulation
for the sensor measurements. In reality, this represents a very simple case, and the equations turn out to be similar to that of a Kalman filter. Sensor measurements in biomedical experiments can take many forms. They can take the form of binary-valued observations, continuous-valued observations, and spiking-type observations, to name a few. For instance, we may need to estimate the learning state of a macaque monkey in a behavioral experiment based on whether the monkey gets the answers correct or incorrect in different trials (a binary observation), how quickly the monkey responds in each trial (a continuous observation), and how electrical activity from a specific neuron varies over the trials (a spiking-type observation). These types of measurements result in filter equations that are more complicated than in the case of a Kalman filter. We will rely heavily on Bayes’ rule to derive the mean and variance of \(x_{k}\) at the update step in each case.
While the state estimation step relies primarily on mean and variance calculations, the parameter estimation step relies mainly on derivatives. At the parameter estimation step, we take the derivatives of the probability terms (or equivalently, of the log-likelihood terms) to determine the model parameters. For instance, if we use the state equation in (1.8), we will need to derive \(\rho \) at the parameter estimation step. Moreover, we also need to determine the model parameters related to our observations. For instance, we may choose to model a continuous observation \(r_{k}\) as
where \(\gamma _{0}\) and \(\gamma _{1}\) are constant coefficients and \(v_{k} \sim \mathcal {N}(0, \sigma ^{2}_{v})\) is sensor noise. The three parameters \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\) all need to be determined at the parameter estimation step. We could thus divide the parameter estimation step derivations into two parts. First, there will be the derivations for model parameters in the state equation (e.g., \(\rho \), \(\alpha \), and \(\sigma ^{2}_{\varepsilon }\)). And second, there will be the derivations corresponding to each of the observations (features). Choosing to include a continuous-valued observation in a state-space model will necessitate the determination of a certain set of model parameters. Adding a spiking-type observation necessitates a further set of model parameters. We will see examples of these in due course.
Having laid some of the basic groundwork, we will next proceed with our tutorial discussion of how to derive the state and parameter estimation step equations for several different physiological state-space models. Shown below is a list of the state-space models we will look at along with examples of where they have been applied:
-
State-space model with one binary observation:
-
State-space model with one binary and one continuous observation:
-
State-space model with one binary and two continuous observations:
-
State-space model with one binary, two continuous, and a spiking-type observation:
-
Sympathetic arousal estimation using skin conductance and electrocardiography (EKG) signals [31]
-
-
State-space model with one marked point process (MPP) observation:
-
Sympathetic arousal estimation using skin conductance signals [32]
-
-
State-space model with one MPP and one continuous observation:
Wearable and smart healthcare technologies are likely to play a key role in the future [34, 35]. A number of the state-space models listed above have applicability to healthcare. For instance, patients suffering from emotional disorders, hormone dysregulation, or epileptic seizures could be fitted with wearable devices that implement some of the state-space models (and corresponding EM-based estimators) listed above for long-term care and monitoring. One of the advantages of the state-space framework is that it readily presents itself to the design of the closed-loop control necessary to correct deviation from healthy functioning. Consequently, state-space controllers can be designed to treat some of these disorders [36, 37]. Looking at the human body and brain from a control-theoretic perspective could also help design bio-inspired controllers that are similar to its already built-in feedback control loops [38, 39]. The applications, however, are not just limited to healthcare monitoring, determining hidden psychological and cognitive states also has applications in fields such as neuromarketing [40], smart homes [41], and smart workplaces [42].
Excursus—A Brief Sketch of How the Kalman Filter Equations Can be Derived
Here we provide a brief sketch of how the Kalman filter equations can be derived. We will utilize an approach known as recursive least squares. The symbols used within this excursus are self-contained and should not be confused with the standard terminology that is used throughout the rest of this book.
Suppose we have a column vector of unknowns \(\mathbf {x}\) and a column vector of measurements \({\mathbf {y}}_{1}\) that are related to each other through
where \(A_{1}\) is a matrix and \({\mathbf {e}}_{1} \sim \mathcal {N}(0, \Sigma _{1})\) is noise (\(\Sigma _{1}\) is the noise covariance matrix). In general, we may have more measurements than we have unknowns. Therefore, a solution to this system of equations is given by
where we have used \({\mathbf {x}}_{1}\) to denote that this solution is only based on the first set of measurements. Now suppose that we have another set of measurements \({\mathbf {y}}_{2}\) such that
where \(A_{2}\) is a matrix and \({\mathbf {e}}_{2} \sim \mathcal {N}(0, \Sigma _{2})\). In theory, we could just concatenate all the values to form a single set of equations and solve for \(\mathbf {x}\). However, this would result in a larger matrix inversion each time we get more data. Is there a better way? It turns out that we can use our previous solution \({\mathbf {x}}_{1}\) to obtain a better estimate \({\mathbf {x}}_{2}\) without having to solve everything again. If we assume that \({\mathbf {e}}_{1}\) and \({\mathbf {e}}_{2}\) are uncorrelated with each other, the least squares solution is given by
Let us see how this simplifies. We will begin by defining the term \(P_{1} = (A_{1}^{\intercal }\Sigma _{1}^{-1}A_{1})^{-1}\). Now,
based on (1.13). Substituting \(P_{1}^{-1}\) for \(A_{1}^{\intercal }\Sigma _{1}^{-1}A_{1}\) and \(P_{1}^{-1}{\mathbf {x}}_{1}\) for \(A^{\intercal }\Sigma _{1}^{-1}{\mathbf {y}}_{1}\) in (1.17), we obtain
We use the matrix inversion lemma to simplify this to
We then perform the multiplication.
For the time-being, we will ignore the terms on the right and make the substitution \(K = P_{1}A_{2}^{\intercal }(\Sigma _{2} + A_{2}P_{1}A_{2}^{\intercal })^{-1}\) for the term on the left. Therefore,
When multiplying the terms on the right, we will define the term \(Q = (\Sigma _{2} + A_{2}P_{1}A_{2}^{\intercal })^{-1}\). Making this substitution, we obtain
Here is where we will use a small trick. We will insert \(QQ^{-1}\) into the third term and then simplify.
Since \(Q = (\Sigma _{2} + A_{2}P_{1}A_{2}^{\intercal })^{-1}\), \(Q^{-1} = \Sigma _{2} + A_{2}P_{1}A_{2}^{\intercal }\). We will substitute this into (1.30) to obtain
Note that \(P_{1}A_{2}^{\intercal }Q = P_{1}A_{2}^{\intercal }(\Sigma _{2} + A_{2}P_{1}A_{2}^{\intercal })^{-1} = K\). Therefore,
What does the final equation mean? We simply take our previous solution \({\mathbf {x}}_{1}\), predict what \({\mathbf {y}}_{2}\) will be by multiplying it with \(A_{2}\), calculate the prediction error \({\mathbf {y}}_{2} - A_{2}{\mathbf {x}}_{1}\), and apply this correction to \({\mathbf {x}}_{1}\) based on the multiplication factor K. These equations, therefore, provide a convenient way to continually update \(\mathbf {x}\) when we keep receiving more and more data.
Excursus—A Brief Sketch of How the EM Algorithm Works
Here we will provide a brief overview of how the EM algorithm works in the kind of state estimation problems that we shall see. Assume that we have a set of sensor measurements \(\mathcal {Y} = \{y_{1}, y_{2}, \ldots , y_{K}\}\) and a set of unobserved states \(\mathcal {X} = \{x_{1}, x_{2}, \ldots , x_{K}\}\) that we need to estimate. We also have the model parameters \(\Theta \) that need to be determined.
Let us begin by asking the question as to how we can determine \(\Theta \). In general, we select \(\Theta \) such that it maximizes the probability \(p(\Theta |\mathcal {Y})\). Assuming that we do not have a particular preference for any of the \(\Theta \) values, we can use Bayes’ rule to instead select the \(\Theta \) that maximizes \(p(\mathcal {Y}|\Theta )\). Now,
We do not know what the true \(\Theta \) is, but let us make a guess that it is \(\hat {\Theta }\). Let us now introduce the term \(p(\mathcal {X}|\mathcal {Y} \cap \hat {\Theta })\) into (1.36).
Take a moment to look carefully at what the integral is doing. It is actually calculating the expected value of the fraction term with respect to \(p(\mathcal {X}|\mathcal {Y} \cap \hat {\Theta })\). Taking the log on both sides, we have
Since \(\log (\cdot )\) is a concave function, the following inequality holds true.
Recall that we set out to choose the \(\Theta \) that maximized \(p(\mathcal {Y}|\Theta )\), or that equivalently maximized \(\log \big [p(\mathcal {Y}|\Theta )\big ]\). Typically, we would approach this maximization by calculating the derivative of the probability term with respective to \(\Theta \), set it to \(\mathbf {0}\), and then solve. For instance, if we had a continuous-valued observation \(r_{k}\) in our state-space model, we would have to take the derivatives with respect to \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\), set them each to 0, and solve. Look back at (1.41). Assume we were to calculate the derivative of the term on the right-hand side of the inequality with respective to \(\Theta \). Do you see that the second term does not contain \(\Theta \)? In other words, the derivative would just treat the second term as a constant. If we had to determine \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\), for instance, they would only be present in the first term when taking derivatives. We can, therefore, safely ignore the second term. This leads to an important conclusion. If we need to determine the model parameters \(\Theta \) by maximizing \(\log \big [p(\mathcal {Y}|\Theta )\big ]\), we only need to concentrate on maximizing
We could equivalently write (1.42) as
since this is indeed an expected value. Do you now see the connection between what we have been discussing so far and the EM algorithm? In reality, what we are doing at the state estimation step is calculating \(\mathbb {E}[\mathcal {X}|\mathcal {Y} \cap \hat {\Theta }]\). At the parameter estimation step, we calculate the partial derivatives of the expected value of \(\log \big [p(\mathcal {X} \cap \mathcal {Y}|\Theta )\big ]\) with respect to all of the model parameters. During the actual implementation of the EM algorithm, we keep alternating between the two steps until the model parameters converge. At this point, we have reached one of the localized maximum values of \(\mathbb {E}_{\mathcal {X}|\mathcal {Y} \cap \hat {\Theta }} \Big [\log \big [p(\mathcal {X} \cap \mathcal {Y}|\Theta )\big ]\Big ]\).
References
C. M. McEniery, J. R. Cockcroft, M. J. Roman, S. S. Franklin, and I. B. Wilkinson, “Central blood pressure: current evidence and clinical importance,” European Heart Journal, vol. 35, no. 26, pp. 1719–1725, 01 2014. [Online]. Available: https://doi.org/10.1093/eurheartj/eht565
Z. Ghasemi, C.-S. Kim, E. Ginsberg, A. Gupta, and J.-O. Hahn, “Model-Based Blind System Identification Approach to Estimation of Central Aortic Blood Pressure Waveform From Noninvasive Diametric Circulatory Signals,” Journal of Dynamic Systems, Measurement, and Control, vol. 139, no. 6, 03 2017, 061003. [Online]. Available: https://doi.org/10.1115/1.4035451
R. T. Faghih, “System identification of cortisol secretion: Characterizing pulsatile dynamics,” Ph.D. dissertation, Massachusetts Institute of Technology, 2014.
A. C. Smith, L. M. Frank, S. Wirth, M. Yanike, D. Hu, Y. Kubota, A. M. Graybiel, W. A. Suzuki, and E. N. Brown, “Dynamic analysis of learning in behavioral experiments,” Journal of Neuroscience, vol. 24, no. 2, pp. 447–461, 2004.
M. J. Prerau, A. C. Smith, U. T. Eden, Y. Kubota, M. Yanike, W. Suzuki, A. M. Graybiel, and E. N. Brown, “Characterizing learning by simultaneous analysis of continuous and binary measures of performance,” Journal of Neurophysiology, vol. 102, no. 5, pp. 3060–3072, 2009.
T. P. Coleman, M. Yanike, W. A. Suzuki, and E. N. Brown, “A mixed-filter algorithm for dynamically tracking learning from multiple behavioral and neurophysiological measures,” The Dynamic Brain: An Exploration of Neuronal Variability and its Functional Significance, pp. 3–28, 2011.
N. Malem-Shinitski, Y. Zhang, D. T. Gray, S. N. Burke, A. C. Smith, C. A. Barnes, and D. Ba, “A separable two-dimensional random field model of binary response data from multi-day behavioral experiments,” Journal of Neuroscience Methods, vol. 307, pp. 175–187, 2018.
A. C. Smith, M. R. Stefani, B. Moghaddam, and E. N. Brown, “Analysis and design of behavioral experiments to characterize population learning,” J. Neurophysiology, vol. 93, no. 3, pp. 1776–1792, 2005.
X. Deng, R. T. Faghih, R. Barbieri, A. C. Paulk, W. F. Asaad, E. N. Brown, D. D. Dougherty, A. S. Widge, E. N. Eskandar, and U. T. Eden, “Estimating a dynamic state to relate neural spiking activity to behavioral signals during cognitive tasks,” in 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015, pp. 7808–7813.
E. N. Brown, L. M. Frank, D. Tang, M. C. Quirk, and M. A. Wilson, “A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells,” Journal of Neuroscience, vol. 18, no. 18, pp. 7411–7425, 1998.
R. Barbieri, L. M. Frank, D. P. Nguyen, M. C. Quirk, V. Solo, M. A. Wilson, and E. N. Brown, “Dynamic analyses of information encoding in neural ensembles,” Neural Computation, vol. 16, no. 2, pp. 277–307, 2004.
M. M. Shanechi, Z. M. Williams, G. W. Wornell, R. C. Hu, M. Powers, and E. N. Brown, “A real-time brain-machine interface combining motor target and trajectory intent using an optimal feedback control design,” PloS One, vol. 8, no. 4, p. e59049, 2013.
M. M. Shanechi, R. C. Hu, M. Powers, G. W. Wornell, E. N. Brown, and Z. M. Williams, “Neural population partitioning and a concurrent brain-machine interface for sequential motor function,” Nature Neuroscience, vol. 15, no. 12, p. 1715, 2012.
M. M. Shanechi, G. W. Wornell, Z. M. Williams, and E. N. Brown, “Feedback-controlled parallel point process filter for estimation of goal-directed movements from neural signals,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 21, no. 1, pp. 129–140, 2012.
X. Deng, D. F. Liu, K. Kay, L. M. Frank, and U. T. Eden, “Clusterless decoding of position from multiunit activity using a marked point process filter,” Neural Computation, vol. 27, no. 7, pp. 1438–1460, 2015.
K. Arai, D. F. Liu, L. M. Frank, and U. T. Eden, “Marked point process filter for clusterless and adaptive encoding-decoding of multiunit activity,” bioRxiv, p. 438440, 2018.
A. Yousefi, M. R. Rezaei, K. Arai, L. M. Frank, and U. T. Eden, “Real-time point process filter for multidimensional decoding problems using mixture models,” bioRxiv, p. 505289, 2018.
Y. Yang and M. M. Shanechi, “An adaptive and generalizable closed-loop system for control of medically induced coma and other states of anesthesia,” Journal of Neural Engineering, vol. 13, no. 6, p. 066019, 2016.
Y. Yang, J. T. Lee, J. A. Guidera, K. Y. Vlasov, J. Pei, E. N. Brown, K. Solt, and M. M. Shanechi, “Developing a personalized closed-loop controller of medically-induced coma in a rodent model,” Journal of Neural Engineering, vol. 16, no. 3, p. 036022, 2019.
Y. Yang and M. M. Shanechi, “A generalizable adaptive brain-machine interface design for control of anesthesia,” in 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015, pp. 1099–1102.
M. J. Prerau, K. E. Hartnack, G. Obregon-Henao, A. Sampson, M. Merlino, K. Gannon, M. T. Bianchi, J. M. Ellenbogen, and P. L. Purdon, “Tracking the sleep onset process: an empirical model of behavioral and physiological dynamics,” PLoS Computational Biology, vol. 10, no. 10, p. e1003866, 2014.
R. Barbieri and E. N. Brown, “Application of dynamic point process models to cardiovascular control,” Biosystems, vol. 93, no. 1–2, pp. 120–125, 2008.
——, “Analysis of heartbeat dynamics by point process adaptive filtering,” IEEE Transactions on Biomedical Engineering, vol. 53, no. 1, pp. 4–12, 2006.
A. Yousefi, I. Basu, A. C. Paulk, N. Peled, E. N. Eskandar, D. D. Dougherty, S. S. Cash, A. S. Widge, and U. T. Eden, “Decoding hidden cognitive states from behavior and physiology using a Bayesian approach,” Neural Computation, vol. 31, no. 9, pp. 1751–1788, 2019.
D. S. Wickramasuriya, C. Qi, and R. T. Faghih, “A state-space approach for detecting stress from electrodermal activity,” in Proc. 40th Annu. Int. Conf. IEEE Eng. Medicine and Biology Society, 2018.
D. S. Wickramasuriya, M. R. Amin, and R. T. Faghih, “Skin conductance as a viable alternative for closing the deep brain stimulation loop in neuropsychiatric disorders,” Frontiers in Neuroscience, vol. 13, p. 780, 2019.
T. Yadav, M. M. Uddin Atique, H. Fekri Azgomi, J. T. Francis, and R. T. Faghih, “Emotional valence tracking and classification via state-space analysis of facial electromyography,” in 53rd Asilomar Conference on Signals, Systems, and Computers, 2019, pp. 2116–2120.
M. B. Ahmadi, A. Craik, H. F. Azgomi, J. T. Francis, J. L. Contreras-Vidal, and R. T. Faghih, “Real-time seizure state tracking using two channels: A mixed-filter approach,” in 53rd Asilomar Conference on Signals, Systems, and Computers, 2019, pp. 2033–2039.
D. S. Wickramasuriya and R. T. Faghih, “A Bayesian filtering approach for tracking arousal from binary and continuous skin conductance features,” IEEE Transactions on Biomedical Engineering, vol. 67, no. 6, pp. 1749–1760, 2020.
D. S. Wickramasuriya and R. T. Faghih, “A cortisol-based energy decoder for investigation of fatigue in hypercortisolism,” in 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), July 2019, pp. 11–14.
D. S. Wickramasuriya and R. T. Faghih, “A mixed filter algorithm for sympathetic arousal tracking from skin conductance and heart rate measurements in Pavlovian fear conditioning,” PloS One, vol. 15, no. 4, p. e0231659, 2020.
D. S. Wickramasuriya and R. T. Faghih, “A marked point process filtering approach for tracking sympathetic arousal from skin conductance,” IEEE Access, vol. 8, pp. 68 499–68 513, 2020.
D. S. Wickramasuriya, S. Khazaei, R. Kiani and R. T. Faghih, “A Bayesian Filtering Approach for Tracking Sympathetic Arousal and Cortisol-related Energy from Marked Point Process and Continuous-valued Observations,” IEEE Access. https://doi.org/10.1109/ACCESS.2023.3334974.
P. J. Soh, G. A. Vandenbosch, M. Mercuri, and D. M.-P. Schreurs, “Wearable wireless health monitoring: Current developments, challenges, and future trends,” IEEE Microwave Magazine, vol. 16, no. 4, pp. 55–70, 2015.
W. Gao, H. Ota, D. Kiriya, K. Takei, and A. Javey, “Flexible electronics toward wearable sensing,” Accounts of Chemical Research, vol. 52, no. 3, pp. 523–533, 2019.
H. F. Azgomi, D. S. Wickramasuriya, and R. T. Faghih, “State-space modeling and fuzzy feedback control of cognitive stress,” in 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019, pp. 6327–6330.
H. F. Azgomi and R. T. Faghih, “A wearable brain machine interface architecture for regulation of energy in hypercortisolism,” in 53rd Asilomar Conference on Signals, Systems, and Computers, 2019, pp. 254–258.
R. T. Faghih, M. A. Dahleh, and E. N. Brown, “An optimization formulation for characterization of pulsatile cortisol secretion,” Frontiers in Neuroscience, vol. 9, p. 228, 2015.
H. Taghvafard, M. Cao, Y. Kawano, and R. T. Faghih, “Design of intermittent control for cortisol secretion under time-varying demand and holding cost constraints,” IEEE Transactions on Biomedical Engineering, vol. 67, no. 2, pp. 556–564, 2019.
W. M. Lim, “Demystifying neuromarketing,” Journal of Business Research, vol. 91, pp. 205–220, 2018. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0148296318302716
L. Angioletti, F. Cassioli, and M. Balconi, “Neurophysiological correlates of user experience in smart home systems (SHSs): First evidence from electroencephalography and autonomic measures,” Frontiers in Psychology, vol. 11, p. 411, 2020.
E. Whelan, D. McDuff, R. Gleasure, and J. V. Brocke, “How emotion-sensing technology can reshape the workplace,” MIT Sloan Management Review, vol. 59, no. 3, pp. 7–10, Spring 2018. [Online]. Available: http://search.proquest.com.ezproxy.lib.uh.edu/docview/2023991461?accountid=7107
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this chapter
Cite this chapter
Wickramasuriya, D.S., Faghih, R.T. (2024). Introduction. In: Bayesian Filter Design for Computational Medicine. Springer, Cham. https://doi.org/10.1007/978-3-031-47104-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-47104-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47103-2
Online ISBN: 978-3-031-47104-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)