Invited talk at Medical College of Wisconsin Department of Biostatistics
11/14/23
The raw EEG can indicate general state, e.g., awake or asleep, but it can’t tell us much about specific neural processes that underlie perception, cognition or emotion.
grand ERP by averaging across subjects: lost subject-level information
no uncertainty quantification
filtering for removing drifts or noises: no rule of thumb for which frequency/bandwidth should be used.
component search window specification
separate ANOVA for control-experimental comparison
Solve the issues by
A Gaussian process (GP) is a process f(\cdot): \mathbb{R}^D \rightarrow \mathbb{R} where all finite-dimensional distributions are Gaussian, for any n and x_1\dots, x_n \in \mathbb{R}^D:
f(x_1), \dots, f(x_n) \sim N_n\left(\boldsymbol \mu, \boldsymbol \Sigma\right)
Write f(\cdot) \sim GP to denote that the function f is a GP.
The underlying true ERP waveform is assumed a GP.
\text{E}(f(x)) = m(x) \text{Cov}(f(x), f(x')) = k(x, x')
Popular choices of m(\cdot) are m(x) = 0 or \text{const} for all x, or m(x) = \beta'x
Care more about the covariance function or kernel as it governs the how the process looks like by defining the similarity between data points.
k(x, x' \mid \tau, h) = \tau^2 \exp\left(-\frac{(x - x')^2}{2h^2} \right)
k(x, x') = \exp\left(-\frac{1}{2}(x - x')^2 \right)
k(x, x') = \exp\left(-\frac{1}{2}\frac{(x - x')^2}{ {\color{red}{0.25}}^2} \right)
k(x, x') = \exp\left(-\frac{1}{2}\frac{(x - x')^2}{ {\color{red}{4}}^2} \right)
k(x, x') = {\color{red}{10^2}}\exp\left(-\frac{1}{2}(x - x')^2 \right)
\begin{aligned} y_i &= f(x_i) + \epsilon _i, ~~ \epsilon _i \sim N(0, \sigma^2), ~~ i = 1, \dots, n, \\ f'(t_m) &= 0, ~~m = 1, \dots, M,\\ f(\cdot) &\sim GP \end{aligned}
[k_{01}(\bm{x}, \bm{t})]_{ij} := \frac{\partial }{\partial t_j}k(x_i, t_j), ~~[k_{11}(\bm{t}, \bm{t})]_{ij} := \frac{\partial ^2 }{\partial t_i \partial t_j}k(t_i, t_j)
k_{10}(\bm{x}, \bm{t}) = k_{01}^T(\bm{x}, \bm{t})
t_1, \dots, t_M are parameters to be estimated.
In ERP studies, M is usually known.
Use an univariate prior t \sim \pi(t) that corresponds to assuming that f has at least one stationary point, and utilize the posterior of t to infer all stationary points.
An univariate prior of t leads to
Efficient computation:
Avoids possible misspecification of M that distorts the fitted result.
\begin{aligned} y_i &= f(x_i) + \epsilon_i, ~~ \epsilon_i \sim N(0, \sigma^2), ~~i = 1, \dots, n, \\ f &\sim GP(0, k(\cdot, \cdot; \boldsymbol \theta)) , ~~ f'(t) = 0,\\ t &\sim \pi(t), ~~ \sigma^2 \sim \pi(\sigma^2), ~~ \boldsymbol \theta\sim \pi(\boldsymbol \theta) \end{aligned}
For iteration i = 1, 2, \dots until convergence:
Monte Carlo E-step: Metropolis-Hastings on t and Gibbs sampling on \sigma^2.
M-step: Given samples of t and \sigma^2, update \boldsymbol \theta to \hat{\boldsymbol \theta}^{(i+1)} by setting \begin{aligned} \hat{\boldsymbol \theta}^{(i+1)} &= \arg \max_{\boldsymbol \theta} \frac{1}{J} \sum_{j=1}^J \log p(\bm{y} \mid t_j^{(i)}, (\sigma_j^2)^{(i)}, \boldsymbol \theta^{(i)}) \end{aligned}
Estimates the amplitude and latency as well as the ERP waveform in a unified framework.
Supports a multi-factor ANOVA setting that is capable of examining how multiple factors or covariates affect ERP components’ latency and amplitude.
Bayesian approach having probabilistic statements about the amplitude and latency of ERP components.
The hierarchical nature of the model provides estimates at both the grand group and individual subject levels.
y_{igs} = f_{gs}(x_i) + \epsilon_{igs}, ~~ \epsilon_{igs} \stackrel{\rm iid}{\sim}N(0, \sigma^2).
Time point i = 1, \dots, n
Factor level g = 1, \dots, G
Subject s = 1, \dots, S_{g}
f_{gs} \mid \bm{t}_{gs} \sim \text{DGP}(0, k(\cdot, \cdot; \boldsymbol \theta), \bm{t}_{gs}), where \bm{t}_{gs} = \{t_{gs}^m\}_{m = 1}^M.
\begin{aligned} t_{gs}^m \mid r_{g}^m, \eta_{g}^m &\stackrel{\rm iid}{\sim}\text{gbeta}\left(r_{g}^m\eta_{g}^m, \, (1 - r_{g}^m)\eta_{g}^m, \, a^{m}, \, b^{m}\right), \, s = 1, \dots, S_{g}. \end{aligned}
Subject-level t_{gs}^m \in [a^{m}, b^{m}]
Group-level r_{g}^m \in (0, 1): location of prior mean in [a^{m}, b^{m}]
{\mathbb E}\left(t_{gs}^m\right) = (1 - r_{g}^m)a^{m} + r_{g}^mb^{m}
With the link function \phi(\cdot) \in (-\infty, \infty), and z_j \in \{0, 1\}, j = 1, \dots, G-1,
\phi\left(r_{g}^m\right) = \beta_0^m + \beta_{1}^mz_{1} + \cdots + \beta_{G-1}^mz_{G-1}.
Accommodate any linear model with categorical and numerical variables
Being able to describe interactions
Any link function with domain (0, 1) can be used, logit, probit, complementary log-log, etc.
\begin{aligned} \beta_{j}^m &\sim N(\mu_j^m, (\sigma_j^m)^2), ~~ j = 0, 1, \dots, G-1\\ \eta_{g}^m &\sim Ga(\alpha_{\eta}, \beta_{\eta}),\\ \pi(\sigma^2, \tau^2, h) &= \pi(\sigma^2) \pi(\tau^2) \pi(h) = IG(\alpha_{\sigma}, \beta_{\sigma})IG(\alpha_{\tau}, \beta_{\tau})Ga(\alpha_{h}, \beta_{h}). \end{aligned}
For iteration i = 1, 2, \dots until convergence:
Monte Carlo E-step:
M-step: Given samples of \mathbf{t} and \sigma^2, update \boldsymbol \theta to \hat{\boldsymbol \theta}^{(i+1)} by setting \begin{aligned} \hat{\boldsymbol \theta}^{(i+1)} &= \arg \max_{\boldsymbol \theta} \frac{1}{L} \sum_{l=1}^L \log p(\bm{y} \mid \mathbf{t}_l^{(i)}, (\sigma_l^2)^{(i)}, \boldsymbol \theta^{(i)}) \end{aligned}
where
\bm{y}_{gs} \mid \mathbf{t}_{gs}, \hat{\boldsymbol \theta}^{(i)} \stackrel{\rm ind}{\sim}N\left(\boldsymbol \mu_{gs}, \boldsymbol \Sigma_{gs} +\sigma^2\mathbf{I}\right).
\begin{aligned} f(x)_{1s} = -2\sin(2 \pi x + s/15 - 0.3),\\ f(x)_{2s} = \cos(2\pi x + s/10 + 1.2) - 3x \end{aligned}
Method | Sine | Cosine |
---|---|---|
SLAM-posterior mean | 0.0037 (0.0000) | 0.0068 (0.0002) |
SLAM-median | 0.0036 (0.0001) | 0.0068 (0.0002) |
LOESS1 | 0.0119 (0.0021) | 0.0251 (0.0036) |
Method | Sine | Cosine |
---|---|---|
SLAM-posterior mean | 0.0421 (0.0004) | 0.0582 (0.0016) |
SLAM-median | 0.0421 (0.0005) | 0.0581 (0.0016) |
LOESS1 | 0.0661 (0.0021) | 0.0654 (0.0066) |
Mean (SD) of posterior mean | Mean (SD) of RMSE | ||||
---|---|---|---|---|---|
Parameter | True value | 10 subjects | 20 subjects | 10 subjects | 20 subjects |
r_1^1 | 0.57 | 0.54 (0.033) | 0.56 (0.027) | 0.077 (0.016) | 0.050 (0.012) |
r_1^2 | 0.43 | 0.46 (0.028) | 0.44 (0.024) | 0.077 (0.012) | 0.051 (0.011) |
r_2^1 | 0.45 | 0.45 (0.035) | 0.45 (0.029) | 0.075 (0.010) | 0.051 (0.009) |
r_2^2 | 0.67 | 0.63 (0.033) | 0.65 (0.025) | 0.084 (0.017) | 0.052 (0.010) |
\beta_0^1 | 0.3 | 0.17 (0.135) | 0.24 (0.109) | 0.314 (0.064) | 0.205 (0.049) |
\beta_0^2 | -0.3 | -0.17 (0.114) | -0.24 (0.100) | 0.317 (0.049) | 0.210 (0.043) |
\beta_1^1 | -0.5 | -0.38 (0.276) | -0.44 (0.222) | 0.463 (0.115) | 0.319 (0.100) |
\beta_1^2 | 1 | 0.72 (0.244) | 0.88 (0.203) | 0.538 (0.127) | 0.351 (0.093) |
\sigma | 0.5 | 0.57 (0.003) | 0.57 (0.003) | 0.074 (0.002) | 0.074 (0.002) |
Experiment on speech recognition.
The N100 (60 ~ 140 msec) component captures phonological (syllable) representation.
P200 (140 ~ 220 msec) component represents higher-order perceptual processing.
Novel Bayesian SLAM estimates the latency and amplitude of ERP components with uncertainty quantification.
Solves the search window specification problem by generating posterior distribution of latency.
Incorporates the ANOVA and possibly a latent generalized linear model structure in a single unified framework.
Examines both individual and group differences that facilitate comparing different characteristics or factors, such as age/gender.
R package slam at https://github.com/chenghanyustats/slam
Working in progress: