class: left, inverse, middle, title-slide # Bayesian Modeling of Complex-valued fMRI 🧠 ## Spatiotemporal modeling via kernel convolution ### Dr. Cheng-Han Yu ### Mathematical and Statistical Sciences
Marquette University ### Statistics group, KAUST March 21 2022 --- <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { var: "{\\mathrm{Var}}", cov: "{\\mathrm{Cov}}", bsbeta: "{\\boldsymbol{\\beta}}", bsalpha: "{\\boldsymbol{\\alpha}}", bsep: "{\\boldsymbol{\\epsilon}}", bseta: "{\\boldsymbol{\\eta}}", ga: "{\\boldsymbol{\\gamma}}", bfX: "{\\mathbf{X}}", X: "{\\mathbf{X}}", I: "{\\mathbf{I}}", D: "{\\mathbf{D}}", y: "{\\mathbf{y}}", ind: "{\\stackrel{\\rm ind}{\\sim}}", iid: "{\\stackrel{\\rm iid}{\\sim}}", bfS: "{\\mathbf{S}}", bfpsi: "{\\boldsymbol{\\psi}}", Lam: "{\\boldsymbol{\\Lambda}}" } } }); </script> <!-- `$$\newcommand{\bfX}{\mathbf{X}}$$` --> <!-- `$$\newcommand{\ga}{\boldsymbol{\gamma}}$$` --> <!-- `$$\newcommand{\bsbeta}{\boldsymbol{\beta}}$$` --> <!-- `$$\newcommand{\bseta}{\boldsymbol{\eta}}$$` --> <!-- layout: true --> <!-- ## small figure caption --> <!-- https://stackoverflow.com/questions/45018397/changing-the-font-size-of-figure-captions-in-rmarkdown-html-output --> <style> p.caption { font-size: 0.6em; } </style> <!-- <div class="my-footer"> --> <!-- <span> --> <!-- <a href="https://d2l.mu.edu/d2l/home/431031" target="_blank">D2L Website</a> --> <!-- </span> --> <!-- </div> --> <!-- <style type="text/css"> --> <!-- .caption { --> <!-- font-size: x-small; --> <!-- text-align: center; --> <!-- } --> <!-- </style> --> --- ## (Task-based) Functional Magnetic Resonance Imaging? - fMRI is a noninvasive neuroimaging method that measures the blood-oxygen level-dependent (BOLD) signals in the brain. + **Large-dimensional** + **Low signal-to-noise ratios** (SNRs) + Sophisticated **spatio-temporal dependence** of voxels. <div class="figure" style="text-align: center"> <img src="./img/fmri_ts.png" alt="Source: Martha Skup (2010)" width="80%" /> <p class="caption">Source: Martha Skup (2010)</p> </div> --- ## Why Need Complex-valued Models - Raw fMRI data are **complex-valued** (CV) after Fourier transform and inverse Fourier transform image reconstruction. - Most fMRI studies use **magnitude-only** (MO) data and **phase information is discarded**. <div class="figure" style="text-align: center"> <img src="./img/cv_ks.png" alt="Source: Lindquist (2008) and Adali at. el (2011)" width="60%" /> <p class="caption">Source: Lindquist (2008) and Adali at. el (2011)</p> </div> --- ## Complex-valued fMRI (CV-fMRI) Data <img src="./img/cplximg400.png" width="100%" style="display: block; margin: auto;" /> --- ## Why Bayesian Models of CV-fMRI - Sophisticated Bayesian spatiotemporal models have been developed for MO-fMRI (Zhang Guindani, et al., 2016). -- > *... the most common software packages for fMRI analysis can result in false-positive rates of up to 70% --- Eklund et al. (PNAS 2016)* -- .question[ Can we make use of full information of CV-fMRI while keep the advantages of Bayesian nature? ] -- - Start with the most common objective: **detecting brain activations in task-based fMRI**. <div class="figure" style="text-align: center"> <img src="./img/fmri.jpg" alt="Source: https://blog.applysci.com" width="50%" /> <p class="caption">Source: https://blog.applysci.com</p> </div> --- ## Goal: Computationally Effcient Models for CV-fMRI Propose - computationally efficient models/algorithms that + *jointly* analyze real and imaginary parts of CV-fMRI data + better detect activation *at the voxel level* + infer brain connectivity (in preparation) -- <br> <br> - **Complex-valued Expectation-Maximization Variable Selection with Autoregressive Processes (CV-EMVS-AR)** - **(Multi-subject) Bayesian Spatiotemporal Model via Kernel Convolution and Autoregressive Processes (CV-KC-AR)** --- ## Background: Rowe-Logan Constant Phase Model From Rowe and Logan (2004), for time `\(t = 1:T\)`, voxel `\(v = 1:V\)` and `\(p\)` tasks, `$$\begin{align*} y_t^v &= \rho_t^v\cos(\phi^v) + i\rho_t^v\sin(\phi^v)+\eta_t^v,\nonumber \\ \rho_t^v &= \beta_0^v + \beta_1^vx_{1, t} + \cdots + \beta_{p}^vx_{p, t} \end{align*}$$` -- `$$\begin{bmatrix} y_1^v \\ \vdots\\ y_T^v \end{bmatrix}=\begin{bmatrix} 1 & x_{11} & \cdots & x_{p1} \\ \vdots & \vdots & \ddots &\vdots \\ 1 & x_{1T} & \cdots & x_{pT} \end{bmatrix} \begin{bmatrix} \beta_0^v \cos(\phi^v) + i \beta_0^v \sin(\phi^v) \\ \vdots \\ \underbrace{ \beta_p^v \cos(\phi^v)}_{\ga_{Re}^v}+ i \underbrace{ \beta_p^v \sin(\phi^v)}_{\ga_{Im}^v} \end{bmatrix}+\begin{bmatrix}\eta_1^v \\ \vdots\\ \eta_T^v \end{bmatrix}$$` - `\(\mathbf{y}^v = \bfX\ga_{Re}^v + i\bfX\ga_{Im}^v + \bseta ^v\)` - `\(\mathbf{y}^v = \bfX\ga^v + \bseta ^v\)`, `\(\, \, \bseta ^v \sim CN_T(\mathbf{0}, \mathbf{\Gamma}_v, \mathbf{C}_v)\)`. `\(\mathbf{\Gamma}_v \in \mathcal{C}^{T \times T}\)`, `\(\mathbf{C}_v \in \mathcal{C}^{T \times T}\)` - `\(\bseta ^v\)` is [*circular* normal](https://en.wikipedia.org/wiki/Complex_normal_distribution) if `\(\mathbf{C}_v = 0\)` --- ## Background: Real-valued Representation `\(\mathbf{y}^v = \bfX\ga^v + \bseta ^v\)`, `\(\, \, \bseta ^v \sim CN_T(\mathbf{0}, \mathbf{\Gamma}_v, \mathbf{C}_v)\)` <br> <img src="./img/real_rep.png" width="40%" style="display: block; margin: auto;" /> <br> or equivalently, `\(\mathbf{y}_r^v = \bfX^r\ga_r^v + \bseta_r ^v\)`, where `\(\bseta_r ^v \sim N_{2T}\left( \mathbf{0}, \mathbf{\Sigma}_v \right)\)` <br> <img src="./img/real_sigma.png" width="30%" style="display: block; margin: auto;" /> -- <br> <div class="figure" style="text-align: center"> <img src="./img/boldconv.png" alt="Source: Lindquist (2008)" width="70%" /> <p class="caption">Source: Lindquist (2008)</p> </div> --- ## Brain Activation as Variable Selection - `\(\mathbf{y}^v = \bfX\ga^v + \bseta ^v\)` - `\(\gamma_j^v \neq 0\)` iff voxel `\(v\)` at task `\(j\)` is activated (Xia, Liang, and Wang (2009); Zhang, Guindani, and Vannucci (2015)). - **Complex normal spike-and-slab** prior `$$\gamma^v_j \sim (1 - \psi_j^v) \underbrace{CN(0, \omega_0, \lambda_0)}_{spike} + \psi_j^v \underbrace{CN(0, \omega_1, \lambda_1)}_{slab},$$` `\(\omega_0 < \omega_1 \in \mathcal{C}, \quad \lambda_0 < \lambda_1 \in \mathcal{C}\)`. <img src="index_files/figure-html/unnamed-chunk-5-1.png" width="45%" style="display: block; margin: auto;" /> --- ## Brain Activation as Variable Selection - **Complex normal spike-and-slab** prior `$$\gamma^v_j \sim (1 - \psi_j^v) \underbrace{CN(0, \omega_0, \lambda_0)}_{spike} + \psi_j^v \underbrace{CN(0, \omega_1, \lambda_1)}_{slab},$$` `\(\omega_0 < \omega_1 \in \mathcal{C}, \quad \lambda_0 < \lambda_1 \in \mathcal{C}\)`. - **Non-active** voxel: `\(\psi_j^v=0 \Rightarrow \gamma_j^v=0\)` - **Active** voxel: `\(\psi_j^v=1 \Rightarrow \gamma_j^v \neq 0\)` - Activation is inferred by borrowing information across voxels through a Bernoulli prior on `\(\psi_j^v\)` with a **common** probability of activation for all voxels: `\(\psi_j^v \sim Bernoulli(\theta^v_j = \theta_j)\)`, i.e., `\(Pr(\psi_j^v = 1\mid \theta_j) = \theta_j\)`. --- ## CV-EMVS (Yu, Prado, Ombao, and Rowe, 2018) `$$\begin{align*} \mathbf{y}^v &= \bfX \ga^v + \bseta^v, \quad \bseta^v \sim CN_T(\mathbf{0}, 2\sigma_v^2\I, \mathbf{0}), \quad v = 1:V,\\ \gamma^v_j \mid \psi^v_j&\stackrel{\tiny indep}{\sim}(1-\psi^v_j)CN_1(0, \sigma_v^2\omega_0, \sigma_v^2\lambda_0) + \gamma^v_jCN_1(0, \sigma_v^2\omega_1, \sigma_v^2\lambda_1), \\ \omega_0 & << \omega_1, \quad \lambda_0 << \lambda_1, \quad j = 1:p,\\ \psi^v_j \mid \theta_j &\stackrel{\tiny IID}{\sim} Bernoulli(\theta_j), \quad \theta_j \stackrel{\tiny IID}{\sim} Beta(a_{\theta}, b_{\theta}), \quad \sigma_v^2 \sim IG(a_{\sigma}, b_{\sigma}) \end{align*}$$` -- - We determine if the real and imaginary parts of `\(\gamma_j^v\)` are zero **jointly**: `\(\gamma_j^v \neq 0\)` if `\(Pr(\psi_{j}^v = 1\mid \ga^*, \theta^*, \sigma^*, \mathbf{y}) > \delta\)`. + E-step: derive .midi[ `$$Pr(\psi_{j}^v = 1\mid \ga^{(l)}, \theta^{(l)}, \sigma^{(l)}, \mathbf{y})$$` ] + M-step: obtain maximum a posteriori .midi[ `$$\ga^{(l+1)}, \theta^{(l+1)}, \sigma^{(l+1)} = \max \arg \mathrm{E}_{\bfpsi|\cdot} \left[\log \pi \left( \ga, \bfpsi, \theta, \sigma \mid \y\right) \mid \ga^{(l)}, \theta^{(l)}, \sigma^{(l)}, \y \right]$$` ] --- ## CV-EMVS (Yu, Prado, Ombao, et al., 2018) `$$\begin{align*} \mathbf{y}^v &= \bfX \ga^v + \bseta^v, \quad \bseta^v \sim CN_T(\mathbf{0}, 2\sigma_v^2\I, \mathbf{0}), \quad v = 1:V,\\ \gamma^v_j \mid \psi^v_j&\stackrel{\tiny indep}{\sim}(1-\psi^v_j)CN_1(0, \sigma_v^2\omega_0, \sigma_v^2\lambda_0) + \gamma^v_jCN_1(0, \sigma_v^2\omega_1, \sigma_v^2\lambda_1), \\ \omega_0 & << \omega_1, \quad \lambda_0 << \lambda_1, \quad j = 1:p,\\ \psi^v_j \mid \theta_j &\stackrel{\tiny IID}{\sim} Bernoulli(\theta_j), \quad \theta_j \stackrel{\tiny IID}{\sim} Beta(a_{\theta}, b_{\theta}), \quad \sigma_v^2 \sim IG(a_{\sigma}, b_{\sigma}) \end{align*}$$` - Under circular normal prior on `\(\gamma_j^v\)`, i.e., `\(\lambda_0 = \lambda_1 = 0\)`, .midi[ `$$\begin{align*} \hat{\ga}^v_{Re}&=(\bfX'\bfX + \color{black} 2 \color{black} \D_v)^{-1}(\X'\y^v)_{Re},\\ \hat{\ga}^v_{Im}&=(\bfX'\bfX + \color{black} 2 \color{black} \D_v)^{-1}(\X'\y^v)_{Im}. \end{align*}$$` ] --- ## Activation and Strength Maps: Low SNR - The CV model outperforms other MO models in terms of activation detection and strength. - CV: *CV-EMVS* - MO: *MO-EMVS* - ALA: *Adaptive Lasso* <br> <img src="./img/sim1strall2.png" width="90%" style="display: block; margin: auto;" /> --- ## Human CV-fMRI (Karaman, Bruce, and Rowe, 2015) <img src="./img/CplxActFWEpt05ExpR.png" width="33%" /><img src="./img/MagActFWEpt05ExpR.png" width="33%" /><img src="./img/DtngActFWEpt05ExpR.png" width="33%" /> - Unilateral finger tapping data of dimension 96 x 96 x 510. - KBR-CV: false positives outside the brain area. - KBR-MO: low detecting power. - DeTeCT-ING: Nonlinear model using the MR signal equation. --- ## CV-EMVS with AR Noise `$$\begin{eqnarray*} y_t^v & = & \gamma_1^{v} + \gamma_2^{v} t/T + \gamma_3^{v} x_t + \eta_t^v, \\ \eta ^v_t & = & \varphi_v \eta ^v_{t-1} + \zeta_t^v, \quad \zeta_t^v \,\, \iid \,\, CN_1(0, 2\sigma_v^2, 0), \\ \varphi_v & \sim & Uniform(-1, 1). \end{eqnarray*}$$` .center[ <img src="./img/arcoef_cv_sigv_phiv.png" width="40%" /><img src="./img/CV_ar1_sigv_phiv_test.png" width="40%" /> ] --- ## General Bayesian spatio-temporal model - CV-EMVS-AR much improves detection but does not explicitly model the spatial dependence across voxels. - The execution of biological tasks involves populations of neurons spanning across several voxels, rather than a single voxel. - Propose Bayesian variable selection tools coupled with a spatial kernel convolution (KC) structure as well as temporal autoregressive processes for CV-fMRI data. (Yu, Prado, Ombao, and Rowe, 2022) --- ## Kernel Convolution: Definition - Given a kernel `\(k(z; \phi), z \in \mathcal{S}\)` and a white noise process `\(w(u), u \in \mathcal{S}\)`, the kernel convolution is `$$S(z) = \int_{\mathcal{S}} k(z - u; \phi)w(u)du.$$` - In practice, for sites `\(u_1, ..., u_D\)`, the process is defined as `$$S(z) = \sum_{d=1}^Dk(z-u_d; \phi)w(u_d)$$` - The spatial process `\(S(z)\)` governs the spatial dependence of voxels in the image, and affects the probability of voxels being activated. --- ## Kernel Convolution: Properties <img src="./img/img_centroid.png" width="65%" style="display: block; margin: auto;" /> - **Dimension reduction**: A small number `\(D\)` of parameters `\(w(u_1), ..., w(u_D)\)` governs the entire process `\(S(z)\)` that may have tons of measurements at location `\(z_1, ..., z_V\)`. `\((V >> D)\)`. - `\(\bfS = (S(z_1), ..., S(z_V))'\)` is the voxel-level spatial effects - `\(\mathbf{w} = (w(u_1), ..., w(u_D))'\)` is the `\(D\)`-dimensional latent spatial effect formed by the selected `\(D\)` sites. --- ## Kernel convolution (KC) vs. Gaussian process (GP) .pull-left[ .center[ .large[ .red[ **KC** ] ] ] 1. Select `\(D\)` sites to be representative of the image. 2. Compute the _**voxel**-level spatial effects_: `$$\bfS_{(j)} = (S_{j}^1, ..., S_{j}^V)$$` via kernel convolution using the latent spatial process formed by the selected `\(D\)` sites. ] .pull-right[ .center[ .large[ .red[ **GP** ] ] ] 1. Parcellate the image into `\(G\)` clusters of voxels. 2. Compute the _**region**-level spatial effects_: `$$\bfS_{(j)} = (S_{j}^1, ..., S_{j}^G)$$` based on a Gaussian process and the "location" of the `\(G\)` regions. (Bezener et al. 2018) ] -- .alert[ `\(D \neq G\)` but we use `\(D = G\)` for comparison. ] --- ## Advantages of KC over GP <img src="./img/img_kc.png" width="33%" /><img src="./img/img_gp.png" width="33%" /><img src="./img/nonconvex_centroid.png" width="33%" /> .pull-left[ .center[ .large[ .red[ **KC** ] ] ] - Circumvents the need for assigning each voxel to some group. ] .pull-right[ .center[ .large[ .red[ **GP** ] ] ] - Needs to define and compute the location of regions. - Shape of region is sensitive to the detecting performance. ] --- ## Complex-valued Bayesian Spatiotemporal Model - **Likelihood:** With the indicators `\(\psi^v_{j}\)` such that `\(\gamma^v_{j} \neq 0\)` if `\(\psi^v_{j} = 1\)` and `\(\gamma_{j}^v = 0\)` if `\(\psi_{j}^v = 0\)`, `$$\begin{align*} \mathbf{y}^v &= \bfX(\bfpsi^v)\ga^v(\bfpsi^v) + \bseta^v\\ \bseta^v &\sim CN_T(\mathbf{0}, 2\sigma_v^2 \Lam_v, \mathbf{0}), \end{align*}$$` where `\(\Lam_v\)` is the AR(1) correlation matrix. - Empirical Bayes estimator `\(\hat{\rho}_v\)` `\((\hat{\Lam}_v)\)` for AR coefficient for computation efficiency. -- - **Complex-valued g-prior on `\(\ga^v\)`:** `$$\begin{align*} \ga^v(\bfpsi^v) \mid \bfpsi^v, \sigma_v^2 & \,\, \ind \,\, CN_{p}\left(\hat{\ga}^v(\bfpsi^v), 2T\sigma_v^2(\X'(\bfpsi^v) \hat{\Lam}_v^{-1} \X(\bfpsi^v))^{-1}, \mathbf{0}\right)\\ \hat{\ga}^v(\bfpsi^v) & = (\X'(\bfpsi^v)\hat{\Lam}_v^{-1}\X(\bfpsi^v))^{-1}\X'(\bfpsi^v)\y^v \end{align*}$$` - Integrate `\(\ga^v\)` out for faster computation. --- ## Spatial Priors .pull-left[ .center[ .large[ .red[ **KC** ] ] ] .midi[ `$$\pi\left(\bfpsi_{(j)} \bigm| \bfS_{(j)}\right) = \prod_{v = 1}^{V} \pi\left(\psi^v_{j} \bigm| S_{j}^v\right)$$` `$$\psi^v_j \bigm| S_j^v \,\,\ind\,\, Bernoulli\left(\frac{1}{1 + e^{-\left( \color{red}{a_j^d + S_j^v} \right)}}\right)$$` `$$\color{red}{S_j^v = \sum_{d=1}^{D}k\left(z_v-s_d; \phi^d\right)w_{j}^d}$$` `$$w_j^d \mid \tau_j^2 \,\,\ind\,\, N\left(0, \tau_j^2\right)$$` `$$\tau_j^2 \,\,\iid\,\, IG\left(a_{\tau}, b_{\tau}\right)$$` `$$\phi^d \,\,\iid\,\, Ga\left(a_{\phi}, b_{\phi}\right)$$` ] ] -- .pull-right[ .center[ .large[ .red[ **GP** ] ] ] .midi[ `$$\pi\left(\bfpsi_{(j)} \bigm| \bfS_{(j)}\right) = \prod_{g = 1}^{G} \prod_{v \in \mathcal{R}_g} \pi\left(\psi^v_{j} \bigm| S_{j}^g\right)$$` `$$\psi^v_j \bigm| S_j^g \,\,\ind\,\, Bernoulli\left(\frac{1}{1 + e^{-\left( \color{red}{a_j^g + S_j^g} \right)}}\right)$$` `$$\color{red}{\bfS_{(j)} \bigm| \delta_j^2, r_j \,\,\ind\,\, N\left(0, \delta_j^2 \Gamma_j\right)}$$` $$\Gamma_j(i, k) = \exp \left( -\frac{ \|| s_i - s_k \|| }{r_j} \right) $$ $$\pi\left(\delta_j^2\right) \propto \delta_j^{-2} $$ `$$r_j \sim Ga(a_j, b_j)$$` ] ] --- ## Choice of Kernels - **Gaussian** kernel `\(k(z_v - u_d; \phi) = \exp\left( -\frac{\|z_v - u_d\|^2}{2\phi}\right)\)` - **Bezier** kernel `$$k(s_v - u_d; \nu, \phi) = \begin{cases} \left(1 - \frac{\|z_v - u_d \|^2}{\phi^2}\right)^\nu, & \quad \|z_v - u_d \| < \phi\\ 0, & \quad \text{ otherwise } \end{cases},$$` where `\(\nu\)` is the smooth parameter and `\(\phi\)` is the range parameter. -- - The Bezier kernel has a compact support. - To capture neighboring spatial dependence + this avoids unrealistically relating any two voxels together + lets the model learn the neighboring structure of a given voxel by learning `\(\phi\)` --- ## Bezier Kernel: `\(\nu = 2\)`; `\(\phi = 2, 4, 6\)` <img src="./img/bezier_phi.png" width="85%" style="display: block; margin: auto;" /> - Here shows you what the Bezier kernel looks like with different value of parameters. --- ## Bezier Kernel: `\(\nu = 0.5, 2, 5\)`; `\(\phi = 3\)` <img src="./img/bezier_nu.png" width="85%" style="display: block; margin: auto;" /> --- ## Simulated data with AR coefficient 0.5 - Estimate `\(Pr\left(\psi_j^v = 1 \Bigm| \y\right)\)` by computing the number of 1s of `\(\psi_j^v\)` in the posterior sample divided by the number of MCMC iterations. .center[ .red[ Temporal vs. Non-temporal <br> ] |Model | Sensitivity | Specificity | Precision | Accuracy | F1 | MCC| |-------|--------------|-------------|-----------|----------|----|----| |CV-KC-AR | 0.92 | **0.99** | **0.99** | **0.99** | **0.95** | **0.95** | |CV-GP-AR | 0.82 | 0.99 | 0.97 | 0.97 | 0.89 | 0.88 | |CV-EMVS-AR | 0.96 | 0.99 | 0.93 | 0.98 | 0.94 | 0.93 | |CV-KC | 1 | 0.79 | 0.47 | 0.82 | 0.63 | 0.61 | ] --- ## Simulated data with AR coefficient 0.5 - Estimate `\(Pr\left(\psi_j^v = 1 \Bigm| \y\right)\)` by computing the number of 1s of `\(\psi_j^v\)` in the posterior sample divided by the number of MCMC iterations. .center[ .red[ Temporal vs. Non-temporal <br> ] <img src="./img/act_ar.png" width="85%" style="display: block; margin: auto;" /> ] --- ## Simulated data with AR coefficient 0 .center[ .red[ Spatial vs. Non-spatial <br> ] |Model | Sensitivity | Specificity | Precision | Accuracy | F1 | MCC| |-------|--------------|-------------|-----------|----------|----|----| |CV-KC | **0.79** | **0.99** | **0.99** | **0.97** | **0.87** | **0.86** | |CV-GP | 0.58 | 0.99 | 0.99 | 0.93 | 0.73 | 0.72 | |CV-EMVS | 0.66 | 0.99 | 0.92 | 0.94 | 0.77 | 0.75 | <br> ] --- .pull-left[ .center[ .large[ .red[ **KC** ] ] ] <img src="./img/spatialeffect_cvkc_dot.png" width="100%" style="display: block; margin: auto;" /> - KC produces a finer and reasonable latent spatial effect map `$$\psi^v \bigm| S^v \,\,\ind\,\, Bernoulli\left(\frac{1}{1 + e^{- \color{red}{S^v}}}\right)$$` `$$\color{red}{S^v = \sum_{d=1}^{D}k\left(z_v-s_d; \phi\right)w^d}$$` ] .pull-right[ .center[ .large[ .red[ **GP** ] ] ] <img src="./img/spatialeffect_cvgp.png" width="100%" style="display: block; margin: auto;" /> - GP forces voxels in the same region share the same effect `$$\psi^v_j \bigm| S_j^g \,\,\ind\,\, Bernoulli\left(\frac{1}{1 + e^{- \color{red}{S_j^g}}}\right)$$` ] --- ## KC is Less Sensitive to Dimension Reduction <img src="./img/cvgp_G.png" width="60%" style="display: block; margin: auto;" /><img src="./img/cvkc_G_dot.png" width="60%" style="display: block; margin: auto;" /> <br> .center[ It matters because ... ] --- ## Computing Time - Average computing time over 10 times of running the MCMC algorithms. - The time unit is seconds per 1000 MCMC iterations |Model | `\(16\)` | `\(25\)` | `\(100\)` | |-------|-------|-------|-----------| |CV-KC | 0.51 | **0.59** | 1.48 | |CV-GP | 0.30 | 0.41 | **3.36** | - The example shows the KC model can use just about 15% to 20% of the computing time of the GP model per 1000 MCMC iterations to reach the similar detecting activation performance --- ## CV vs. MO - The CV model is better than the MO model especially when signals are noisy. - The MO models improve more when an explicit spatial prior is included. <img src="./img/ROC_CV_MO_LL_new.png" width="85%" style="display: block; margin: auto;" /> --- ## MO-GP Has a Region Probability Inflation If a spatial region contains true activated voxels, other nonactivated voxels in the region will also have relatively high probability of activation `\(\implies\)` increases false positives. .pull-left[ .center[ .large[ .red[ **CV** ] ] ] <img src="./img/postprob_cv.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ .center[ .large[ .red[ **MO** ] ] ] <img src="./img/postprob_mo.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Spatial Models Encourage Activation in Clusters <img src="./img/cvkc_activation05_dot.png" width="33%" /><img src="./img/cvgp_activation05.png" width="33%" /><img src="./img/em_activation05.png" width="33%" /> --- ## Zoom In .center[ <img src="./img/CVKC_probmap_dot.png" width="40%" /><img src="./img/CVGP_probmap.png" width="46%" /> ] --- <img src="./img/CVKC_spatial_dot.png" width="65%" style="display: block; margin: auto;" /> <img src="./img/CVGP_spatial_line.png" width="65%" style="display: block; margin: auto;" /> --- ## Multi-resolution - `\(D\)` coarse-resolution sites (white dots) and `\(H\)` fine-resolution sites (green dots) `$$S^v = \sum_{d=1}^{D}k_c\left(z_v-s_d; \phi_c \right)w^d + \sum_{h=1}^{H}k_f\left(z_v-s_d; \phi_f \right)b^h$$` <img src="./img/multires_all_dot.png" width="100%" style="display: block; margin: auto;" /> --- ## Take-home Message - Complex-valued modeling improves activation. - CV-EMVS and CV-KC-AR are computationally efficient. - Spatial models encourage activation in clusters. - The MO models need a sophisticated spatial structure to reach good performance as CV models. - The KC models are flexible and outperform the baseline GP models. + Any valid kernel can be used. + Avoid dealing with problems of shape of regions. + How a voxel is influenced by its neighboring voxels is model-based. + Less affected by dimension reduction, leading to fast computation. --- ## References [1] M. Karaman, I. P. Bruce, and D. B. Rowe. "Incorporating relaxivities to more accurately reconstruct MR images". In: _Magnetic Resonance Imaging_ 33 (2015), pp. 374-384. [2] D. B. Rowe and B. R. Logan. "A complex way to compute fMRI activation". In: _NeuroImage_ 23 (2004), pp. 1078-1092. [3] J. Xia, F. Liang, and Y. Wang. "FMRI analysis through Bayesian variable selection with a spatial prior". In: _Proceedings of the 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro_. Boston, MA, USA, 2009, pp. 714-717. [4] C. Yu, R. Prado, H. Ombao, et al. "A Bayesian variable selection approach yields improved detection of brain activation from complex-valued fMRI". In: _Journal of American Statistical Association: Applicaiton and Case Studies_ 113 (2018), pp. 1395-1410. [5] C. Yu, R. Prado, H. Ombao, et al. "Bayesian spatiotemporal modeling on complex-valued fMRI signals via kernel convolutions". In: _Biometrics_ (2022), pp. 1-13. [6] L. Zhang, M. Guindani, and M. Vannucci. "Bayesian models for fMRI data analysis". In: _WIREs Computational Statistics_ 7 (2015), pp. 21-41. [7] L. Zhang, M. Guindani, F. Versace, et al. "A spatio-temporal non-parametric Bayesian model of multi-subject fMRI data". In: _Annals of Applied Statistics_ (2016).