## Resources for stochastic differential equation mixed-effects models

[tl;dr here is a collection of resources for SDEMEMs]

Mixed-effects models (MEM) are hierarchical models suited for “population inference”, where instead of fitting data from a single experiment, we are interested in learning characteristics common to $M$ runs of the same experiment. As an example, we could have data from $M$ subjects, and we wish to fit them all together, not separately. This way we learn something at the “population” level, using a statistical model that explicitly takes into account variation from several streams of data. Mixed-effects models are particularly relevant for repeated measurements data.

In MEM, individual experiments (e.g. “subjects”) are modeled by introducing some set of parameters $\phi_i$ ( $i=1,...,M$), which vary randomly between experiments according to a probability distribution, say $\phi_i\sim p(\phi_i|\eta)$, depending on some unknown parameter $\eta$. Here $\eta$ is common to all $\phi_i$, hence common to all $M$ subjects. Therefore $\phi_i$ is a “random effect”, and $\eta$ is a “population parameter”. Both $\phi_i$ and $\eta$ may be vectors. Then there could be other unknown parameters which are not random effects, and we call those $c$.

So we have mixed-effects, as some parameters vary randomly between subjects ( $\phi_i$, $i=1,...,M$) and others are common to all subjects ( $c$ and $\eta$).

Example: $y_{ij}=f(X_{ij}|\phi_i,,\eta,c)+\varepsilon_{ij},\quad j=1,...,n_i;\quad i=1,...,M \qquad (1)$

where $y_{ij}$ is the $j$-th measurement recorded on subject $i$ ( $j=1,...,n_i$), $X_{ij}$ may be corresponding fixed covariates or even an unobserved  stochastic process, and $f$ is some function. Finally $\varepsilon_{ij}$ is residual variation (measurement error), for example Gaussian error $\varepsilon_{ij}\sim N(0,\sigma^2_\varepsilon)$.

When the $X_{ij}$ result from discrete observations of a diffusion process $\{X_{i,t}\}_{t\geq t_0}$, which is a continuous-time Markov process solution to the stochastic differential equation (SDE) for subject $i$ $dX_{i,t}=\mu(X_{i,t},\phi_i,c)dt+\sigma(X_{i,t},\phi_i,c)dB_{i,t}, \qquad \phi_i\sim p(\phi_i|\eta), i=1,...,M \qquad (2)$

then we have obtained a stochastic differential equation mixed-effects model (SDEMEM). Some papers consider SDEMEMs written as (2) (no measurement error) others consider it as the system (1)-(2).

This is a powerful class of models, since we can simultaneously consider three sources of variability: (i) variation between subjects: by estimating the parameter $\eta$ underlying the distribution of the individual random effects $\phi_i$; (ii) intrinsic individual stochastic variation, encoded within the “diffusion coefficient” $\sigma(X_{i,t},\phi_i,c)$; (iii) measurement error variation $\sigma^2_\varepsilon$ (if (1) is considered). Therefore SDEMEMs enable us to learn characteristics common to all subjects $(\eta,c)$ (i.e. “population estimation”), while also taking into account individual systemic (intrinsic) variation and measurement error.

Now,  inference for SDEMEMs is not trivial at all, essentially because inference for SDEs based on discrete observations can be quite tricky. However some literature is available, and the motivation for writing this post is to make reasearchers aware of a collection of resources  for SDEMEMs I have set up. 