Part 1 Background

1.1 Latent processes

There are a many names for classes of models that describing repeated measures with (possibly heterogeneous) random effects.

A latent process mixed model, as presented in the R package lcmm (Proust-Lima, Philipps, and Liquet 2017), is a term used to describe a generalized linear mixed model of the form \[\begin{equation} Y_{ij} = H\left( \Lambda(t_{ij}) + \varepsilon_{ij}~;~ \eta \right), \tag{1.1} \end{equation}\] where \(Y_{ij}\) is the response for subject \(i\) at occasion \(j\), at time point \(t_{ij}\) with independent normally-distributed measurement error \(\varepsilon_{ij}\). The function \(H(\cdot)\) represents a link function (parametrised by \(\eta\)) relating the response to the linear mixed model latent process given by \[\begin{equation} \Lambda(t) = X(t) \beta + Z(t) u_i, \tag{1.2} \end{equation}\] for covariate vectors \(Z(t) \subset X(t)\), fixed effects \(\beta\) and random effects \(u_i\).

The link function \(H\) may be linear, and in the case of an identity link, the latent process reduces to a linear mixed model. Else it may represent a distribution function or quadratic L-spline basis. Psychology texts might refer to the same idea by the name latent curve analysis. Within the structural equation modelling framework, Equation (1.2) describes the structural model, while Equation (1.1) is called the measurement model.

A mixture of such models is named a latent class mixed model, growth mixture model, group-based trajectory model, heterogeneous mixed model or mixed mixture model, depending on the correlation structure of the random effects.

The class or cluster of subject \(i\) is denoted by a latent discrete random variable \(c_i\) and follows a multinomial logistic model (possibly with covariates predicting class membership). Then for each subject \(i\), the model (1.1) is conditional on the class, with fixed effects of covariates may be either common over the classes or class-specific, and random effects that are class specific. See the lcmm::hlme() vignette for examples.

The package lcmm supports (via the hlme and lcmm functions) random effects that have either common or heterogeneous (but similarly structured) covariances (toggled via the nwg argument). Meanwhile R package flexmix allows for varying correlation structure between the latent classes; the differences between the two packages are outlined in detail by Wardenaar (2020 and compared with the proprietary statistical software Mplus). These tools are frequentist, based on maximum likelihood esimation; later we will evaluate Bayesian alternatives that work via Markov chain Monte Carlo (MCMC) sampling.

1.2 Semantic growth

Some authors draw distinctions between growth curve modelling, growth mixture modelling and group-based trajectory modelling (Nagin and Odgers 2010; Nagin 2014) but the differences between these approaches are largely semantic or contextual and essentially the three methods describe the same thing. At worst, the techniques are special cases of one another: e.g. growth curve modelling being a degenerate growth mixture model with one mixture component. The word ‘curve’ or ‘trajectory’ is redundant and ‘group’ is synonymous with a mixture component; these semantic variations undoubtedly confuse more than they help clarify.

Growth models are generalized linear models with optional random effects (Rosen 1991). As an example of a multilevel mixed linear model (Steele 2007), growth curves can be fitted using standard software, such as the R package lme4 (Bates et al. 2020). There is nothing particularly essential about repeated measures being longitudinal, either: the variable representing time (or age) may be exchanged for any other continuous predictor.

Here we describe only a family of related methods, aiming to distinguish by explicit technical differences rather than arbitrary naming conventions. Every such model can be explained in terms of a hierarchical Bayesian model, aided by the expressive power of the Stan language (Guo et al. 2020).

To be continued…

References

Bates, Douglas, Martin Maechler, Ben Bolker, and Steven Walker. 2020. Lme4: Linear Mixed-Effects Models Using Eigen and S4. https://github.com/lme4/lme4/.

Guo, Jiqiang, Jonah Gabry, Ben Goodrich, and Sebastian Weber. 2020. Rstan: R Interface to Stan. https://CRAN.R-project.org/package=rstan.

Nagin, Daniel S. 2014. “Group-Based Trajectory Modeling: An Overview.” Annals of Nutrition and Metabolism 65 (2-3): 205–10. https://doi.org/10.1159/000360229.

Nagin, Daniel S., and Candice L. Odgers. 2010. “Group-Based Trajectory Modeling in Clinical Research.” Annual Review of Clinical Psychology 6 (1): 109–38. https://doi.org/10.1146/annurev.clinpsy.121208.131413.

Proust-Lima, Cécile, Viviane Philipps, and Benoit Liquet. 2017. “Estimation of Extended Mixed Models Using Latent Classes and Latent Processes: The R Package lcmm.” Journal of Statistical Software 78 (2): 1–56. https://doi.org/10.18637/jss.v078.i02.

Rosen, Dietrich Von. 1991. “The Growth Curve Model: A Review.” Communications in Statistics - Theory and Methods 20 (9): 2791–2822. https://doi.org/10.1080/03610929108830668.

Steele, Fiona. 2007. “Multilevel Models for Longitudinal Data.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 0 (0): 071029094155001–??? https://doi.org/10.1111/j.1467-985x.2007.00509.x.

Wardenaar, Klaas J. 2020. “Latent Class Growth Analysis and Growth Mixture Modeling Using R: A Tutorial for Two R-Packages and a Comparison with Mplus.” April. https://doi.org/10.31234/osf.io/m58wx.