Introduction

We present here some examples to highlight the differences between the gANOVA() function and the built-in spherical distribution of lmer(). Both functions estimate a spherical correlation structure. However the parametrization using gANOVA() increases the parameter space and widen the range of the models that can be estimated without having estimates at the boundary of the parameter space (e.g. variances estimated at 0).

We need the following packages:

Simulating data

Then, we construct the design of a simple experiment. It contains 12 participants, a within-participant factor with 4 levels and 3 replications for each participant within each level. The design is simply:

set.seed(42)

df <- expand.grid(pt = paste0("P",str_pad(1:12,width = 2,pad = "0")),
                  A = paste0("A",1:4),
                  r = 1:3,
                  stringsAsFactors = F)%>%
  select(-r)

Then we simulate 2 different response variables. There are y_diff and y_same and they differ only by the variance of the random intercepts. The response y_diff has a lower variance of the random intercepts compared to the one of the random interactions.

# add +(1|pt) random intercept
df<-
  df%>%
  nest(data=-pt)%>%
  mutate(ranef_pt = rnorm(n(), sd = .8))%>%
  unnest(data)

# add +(1|pt:A) random interaction
df<-
  df%>%
  nest(data=-c(pt,A))%>%
  mutate(ranef_ptA = rnorm(n(), sd = 3))%>%
  unnest(data)

# add an error term
df<-
  df%>%
  mutate(err = rnorm(n(),sd = 4))


# create 2 dependant variables
df <- df%>%
  transmute(pt = pt, A= A,
            y_diff = ranef_pt + ranef_ptA + err,
            y_same =  ranef_pt*2 +ranef_ptA + err)

Comparing estimation

Finally, we estimate the model using y_same as dependant variable:

lmer_same <- lmer(y_same ~ A+ (1|pt) + (1|pt:A), data=df, contrasts = list(A = contr.sum))
summary(lmer_same)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: y_same ~ A + (1 | pt) + (1 | pt:A)
#>    Data: df
#> 
#> REML criterion at convergence: 836.8
#> 
#> Scaled residuals: 
#>      Min       1Q   Median       3Q      Max 
#> -2.61123 -0.61168 -0.06369  0.59976  2.64576 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  pt:A     (Intercept)  9.799   3.130   
#>  pt       (Intercept)  2.169   1.473   
#>  Residual             13.744   3.707   
#> Number of obs: 144, groups:  pt:A, 48; pt, 12
#> 
#> Fixed effects:
#>             Estimate Std. Error t value
#> (Intercept)  0.47727    0.69306   0.689
#> A1           0.36270    0.94803   0.383
#> A2           0.03153    0.94803   0.033
#> A3          -1.47134    0.94803  -1.552
#> 
#> Correlation of Fixed Effects:
#>    (Intr) A1     A2    
#> A1  0.000              
#> A2  0.000 -0.333       
#> A3  0.000 -0.333 -0.333

and then gANOVA:

gANOVA_same <- gANOVA(y_same ~ A + (1|pt|A), data=df, contrasts = list(A = contr.sum))
summary(gANOVA_same)
#> Linear mixed model fit by REML. t-tests use Satterthwaite's method [
#> lmerModgANOVA]
#> Formula: y_same ~ A + (1 | pt | A)
#>    Data: df
#> Control: structure(list(optimizer = "nloptwrap", restart_edge = TRUE,  
#>     boundary.tol = 1e-05, calc.derivs = TRUE, use.last.params = FALSE,  
#>     checkControl = list(check.nobs.vs.rankZ = "ignore", check.nobs.vs.nlev = "stop",  
#>         check.nlev.gtreq.5 = "ignore", check.nlev.gtr.1 = "stop",  
#>         check.nobs.vs.nRE = "stop", check.rankX = "message+drop.cols",  
#>         check.scaleX = "warning", check.formula.LHS = "stop"),  
#>     checkConv = list(check.conv.grad = list(action = "warning",  
#>         tol = 0.002, relTol = NULL), check.conv.singular = list( 
#>         action = "message", tol = 1e-04), check.conv.hess = list( 
#>         action = "warning", tol = 1e-06)), optCtrl = list()), class = c("lmerControl",  
#> "merControl"))
#> 
#> REML criterion at convergence: 836.8
#> 
#> Scaled residuals: 
#>      Min       1Q   Median       3Q      Max 
#> -2.61123 -0.61168 -0.06369  0.59976  2.64577 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  pt:A     (Intercept)  9.799   3.130   
#>  pt       (Intercept)  4.619   2.149   
#>  Residual             13.744   3.707   
#> Number of obs: 144, groups:  pt:A, 48; pt, 12
#> 
#> Fixed effects:
#>             Estimate Std. Error       df t value Pr(>|t|)
#> (Intercept)  0.47727    0.69306 10.99989   0.689    0.505
#> A1           0.36270    0.94803 33.00003   0.383    0.704
#> A2           0.03153    0.94803 33.00003   0.033    0.974
#> A3          -1.47134    0.94803 33.00003  -1.552    0.130
#> 
#> Correlation of Fixed Effects:
#>    (Intr) A1     A2    
#> A1  0.000              
#> A2  0.000 -0.333       
#> A3  0.000 -0.333 -0.333

Using y_same the 2 estimations are the same (likelihood, fixed parameters, etc). Here the reparametrizations of gANOVA() does not improve the results as the variance of the random intercept is big enough with respect to the variance of the random interaction.

However, using y_diff, the variability of the random intercepts is smaller. The estimation using the parametrization of lmer() produces a singular fit:

lmer_diff <- lmer(y_diff~ A + (1|pt)+(1|pt:A),data=df,contrasts = list(A = contr.sum))
#> boundary (singular) fit: see ?isSingular
summary(lmer_diff)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: y_diff ~ A + (1 | pt) + (1 | pt:A)
#>    Data: df
#> 
#> REML criterion at convergence: 831
#> 
#> Scaled residuals: 
#>      Min       1Q   Median       3Q      Max 
#> -2.62948 -0.60740 -0.07108  0.61599  2.68565 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  pt:A     (Intercept)  9.581   3.095   
#>  pt       (Intercept)  0.000   0.000   
#>  Residual             13.744   3.707   
#> Number of obs: 144, groups:  pt:A, 48; pt, 12
#> 
#> Fixed effects:
#>             Estimate Std. Error t value
#> (Intercept) -0.12703    0.54318  -0.234
#> A1           0.36270    0.94081   0.386
#> A2           0.03153    0.94081   0.034
#> A3          -1.47134    0.94081  -1.564
#> 
#> Correlation of Fixed Effects:
#>    (Intr) A1     A2    
#> A1  0.000              
#> A2  0.000 -0.333       
#> A3  0.000 -0.333 -0.333
#> convergence code: 0
#> boundary (singular) fit: see ?isSingular

Estimating the model using gANOVA() reduces the constraint on the parameter space (of the variance of the intercept). The gANOVA parametrization produces a non-zero estimate of the variance of the random intercept. It follows that the likelihood is smaller:

mod_gANOVA <- gANOVA(y_diff~ A+ (1|pt|A),data=df,contrasts = list(A = contr.sum))
summary(mod_gANOVA)
#> Linear mixed model fit by REML. t-tests use Satterthwaite's method [
#> lmerModgANOVA]
#> Formula: y_diff ~ A + (1 | pt | A)
#>    Data: df
#> Control: structure(list(optimizer = "nloptwrap", restart_edge = TRUE,  
#>     boundary.tol = 1e-05, calc.derivs = TRUE, use.last.params = FALSE,  
#>     checkControl = list(check.nobs.vs.rankZ = "ignore", check.nobs.vs.nlev = "stop",  
#>         check.nlev.gtreq.5 = "ignore", check.nlev.gtr.1 = "stop",  
#>         check.nobs.vs.nRE = "stop", check.rankX = "message+drop.cols",  
#>         check.scaleX = "warning", check.formula.LHS = "stop"),  
#>     checkConv = list(check.conv.grad = list(action = "warning",  
#>         tol = 0.002, relTol = NULL), check.conv.singular = list( 
#>         action = "message", tol = 1e-04), check.conv.hess = list( 
#>         action = "warning", tol = 1e-06)), optCtrl = list()), class = c("lmerControl",  
#> "merControl"))
#> 
#> REML criterion at convergence: 830.9
#> 
#> Scaled residuals: 
#>      Min       1Q   Median       3Q      Max 
#> -2.61261 -0.61229 -0.07301  0.62167  2.68455 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  pt:A     (Intercept)  9.799   3.130   
#>  pt       (Intercept)  2.232   1.494   
#>  Residual             13.744   3.707   
#> Number of obs: 144, groups:  pt:A, 48; pt, 12
#> 
#> Fixed effects:
#>             Estimate Std. Error       df t value Pr(>|t|)
#> (Intercept) -0.12703    0.53048 10.99999  -0.239    0.815
#> A1           0.36270    0.94803 33.00006   0.383    0.704
#> A2           0.03153    0.94803 33.00006   0.033    0.974
#> A3          -1.47134    0.94803 33.00006  -1.552    0.130
#> 
#> Correlation of Fixed Effects:
#>    (Intr) A1     A2    
#> A1  0.000              
#> A2  0.000 -0.333       
#> A3  0.000 -0.333 -0.333