Simulating a multi-level dataset for linear mixed effects analysis
This tutorial takes a simulation-based approach to understanding mixed effects models. We start with the fundamental equations specifying the model structure. Then, we simulate data with two variance components and fit several mixed effects models to it.
Why do we do this? Even the most advanced users do not necessarily know the implications and effects of the assumptions and procedures used in modeling. A simulation-based approach allows you to interrogate how the model performs under different scenarios (i.e., data and experimental design conditions). This provides a way of understanding the models' behavior, including bias, variance, power, and false positive rates.
Model formulation
First level:
This describes how individual-participant data are generated from a set of fixed predictors and participant-specific model parameters ("true" regression intercept and slope values for each participant). Outcome data vector
for participant
(
) is generated by a within-person design matrix
times participant-specific coefficient vector
plus participant specific error vector
. -
, where mis the number of participants. - y and
are
vectors corresponding to
observations
is an
within-person design matrix of
model predictors (regressors).
can vary across participants. p has the same value for all participants. One column of
is constant across all rows, and its coefficient reflects the model intercept.
is a
vector of participant-specific model parameters. Note that these are theoretical values, not estimates, as we are laying out the structure of the model. These are fixed effects, population-level effects of predictors that we are trying to estimate.
Note that
is block diagonal here. Second level:
This describes how participant-specific model parameters
are generated from between-person (group-level) population-level model parameters
.
is generated by a between-person design matrix
(see below) times fixed population-level coefficient vector
plus participant-specific deviation
.
is a
vector of participant-specific model parameters.
is a
vector of fixed population-level model parameters.- A is a placeholder identity matrix for group-level variables here, eye(p)
is a participant-specific deviation from population-level effects. If participants are selected randomly from a population,
varies randomly across m levels, and introduces a random effect.
Combined two-level model
is an
second-level design matrix of between-person predictors for the intercept and other model parameters.
Alternative notations
There are several other notation systems that differ somewhat in the notation used. Here are the equations used in the Mumford video lectures.
A simple example
Let's generate some data with made-up slopes. We have a simple model with a linear predictor and an intercept for each person.
Zi = [1 1 1 1 1; 0 1 2 3 4]'
disp('Number of subjects')
disp('Number of observations within-person, and num predictors')
Number of observations within-person, and num predictors
disp('Z is the Kronecker product of an [m x m] identity matrix and Zi');
Z is the Kronecker product of an [m x m] identity matrix and Zi
disp('The Kronecker product expands Zi into a block diagonal matrix for every subject.');
The Kronecker product expands Zi into a block diagonal matrix for every subject.
Z = kron(eye(m), Zi)
1 0 0 0 0 0
1 1 0 0 0 0
1 2 0 0 0 0
1 3 0 0 0 0
1 4 0 0 0 0
0 0 1 0 0 0
0 0 1 1 0 0
0 0 1 2 0 0
0 0 1 3 0 0
0 0 1 4 0 0
disp('Made-up true betas')
disp('Intercept and slope for each of m persons, concatenated')
Intercept and slope for each of m persons, concatenated
y = Z*beta % Now each person's fit is a combination of their individual Zi and beta_i
% Plot the true response without error
ycell = mat2cell(y, n_within * ones(m, 1))';
x1 = repmat({Zi(:, 2)}, 1, m);
create_figure('example');
handles = line_plot_multisubject(x1, ycell);
____________________________________________________________________________________________________________________________________________
Input data:
X scaling: No centering or z-scoring
Y scaling: No centering or z-scoring
Transformations:
No data transformations before plot
Correlations:
r = 0.68 across all observations, based on untransformed input data
Stats on slopes after transformation, subject is random effect:
Mean b = 3.67, t( 2) = 3.05, p = 0.092735, num. missing: 0
Average within-person r = 1.00 +- 0.00 (std)
Between-person r (across subject means) = NaN, p = NaN
____________________________________________________________________________________________________________________________________________
set(handles.point_handles, 'MarkerSize', 8, 'Marker', 'o'); % we can use handles to reformat things
set(gca, 'XLim', [-0.5 4.5])
xlabel('X1'); ylabel('y');
The response (y) will also be measured witih error. Let's add some errors.
Here, the errors are normally distributed with var = 1. Here the errors are homoscedastic across subjects.
This is a common assumption in mixed effects models, e.g., lmer in R, but not all models make this assumption.
e = randn(size(y, 1), 1);
% Plot the true response with error
ycell = mat2cell(y, n_within * ones(m, 1))';
x1 = repmat({Zi(:, 2)}, 1, m);
create_figure('example');
handles = line_plot_multisubject(x1, ycell);
____________________________________________________________________________________________________________________________________________
Input data:
X scaling: No centering or z-scoring
Y scaling: No centering or z-scoring
Transformations:
No data transformations before plot
Correlations:
r = 0.66 across all observations, based on untransformed input data
Stats on slopes after transformation, subject is random effect:
Mean b = 3.53, t( 2) = 2.67, p = 0.115979, num. missing: 0
Average within-person r = 0.98 +- 0.03 (std)
Between-person r (across subject means) = NaN, p = NaN
____________________________________________________________________________________________________________________________________________
set(handles.point_handles, 'MarkerSize', 8, 'Marker', 'o'); % we can use handles to reformat things
set(gca, 'XLim', [-0.5 4.5])
xlabel('X1'); ylabel('y');
The variation in betas 1, 3, and 5 reflects individual differences in the intercept.
The variation in betas 2, 4, and 6 reflects individual differences in the slope of X1.
We set these to fixed values above, but in the next section, these individual differences will be randomly selected from a distribution. The "true" individual differences among subjects are thus random effects.
A more flexible simulation
Specify parameters and model structure
% n observations within-subject
% For now, assume same across all subjects
n_i = n_within * ones(m, 1);
% Number of model parameters (regressors), including intercept
% Within-person residual noise variance
% For now, assume same across all subjects
% s2 is 1 x m vector of residual variances
s2 = sig_2_within .* ones(1, m);
Fixed population-level effects
Specify a column vector of true population-level values for each effect, intercept followed by other parameters.
if any(size(b_g) - [p 1])
error('Population effects b_g is the wrong size')
Group-level covariance matrix
Diagonals are variances for each of p parameters. By convention, the intercept is first.
Off-diagonals are covariances
e.g., for an intercept and slope (simple multilevel regression), U_g would be a 2 x 2 matrix with variances of the intercept and slope on the diagonal.
error('Covariance matrix U_g is the wrong size')
Fixed design matrix
These can be the same for all subjects, or different. They'd be the same if we were using a fixed experimental design with the same numbers of observations per person, and different if (1) our predictors are measured variables or (2) we have different numbers of observations.
We'll assume they are the same across participants for now, but allow the code to be easily extended to different design matrices across participants.
Predictors can also be categorical or continuous, and correlated or uncorrelated. We'll generate a simple effects-coded [1 -1] predictor for now, with approximately balanced numbers of observations.
The intercept is first (a column of 1's) by convention.
Z_within = [ones(n_within, 1)]; % intercept
Z_tmp = randn(n_within, p - 1);
Z_within = [Z_within Z_tmp];
for i = 1:m % allows subjects to vary later if needed
Now we have all the variables we need to simulate data:
n_i Number of observations for each person i = 1...m
b_g Population-level true fixed effects parameters
s2 residual variances for each person (1 x m)
U_g Group covariance matrix
Z_i Design matrix for each person
Questions to answer
Under most input parameter choices, the dataset will include a random subject intercept and random subject slope.
Which input parameter would you change to make the dataset consistent with a "random intercept only" model?
Which input parameter would you change to make the dataset consistent with a "random slope only" model?
Create simulated dataset
Generate subject-level true parameters
Assume a normal distribution with covariance U_g
beta_i here is an [p x m] matrix where column i is the [p x 1] true beta coefficient vector for subject i.
beta_i = mvnrnd(b_g, U_g, m)';
Generate data (y)
y{i} = Z_i{i} * beta_i(:, i) + sqrt(s2(i)) .* randn(n_i(i), 1);
To fit the dataset, we'll need these variables:
y{i = 1...m} The DV/outcome
Z_i{i = 1...m} The design matrix for each person
CANlab functions like glmfit_multilevel, igls, and rigls can use these cell arrays directly.
For fitlm and fitlme, and lmer in R (etc.), we will want these in stacked, 'long format' tabular form for mixed effects models. This requires a Grouping (in this case a Subject ID variable) as well. We will also create a column for each predictor and leave out the intercept, as it's added automatically.
% Make a long-format table with simulated data
sim_data = table(cat(1, y{:}), 'VariableNames', {'Y'});
sid_tmp = {}; % Subject ID
var_tmp = {}; % predictor variables
sid_tmp{i} = i .* Z_i{i}(:, 1); % use intercept in case n's vary
var_tmp{j}{i} = Z_i{i}(:, j+1);
sim_data.sid = cat(1, sid_tmp{:});
varname = sprintf('x%02d', j);
sim_data.(varname) = cat(1, var_tmp{j}{:});
sim_data
sim_data = 400×3 table
| Y | sid | x01 |
---|
1 | -4.0320 | 1 | 1 |
---|
2 | -3.8698 | 1 | 1 |
---|
3 | -4.0851 | 1 | -1 |
---|
4 | -3.9542 | 1 | 1 |
---|
5 | -3.8040 | 1 | -1 |
---|
6 | 0.1725 | 1 | -1 |
---|
7 | -2.8657 | 1 | -1 |
---|
8 | -2.5798 | 1 | -1 |
---|
9 | -2.6348 | 1 | -1 |
---|
10 | 0.5730 | 1 | -1 |
---|
11 | -3.2829 | 1 | -1 |
---|
12 | -3.1335 | 1 | -1 |
---|
13 | -3.4389 | 1 | -1 |
---|
14 | -4.9082 | 1 | 1 |
---|
15 | -2.4587 | 1 | -1 |
---|
16 | -3.7196 | 1 | -1 |
---|
17 | -2.6849 | 1 | -1 |
---|
18 | -5.1986 | 1 | 1 |
---|
19 | -5.8346 | 1 | 1 |
---|
20 | -3.2149 | 1 | 1 |
---|
21 | 19.9078 | 2 | 1 |
---|
22 | 21.5983 | 2 | 1 |
---|
23 | 1.4332 | 2 | -1 |
---|
24 | 21.0363 | 2 | 1 |
---|
25 | 5.0081 | 2 | -1 |
---|
26 | 3.0901 | 2 | -1 |
---|
27 | 4.1288 | 2 | -1 |
---|
28 | 4.5823 | 2 | -1 |
---|
29 | 2.9716 | 2 | -1 |
---|
30 | 3.3803 | 2 | -1 |
---|
31 | 2.8012 | 2 | -1 |
---|
32 | 3.2771 | 2 | -1 |
---|
33 | 3.1316 | 2 | -1 |
---|
34 | 20.5364 | 2 | 1 |
---|
35 | 4.0396 | 2 | -1 |
---|
36 | 0.9159 | 2 | -1 |
---|
37 | 3.4813 | 2 | -1 |
---|
38 | 20.6233 | 2 | 1 |
---|
39 | 20.8694 | 2 | 1 |
---|
40 | 20.4360 | 2 | 1 |
---|
41 | 3.1898 | 3 | 1 |
---|
42 | 3.2318 | 3 | 1 |
---|
43 | 3.3912 | 3 | -1 |
---|
44 | 6.2155 | 3 | 1 |
---|
45 | 3.6417 | 3 | -1 |
---|
46 | 3.1544 | 3 | -1 |
---|
47 | 1.9438 | 3 | -1 |
---|
48 | 5.4872 | 3 | -1 |
---|
49 | 3.6102 | 3 | -1 |
---|
50 | 3.5736 | 3 | -1 |
---|
51 | 2.6696 | 3 | -1 |
---|
52 | 4.5774 | 3 | -1 |
---|
53 | 3.6798 | 3 | -1 |
---|
54 | 4.4016 | 3 | 1 |
---|
55 | 3.6392 | 3 | -1 |
---|
56 | 5.2595 | 3 | -1 |
---|
57 | 3.1658 | 3 | -1 |
---|
58 | 2.6947 | 3 | 1 |
---|
59 | 2.6559 | 3 | 1 |
---|
60 | 4.4640 | 3 | 1 |
---|
61 | 13.0131 | 4 | 1 |
---|
62 | 12.5867 | 4 | 1 |
---|
63 | 17.5766 | 4 | -1 |
---|
64 | 12.1720 | 4 | 1 |
---|
65 | 16.2980 | 4 | -1 |
---|
66 | 15.2236 | 4 | -1 |
---|
67 | 15.2107 | 4 | -1 |
---|
68 | 15.0313 | 4 | -1 |
---|
69 | 14.5745 | 4 | -1 |
---|
70 | 13.0986 | 4 | -1 |
---|
71 | 14.5713 | 4 | -1 |
---|
72 | 14.9110 | 4 | -1 |
---|
73 | 14.1387 | 4 | -1 |
---|
74 | 13.2338 | 4 | 1 |
---|
75 | 16.6579 | 4 | -1 |
---|
76 | 15.6732 | 4 | -1 |
---|
77 | 12.6105 | 4 | -1 |
---|
78 | 13.6146 | 4 | 1 |
---|
79 | 12.7171 | 4 | 1 |
---|
80 | 13.4009 | 4 | 1 |
---|
81 | 12.3251 | 5 | 1 |
---|
82 | 13.1484 | 5 | 1 |
---|
83 | 9.1100 | 5 | -1 |
---|
84 | 15.9891 | 5 | 1 |
---|
85 | 9.3342 | 5 | -1 |
---|
86 | 8.9014 | 5 | -1 |
---|
87 | 9.1219 | 5 | -1 |
---|
88 | 7.5665 | 5 | -1 |
---|
89 | 9.4183 | 5 | -1 |
---|
90 | 8.2049 | 5 | -1 |
---|
91 | 9.1913 | 5 | -1 |
---|
92 | 10.1201 | 5 | -1 |
---|
93 | 8.3289 | 5 | -1 |
---|
94 | 12.6770 | 5 | 1 |
---|
95 | 8.9499 | 5 | -1 |
---|
96 | 9.5138 | 5 | -1 |
---|
97 | 9.0417 | 5 | -1 |
---|
98 | 13.1640 | 5 | 1 |
---|
99 | 12.8173 | 5 | 1 |
---|
100 | 11.4494 | 5 | 1 |
---|
⋮ |
---|
Plot the dataset
For plotting, we may need to extract a predictor. This extracts the first non-intercept predictor (if p > 1 and there is one).
We'll use the CANlab function line_plot_multisubject. It gives us some output and basic stats on the relationship between x1 and y as well.
x1 = cellfun(@(x) x(:, 2), Z_i, 'UniformOutput', false);
create_figure('multisubject data')
disp('line_plot_multisubject output:')
[han, ~, ~, slope_stats] = line_plot_multisubject(x1, y);
disp('Intercept only: No x1-y relationship to plot')
end
ans =
Figure (multisubject data) with properties:
Number: 2
Name: 'multisubject data'
Color: [1 1 1]
Position: [584 595 560 420]
Units: 'pixels'
Show all properties
line_plot_multisubject output:
Warnings:
____________________________________________________________________________________________________________________________________________
Y: input cells with low varability:
3
____________________________________________________________________________________________________________________________________________
____________________________________________________________________________________________________________________________________________
Input data:
X scaling: No centering or z-scoring
Y scaling: No centering or z-scoring
Transformations:
No data transformations before plot
Correlations:
r = 0.47 across all observations, based on untransformed input data
Stats on slopes after transformation, subject is random effect:
Mean b = 3.54, t( 19) = 4.27, p = 0.000413, num. missing: 0
Average within-person r = 0.58 +- 0.71 (std)
Between-person r (across subject means) = -0.00, p = 1.000000
____________________________________________________________________________________________________________________________________________

Fit the model
glmfit_multilevel
glmfit_multilevel is a CANlab function. It is a fast and simple option for running a two-level mixed effects model with participant as a random effect. It implements a random-intercept, random-slope model across 2nd-level units (e.g., participants). it fits regressions for individual 2nd-level units (e.g., participants), and then (optionally) uses a precision-weighted least squares approach to model group effects (if the 'weighted' keyword is entered). It thus treats participants as a random effect. This is appropriate when 2nd-level units are participants and 1st-level units are observations (e.g., trials) within participants. glmfit_multilevel was designed with this use case in mind.
Options:
glmfit_multilevel includes some options that are not included in many
mixed effects models, including bootstrapping or sign permutation for inference
Inputs:
- y = cell array, one cell per subject. Each cell contains a column vector of outcome data for that subject, across lower-level observations (e.g., trials).
- x1 = cell array, one cell per subject. Each cell contains a [trials x predictors] matrix of predictor data for that subject.
disp('glmfit_multilevel output:')
glmfit_multilevel output:
stats = glmfit_multilevel(y, x1, [], 'verbose');
Analysis description: Second Level of Multilevel Model
GLS analysis
Observations: 20, Predictors: 1
Outcome names: 1st-level B1 1st-level B2
Weighting option: unweighted
Inference option: t-test
Other Options:
Plots: No
Robust: no
Save: No
Output file: glmfit_general_output.txt
Total time: 0.00 s
________________________________________
Second Level of Multilevel Model
Outcome variables:
1st-level B1 1st-level B2
Adj. mean 3.64 3.54
2nd-level B01
1st-level B1 1st-level B2
Coeff 3.64 3.54
STE 1.14 0.83
t 3.20 4.27
Z 2.83 3.53
p 0.00469 0.00041
________________________________________
fitlme
This is Matlab's object-oriented equivalent to lmer in R. It fits a variety of mixed effects models. A list of methods and properties are here:
These models, like many other statistics objects implementing linear models, allow you to specify model formulas using Wilkinson notation. There is also a matrix input version. help LinearMixedModel
LinearMixedModel Fitted linear mixed effects model.
LME = FITLME(...) fits a linear mixed effects model to data. The fitted
model LME is a LinearMixedModel modeling a response variable as a
linear function of fixed effect predictors and random effect
predictors.
LinearMixedModel methods:
coefCI - Coefficient confidence intervals
coefTest - Linear hypothesis test on coefficients
predict - Compute predicted values given predictor values
random - Generate random response values given predictor values
partialDependence - Partial Dependence values.
plotPartialDependence - Plot Partial Dependence.
plotResiduals - Plot of residuals
designMatrix - Fixed and random effects design matrices
fixedEffects - Stats on fixed effects
randomEffects - Stats on random effects
covarianceParameters - Stats on covariance parameters
fitted - Fitted response
residuals - Various types of residuals
response - Response used to fit the model
compare - Compare fitted models
anova - Marginal tests for fixed effect terms
disp - Display a fitted model
LinearMixedModel properties:
FitMethod - Method used for fitting (either 'ML' or 'REML')
MSE - Mean squared error (estimate of residual variance)
Formula - Representation of the model used in this fit
LogLikelihood - Log of likelihood function at coefficient estimates
DFE - Degrees of freedom for residuals
SSE - Error sum of squares
SST - Total sum of squares
SSR - Regression sum of squares
CoefficientCovariance - Covariance matrix for coefficient estimates
CoefficientNames - Coefficient names
NumCoefficients - Number of coefficients
NumEstimatedCoefficients - Number of estimated coefficients
Coefficients - Coefficients and related statistics
Rsquared - R-squared and adjusted R-squared
ModelCriterion - AIC and other model criteria
VariableInfo - Information about variables used in the fit
ObservationInfo - Information about observations used in the fit
Variables - Table of variables used in fit
NumVariables - Number of variables used in fit
VariableNames - Names of variables used in fit
NumPredictors - Number of predictors
PredictorNames - Names of predictors
ResponseName - Name of response
NumObservations - Number of observations in the fit
ObservationNames - Names of observations in the fit
See also fitlme, fitlmematrix, LinearModel, GeneralizedLinearModel, NonLinearModel.
Documentation for LinearMixedModel
lme = fitlme(sim_data,'Y ~ x01 + (1 + x01|sid)')
lme =
Linear mixed-effects model fit by ML
Model information:
Number of observations 400
Fixed effects coefficients 2
Random effects coefficients 40
Covariance parameters 4
Formula:
Y ~ 1 + x01 + (1 + x01 | sid)
Model fit statistics:
AIC BIC LogLikelihood Deviance
1433.1 1457 -710.54 1421.1
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower Upper
{'(Intercept)'} 3.6373 1.1069 3.2859 398 0.0011066 1.4611 5.8135
{'x01' } 3.5392 0.80765 4.3821 398 1.5057e-05 1.9514 5.127
Random effects covariance parameters (95% CIs):
Group: sid (20 Levels)
Name1 Name2 Type Estimate Lower Upper
{'(Intercept)'} {'(Intercept)'} {'std' } 4.944 3.6236 6.7455
{'x01' } {'(Intercept)'} {'corr'} -0.21241 -0.57555 0.22071
{'x01' } {'x01' } {'std' } 3.6031 2.6389 4.9195
Group: Error
Name Estimate Lower Upper
{'Res Std'} 1.0772 1.0013 1.1588
anova(lme, 'DFMethod', 'Satterthwaite')
ans =
ANOVA marginal tests: DFMethod = 'Satterthwaite'
Term FStat DF1 DF2 pValue
{'(Intercept)'} 10.797 1 20 0.0036938
{'x01' } 19.203 1 20 0.00028801
pval = coefTest(lme,[1 0]);
igls / rigls
igls_stats = igls(y, x1, 'plot', 'all');
MST (diag of Ystar_mtx): 40.85556
MSE (diag of resid): 0.00000


igls.m output report:
---------------------------------------
Data: 20 observations x 20 subjects
Fit Type: igls
AR(p) model: No
Converged: Yes
Max iterations: 5, Completed iterations: 1
Epsilon for convergence: 0.010000
Elapsed time: 0.32 s
Statistics: Tests of inference on fixed population parameters
Parameter est. t( 19) p
Intcpt. 3.637 3.29 0.003885 **
Slope1 3.539 4.38 0.000320 ***
Statistics: Tests of significance on random effects (variances)
Parameter est. LRT p
Intcpt. 24.440 1047.88 0.000000 ***
Slope1 12.979 832.30 0.000000 ***
---------------------------------------
disp('Est. random-effect variances: '); disp(sqrt(igls_stats.betastar)');
Est. random-effect variances:
4.9436 3.6026
disp('Estimated fixed effects')
disp('True fixed effects')
disp('Estimated random effects (covariances)')
Estimated random effects (covariances)
disp('True random effects (covariances)')
True random effects (covariances)
Hands-on Activities
- Identify and compare residual variance terms, slope estimates, t-values across models
- Modify the code to generate one or more continuous predictors instead of categorical/experimental ones.
- Turn the data-generation script into a function, so that you can efficiently simulate power and false positive rates using repeated simualtions -- two key properties of any statistical model.
A Matlab version is included in the function sim_generate_mixedfx_data1, which is called below. You can create your own version in the software package of your choice (e.g., Python, R).
regressortype = 'categorical';
[y, Z_i, x1, sim_data] = sim_generate_mixedfx_data1(m, n_i, p, b_g, s2, U_g, regressortype);
sim_data
sim_data = 400×3 table
| Y | sid | x01 |
---|
1 | 7.1248 | 1 | 1 |
---|
2 | 2.8940 | 1 | -1 |
---|
3 | 4.7317 | 1 | -1 |
---|
4 | 4.0255 | 1 | -1 |
---|
5 | 3.7738 | 1 | -1 |
---|
6 | 2.7529 | 1 | -1 |
---|
7 | 3.6547 | 1 | -1 |
---|
8 | 2.1745 | 1 | -1 |
---|
9 | 4.8593 | 1 | 1 |
---|
10 | 6.9221 | 1 | 1 |
---|
11 | 6.6253 | 1 | 1 |
---|
12 | 3.8653 | 1 | -1 |
---|
13 | 7.6663 | 1 | 1 |
---|
14 | 5.2197 | 1 | 1 |
---|
15 | 0.2628 | 1 | -1 |
---|
16 | 3.5789 | 1 | -1 |
---|
17 | 3.0592 | 1 | -1 |
---|
18 | 1.8200 | 1 | -1 |
---|
19 | 2.3574 | 1 | -1 |
---|
20 | 6.2340 | 1 | 1 |
---|
21 | 8.7746 | 2 | 1 |
---|
22 | 8.0283 | 2 | -1 |
---|
23 | 8.6770 | 2 | -1 |
---|
24 | 4.6879 | 2 | -1 |
---|
25 | 7.0477 | 2 | 1 |
---|
26 | 7.3928 | 2 | -1 |
---|
27 | 7.1164 | 2 | 1 |
---|
28 | 8.9804 | 2 | -1 |
---|
29 | 6.9855 | 2 | -1 |
---|
30 | 7.4228 | 2 | 1 |
---|
31 | 7.6702 | 2 | 1 |
---|
32 | 7.6151 | 2 | -1 |
---|
33 | 9.3147 | 2 | -1 |
---|
34 | 7.7790 | 2 | -1 |
---|
35 | 5.7913 | 2 | 1 |
---|
36 | 6.8585 | 2 | 1 |
---|
37 | 5.9845 | 2 | 1 |
---|
38 | 6.8890 | 2 | 1 |
---|
39 | 7.3789 | 2 | -1 |
---|
40 | 6.4040 | 2 | -1 |
---|
41 | 18.1539 | 3 | -1 |
---|
42 | 12.3011 | 3 | 1 |
---|
43 | 15.7589 | 3 | -1 |
---|
44 | 17.2003 | 3 | -1 |
---|
45 | 11.3619 | 3 | 1 |
---|
46 | 11.0500 | 3 | 1 |
---|
47 | 16.2744 | 3 | -1 |
---|
48 | 18.8776 | 3 | -1 |
---|
49 | 12.2960 | 3 | 1 |
---|
50 | 18.6082 | 3 | -1 |
---|
51 | 11.9249 | 3 | 1 |
---|
52 | 11.5412 | 3 | 1 |
---|
53 | 16.8547 | 3 | -1 |
---|
54 | 17.1640 | 3 | -1 |
---|
55 | 17.7628 | 3 | -1 |
---|
56 | 17.6201 | 3 | -1 |
---|
57 | 16.3719 | 3 | -1 |
---|
58 | 18.2178 | 3 | -1 |
---|
59 | 18.2842 | 3 | -1 |
---|
60 | 10.4895 | 3 | 1 |
---|
61 | 11.4245 | 4 | 1 |
---|
62 | 11.3352 | 4 | 1 |
---|
63 | 9.9247 | 4 | 1 |
---|
64 | 11.4908 | 4 | 1 |
---|
65 | 8.5905 | 4 | 1 |
---|
66 | 11.0003 | 4 | 1 |
---|
67 | 11.3002 | 4 | -1 |
---|
68 | 9.2207 | 4 | 1 |
---|
69 | 12.6647 | 4 | -1 |
---|
70 | 13.6362 | 4 | -1 |
---|
71 | 9.0691 | 4 | 1 |
---|
72 | 8.3491 | 4 | 1 |
---|
73 | 10.1232 | 4 | 1 |
---|
74 | 9.0750 | 4 | 1 |
---|
75 | 12.3376 | 4 | -1 |
---|
76 | 7.8899 | 4 | 1 |
---|
77 | 14.0128 | 4 | -1 |
---|
78 | 10.2648 | 4 | 1 |
---|
79 | 12.9940 | 4 | -1 |
---|
80 | 11.4644 | 4 | -1 |
---|
81 | -3.5661 | 5 | -1 |
---|
82 | 1.3834 | 5 | 1 |
---|
83 | -5.1804 | 5 | -1 |
---|
84 | -4.0787 | 5 | -1 |
---|
85 | -3.9882 | 5 | -1 |
---|
86 | 2.6419 | 5 | 1 |
---|
87 | -2.2328 | 5 | -1 |
---|
88 | -4.2172 | 5 | -1 |
---|
89 | 0.1179 | 5 | 1 |
---|
90 | -1.7255 | 5 | -1 |
---|
91 | -5.0100 | 5 | -1 |
---|
92 | -3.9038 | 5 | -1 |
---|
93 | 1.2711 | 5 | 1 |
---|
94 | 2.0460 | 5 | 1 |
---|
95 | -2.9332 | 5 | -1 |
---|
96 | -4.4540 | 5 | -1 |
---|
97 | 2.3555 | 5 | 1 |
---|
98 | -4.2770 | 5 | -1 |
---|
99 | -4.9389 | 5 | -1 |
---|
100 | -4.4625 | 5 | -1 |
---|
⋮ |
---|
Extended simulations
With a bit more extension, these simulations can be used to answer a number of questions.
- Simulate data in R and fit using LMER. Compare/contrast output and create a "cheat sheet" comparing output
- Compare TPR and FPR across methods - matlab, R, glmfit_multilevel
- Compare with RM ANOVA
- Simulate and compare true positive rates (TPR, power) and false positive rates (FPR) in these interesting cases:
- mismodeling: ignoring a random effect
- unbalanced data across levels of a predictor
- unequal N within subject
- unequal variances within-subject
- experimental design vs. normally-distributed predictors
- non-normally distributed errors
- more vs. fewer irrelevant predictors
- more levels of a predictor (mixed vs ANOVA)
- crossed random effects with more or fewer levels