Fundamentals of linear mixed models

Topics: Linear models review; Fixed effects vs. random effects

Welcome!
Housekeeping
Outline for today
Linear models review
Adding a random effect to the model
Generalities on mixed models
Applied example
- Building a statistical model
Discussions
Summary
What’s next

Welcome!

About us: Josefina and Claudio.
About you.
- Frequent responses to the registration survey.
- Some knowledge of the existence of mixed effects models.
- Mostly life sciences - often heteroscedasticity (tomorrow) and dependent observations (this workshop)!

Attendees counts — Figure 1. Distribution of Departments attending this worksop

Housekeeping

We will have relatively low proportion of R code in this workshop. Instead, we will focus on the understanding of the components of mixed-effects models. Questions/concerns via e-mail are encouraged.
Emphasis on modeling and figuring out what mixed-effects models actually do.
The statistical notation we will use throughout this workshop is presented here.
DON’T PANIC when you see all the math notation and matrices. We will go slowly together. It is also not expected that you walk out of this workshop as a math notation wizard!

Workshop Overview

Part 1. Fundamentals of linear mixed models
Review of linear models & basic concepts of linear mixed models
Part 2. Modeling data generated by designed experiments Part 3. Generalized linear mixed models (aka non-normal response) applied to designed experiments

Outline for today

Review of linear models: begin with a refresher on linear models (all fixed-effects), then introduce mixed-effects models by incorporating an additional assumption.
The variance-covariance matrix: the information of the random effects is stored in the variance-covariance matrix!
Fixed effects vs. random effects: where/how is the information from each type of effect captured?
Application: practical implications of these concepts when fitting a model to your data.

Linear models review

The quadratic regression is a linear model, too

Figure 2. Yield response to nitrogen fertilizer.

One of the most popular linear models is the intercept-and-slope model (sometimes also called linear regression), often used to describe the (linear) relationship between two quantitative variables. Quadratic regression is also a linear model, and it’s often used to describe relationship between two quantitative variables when it’s not exactly linear, but has some type of curvature.

Most of us learned a way of writing out the statistical model called “model equation form”. For a quantitative predictor $x$, the model equation form is

\[y_{i} = \beta_0 + x_{i} \beta_1 + x_{i}^2 \beta_2 + \varepsilon_{i}, \\ \varepsilon_i \sim N(0, \sigma^2),\]

where $y_{i}$ is the observed value for the $i$th observation, $\beta_0$ is the intercept (i.e., the expected value of $y$ when $x=0$), $\beta_1$ is the linear component (i.e., the expected change in $y_i$ with a unity increase in $x$), $\beta_2$ is the quadratic component (i.e., the expected change in $y_i$ with a unity increase in $x^2$), $x_i$ is the predictor for the $i$th observation, and $\varepsilon_{i}$ is the difference between the observed value $y$ and the expected value $E(y_i) = \beta_0+x_i\beta_1+ x_{i}^2 \beta_2$ - that’s why we often call it “residual”. Typically, $\boldsymbol{\beta} \equiv \begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix}$ is estimated via maximum likelihood estimation. Also, when we assume a Normal distribution, maximum likelihood estimation yields the same point estimates as least squares estimation.

Let’s review the assumptions we make in this model (which is the default model in most software).

Linearity
Constant variance
Independence
Normality

Let’s review that simple model using distribution notation and matrix notation

There are other ways of writing out the statistical model above, besides the model equation form. Instead of focusing on the distribution of the residuals, we can focus on the distribution of the data $y$:

\[y_{i} \sim N(\mu_i, \sigma^2), \\ \mu_i = \beta_0 + x_{i} \beta_1 + x_{i}^2 \beta_2.\]

This type of notation is called “probability distribution form”. The probability distribution form makes it easier to switch to other distributions beyond the Normal (Part 3 of this workshop). We can further express this equation using vectors and matrices:

\[\mathbf{y} \sim N(\boldsymbol{\mu}, \Sigma), \\ \boldsymbol{\mu} = \boldsymbol{1} \beta_0 + \mathbf{x} \beta_1 = \mathbf{X}\boldsymbol{\beta},\]

where $n$ is the total number of observations, $p$ is the total number of parameters, $\mathbf{y}$ is an $n \times 1$ vector containing all observations, $\boldsymbol{\mu}$ is an $n \times 1$ vector containing the expected values of said observations, $\boldsymbol{\beta}$ is an $p \times 1$ vector containing the (fixed) parameters, $\mathbf{X}$ is an $n \times p$ matrix containing the predictors, $\Sigma$ is an $n \times n$ matrix called variance-covariance matrix. Note: This type of notation is also convenient to think about how to prepare the data in your spreadsheet. One row per observation (rows in $\mathbf{X}$), one variable per column (columns in $\mathbf{X}$).

These expressions can be expanded into:

\[\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix} \sim N \left( \begin{bmatrix}\mu_1 \\ \mu_2 \\ \vdots \\ \mu_n \end{bmatrix}, \begin{bmatrix} Cov(y_1, y_1) & Cov(y_1, y_2) & \dots & Cov(y_1, y_n) \\ Cov(y_2, y_1) & Cov(y_2, y_2) & \dots & Cov(y_2, y_n)\\ \vdots & \vdots & \ddots & \vdots \\ Cov(y_n, y_1) & Cov(y_n, y_2) & \dots & Cov(y_n, y_n) \end{bmatrix} \right),\]

which is the same as

\[\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix} \sim N \left( \begin{bmatrix}\mu_1 \\ \mu_2 \\ \vdots \\ \mu_n \end{bmatrix}, \begin{bmatrix} Var(y_1) & Cov(y_1, y_2) & \dots & Cov(y_1, y_n) \\ Cov(y_2, y_1) & Var(y_2) & \dots & Cov(y_2, y_n)\\ \vdots & \vdots & \ddots & \vdots \\ Cov(y_n, y_1) & Cov(y_n, y_2) & \dots & Var(y_n) \end{bmatrix} \right).\]

Linear model with a qualitative predictor

Instead of a quantitative predictor, we could have a qualitative (or categorical) predictor. Let the qualitative predictor have two possible levels, A and B. We could use the same model as before,

\[y_{i} \sim N(\mu_i, \sigma^2), \\ \mu_i = \beta_0 + x_i \beta_1,\]

and say that $y_{i}$ is the observed value for the $i$th observation, $\beta_0$ is the expected value for A, $x_i = 0$ if the $i$th observation belongs to A, and $x_i = 1$ if the $i$th observation belongs to B. That way, $\beta_1$ is the difference between A and B.

If the categorical predictor has more than two levels,

\[y_{i} \sim N(\mu_i, \sigma^2), \\ \mu_i = \beta_0 + x_{1i} \beta_1 + x_{2i} \beta_2+ ... + x_{ji} \beta_j,\]

$y_{i}$ is still the observed value for the $i$th observation, $\beta_0$ is the expected value for A (sometimes in designed experiments this level is a control),

$x_{1i} = \begin{cases} 1, & \text{if } \text{Treatment is B} \\ 0, & \text{if } \text{else} \end{cases},$ $x_{2i} = \begin{cases} 1, & \text{if } \text{Treatment is C} \\ 0, & \text{if } \text{else} \end{cases},$ $\dots,$

\[x_{ji} = \begin{cases} 1, & \text{if } \text{Treatment is J} \\ 1, & \text{if } \text{else} \end{cases}.\]

That way, all $\beta$s are the differences between the treatment and the control.

Review - Variance-covariance

What are variance-covariance matrices anyways?

Before we keep on talking about independent observations and residual variance, let’s review what that actually means.

What does variance mean?

Random variables are usually described with their properties like the expected value and variance. The expected value and variance are the first and second central moments of a distribution, respectively. Regardless of the distribution of a random variable $Y$, we could calculate its expected value $E(Y)$ and variance $Var(Y)$. The expected value measures the average outcome of $Y$. The variance measures the dispersion of $Y$, i.e. how far the possible outcomes are spread out from their average.

Univariate Normal distributions — Figure 3. Normal distributions

Discuss in the plot above:

Expected value
Variance
Covariance?

On the covariance of two random variables $y_1$ and $y_2$

Covariance between two random variables means how the two random variables behave relative to each other. Essentially, it quantifies the relationship between their joint variability. The variance of a random variable is the covariance of a random variable with itself. Consider two variables $y_1$ and $y_2$ each with a variance of 1 and a covariance of 0.6. The data shown in Figure 4 arise from the multivariate normal distribution

\[\begin{bmatrix}y_1 \\ y_2 \end{bmatrix} \sim MVN \left( \begin{bmatrix} 10 \\ 8 \end{bmatrix} , \begin{bmatrix}1 & 0.6 \\ 0.6 & 1 \end{bmatrix} \right),\]

where the means of $y_1$ and $y_2$ are 10 and 8, respectively, and their covariance structure is represented in the variance-covariance matrix. Remember,

\[\begin{bmatrix}y_1 \\ y_2 \end{bmatrix} \sim MVN \left( \begin{bmatrix} E(y_1) \\ E(y_2) \end{bmatrix} , \begin{bmatrix} Var(y_1) & Cov(y_1, y_2) \\ Cov(y_2,y_2) & Var(y_2) \end{bmatrix} \right).\]

Figure 4. Multivariate Normal distribution showing the correlation between two random normal variables.

Discuss in the plot above:

Expected value
Variance
Covariance

Adding a random effect to the model

The assumption behind independent observations

If we used the default model in most software written above, we would be assuming

\[\mathbf{y} \sim N(\boldsymbol{\mu}, \Sigma),\\ \begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ \vdots \\ y_n \end{bmatrix} \sim N \left( \begin{bmatrix}\mu_1 \\ \mu_2 \\ \mu_3 \\ \mu_4 \\ \vdots \\ \mu_n \end{bmatrix}, \sigma^2 \begin{bmatrix} 1 & 0 & 0 & 0 & \dots & 0 \\ 0 & 1 & 0 & 0 & \dots & 0 \\ 0 & 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 0 & 1 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & 0 & \dots & 1 \end{bmatrix} \right),\]

which is the same as

\[\begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ \vdots \\ y_n \end{bmatrix} \sim N \left( \begin{bmatrix}\mu_1 \\ \mu_2 \\ \mu_3 \\ \mu_4 \\ \vdots \\ \mu_n \end{bmatrix}, \begin{bmatrix} \sigma^2 & 0 & 0 & 0 & \dots & 0 \\ 0 & \sigma^2 & 0 & 0 & \dots & 0 \\ 0 & 0 & \sigma^2 & 0 & \dots & 0 \\ 0 & 0 & 0 & \sigma^2 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & 0 & \dots & \sigma^2 \end{bmatrix} \right).\]

Discuss the assumptions:

Linearity
Constant variance
Independence
Normality

Multivariate Normal distribution — Figure 6. Visual representation of the variance-covariance matrix assuming independent observations. Each tile is an element of the variance-covariance matrix. Tile color indicates said covariance.

Non-independent observations

The yield data come from 5 different fields that were approximately homogeneous. Which means that the assumption of independence we made before was kind of a stretch. In reality, all observations from the same field have something in common. It’s not reasonable to assume that the observations are independent, because observations from the same field have more in common than observations from different fields. They share the soil and environments and, with that, a baseline fertility and yield potential.

Basically, we expect the yield response to N fertilizer is similar among fields, but the baseline (a.k.a., the intercept) to be field-specific. Then,

\[y_{ij} = \beta_{0j} + x_{ij} \beta_1 + \varepsilon_{ij}, \\ \varepsilon_{ij} \sim N(0, \sigma^2),\]

Now, there are different ways to model that field-specific intercept.

Fixed vs Random Effects Table

Nfert_lbac	yield_kgha	field
0	5324	A
0	6328	A
0	4021	A
0	7569	A
0	8359	B
0	7447	B
0	5897	B
0	6796	B
0	6566	C
0	7164	C

These data would mean

$\mathbf{y} = \begin{bmatrix} 5324 \\ 6328 \\ 4021 \\ 7569 \\ 8359 \\ 7447 \\ 5897 \\ 6796 \\ 6566 \\ 7164 \end{bmatrix}$.

For the all-fixed model, $\mathbf{X} = \begin{bmatrix} 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0\end{bmatrix}.$

For the mixed model,

$\mathbf{X} = \begin{bmatrix} 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \end{bmatrix},$ $\mathbf{Z} = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \end{bmatrix}.$

How do we define $\beta_{0j}$?

This is a big forking path in statistical modeling. All-fixed models estimate the effects Mixed-effects models indicate what is similar to what via random effects.

Figure 6. Yield response to nitrogen fertilizer. Fitted lines show the responses with different intercepts, $$\beta_{0j}$$.

Fixed effects

So far, we could have defined an all-fixed model,

\[y_{ij} = \beta_{0j} + x_{ij} \beta_1 + x^2_{ij} \beta_2 + \varepsilon_{ij}, \\ \beta_{0j} = \beta_0 + u_j \\ \varepsilon_{ij} \sim N(0, \sigma^2),\]

where $u_j$ is the effect of the $j$th field on the intercept (i.e., on the baseline). In this case, $u_j$ is a fixed effect, which means it may be estimated via least squares estimation or maximum likelihood estimation. Under both least squares and maximum likelihood (assuming normal distribution), we may estimate the parameters by computing

\[\hat{\boldsymbol{\beta}}_{ML} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y},\]

which yields the minimum variance unbiased estimate of $\boldsymbol{\beta}$.

We still have $\boldsymbol{\Sigma} = \begin{bmatrix} \sigma_{\varepsilon}^2 & 0 & 0 & 0 & \dots & 0 \\ 0 & \sigma_{\varepsilon}^2 & 0 & 0 & \dots & 0 \\ 0 & 0 & \sigma_{\varepsilon}^2 & 0 & \dots & 0 \\ 0 & 0 & 0 & \sigma_{\varepsilon}^2 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & 0 & \dots & \sigma_{\varepsilon}^2 \end{bmatrix}$.

Figure 5. Visual representation of the variance-covariance matrix assuming independent observations. Each tile is an element of the variance-covariance matrix. Tile color indicates said covariance.

Random effects

We could also assume that the effects of the $j$th field (i.e., $u_j$) arise from a random distribution. The most common assumption (and the default in most statistical software) is that

\[u_j \sim N(0, \sigma^2_b).\]

Now, we don’t estimate the effect, but the variance $\sigma^2_b$. Note that there are $J$ levels of the random effects, meaning that a random effect is always categorical.
Also, now

\[\hat{\boldsymbol{\beta}}_{REML} = (\mathbf{X}^T \mathbf{V}^{-1} \mathbf{X})^{-1}\mathbf{X}^T \mathbf{V}^{-1} \mathbf{y},\]

where $\mathbf{V} = Var(\mathbf{y})$ is the variance-covariance matrix of $\mathbf{y}$, including residual variance and random-effects variance. Note that this formula yields the same point estimate for $\boldsymbol{\beta}$, but with a different confidence interval.

Now, the variance-covariance matrix $\boldsymbol{\Sigma}$ of the marginal distribution has changed (Figure 7). See next section for more details on how it’s changed.

Figure 7. Visual representation of the variance-covariance matrix assuming the dependence pattern between observations.

Generalities on mixed models

Mixed models combine fixed effects and random effects. Generally speaking, we can write out a mixed-effects model using the model equation form, as

\[\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \mathbf{Z}\mathbf{u} + \boldsymbol{\varepsilon}, \\ \begin{bmatrix}\mathbf{u} \\ \boldsymbol{\varepsilon} \end{bmatrix} \sim \left( \begin{bmatrix}\boldsymbol{0} \\ \boldsymbol{0} \end{bmatrix}, \begin{bmatrix}\mathbf{G} & \boldsymbol{0} \\ \boldsymbol{0} & \mathbf{R} \end{bmatrix} \right),\]

where $\mathbf{y}$ is the observed response, $\mathbf{X}$ is the matrix with the explanatory variables, $\mathbf{Z}$ is the design matrix, $\boldsymbol{\beta}$ is the vector containing the fixed-effects, $\mathbf{u}$ is the vector containing the random effects, $\boldsymbol{\varepsilon}$ is the vector containing the residuals, $\mathbf{G}$ is the variance-covariance matrix of the random effects, and $\mathbf{R}$ is the variance-covariance matrix of the residuals. Note that $\mathbf{X} \boldsymbol{\beta}$ is the fixed effects part of the model, and $\mathbf{Z}\mathbf{u}$ is the random effects part of the model.

Using the probability distribution form, we can then say that $E(\mathbf{y}) = \mathbf{X}\boldsymbol{\beta}$ and $Var(\mathbf{y}) = \mathbf{V} = \mathbf{Z}\mathbf{G}\mathbf{Z}' + \mathbf{R}$. Usually, we assume $\mathbf{G} = \sigma^2_u \begin{bmatrix} 1 & 0 & 0 & \dots 0 \\ 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 1 & \dots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \dots & 1 \end{bmatrix}$ and $\mathbf{R} = \sigma^2 \begin{bmatrix} 1 & 0 & 0 & \dots & 0 \\ 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 1 & \dots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \dots & 1 \end{bmatrix}$.

Then,

\[\mathbf{y} \sim N(\boldsymbol{\mu}, \Sigma),\] \[\Sigma = \begin{bmatrix} \sigma^2 + \sigma^2_u & \sigma^2_u & 0 & 0 & 0 & 0 &\dots & 0\\ \sigma^2_u & \sigma^2 + \sigma^2_u & 0 & 0 & 0 & 0 & \dots & 0 \\ 0 & 0 & \sigma^2 + \sigma^2_u & \sigma^2_u & 0 & 0 & \dots & 0 \\ 0 & 0 & \sigma^2_u & \sigma^2 + \sigma^2_u & 0 & 0 & \dots & 0 \\ 0 & 0 & 0 & 0 & \sigma^2 + \sigma^2_u & \sigma^2_u & \dots & 0 \\ 0 & 0 & 0 & 0 & \sigma^2_u & \sigma^2 + \sigma^2_u & \dots & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & 0 & 0 & 0 & \dots & \sigma^2 + \sigma^2_u \end{bmatrix}.\]

Take your time to digest the variance-covariance matrix above. What type of data do you think generated it?

Random effects

By definition, random effects are regression coefficients that arise from a random distribution.
Typically, a random effect $u \sim N(0, \sigma^2_u)$.
Note that this model for the parameter may result in shrinkage.
We estimate the variance $\sigma^2_u$.
Calculating degrees of freedom can get much more complex than in all-fixed effects models (e.g., with unbalanced data, spatio-temporally correlated data, or non-normal data).
In the context of designed experiments, random effects are assumed to be independent to each other and independent to the residual.

Estimation of parameters

“Estimation” is a term held mostly exclusive to fixed effects and variance components. Restricted maximum likelihood estimation (REML) is the default in most mixed effects models because, for small data (aka most experimental data), maximum likelihood (ML) provides variance estimates that are downward biased.

In REML, the likelihood is maximized after accounting for the model’s fixed effects.
In ML, $\ell_{ML}(\boldsymbol{\sigma; \boldsymbol{\beta}, \mathbf{y}}) = - (\frac{n}{2}) \log(2\pi)-(\frac{1}{2}) \log ( \vert \mathbf{V}(\boldsymbol\sigma) \vert ) - (\frac{1}{2}) (\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^T[\mathbf{V}(\boldsymbol\sigma)]^{-1}(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})$
In REML, $\ell_{REML}(\boldsymbol{\sigma};\mathbf{y}) = - (\frac{n-p}{2}) \log (2\pi) - (\frac{1}{2}) \log ( \vert \mathbf{V}(\boldsymbol\sigma) \vert ) - (\frac{1}{2})log \left( \vert \mathbf{X}^T[\mathbf{V}(\boldsymbol\sigma)]^{-1}\mathbf{X} \vert \right) - (\frac{1}{2})\mathbf{r}[\mathbf{V}(\boldsymbol\sigma)]^{-1}\mathbf{r}$, where $p = rank(\mathbf{X})$, $\mathbf{r} = \mathbf{y}-\mathbf{X}\hat{\boldsymbol{\beta}}_{ML}$.
- Start with initial values for $\boldsymbol{\sigma}$, $\tilde{\boldsymbol{\sigma}}$.
- Compute $\mathbf{G}(\tilde{\boldsymbol{\sigma}})$ and $\mathbf{R}(\tilde{\boldsymbol{\sigma}})$.
- Obtain $\boldsymbol{\beta}$ and $\mathbf{b}$.
- Update $\tilde{\boldsymbol{\sigma}}$.
- Repeat until convergence.

Fixed effects versus random effects

What is behind a random effect:

\[\hat{\boldsymbol{\beta}} \sim N \left( \boldsymbol{\beta}, (\mathbf{X}^T \mathbf{V}^{-1} \mathbf{X})^{-1} \right)\]
\[u_j \sim N(0, \sigma^2_u)\]
What process is being studied?
How were the levels selected? (randomly, carefully selected)
How many levels does the factor have, vs. how many did we observe?
BLUEs versus BLUPs.

Read more in in Gelman (2005, page 20), “Analysis of variance—why it is more important than ever” [link], and Gelman and Hill (2006), page 245.

Group discussion: what determines if an effect should be random of fixed?

Case: scientists have decided they wanted to have 3 different environments (SE Kansas, NW Kansas, Central Kansas) for their experiments. They design a randomized complete block design at each location.
However, they are not interested in studying the environments themselves. How should they model the experiment?

Applied example

Field experiment studying the effect of potassium on corn yield.
One treatment factor: Potassium fertilizer (one-way treatment structure).
Randomized Complete Block Design with 4 repetitions (design structure).

Figure 8. Results from a designed experiment. Colors indicate different model assumptions.

Building a statistical model

What is the experiment blueprint? (aka design structure or structure in the data)
What are the treatments?

We can easily come up with two models:

Blocks fixed $y_{ijk} = \mu + \tau_i + \rho_j + \varepsilon_{ijk}; \ \ \varepsilon_{ijk} \sim N(0, \sigma^2)$.
Blocks random $y_{ijk} = \mu + \tau_i + u_j + \varepsilon_{ijk}; \ \ u_j \sim N(0, \sigma^2_u); \ \ \varepsilon_{ijk} \sim N(0, \sigma^2)$, where $u$ and $\varepsilon$ are independent.

Embed R Code

library(lme4)
library(tidyverse)
library(emmeans)
library(latex2exp)

df <- read.csv("cochrancox_kfert.csv")
df$rep <- as.factor(df$rep)
df$K2O_lbac <- as.factor(df$K2O_lbac)
m_fixed <- lm(yield ~ K2O_lbac + rep, data = df)
m_random <- lmer(yield ~ K2O_lbac + (1|rep), data = df)

(mg_means_fixed <- emmeans(m_fixed, ~K2O_lbac, contr = list(c(1, 0, 0, 0, -1))))

## $emmeans
##  K2O_lbac emmean    SE df lower.CL upper.CL
##  36         7.92 0.121  8     7.64     8.19
##  54         8.12 0.121  8     7.84     8.40
##  72         7.81 0.121  8     7.53     8.09
##  108        7.58 0.121  8     7.30     7.86
##  144        7.52 0.121  8     7.24     7.79
## 
## Results are averaged over the levels of: rep 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast          estimate    SE df t.ratio p.value
##  c(1, 0, 0, 0, -1)      0.4 0.171  8   2.344  0.0471
## 
## Results are averaged over the levels of: rep

(mg_means_random <- emmeans(m_random, ~K2O_lbac, contr = list(c(1, 0, 0, 0, -1))))

## $emmeans
##  K2O_lbac emmean    SE   df lower.CL upper.CL
##  36         7.92 0.162 5.57     7.51     8.32
##  54         8.12 0.162 5.57     7.72     8.52
##  72         7.81 0.162 5.57     7.41     8.21
##  108        7.58 0.162 5.57     7.18     7.98
##  144        7.52 0.162 5.57     7.11     7.92
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast          estimate    SE df t.ratio p.value
##  c(1, 0, 0, 0, -1)      0.4 0.171  8   2.344  0.0471
## 
## Degrees-of-freedom method: kenward-roger

Figure 9. Fixed effects versus random effects of blocks. Random effects of blocks arise from a normal distribution.

Discussions

Now we know what we mean when we say “factor A was considered fixed and factor B was considered random”!

Group discussion: What to write in the Materials and Methods section of a paper.

Field-specific consensus
Enough information to be reproducible

Example 1: A study to find out if water capture increased as the result of selection for yield in SX hybrids in the US corn-belt (Reyes et al., 2015). [link]

Figure 10. Section from Materials and Methods section from a peer-reviewed publication.

Example 2: A study to find out if developmental telomere attrition is a measure of state in birds, and hence should predict state-dependent decisions such as the relative value assigned to immediate versus delayed food rewards (Bateson et al., 2014). [link]

Figure 11. Section from Materials and Methods section from a peer-reviewed publication.

Summary

Fixed vs Random Effects Table

	Fixed effects	Random effects
Where	Expected value	Variance-covariance matrix
Inference	Constant for all groups in the population of study	Differ from group to group
Usually used to model	Carefully selected treatments or genotypes	The study design (aka structure in the data, or what is similar to what)
Estimation	$$\hat{\boldsymbol{\beta}} \sim N \left( \boldsymbol{\beta}, (\mathbf{X}^T \mathbf{V}^{-1} \mathbf{X})^{-1} \right) $$	$$u_j \sim N(0, \sigma^2_u)$$
Method of estimation	Maximum likelihood, least squares. BLUEs.	Restricted maximum likelihood (shrinkage). BLUPs

What’s next

Linear mixed models applied to data generated by designed experiments.

Fundamentals of linear mixed models

Welcome!

Housekeeping

Outline for today

Linear models review

The quadratic regression is a linear model, too

Let’s review that simple model using distribution notation and matrix notation

Linear model with a qualitative predictor

What are variance-covariance matrices anyways?

Adding a random effect to the model

The assumption behind independent observations

Non-independent observations

How do we define \(\beta_{0j}\)?

Fixed effects

Random effects

Generalities on mixed models

Random effects

Estimation of parameters

Fixed effects versus random effects

Applied example

Building a statistical model

Discussions

Summary

What’s next