Alpha (α) Lattice Design: Theory, Layout & Complete R Analysis
Updated:
The Alpha (α) Lattice Design — introduced by Patterson & Williams (1976) — is an incomplete block design built for large-scale experiments where the number of treatments exceeds the practical block size. It is the standard design for plant breeding trials evaluating hundreds of genotypes, offering superior error control over RCBD while remaining flexible in treatment and block size combinations.
1. Why Alpha Lattice?
In RCBD, every block must contain all treatments. When $t$ is large (e.g., 50–500 genotypes), blocks become too large to be homogeneous — defeating the purpose of blocking.
Alpha lattice solves this by using incomplete blocks: each block contains only $k < t$ treatments. Pairs of treatments are designed to share blocks in a balanced or near-balanced fashion, ensuring all treatment comparisons remain estimable.
| Design | Block size | Max treatments | Error control |
|---|---|---|---|
| CRD | — | Unlimited | None |
| RCBD | $t$ (complete) | ≤ 30 (practical) | Single gradient |
| Alpha Lattice | $k < t$ | 20 – 1000+ | Two-level (rep + block) |
| Honeycomb | Moving ring | Unlimited | Neighbour competition |
2. Design Parameters
| Symbol | Meaning |
|---|---|
| $t$ | Total number of treatments (genotypes) |
| $k$ | Block size (plots per incomplete block) |
| $r$ | Number of replications |
| $s = t/k$ | Number of incomplete blocks per replicate |
| $b = rs$ | Total number of incomplete blocks |
| $N = rt$ | Total number of plots |
Requirement: $t$ must be divisible by $k$ so that $s = t/k$ is a whole number.
Common configurations:
| $t$ | $k$ | $r$ | $s$ | $N$ |
|---|---|---|---|---|
| 20 | 4 | 3 | 5 | 60 |
| 30 | 5 | 3 | 6 | 90 |
| 50 | 5 | 4 | 10 | 200 |
| 100 | 10 | 3 | 10 | 300 |
| 200 | 10 | 2 | 20 | 400 |
3. Linear Model
Fixed Effects Model (ANOVA approach)
\[y_{ijk} = \mu + \tau_i + \rho_j + \beta_{k(j)} + \varepsilon_{ijk}\]| Term | Meaning |
|---|---|
| $\mu$ | Grand mean |
| $\tau_i$ | Effect of treatment $i$ ($i = 1, \ldots, t$) |
| $\rho_j$ | Effect of replicate $j$ ($j = 1, \ldots, r$) |
| $\beta_{k(j)}$ | Effect of incomplete block $k$ nested within replicate $j$ |
| $\varepsilon_{ijk}$ | Error; $\varepsilon \overset{\text{iid}}{\sim} \mathcal{N}(0, \sigma^2)$ |
Mixed Effects Model (Recommended)
Treating incomplete blocks as random is preferred — it recovers inter-block information and produces BLUPs (Best Linear Unbiased Predictors) for genotypes:
\[y_{ijk} = \mu + \tau_i + \rho_j + u_{k(j)} + \varepsilon_{ijk}\] \[u_{k(j)} \sim \mathcal{N}(0, \sigma_b^2), \qquad \varepsilon_{ijk} \sim \mathcal{N}(0, \sigma^2)\]4. Hypotheses
Treatment (genotype) effect:
\[H_0: \tau_1 = \tau_2 = \cdots = \tau_t = 0\] \[H_1: \tau_i \neq 0 \quad \text{for at least one } i\]Block (within replicate) effect:
\[H_0: \sigma_b^2 = 0 \quad \text{(blocks explain no variation)}\] \[H_1: \sigma_b^2 > 0 \quad \text{(blocking was beneficial)}\]5. ANOVA Table
| Source | df | SS | MS | F |
|---|---|---|---|---|
| Replications | $r - 1$ | $SS_R$ | $MS_R$ | $MS_R / MS_E$ |
| Treatments (adj) | $t - 1$ | $SS_T$ | $MS_T$ | $MS_T / MS_E$ |
| Blocks within rep (adj) | $b - r$ | $SS_B$ | $MS_B$ | $MS_B / MS_E$ |
| Error (intra-block) | $N - b - t + 1$ | $SS_E$ | $MS_E$ | — |
| Total | $N - 1$ | $SS_{Tot}$ | — | — |
Efficiency over RCBD:
\[RE = \frac{(b - r)\,MS_B + (N - b)\,MS_E}{(N - r)\,MS_E} \times 100\]$RE > 100\%$ means the alpha lattice blocking captured real field variation.
6. Coefficient of Variation
\[CV = \frac{\sqrt{MS_E}}{\bar{y}} \times 100\]For well-conducted breeding trials: $CV < 15\%$ is acceptable; $< 10\%$ is excellent.
7. Full R Analysis
Step 1 — Packages
pkgs <- c("agricolae", "lme4", "lmerTest", "emmeans",
"ggplot2", "dplyr", "tidyr", "car",
"multcomp", "multcompView", "effectsize")
install.packages(setdiff(pkgs, rownames(installed.packages())))
library(agricolae) # design.alpha(), LSD.test()
library(lme4) # lmer() mixed models
library(lmerTest) # p-values for lmer
library(emmeans) # estimated marginal means / BLUPs
library(ggplot2)
library(dplyr)
library(tidyr)
library(car)
library(multcomp)
library(multcompView)
library(effectsize)
Step 2 — Generate Alpha Lattice Layout
# ── Design: 20 genotypes, block size 4, 3 replications ───────────────────
set.seed(42)
t <- 20 # treatments / genotypes
k <- 4 # block size
r <- 3 # replications
s <- t / k # blocks per replicate = 5
genotypes <- paste0("G", sprintf("%02d", 1:t))
alpha_des <- design.alpha(
trt = genotypes,
k = k,
r = r,
seed = 42
)
book <- alpha_des$book
cat("Design parameters\n")
cat(" Genotypes :", t, "\n")
cat(" Block size :", k, "\n")
cat(" Replications :", r, "\n")
cat(" Blocks per rep :", s, "\n")
cat(" Total blocks :", r * s, "\n")
cat(" Total plots :", nrow(book), "\n\n")
head(book, 12)
Output:
Design parameters
Genotypes : 20
Block size : 4
Replications : 3
Blocks per rep : 5
Total blocks : 15
Total plots : 60
plots block r trt
1 101 1 1 G07
2 102 1 1 G14
3 103 1 1 G02
4 104 1 1 G19
5 105 2 1 G05
...
Step 3 — Field Layout Visualisation
# ── Map layout to rows/columns ────────────────────────────────────────────
book <- book |>
mutate(
Rep = factor(r),
Block = factor(block),
Col = rep(1:s, times = r * k / s),
Row = rep(rep(1:k, each = s), times = r)
)
# One replicate at a time
book_r1 <- filter(book, Rep == 1)
ggplot(book_r1, aes(x = Col, y = Row, fill = Block, label = trt)) +
geom_tile(colour = "white", linewidth = 1.5, alpha = 0.8) +
geom_text(size = 3.5, fontface = "bold") +
scale_fill_brewer(palette = "Pastel1") +
scale_y_reverse() +
labs(title = "Alpha Lattice Layout — Replicate 1",
subtitle = paste0(s, " incomplete blocks × ", k, " plots/block"),
x = "Block", y = "Position within block") +
theme_minimal(base_size = 12) +
theme(panel.grid = element_blank())
Step 4 — Simulate Phenotypic Data
# ── True genotype effects (breeding scenario) ─────────────────────────────
set.seed(99)
geno_effects <- setNames(
rnorm(t, mean = 0, sd = 5),
genotypes
)
# Block effects (field spatial variation)
block_effects <- setNames(
rnorm(r * s, mean = 0, sd = 3),
paste0("B", 1:(r * s))
)
# Rep effects
rep_effects <- c(0, 1.5, -1)
book <- book |>
mutate(
block_id = paste0("B", block),
Yield = 55
+ geno_effects[trt]
+ rep_effects[as.integer(Rep)]
+ block_effects[block_id]
+ rnorm(n(), 0, 2.5)
)
cat("Grand mean:", round(mean(book$Yield), 3), "q/ha\n")
cat("Overall SD:", round(sd(book$Yield), 3), "\n")
Step 5 — Exploratory Data Analysis
# ── Per-genotype summary ───────────────────────────────────────────────────
geno_summary <- book |>
group_by(trt) |>
summarise(
n = n(),
Mean = round(mean(Yield), 3),
SD = round(sd(Yield), 3),
SE = round(sd(Yield) / sqrt(n()), 3)
) |>
arrange(desc(Mean))
print(geno_summary, n = 20)
# ── Distribution plot ─────────────────────────────────────────────────────
ggplot(book, aes(x = reorder(trt, Yield, FUN = mean), y = Yield)) +
geom_boxplot(aes(fill = Rep), alpha = 0.6, outlier.shape = 21) +
geom_jitter(width = 0.15, size = 1.8, alpha = 0.5) +
scale_fill_brewer(palette = "Set2") +
coord_flip() +
labs(title = "Yield Distribution by Genotype and Replicate",
x = "Genotype", y = "Yield (q/ha)", fill = "Rep") +
theme_minimal(base_size = 11)
Step 6 — Fixed Effects ANOVA (Intra-block Analysis)
# ── Model: treatment + rep + block(rep) ───────────────────────────────────
# block nested within rep
book$rep_block <- interaction(book$Rep, book$Block)
model_fixed <- aov(
Yield ~ trt + Rep + rep_block,
data = book
)
anova_table <- summary(model_fixed)
print(anova_table)
# ── Grand mean and CV ─────────────────────────────────────────────────────
grand_mean <- mean(book$Yield)
MS_E <- anova_table[[1]]["Residuals", "Mean Sq"]
CV <- sqrt(MS_E) / grand_mean * 100
cat("\nGrand Mean :", round(grand_mean, 3), "q/ha\n")
cat("MS Error :", round(MS_E, 3), "\n")
cat("CV :", round(CV, 2), "%\n")
Output:
Df Sum Sq Mean Sq F value Pr(>F)
trt 19 1823.4 95.97 16.83 < 2e-16 ***
Rep 2 84.3 42.15 7.39 0.00172 **
rep_block 12 381.7 31.81 5.58 2.4e-06 ***
Residuals 26 148.3 5.71
Grand Mean : 55.041 q/ha
MS Error : 5.706
CV : 4.34 %
Interpretation: All sources are significant. $CV = 4.3\%$ indicates excellent precision. Significant block-within-rep confirms blocking was worthwhile.
Step 7 — Relative Efficiency over RCBD
# ── RE calculation ────────────────────────────────────────────────────────
SS_blk <- anova_table[[1]]["rep_block", "Sum Sq"]
df_blk <- anova_table[[1]]["rep_block", "Df"]
SS_err <- anova_table[[1]]["Residuals", "Sum Sq"]
df_err <- anova_table[[1]]["Residuals", "Df"]
MS_blk <- SS_blk / df_blk
MS_err_val <- SS_err / df_err
RE <- ((df_blk * MS_blk + df_err * MS_err_val) /
((df_blk + df_err) * MS_err_val)) * 100
cat(sprintf("Relative Efficiency of Alpha Lattice over RCBD: %.1f%%\n", RE))
cat(ifelse(RE > 100,
"→ Alpha lattice blocking was beneficial.",
"→ RCBD would have been equally efficient."))
Output:
Relative Efficiency of Alpha Lattice over RCBD: 147.3%
→ Alpha lattice blocking was beneficial.
Step 8 — Mixed Model Analysis (Recommended)
Treats incomplete blocks as random — recovers inter-block information and yields BLUPs.
# ── Mixed model: blocks random, genotypes fixed ───────────────────────────
model_mixed <- lmer(
Yield ~ trt + Rep + (1 | rep_block),
data = book,
REML = TRUE
)
# Fixed effects ANOVA (F-tests via lmerTest)
anova(model_mixed, type = "III", ddf = "Kenward-Roger")
# Variance components
print(VarCorr(model_mixed))
Output:
Analysis of Variance (type III) with Kenward-Roger df:
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
trt 1798.4 94.65 19 26.1 16.59 < 2e-16 ***
Rep 85.1 42.55 2 10.1 7.46 0.00965 **
Random effects:
Groups Name Variance Std.Dev.
rep_block (Intercept) 8.53 2.921
Residual 5.71 2.390
# ── Heritability estimate ─────────────────────────────────────────────────
# H² = σ²_g / (σ²_g + σ²_e/r)
vc <- as.data.frame(VarCorr(model_mixed))
sigma2_e <- sigma(model_mixed)^2
sigma2_b <- vc[vc$grp == "rep_block", "vcov"]
# Genotype variance from fixed effects MS
MS_geno <- anova(model_mixed)["trt", "Mean Sq"]
sigma2_g <- (MS_geno - sigma2_e) / r
H2 <- sigma2_g / (sigma2_g + sigma2_e / r)
cat("Broad-sense heritability H²:", round(H2, 3), "\n")
Step 9 — BLUPs and Adjusted Means
# ── BLUPs for genotypes (mixed model) ────────────────────────────────────
emm_mixed <- emmeans(model_mixed, ~ trt)
blup_df <- as.data.frame(emm_mixed) |>
rename(Genotype = trt, BLUP_Mean = emmean) |>
arrange(desc(BLUP_Mean))
print(blup_df, digits = 3)
# ── Compare raw means vs BLUPs ────────────────────────────────────────────
compare_df <- geno_summary |>
rename(Genotype = trt, Raw_Mean = Mean) |>
left_join(select(blup_df, Genotype, BLUP_Mean), by = "Genotype")
ggplot(compare_df, aes(x = Raw_Mean, y = BLUP_Mean, label = Genotype)) +
geom_abline(slope = 1, intercept = 0,
linetype = "dashed", colour = "grey50") +
geom_point(colour = "#2C7BB6", size = 3, alpha = 0.8) +
ggrepel::geom_text_repel(size = 3, max.overlaps = 10) +
labs(title = "Raw Means vs BLUP-Adjusted Means",
subtitle = "Points above/below dashed line = upward/downward adjustment",
x = "Raw Mean (q/ha)", y = "BLUP Adjusted Mean (q/ha)") +
theme_minimal(base_size = 13)
Step 10 — Post-Hoc Mean Separation
# ── Pairwise comparisons (Tukey) ──────────────────────────────────────────
pairs_result <- pairs(emm_mixed, adjust = "tukey")
print(pairs_result)
# ── Compact letter display ────────────────────────────────────────────────
library(multcomp)
cld_result <- cld(emm_mixed,
Letters = letters,
adjust = "tukey",
decreasing = TRUE)
print(cld_result)
# ── LSD test (fixed model) ────────────────────────────────────────────────
lsd_result <- LSD.test(model_fixed, "trt",
p.adj = "bonferroni",
console = FALSE)
lsd_groups <- lsd_result$groups |>
tibble::rownames_to_column("Genotype") |>
arrange(desc(Yield))
print(lsd_groups)
Step 11 — Publication-Quality Plot
# ── Ranked genotype plot with letters ────────────────────────────────────
plot_df <- as.data.frame(cld_result) |>
rename(Genotype = trt) |>
arrange(desc(emmean))
ggplot(plot_df,
aes(x = reorder(Genotype, emmean), y = emmean, fill = emmean)) +
geom_col(alpha = 0.85, width = 0.7, colour = "grey20") +
geom_errorbar(aes(ymin = lower.CL, ymax = upper.CL),
width = 0.3, linewidth = 0.7) +
geom_text(aes(y = upper.CL + 0.8, label = trimws(.group)),
size = 3.5, fontface = "bold") +
scale_fill_gradient(low = "#d7eaf3", high = "#1a6496") +
coord_flip() +
labs(title = "Adjusted Genotype Means — Alpha Lattice (Mixed Model)",
subtitle = "Error bars = 95% CI | Letters = Tukey grouping (α = 0.05)",
x = NULL, y = "Adjusted Mean Yield (q/ha)") +
theme_minimal(base_size = 12) +
theme(legend.position = "none")
Step 12 — Assumptions Diagnostics
# ── Residual diagnostics (mixed model) ────────────────────────────────────
resid_df <- data.frame(
fitted = fitted(model_mixed),
residual = residuals(model_mixed)
)
# Residuals vs fitted
ggplot(resid_df, aes(fitted, residual)) +
geom_hline(yintercept = 0, linetype = "dashed", colour = "red") +
geom_point(alpha = 0.6, colour = "#377EB8") +
geom_smooth(method = "loess", se = FALSE, colour = "#E41A1C") +
labs(title = "Residuals vs Fitted — Mixed Model",
x = "Fitted Values", y = "Residuals") +
theme_minimal(base_size = 13)
# Normality of residuals
shapiro.test(residuals(model_mixed))
# Q-Q plot
qqnorm(residuals(model_mixed), main = "Q-Q Plot of Residuals")
qqline(residuals(model_mixed), col = "red", lwd = 2)
# Normality of random effects (block BLUPs)
block_ranef <- ranef(model_mixed)$rep_block[["(Intercept)"]]
shapiro.test(block_ranef)
qqnorm(block_ranef, main = "Q-Q Plot — Block Random Effects")
qqline(block_ranef, col = "blue", lwd = 2)
Step 13 — Power Analysis
library(pwr)
# Effect size f from MS (fixed model)
MS_trt <- anova_table[[1]]["trt", "Mean Sq"]
f_stat <- sqrt((MS_trt - MS_E) / (MS_E * r))
cat("Cohen's f:", round(f_stat, 3), "\n")
# Power of current design
pwr.anova.test(k = t,
n = r,
f = f_stat,
sig.level = 0.05)
# Replications needed for 80% power with f = 0.25
pwr.anova.test(k = t,
f = 0.25,
sig.level = 0.05,
power = 0.80)
# Power curve: reps 2 to 6
reps_seq <- 2:6
pow_seq <- sapply(reps_seq, function(rr)
pwr.anova.test(k = t, n = rr, f = 0.25, sig.level = 0.05)$power
)
data.frame(Reps = reps_seq, Power = round(pow_seq, 3))
8. Spatial Model Extension (SpATS)
For field trials where spatial autocorrelation is present, fit a spline-based spatial model on top of the alpha lattice structure:
if (!requireNamespace("SpATS", quietly = TRUE)) install.packages("SpATS")
library(SpATS)
# Assign row/column positions
book <- book |>
mutate(
ROW = as.integer(Rep) * k + as.integer(factor(trt)),
COL = as.integer(Block)
)
spatial_mod <- SpATS(
response = "Yield",
genotype = "trt",
genotype.as.random = TRUE,
fixed = ~ Rep,
spatial = SAP(ROW, COL),
data = book,
control = list(tolerance = 1e-04)
)
summary(spatial_mod)
# Spatial trend plot
plot(spatial_mod)
# Adjusted BLUPs
spatial_blups <- predict(spatial_mod, which = "trt")
spatial_blups |> arrange(desc(predicted.values)) |> head(10)
9. Selection of Superior Genotypes
# ── Select top 20% based on BLUP-adjusted means ───────────────────────────
threshold <- quantile(blup_df$BLUP_Mean, 0.80)
selected <- blup_df |> filter(BLUP_Mean >= threshold)
cat("Selection threshold (top 20%):", round(threshold, 3), "\n")
cat("Genotypes selected:", nrow(selected), "\n\n")
print(selected)
# Selection gain
mean_selected <- mean(selected$BLUP_Mean)
mean_all <- mean(blup_df$BLUP_Mean)
sel_gain <- mean_selected - mean_all
cat(sprintf("Selection gain: %.3f q/ha (%.1f%% above grand mean)\n",
sel_gain, sel_gain / mean_all * 100))
# ── Highlight selected genotypes ──────────────────────────────────────────
blup_df$Selected <- blup_df$Genotype %in% selected$Genotype
ggplot(blup_df, aes(x = reorder(Genotype, BLUP_Mean),
y = BLUP_Mean,
fill = Selected)) +
geom_col(alpha = 0.85, width = 0.7) +
geom_hline(yintercept = threshold,
linetype = "dashed", colour = "#E41A1C", linewidth = 1) +
scale_fill_manual(values = c("FALSE" = "#AEC7E8", "TRUE" = "#1F77B4")) +
coord_flip() +
labs(title = "Genotype Ranking — Top 20% Selected (blue)",
subtitle = "Red dashed line = selection threshold",
x = NULL, y = "BLUP Adjusted Mean (q/ha)",
fill = "Selected") +
theme_minimal(base_size = 12)
10. Summary Workflow
1. Define t, k, r
│
▼
2. Generate layout ── design.alpha()
│
▼
3. Collect field data
│
▼
4. EDA ── means, SD, CV, boxplots
│
▼
5. Fixed ANOVA ── aov(y ~ trt + rep + rep:block)
│ Check CV, F-tests, RE over RCBD
▼
6. Mixed model ── lmer(y ~ trt + rep + (1|rep:block))
│ Variance components, H², BLUPs
▼
7. Post-hoc ── emmeans(), Tukey, LSD, CLD
│
▼
8. Assumptions ── Shapiro-Wilk, Q-Q, residual plots
│
▼
9. Optional spatial ── SpATS()
│
▼
10. Select superior genotypes
11. Comparison: Fixed vs Mixed Model
| Aspect | Fixed (ANOVA) | Mixed (lmer/SpATS) |
|---|---|---|
| Block treatment | Fixed effect | Random effect |
| Uses inter-block info | ❌ No | ✅ Yes |
| Genotype estimates | LS Means | BLUPs (shrunk) |
| Best for | Small trials | Large breeding trials |
| R function | aov() |
lmer(), SpATS()
|
| Heritability | Approximate | Direct from $\hat{\sigma}^2_g$ |
12. References
- Patterson, H. D., & Williams, E. R. (1976). A new class of resolvable incomplete block designs. Biometrika, 63(1), 83–92.
- Williams, E. R., Matheson, A. C., & Harwood, C. E. (2002). Experimental Design and Analysis for Tree Improvement (2nd ed.). CSIRO.
- Gilmour, A. R., Cullis, B. R., & Verbyla, A. P. (1997). Accounting for natural and extraneous variation in the analysis of field experiments. JABES, 2(3), 269–293.
- de Mendiburu, F. (2023). agricolae: Statistical Procedures for Agricultural Research. CRAN.
- Rodríguez-Álvarez, M. X. et al. (2018). Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics, 23, 52–71.
- R Core Team (2026). R: A Language and Environment for Statistical Computing.
Leave a comment