West by Midwest: Package Maintenance, Transition and Handover

May 14, 2026

Julia Piaskowski

Director of Statistical Programs, University of Idaho

Russ Lenth

Professor Emeritus of Statistics, University of Iowa

Julia Piaskowski has recently taken over maintenance of ‘emmeans’ from Russ Lenth.

How is that going?

What challenges have we encountered?

Emmeans Background

For estimating marginal means from a linear modeling object
Legacy package: first released November 5, 2017
replacing lsmeans (first released August 15, 2012)
methods based on Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 doi:10.1080/00031305.1980.10483031.

Russ’ Perspective

West by Midwest: Where We Are

West by Midwest: Where We’ve Lived

Russ’s Computing Background

Skeletran, Fortran IV
Algorithms for noncentral \(F\), beta, and \(t\)
Power and sample size
- Pascal
- Xlisp-stat
- Java
Other R packages: rsm, lsmeans, estimability, vigindex

Origins of ‘lsmeans’

Preliminary thoughts about how to obtain LSmeans focused on interpretation of model coefficients
- What to do with covariates?
- Doesn’t help that SAS Type III estimable functions do this wrong (IMHO)
Salary equity analysis for administration
- Focus on what model predicts
- Duh! Those are the linear functions of coefficients that I need

Differing Application Perspectives

LSMs use equal weighting
Makes sense to experimenters
Others who deal with observational data, not so much
Alernative weighting: Proportional, Outer, Flat
Causal effects: G computation (counterfactuals)
Other packages or platforms: Stata, effects, marginaleffects, margins

emmeans (2017–)

Re-structure of lsmeans with improved architecture and class structure, customizability
Expanding to generalized linear models, link functions and response transformations
Version 2.0.3 has 45 source files (17,500 lines)
- 9,700 lines of R code
- 7,800 lines of comments or embedded documentation
243 functions (79 public), 16 vignettes
121 reverse depends/imports/suggests/enhances

Getting into R package development

My first R package was rsm
I had existing code that worked, and used package.skeleton() function (still available) to bundle it up
- But this required writing separate Rd files, etc.
Now the way to go is to use RStudio (or Positron) – free download – and create a new project
Use roxygen2 package to incorporate documentation, imports, etc.
Wickham, H and Bryan, J (2023) R Packages (2d ed), O’Reilly Media, Inc.

Observations on package development

Writing code is fun
Writing documentation & vignettes is less fun, and time-consuming. But really important
Your stuff is there for others to see – and they will look

My pet peeves with some other people’s packages

Cramped,runtogethercode. Use whitespace
Over-abstraction. Use meaningful (but short) variable names
Re-using the same variable or object name in a string of examples

Bug reports

Same peeves, also unnecessary elaboration

Transition experience

Julia …
- encouraged me to add pkgdown site (July ’24)
- set up GitHub action to update site automatically
- started monitoring the Issues page
- began answering some issues
- major overhaul graphics theme
- took over as official maintainer (August ’25, version 2.0.0)

Transitioning = Stress of change + Benefits

‘pkgdown’ as mentioned before
Different work habits & availability
Repo management – working in branches
Coding style (e.g., I like “=” rather than “<-”)
- but similar styles with whitespace and indentation
Joint responsibility for decisions (e.g., no “stars”)

Julia’s Perspective

Programming Background

Originally trained in SAS
Started using R in 2010 when I was unable to access all the predictions from PROC PLS (SAS v9.3).
R in those days was hard (no tidyverse, ggplotting, or RStudio), but it had enough functionality and flexibility to make it worth the effort.
Learned C, python, bash shell scripting. Programming is satisfying!
I pay attention to developements in the R ecosystem and read source code frequently.

Programming Philosophy

I have an interest in clean code: functions with a clear (single) purpose, good error catching and informative messages. Don’t go overboard on functionality (a function is not a multi-tool)
Style: code needs to breathe! We aren’t bit packing code and sending it to the moon, so use whitespace. Indent code (to indicate function nesting) because it’s easier to read and understand
Use existing libraries for package enhancement (e.g. pbkrtest for Kenward-Rogers degrees of freedom), but try to limit software dependencies that are required for yours to run.

emmeans::cld()

Daily CRAN Downloads

More Usage/Popularity Stats

Downloads: 6,000 daily downloads
Reverse dependencies: 207
Code Mentions on GitHub: 21,600
Citations: ?????
GitHub Stars: 414
Contributors: 17
Issues filed: 532

emmeans Function Dependency Graph

There is Not Much Guidance on Library Handoff

How Did It Start? (Julia)

I filed an issue in 2024 (to make pkgdown website) and assisted with implementing it, realized Russ was 77 years old, and inquired about package succession.
I did not contribute much for the next year; I answered issues, and looked more into package functionality.
I became more involved starting last August with some encouragement from Russ (“Could you address this…?”)

How Did I Find My Way?

Step zero: read about package maintenance tools (e.g devtools, lintr, styler, CRAN guides, R Packages \(2^{nd}\), and R OpenSci recommendations. This can be intimidating.
Step one: answer issues to understand how the userbase interacts with the package, what problems do exist, and learn how to respond to different types of requests.
Step two: make an easy package fixes or improvements (e.g. replace aes_ with aes) and learn how to test them. Learn how to use testhat and write a unit test.
Step three: do more complex fixes: I added functionality for specifying quantile-based credible intervals for Bayesian models in response to a feature request.
Step four: collaborate with Russ on more complex thing (we overhauled plot aesthetics)

Challenges: Filed Issues

Some requests:

Can you enable users to adjust the number of significant digits when reporting p-values?
Can you make a cheat sheet?
Can you add ANOVA for Bayesian models?
Can you enable credible intervals for Bayesian models?
Can you fix this bug where the back transformation is wrong?
Can emmeans support {this package}?
Can you alter your internal function structure so it returns answers identical to my manual calculations?
I keep getting this ggplot error; thought you might want to know!
How should I report my results from emmeans?

How do I respond?

Challenges: Breaking Dependencies

Internal or external

Some issues are complex. The problem might be straightforward to describe, but what it is causing it is not easy to resolve (example).
Fixing one issue may introduce new problems in the package (example).
Many packages depend on emmeans, so every change has to be weighed with that in mind. What if I break something?
We have a script to check for breaking dependencies. But be careful, if something is broken downstream, that will be our problem.

Collaboration

Russ is #14, I am #2

Collaboration

We are collaborating two levels: package maintenance and everyday git issues.
Provide transition tools: guides to package structure, how to prepare for CRAN.
Clarify expectations: who is addressing what issue and how?
Need clear two-way communication (we email each other frequently).
Make changes as pull requests (PR’s) and request code review
Pave the way to easy “wins”.
Patience, kindness, and encouragement will go a long ways.

Can I Fill Russ’ Shoes?

I am learning the codebase, but logically, the original author will know it best.
Most of the time, I will not be as timely in my response to filed issues.
I have to be timely responding to CRAN requests.
I do want to make a few changes: expand the number of unit tests, expand options for ordinal models, add continuous integration tools for standard checks (R CMD CHECK) and unit tests as a GitHub action.

Final Thoughts: How to Do Library Handoff

Most of package maintenance is answering issues.
Some programming skill is needed, but more to read code than write it.
Considerable patience with git and GitHub is required alongside careful coordination with co-maintainers.
LLMs: can help with understanding codebase, but an agent (semi-autonomous) is an unwise choice at this time until there is more confidence with the codebase.
Newer developers need nuturing and encourgement (or at least no discouragement) and some easy “wins” early on.

Live shot of me and Russ collaborating

Thoughts on Open Source Software

How to Get Involved

The open source software universe needs you!

Some authors struggle to keep up with demand (see the 200+ issues currently open in glmmTMB).
Popular packages are abandoned regularly when the original author no longer has time. Any package ever on CRAN is available in a public archive.
There are many opportunities for programmers of different levels; ROpenSci runs a ‘Help Wanted’ page with requests for help in solving issues or looking for new maintainers.

Thank You

Addendum: Why SAS Type III Anova is Wrong

(when covariates are involved)

library(emmeans)
fiber.add = lm(strength ~ diameter + machine, data = fiber) # why have this? 
fiber.int = lm(strength ~ diameter * machine, data = fiber)

emmeans(fiber.add, "machine") |> contrast("consec") |> test(joint = TRUE)

 df1 df2 F.ratio p.value
   2  11   2.611  0.1181

emmeans(fiber.int, "machine") |> contrast("consec") |> test(joint = TRUE)

 df1 df2 F.ratio p.value
   2   9   2.814  0.1124

# What SAS type III ANOVA does
emmeans(fiber.int, "machine", at = list(diameter = 0)) |> 
    contrast("consec") |> test(joint = TRUE)

 df1 df2 F.ratio p.value
   2   9   0.475  0.6367

SAS Compares Intercepts, not LSmeans

Confirming SAS’s output

With model equivalent to `fiber.int`

Source                     DF    Type III SS    Mean Square   F Value   Pr > F
diameter                    1    171.1192314    171.1192314     61.00   <.0001
machine                     2      2.6641625      1.3320812      0.47   0.6367
diameter*machine            2      2.7371774      1.3685887      0.49   0.6293

(machine result matches joint test of intercepts)

With `diameter` centered:

Source                     DF    Type III SS    Mean Square   F Value   Pr > F
ctr_diam                    1    171.1192314    171.1192314     61.00   <.0001
machine                     2     15.7906794      7.8953397      2.81   0.1124
ctr_diam*machine            2      2.7371774      1.3685887      0.49   0.6293

(machine result matches joint test of LSmeans)

Lesson: In SAS, if you have covariates that interact with anything, center (or standardize) them first.

West by Midwest: Package Maintenance, Transition and Handover

Julia Piaskowski

Russ Lenth

Julia Piaskowski has recently taken over maintenance of ‘emmeans’ from Russ Lenth.

How is that going?

What challenges have we encountered?

Emmeans Background

Russ’ Perspective

West by Midwest: Where We Are

West by Midwest: Where We’ve Lived

Russ’s Computing Background

Origins of ‘lsmeans’

Differing Application Perspectives

emmeans (2017–)

Getting into R package development

Observations on package development

My pet peeves with some other people’s packages

Bug reports

Transition experience

Transitioning = Stress of change + Benefits

Julia’s Perspective

Programming Background

Programming Philosophy

Daily CRAN Downloads

More Usage/Popularity Stats

emmeans Function Dependency Graph

There is Not Much Guidance on Library Handoff

How Did It Start? (Julia)

How Did I Find My Way?

Challenges: Filed Issues

Challenges: Breaking Dependencies

Collaboration

Collaboration

Can I Fill Russ’ Shoes?

Final Thoughts: How to Do Library Handoff

Thoughts on Open Source Software

How to Get Involved

Thank You

Addendum: Why SAS Type III Anova is Wrong

SAS Compares Intercepts, not LSmeans

Confirming SAS’s output

With model equivalent to fiber.int

With diameter centered:

With model equivalent to `fiber.int`

With `diameter` centered: