| Type: | Package |
| Title: | Spline-Based Nonlinear Modeling for Multilevel and Longitudinal Data |
| Version: | 0.1.1 |
| Description: | Provides tools for fitting, predicting, and visualizing nonlinear relationships in single-level, multilevel, and longitudinal regression models. Nonlinear functional forms are represented using natural cubic splines from 'splines' and smooth terms from 'mgcv'. The package offers a unified interface for specifying nonlinear effects, interactions with time variables, random-intercept clustering structures, and additional linear covariates. Utilities are included to generate prediction grids and produce effect plots, facilitating interpretation and visualization of nonlinear relationships in applied regression workflows. The implementation builds on established methods for spline-based regression and mixed-effects modeling (Hastie and Tibshirani, 1990 <doi:10.1201/9780203738535>; Bates et al., 2015 <doi:10.18637/jss.v067.i01>; Wood, 2017 <doi:10.1201/9781315370279>). Applications include hierarchical and longitudinal data structures common in education, health, and social science research. |
| Depends: | R (≥ 4.2.0) |
| Imports: | stats, lme4, mgcv, dplyr, ggplot2, rlang |
| Suggests: | lmerTest, knitr, rmarkdown, reformulas, testthat (≥ 3.0.0) |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Language: | en-US |
| RoxygenNote: | 7.3.3 |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | no |
| Packaged: | 2026-03-10 13:44:08 UTC; Subir |
| Author: | Subir Hait [aut, cre] |
| Maintainer: | Subir Hait <haitsubi@msu.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-03-16 16:40:33 UTC |
Fit a nonlinear (spline or GAM) single-level or multilevel model
Description
Fits a nonlinear regression model for an outcome y with a focal
predictor x, modeled either by a natural cubic spline
(splines::ns()) or a GAM smooth (mgcv::s()).
Optionally includes a time variable time. For spline fits
(method = "ns"), multilevel random-intercept structure can be added
via one or more grouping variables in cluster.
Models fitted:
Single-level spline:
stats::lm()Single-level GAM:
mgcv::gam()Multilevel spline:
lme4::lmer()(Gaussian) orlme4::glmer()(non-Gaussian)
Usage
nl_fit(
data,
y,
x,
time = NULL,
cluster = NULL,
controls = NULL,
method = c("ns", "gam"),
df = 4,
k = 5,
family = stats::gaussian(),
...
)
Arguments
data |
A data frame (often long format for longitudinal data). |
y |
Outcome variable name (string). |
x |
Focal nonlinear predictor name (string). Must be numeric. |
time |
Optional time variable name (string). If provided, a
spline-by-time interaction is included for |
cluster |
Optional character vector of grouping variable name(s) for
random intercepts, e.g., |
controls |
Optional character vector of additional covariate names to include linearly. |
method |
Either |
df |
Degrees of freedom for |
k |
Basis dimension for |
family |
A model family object, e.g., |
... |
Additional arguments passed to the underlying fitting function
( |
Value
An object of class "nl_fit" (a named list) containing:
modelThe fitted model object.
methodThe fitting method used (
"ns"or"gam").yName of the outcome variable.
xName of the focal predictor.
timeName of the time variable, or
NULL.clusterCharacter vector of clustering variables, or
NULL.controlsCharacter vector of control variable names, or
NULL.dfDegrees of freedom used for
splines::ns().kBasis dimension used for
mgcv::s()/mgcv::ti().familyThe family object used.
formulaThe model formula passed to the fitter.
callThe matched call.
x_infoA list with quantiles and range of
xfor building prediction grids:q01,q99,min,max,n.levels_infoA named list of factor levels for
timeand any factor control variables.control_defaultsA named list of typical values for control variables: the mean for numeric variables and the first level for factors.
See Also
Examples
# --- Toy example (automatically tested by CRAN) ---
# Single-level natural spline on small simulated data
set.seed(1)
mydata <- data.frame(
outcome = rnorm(120),
age = runif(120, 18, 65),
id = rep(1:30, each = 4),
wave = factor(rep(1:4, times = 30)),
sex = factor(sample(c("F", "M"), 120, replace = TRUE)),
baseline_score = rnorm(120)
)
fit_sl <- nl_fit(data = mydata, y = "outcome", x = "age", df = 4)
# Single-level GAM
fit_gam <- nl_fit(
data = mydata,
y = "outcome",
x = "age",
method = "gam",
k = 5
)
# Multilevel spline with random intercepts
fit_ml <- nl_fit(
data = mydata,
y = "outcome",
x = "age",
cluster = "id",
df = 4
)
# Spline with time interaction and controls
fit_t <- nl_fit(
data = mydata,
y = "outcome",
x = "age",
time = "wave",
controls = c("sex", "baseline_score"),
df = 4
)
Intraclass correlation coefficients for a multilevel nl_fit model
Description
Extracts variance components from a multilevel nl_fit object and
computes intraclass correlation coefficients (ICCs) for each grouping
level plus the residual.
The ICC for grouping factor g is defined as:
ICC_g = \frac{\sigma^2_g}{\sum_j \sigma^2_j + \sigma^2_\epsilon}
Usage
nl_icc(object, include_residual = TRUE)
Arguments
object |
An |
include_residual |
Logical; if |
Value
A named numeric vector of ICCs, one per grouping factor (named
ICC_<groupname>) plus Residual for the residual variance
(when include_residual = TRUE). When a residual variance is available,
all values sum to 1.
See Also
Examples
# --- Toy example (automatically tested by CRAN) ---
# Small multilevel data: 10 clusters, 5 obs each (50 rows total)
set.seed(1)
toy <- data.frame(
outcome = rnorm(50),
age = runif(50, 18, 65),
id = rep(1:10, each = 5)
)
fit <- nl_fit(
data = toy,
y = "outcome",
x = "age",
cluster = "id",
df = 2
)
nl_icc(fit)
# Two-level clustering: students nested in schools
set.seed(1)
mydata <- data.frame(
math_score = rnorm(240),
SES = rnorm(240),
id = rep(1:60, each = 4),
schid = rep(1:12, each = 20)
)
fit_ml <- nl_fit(
data = mydata,
y = "math_score",
x = "SES",
cluster = c("id", "schid"),
df = 4
)
nl_icc(fit_ml)
# Without residual variance in output
nl_icc(fit_ml, include_residual = FALSE)
Plot predictions from nl_predict()
Description
Visualizes predicted nonlinear effects returned by
nl_predict. If a time variable is present (or detected),
draws separate colored curves for each time level with optional confidence
ribbons.
Usage
nl_plot(
pred_df,
x,
time = NULL,
show_ci = TRUE,
ci_level = 0.95,
y_lab = "Predicted outcome",
x_lab = NULL,
title = NULL,
legend_title = NULL
)
Arguments
pred_df |
A data frame returned by |
x |
Character string naming the focal predictor column in
|
time |
Optional character string naming the time variable column in
|
show_ci |
Logical; if |
ci_level |
Numeric in (0, 1); confidence level used when computing CI
from |
y_lab |
Character label for the y-axis. Default
|
x_lab |
Character label for the x-axis. Defaults to the value of
|
title |
Optional plot title string. |
legend_title |
Optional legend title string. Defaults to the value of
|
Value
A ggplot object.
See Also
Examples
# --- Toy example (automatically tested by CRAN) ---
# Single-level spline: fit, predict, then plot
set.seed(1)
mydata <- data.frame(
outcome = rnorm(120),
age = runif(120, 18, 65),
id = rep(1:30, each = 4)
)
fit <- nl_fit(data = mydata, y = "outcome", x = "age", df = 4)
pred <- nl_predict(fit)
nl_plot(pred, x = "age")
# Custom axis labels and title
nl_plot(
pred,
x = "age",
y_lab = "Predicted Math Score",
x_lab = "Age (years)",
title = "Nonlinear Effect of Age"
)
# Without confidence ribbon
nl_plot(pred, x = "age", show_ci = FALSE)
# With time variable: separate curves per wave
set.seed(1)
mydata2 <- data.frame(
outcome = rnorm(120),
age = runif(120, 18, 65),
id = rep(1:30, each = 4),
wave = factor(rep(1:4, times = 30))
)
fit_t <- nl_fit(
data = mydata2,
y = "outcome",
x = "age",
time = "wave",
df = 4
)
pred_t <- nl_predict(fit_t)
nl_plot(pred_t, x = "age", time = "wave", legend_title = "Wave")
Generate predictions from an nl_fit model
Description
Creates a prediction data frame over a grid of the focal predictor x
(and optionally over time), holding control variables at typical
values (means for numeric; reference levels for factors). For mixed models,
predictions default to population-level curves (random effects excluded).
Usage
nl_predict(
object,
x_seq = NULL,
time_levels = NULL,
controls_fixed = NULL,
se = TRUE,
level = 0.95,
re_form = NA,
...
)
Arguments
object |
An |
x_seq |
Optional numeric vector of x values to predict over. If
|
time_levels |
Optional vector of time levels to predict for. If
|
controls_fixed |
Optional named list giving specific values for control
variables. If |
se |
Logical; if |
level |
Confidence level for the interval. Default |
re_form |
For mixed models ( |
... |
Reserved for future use. |
Value
A data frame (or tibble) with columns: the focal predictor x,
time (if applicable), any control variables (at fixed values),
fit, se.fit, lwr, and upr.
See Also
Examples
# --- Toy example (automatically tested by CRAN) ---
# Single-level natural spline: fit then predict
set.seed(1)
mydata <- data.frame(
outcome = rnorm(120),
age = runif(120, 18, 65),
id = rep(1:30, each = 4)
)
fit <- nl_fit(data = mydata, y = "outcome", x = "age", df = 4)
pred <- nl_predict(fit)
head(pred)
# Custom x grid
pred_custom <- nl_predict(fit, x_seq = seq(20, 60, by = 5))
# Without standard errors
pred_nose <- nl_predict(fit, se = FALSE)
# With time variable (spline x time interaction)
set.seed(1)
mydata2 <- data.frame(
outcome = rnorm(120),
age = runif(120, 18, 65),
id = rep(1:30, each = 4),
wave = factor(rep(1:4, times = 30))
)
fit_t <- nl_fit(
data = mydata2,
y = "outcome",
x = "age",
time = "wave",
df = 4
)
pred_t <- nl_predict(fit_t)
# Multilevel model: population-level predictions
fit_ml <- nl_fit(
data = mydata,
y = "outcome",
x = "age",
cluster = "id",
df = 4
)
pred_ml <- nl_predict(fit_ml)
Tidy coefficient table for an nl_fit model
Description
Produces a tidy coefficient table for a model fitted by
nl_fit. For linear mixed models (lmerMod), optional
p-values and denominator degrees of freedom are obtained via
lmerTest (Satterthwaite method). For GLMMs (glmerMod),
z-tests are reported from summary().
Usage
nl_summary(
object,
digits = 3,
pvals = TRUE,
df_method = c("satterthwaite", "none")
)
Arguments
object |
An |
digits |
Integer; number of decimal places for rounding. If
|
pvals |
Logical; if |
df_method |
For |
Value
A data frame (or tibble) with columns: Term, Estimate,
Std.Error, df (if available), statistic, and
p.value (if available).
See Also
Examples
# --- Toy example (automatically tested by CRAN) ---
# Single-level natural spline on small simulated data
set.seed(1)
mydata <- data.frame(
outcome = rnorm(120),
age = runif(120, 18, 65),
id = rep(1:30, each = 4)
)
fit <- nl_fit(data = mydata, y = "outcome", x = "age", df = 4)
nl_summary(fit)
# With p-values suppressed
nl_summary(fit, pvals = FALSE)
# No rounding
nl_summary(fit, digits = NULL)
# Multilevel model (lmerMod) with Satterthwaite p-values via lmerTest
set.seed(1)
mydata2 <- data.frame(
outcome = rnorm(240),
age = runif(240, 18, 65),
id = rep(1:60, each = 4)
)
fit_ml <- nl_fit(
data = mydata2,
y = "outcome",
x = "age",
cluster = "id",
df = 4
)
nl_summary(fit_ml)
Print method for nl_fit objects
Description
Compact console display for objects returned by nl_fit.
Shows key metadata (method, outcome, predictor, time, clustering, family,
and controls) and the fitted model formula.
Usage
## S3 method for class 'nl_fit'
print(x, ...)
Arguments
x |
An object of class |
... |
Further arguments (currently ignored). |
Value
x invisibly.
Print method for summary_nl_fit objects
Description
Prints a summary_nl_fit object returned by
summary.nl_fit.
Usage
## S3 method for class 'summary_nl_fit'
print(x, ...)
Arguments
x |
A |
... |
Further arguments passed to |
Value
x invisibly.
Summary method for nl_fit objects
Description
Produces a tidy coefficient table via nl_summary and wraps
it in a summary_nl_fit object for pretty printing.
Usage
## S3 method for class 'nl_fit'
summary(
object,
digits = 3,
pvals = TRUE,
df_method = c("satterthwaite", "none"),
...
)
Arguments
object |
An |
digits |
Number of decimal places for rounding. Default |
pvals |
Logical; if |
df_method |
For |
... |
Further arguments passed to |
Value
An object of class summary_nl_fit containing call,
formula, method, and table.