SAS-style PROC FORMAT for R: create and apply value formats, range-based formatting, reverse formatting (invalue), and consistent handling of missing values (NA, NULL, NaN).
Repository: github.com/crow16384/ksformat — source code, issue tracker, and development.
From GitHub (after cloning or from your repo URL):
# install.packages("remotes")
remotes::install_github("crow16384/ksformat")From local source:
install.packages(".", repos = NULL, type = "source")
# or
devtools::install()fput_all).x1, .x2, … evaluated at apply-timeignore_case = TRUE in fnewstrftime patternsfparse), export (fexport), import CNTLOUT CSV
(fimport)library(ksformat)
fnew(
"M" = "Male",
"F" = "Female",
.missing = "Unknown",
name = "sex"
)
fput(c("M", "F", NA, "X"), "sex")
# [1] "Male" "Female" "Unknown" "X"fparse(text = '
VALUE age (numeric)
[0, 18) = "Child"
[18, 65) = "Adult"
[65, HIGH] = "Senior"
.missing = "Age Unknown"
;
')
fputn(c(5, 25, 70, NA), "age")
# [1] "Child" "Adult" "Senior" "Age Unknown"finput("Male" = 1, "Female" = 2, name = "sex_inv")
finputn(c("Male", "Female", "Unknown"), "sex_inv")
# [1] 1 2 NAfprint() # list all registered formats
fmt <- format_get("sex")
fclear("sex") # remove one format
fclear() # clear alldf <- data.frame(
sex = c("M", "F", "M", NA),
age = c(15, 25, 70, 35)
)
fput_df(df, sex = format_get("sex"), age = format_get("age"), suffix = "_label")With multilabel = TRUE, a single value can match
multiple labels. Use fput_all() to collect all matches:
fnew(
"0,17,TRUE,TRUE" = "Pediatric",
"18,Inf,TRUE,TRUE" = "Adult",
"3,5,TRUE,TRUE" = "Serious",
name = "ae_age", type = "numeric", multilabel = TRUE
)
fput_all(c(10, 25, 4), "ae_age")
# [[1]] "Pediatric"
# [[2]] "Adult"
# [[3]] "Pediatric" "Serious"SAS date format names are auto-resolved — no pre-creation needed:
fputn(Sys.Date(), "DATE9.")
# [1] "25MAR2026"
fputn(Sys.Date(), "MMDDYY10.")
# [1] "03/25/2026"
# Custom strftime pattern
fnew_date("%d.%m.%Y", name = "ru_date", type = "date")
fput(Sys.Date(), "ru_date")
# [1] "25.03.2026"Time (seconds since midnight) and datetime are also supported:
fputn(3600, "TIME8.")
# [1] "1:00:00"
fputn(Sys.time(), "DATETIME20.")Labels containing .x1, .x2, etc. are
evaluated as R expressions at apply-time. Pass extra arguments through
fput(x, fmt, ...):
stat_fmt <- fnew(
"n" = "sprintf('%s', .x1)",
"pct" = "sprintf('%.1f%%', .x1 * 100)",
name = "stat", type = "character"
)
fput(c("n", "pct"), stat_fmt, c(42, 0.053))
# [1] "42" "5.3%"Use e() to mark a label for evaluation even without
.xN placeholders:
fnew("ts" = e("format(Sys.time(), '%Y-%m-%d')"), name = "demo")
fput("ts", "demo")fnew("M" = "Male", "F" = "Female", name = "sex_nc",
type = "character", ignore_case = TRUE)
fput(c("m", "F", "M", "f"), "sex_nc")
# [1] "Male" "Female" "Male" "Female"Priority order:
.missing label if
defined, otherwise NA.other label or original
valueOptions: keep_na = TRUE, na_if,
include_empty = TRUE.
ksformat_cheatsheet() to
open the cheat sheet in your browser (HTML), or
ksformat_cheatsheet("pdf") for the PDF.| Area | Functions |
|---|---|
| Creation | fnew(), finput(), fnew_bid(),
fnew_date(), fparse(), e() |
| Application | fput(), fputn(), fputc(),
fput_all(), fput_df() |
| Reverse | finputn(), finputc() |
| Library | format_get(), fprint(),
fclear(), fexport(),
fimport() |
| Utilities | is_missing(), range_spec() |
| Documentation | ksformat_cheatsheet() — open cheat sheet |
install.packages(c("roxygen2", "testthat", "devtools"))
devtools::document()
devtools::test()
devtools::check()When bumping the package version, update DESCRIPTION and
then run
Rscript scripts/sync-version.R to refresh version
references in cran-comments.md and any other synced
files.
GPL-3. See https://www.gnu.org/licenses/gpl-3.0.html.