How to convert R formula to text?
RR FormulaR Problem Overview
I have trouble working with formula as with text. What I'm trying to do is to concatenate the formula to the title of the graph. However, when I try to work with the formula as with text, I fail:
model <- lm(celkem ~ rok + mesic)
formula(model)
# celkem ~ rok + mesic
This is fine. Now I want to build string like "my text celkem ~ rok + mesic"
- this is where the problem comes:
paste("my text", formula(model))
# [1] "my text ~" "my text celkem" "my text rok + mesic"
paste("my text", as.character(formula(model)))
# [1] "my text ~" "my text celkem" "my text rok + mesic"
paste("my text", toString(formula(model)))
# [1] "my text ~, celkem, rok + mesic"
Now I see there is a sprint
function in package gtools
, but I think this is such a basic thing that it deserves a solution within the default environment!!
R Solutions
Solution 1 - R
A short solution from the package formula.tools
, as a function as.character.formula
:
frm <- celkem ~ rok + mesic
Reduce(paste, deparse(frm))
# [1] "celkem ~ rok + mesic"
library(formula.tools)
as.character(frm)
# [1] "celkem ~ rok + mesic"
Reduce
might be useful in case of long formulas:
frm <- formula(paste("y ~ ", paste0("x", 1:12, collapse = " + ")))
deparse(frm)
# [1] "y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + "
# [2] " x12"
Reduce(paste, deparse(frm))
# [1] "y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12"
Which is because of width.cutoff = 60L
in ?deparse
.
Solution 2 - R
Try format
:
paste("my text", format(frm))
## [1] "my text celkem ~ rok + mesic"
Solution 3 - R
Simplest solution covering everything:
f <- formula(model)
paste(deparse(f, width.cutoff = 500), collapse="")
Solution 4 - R
or as an alternative to Julius's version (note: your code was not self-contained)
celkem = 1
rok = 1
mesic = 1
model <- lm(celkem ~ rok + mesic)
paste("my model ", deparse(formula(model)))
Solution 5 - R
The easiest way is this:
f = formula(model)
paste(f[2],f[3],sep='~')
done!
Solution 6 - R
R 4.0.0 (released 2020-04-24) introduced deparse1
which never splits the result into multiple strings:
f <- y ~ a + b + c + d + e + f + g + h + i + j + k + l + m + n + o +
p + q + r + s + t + u + v + w + x + y + z
deparse(f)
# [1] "y ~ a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + " " p + q + r + s + t + u + v + w + x + y + z"
deparse1(f)
# [1] "y ~ a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z"
However, it still has a width.cutoff
argument (default (an maximum): 500
) after which linebreaks are introduced but with lines separated by collapse
(default: " "
) not \n
, leaving extra white whitespace (even with collapse = ""
) (use gsub
to remove them if needed, see Ross D's answer):
> f <- rlang::parse_expr( paste0("y~", paste0(rep(letters, 20), collapse="+")))
> deparse1(f, collapse = "")
[1] "y ~ a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u + v + w + x + y + z"
To use it in R < 4.0.0 use backports
(recommended)
or copy it's implementation:
# Part of the R package, https://www.R-project.org
#
# Copyright (C) 1995-2019 The R Core Team
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# A copy of the GNU General Public License is available at
# https://www.R-project.org/Licenses/
deparse1 <- function (expr, collapse = " ", width.cutoff = 500L, ...)
paste(deparse(expr, width.cutoff, ...), collapse = collapse)
Solution 7 - R
Here a solution which use print.formula
, it seems trick but it do the job in oneline and avoid the use of deparse
and no need to use extra package. I just capture the output of the print formula, using capture.output
paste("my text",capture.output(print(formula(celkem ~ rok + mesic))))
[1] "my text celkem ~ rok + mesic"
In case of long formula:
ff <- formula(paste("y ~ ", paste0("x", 1:12, collapse = " + ")))
paste("my text",paste(capture.output(print(ff)), collapse= ' '))
"my text y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12"
Solution 8 - R
Another deparse
-based solution is rlang::expr_text()
(and rlang::quo_text()
):
f <- Y ~ 1 + a + b + c + d + e + f + g + h + i +j + k + l + m + n + o + p + q + r + s + t + u
rlang::quo_text(f)
#> [1] "Y ~ 1 + a + b + c + d + e + f + g + h + i + j + k + l + m + n + \n o + p + q + r + s + t + u"
They do have a width argument to avoid line breaks, but that is limited to 500 characters too. At least it's a single function that is most likely loaded already...
Solution 9 - R
Then add gsub to remove white spaces
gsub(" ", "", paste(format(frm), collapse = ""))
Solution 10 - R
Was optimizing some functions today. A few approaches that have not been mentioned so far.
f <- Y ~ 1 + a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q + r + s + t + u
bench::mark(
expression = as.character(as.expression(f)),
deparse = deparse(f, width.cutoff = 500L),
deparse1 = deparse1(f),
tools = formula.tools:::as.character.formula(f),
stringi = stringi::stri_c(f),
I = as.character(I(f)),
as = as(f, "character"),
txt = gettext(f),
txtf = gettextf(f),
sub = sub("", "", f),
chr = as.character(f),
str = substring(f, 1L),
paste = paste0(f),
)[c(1, 3, 5, 7)]
#> # A tibble: 13 x 3
#> expression median mem_alloc
#> <bch:expr> <bch:tm> <bch:byt>
#> 1 expression 15.4us 0B
#> 2 deparse 31us 0B
#> 3 deparse1 34us 0B
#> 4 tools 58.7us 1.74MB
#> 5 stringi 67us 3.09KB
#> 6 I 64.1us 0B
#> 7 as 100.5us 521.61KB
#> 8 txt 83.4us 0B
#> 9 txtf 85.8us 3.12KB
#> 10 sub 64.6us 0B
#> 11 chr 60us 0B
#> 12 str 62.8us 0B
#> 13 paste 63.5us 0B