Global variables in packages in R

RGlobal Variables

R Problem Overview


I'm developing a package in R. I have a bunch of functions, some of them need some global variables. How do I manage global variables in packages?

I've read something about environment, but I do not understand how it will work, of if this even is the way to go about the things.

R Solutions


Solution 1 - R

You can use package local variables through an environment. These variables will be available to multiple functions in the package, but not (easily) accessible to the user and will not interfere with the users workspace. A quick and simple example is:

pkg.env <- new.env()

pkg.env$cur.val <- 0
pkg.env$times.changed <- 0

inc <- function(by=1) {
	pkg.env$times.changed <- pkg.env$times.changed + 1
	pkg.env$cur.val <- pkg.env$cur.val + by
	pkg.env$cur.val
}

dec <- function(by=1) {
	pkg.env$times.changed <- pkg.env$times.changed + 1
	pkg.env$cur.val <- pkg.env$cur.val - by
	pkg.env$cur.val
}

cur <- function(){
	cat('the current value is', pkg.env$cur.val, 'and it has been changed', 
		pkg.env$times.changed, 'times\n')
}

inc()
inc()
inc(5)
dec()
dec(2)
inc()
cur()

Solution 2 - R

You could set an option, eg

options("mypkg-myval"=3)
1+getOption("mypkg-myval")
[1] 4

Solution 3 - R

In general global variables are evil. The underlying principle why they are evil is that you want to minimize the interconnections in your package. These interconnections often cause functions to have side-effects, i.e. it depends not only on the input arguments what the outcome is, but also on the value of some global variable. Especially when the number of functions grows, this can be hard to get right and hell to debug.

For global variables in R see this SO post.

Edit in response to your comment: An alternative could be to just pass around the needed information to the functions that need it. You could create a new object which contains this info:

token_information = list(token1 = "087091287129387",
                         token2 = "UA2329723")

and require all functions that need this information to have it as an argument:

do_stuff = function(arg1, arg2, token)
do_stuff(arg1, arg2, token = token_information)

In this way it is clear from the code that token information is needed in the function, and you can debug the function on its own. Furthermore, the function has no side effects, as its behavior is fully determined by its input arguments. A typical user script would look something like:

token_info = create_token(token1, token2)
do_stuff(arg1, arg2, token_info)

I hope this makes things more clear.

Solution 4 - R

The question is unclear:

  • Just one R process or several?

  • Just on one host, or across several machine?

  • Is there common file access among them or not?

In increasing order of complexity, I'd use a file, a SQLite backend via the RSQlite package or (my favourite :) the rredis package to set to / read from a Redis instance.

Solution 5 - R

You could also create a list of tokens and add it to R/sysdata.rda with usethis::use_data(..., internal = TRUE). The data in this file is internal, but accessible by all functions. The only problem would arise if you only want some functions to access the tokens, which would be better served by:

  1. the environment solution already proposed above; or
  2. creating a hidden helper function that holds the tokens and returns them. Then just call this hidden function inside the functions that use the tokens, and (assuming it is a list) you can inject them to their environment with list2env(..., envir = environment()).

Solution 6 - R

If you don't mind adding a dependency to your package, you can use an R6 object from the homonym package, as suggested in the comments to @greg-snow's answer.

R6 objects are actual environments with the possibility of adding public and private methods, are very lightweight and could be a good and more rigorous option to share package's global variables, without polluting the global environment.

Compared to @greg-snow's solution, it allows for a stricter control of your variables (you can add methods that check for types for example). The drawback can be the dependency and, of course, learning the R6 syntax.

library(R6)
MyPkgOptions = R6::R6Class(
  "mypkg_options",
  public = list(
    get_option = function(x) private$.options[[x]]
  ),
  active = list(
    var1 = function(x){
      if(missing(x)) private$.options[['var1']]
      else stop("This is an environment parameter that cannot be changed")
    }
    ,var2 = function(x){
      if(missing(x)) private$.options[['var2']]
      else stop("This is an environment parameter that cannot be changed")
    }
  ),
  private = list(
    .options = list(
      var1 = 1,
      var2 = 2
    )
  )
)
# Create an instance
mypkg_options = MyPkgOptions$new()
# Fetch values from active fields
mypkg_options$var1
#> [1] 1
mypkg_options$var2
#> [1] 2
# Alternative way
mypkg_options$get_option("var1")
#> [1] 1
mypkg_options$get_option("var3")
#> NULL
# Variables are locked unless you add a method to change them
mypkg_options$var1 = 3
#> Error in (function (x) : This is an environment parameter that cannot be changed

Created on 2020-05-27 by the reprex package (v0.3.0)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionbskardView Question on Stackoverflow
Solution 1 - RGreg SnowView Answer on Stackoverflow
Solution 2 - RJamesView Answer on Stackoverflow
Solution 3 - RPaul HiemstraView Answer on Stackoverflow
Solution 4 - RDirk EddelbuettelView Answer on Stackoverflow
Solution 5 - RDavid AtlasView Answer on Stackoverflow
Solution 6 - RDuccio AView Answer on Stackoverflow