GeoEco.R.RWorkerProcess.ExecuteRAndEvaluateExpressions

classmethod RWorkerProcess.ExecuteRAndEvaluateExpressions(expressions, returnResult=False, timeout=60.0, rPackages=None, rInstallDir=None, rLibDir=None, rRepository='https://cloud.r-project.org', updateRPackages=False, port=None, startupTimeout=15.0, defaultTZ=None, variableNames=None, variableValues=None)

Start R, evaluate one or more R expressions, stop R, and optionally return the result of the last expression.

The R statistics program version 3.3 or later must be installed. R can be downloaded from https://cran.r-project.org/.

The Rscript program from R will be started as a child worker process with no visible user interface and MGET will communicate through it with HTTP over TCP/IP. Rscript will listen on the loopback interface (IPv4 address 127.0.0.1), and therefore will only be accessible to processes running on the local machine. To prevent anything other than the intended parent process from interacting with the Rscript worker process, it requires all callers to provide a randomly-generated 512 bit token only known by the parent process. After the final expression is executed, Rscript will be shut down.

For more information about how this works, please see the documentation for the RWorkerProcess class in MGET’s documentation.

Parameters:
  • expressions (list of str) – List of R expressions to evaluate. Each expression can be anything that may be evaluated by the R eval function. Empty strings or strings composed only of whitespace characters are not allowed. Minimum length꞉ 1.

  • returnResult (bool, optional) – If True, the value of the last expression will be returned. If False, the default, a Python None will be returned, regardless of what the last expression evaluated to.

  • timeout (float, optional) –

    After R has started up and installed any necessary packages, this is the maximum amount of time, in seconds, that it is permitted to run while evaluating the expressions before it must return a result. If this time elapses without the R worker process beginning to send its response, an error will be raised.

    The default timeout was selected to allow all but the most time consuming expressions to complete. You should increase it for very long running jobs. If you’re unsure how long it will take, you may allow an infinite amount of time by providing None from Python or deleting all text from this text box in the ArcGIS user interface.

    Warning

    If you allow an infinite amount of time and your R expression never completes, your program will be blocked forever. Use caution.

    Must be greater than 0.0.

  • rPackages (list of str, optional) –

    List of R packages to ensure are installed. For each package that is provided, MGET will check whether it is installed. If it is not, MGET will install it. If it is, MGET will do nothing. To update already-installed packages, use the updateRPackages parameter.

    MGET does not automatically “load” the packages given here. If you need to load them, make sure the expressions include a call to load(), or another suitable function.

  • rInstallDir (str, optional) –

    On Windows: the path to the directory where R is installed, if you do not want R’s installation directory to be discovered automatically. You can determine the installation directory from within R by executing the function R.home(). If this parameter is not provided, the installation directory will be located automatically. Three methods will be tried, in this order:

    1. If the R_HOME environment variable has been set, it will be used. The program Rscript.exe must exist in the bin\x64 subdirectory of R_HOME or a FileNotFoundError exception will be raised.

    2. Otherwise (R_HOME has not been set), the Registry will be checked, starting with the HKEY_CURRENT_USER\Software\R-core key and falling back to HKEY_LOCAL_MACHINE\Software\R-core only if the former does not exist. For whichever exists, the value of R64\InstallPath will be used. The program Rscript.exe must exist in the bin\x64 subdirectory of that directory or a FileNotFoundError exception will be raised.

    3. Otherwise (neither of those registry keys exist), the PATH environment variable will be checked for the program Rscript.exe. If it does not exist, FileNotFoundError exception will be raised.

    On other operating systems: this parameter is ignored, and R’s executables are expected to be available through via the PATH environment variable.

    Minimum length꞉ 1. Must exist.

  • rLibDir (str, optional) –

    Path to the R library directory where R packages should be stored. When a package is needed, it will be loaded from this directory if it exists there, and downloaded there it does not exist. If not provided, R’s default will be used. See the R documentation for details.

    You should provide a custom directory if you want MGET to maintain its own set of R packages, rather than those you use when running R yourself. For example, when running MGET, you may want to use only packages that have been released to CRAN, while when running R yourself, you may want to use newer or experimental versions that you obtained elsewhere.

    Minimum length꞉ 1.

  • rRepository (str, optional) – R repository to use when downloading packages. If not provided, https://cloud.r-project.org will be used. Minimum length꞉ 1.

  • updateRPackages (bool, optional) –

    If True, the R update.packages() function will be called automatically when R starts up, to update all R packages to their latest versions. If False, the default, this will not be done, and once a package has been installed, it will remain at that version until it is updated via some other mechanism.

    Use this option to ensure your R package library is automatically kept up to date. It is set to False by default to prevent MGET from updating your already-installed packages without your explicit permission. However, even if this option is set to False, MGET will still automatically install any packages that it needs that are missing.

  • port (int, optional) – TCP port to use for communicating with R via the R plumber package. If not specified, an unused port will be selected automatically. Minimum value꞉ 1.

  • startupTimeout (float, optional) –

    Maximum amount of time, in seconds, that R is allowed to take to initialize itself and begin servicing requests. This time is usually only a second or two, but can be longer if the machine is busy. Because of this, the default is set to 15 seconds. If the timeout elapses without the R process indicating that it is ready, an error will be raised. To allow an infinite amount of time, provide None from Python or delete all text from this text box in the ArcGIS user interface.

    If packages must be installed or updated, as usually occurs the first time you use MGET to interact with R, the delay is automatically extended to allow package installation to complete.

    Warning

    If you allow an infinite amount of time and R never responds, your program will be blocked forever. Use caution.

    Must be greater than 0.0.

  • defaultTZ (str, optional) –

    Name of the time zone to use when 1) setting R variables from time-zone naive datetime instances, 2) returning datetime instances from R. The time zone names are those from the IANA Time Zone Database. At the time of this writing, many of the names were conveniently listed in Wikipedia

    Setting R variables using naive datetime instances

    When a datetime instance is sent to R, it is converted to an R POSIXct object, which represents time as the number of seconds since the UNIX epoch, which is defined as 1970-01-01 00:00:00 UTC. Because of this, MGET needs to know which time zone the datetime instance is in so that it can be converted to UTC for R.

    If a datetime instance has a time zone defined (meaning that its tzinfo attribute is not None), then MGET will apply that time zone when computing UTC times to send to R. But if it does not have a time zone defined, it is known as a “naive” datetime. In this case, this default time zone parameter (defaultTZ) determines the time zone to use, as follows:

    If defaultTZ is None (the default), MGET will assume that naive datetime instances are in the local time zone, consistent with how many of the Python datetime methods treat naive instances. MGET will then look up the local time zone using the Python tzlocal package and apply it when computing UTC times to send to R.

    If defaultTZ is a string, a ZoneInfo will be instantiated from it and used instead. For example, if you want all naive datetime instances to be treated as UTC, provide 'UTC' for defaultTZ.

    Getting datetime instances back from R

    For consistency with the behavior described above, if defaultTZ is None (the default), MGET will look up the local time zone using the Python tzlocal package and convert all datetime instances to that time zone before returning them. The returned instances will have that time zone defined (they will not be naive).

    If defaultTZ is a string, a ZoneInfo will be instantiated and used instead.

    Minimum length꞉ 1.

  • variableNames (list of str, optional) –

    A list of names of variables to define in the R interpreter before the R expressions are evaluated.

    This list must have the same number of entries as the Variable Values parameter. This list specifies the names of the variables that will be defined and that list specifies their values.

    These two parameters are useful when you need to pass input data that will be used in your R expressions. You can initialize variables to values you specify, and then refer to the variables in the R expressions. For example, you might define a variable named inputCSVFile and then include the following R expressions to read the table and print a summary:

    x = read.csv(inputCSVFile)
    print(summary(x))
    

  • variableValues (list of object, optional) –

    A list of values of variables to define in the R interpreter before the R expressions are evaluated.

    This list must have the same number of entries as the Variable Names parameter. That list specifies the names of the variables that will be defined and this list specifies their values.

    The values you provide are automatically converted to the most appropriate R data types. Please see the MGET documentation for the RWorkerProcess class for details. However, because this function is intended to be invoked as an ArcGIS geoprocessing tool, it handles strings differently that described in that documentation.

    The reason this is necessary is because the ArcGIS geoprocessing framework passes all parameters to Python tools (such as this one) as strings, making it impossible to determine the original data type of each parameter simply from its value. For example, given the string “123”, it is impossible to determine if it was supposed to represent the string “123”, the integer 123, or the floating point number 123.0.

    To address this limitation, this function attempts to parse strings into booleans, integers, floating point numbers, and datetimes, in that order. If a parsing attempt succeeds, the parsed value is used. If all parsing attempts fail, it is converted to a string as normal. If a string is empty (it has a length of zero), it is converted to NA in R. (If a string contains one or more whitespace characters, it is not considered empty.)

    For example:

    • ”True” is converted to an R logical

    • ”5” is converted to an R integer

    • ”1.05” is converted to an R double

    • ”2007-12-31 12:34:56” is converted to an R POSIXct.

    • ”1.05 days” is converted to an R character

    • ”” is converted to R NA

    This tool parses booleans as “true” or “false” (case insensitive). It attempts to parse dates using a large number of formats, starting with what appears to be the appropriate formats for the operating system’s current locale. If no time zone is included in the string itself, the time zone is specified by the defaultTZ parameter.

    This special parsing logic only applies to atomic string values. It does NOT apply to collections of strings, such as lists or dictionaries of strings. (It is only possible to provide such collections when calling this function from Python; it cannot be done from ArcGIS geoprocessing.)

    Must have the same length as variableNames.

Returns:

Result returned from the R interpreter. If the evaluated R code contained multiple expressions, the value of the last expressions is returned. The type of the returned value depends on the expressions that is evaluated.

Return type:

object