GeoEco.R.RWorkerProcess.ExecuteRAndEvaluateExpressions
- classmethod RWorkerProcess.ExecuteRAndEvaluateExpressions(expressions, returnResult=False, timeout=60.0, rPackages=None, rInstallDir=None, rLibDir=None, rRepository='https://cloud.r-project.org', updateRPackages=False, winBinaryOnly=True, port=None, startupTimeout=15.0, defaultTZ=None, variableNames=None, variableValues=None)
Start R, evaluate one or more R expressions, stop R, and optionally return the result of the last expression.
The R statistics program version 3.3 or later must be installed. R can be downloaded from https://cran.r-project.org/.
The Rscript program from R will be started as a child worker process with no visible user interface and MGET will communicate through it with HTTP over TCP/IP. Rscript will listen on the loopback interface (IPv4 address 127.0.0.1), and therefore will only be accessible to processes running on the local machine. To prevent anything other than the intended parent process from interacting with the Rscript worker process, it requires all callers to provide a randomly-generated 512 bit token only known by the parent process. After the final expression is executed, Rscript will be shut down.
For more information about how this works, please see the documentation for the RWorkerProcess class in MGET’s documentation.
- Parameters:
expressions (
listofstr) – List of R expressions to evaluate. Each expression can be anything that may be evaluated by the Revalfunction. Empty strings or strings composed only of whitespace characters are not allowed. Minimum length꞉ 1.returnResult (
bool, optional) – If True, the value of the last expression will be returned. If False, the default, a PythonNonewill be returned, regardless of what the last expression evaluated to.timeout (
float, optional) –After R has started up and installed any necessary packages, this is the maximum amount of time, in seconds, that it is permitted to run while evaluating the expressions before it must return a result. If this time elapses without the R worker process beginning to send its response, an error will be raised.
The default timeout was selected to allow all but the most time consuming expressions to complete. You should increase it for very long running jobs. If you’re unsure how long it will take, you may allow an infinite amount of time by providing
Nonefrom Python or deleting all text from this text box in the ArcGIS user interface.Warning
If you allow an infinite amount of time and your R expression never completes, your program will be blocked forever. Use caution.
Must be greater than 0.0.
rPackages (
listofstr, optional) –List of R packages to ensure are installed. For each package that is provided, MGET will check whether it is installed. If it is not, MGET will install it. If it is, MGET will do nothing. To update already-installed packages, use the updateRPackages parameter.
MGET does not automatically “load” the packages given here. If you need to load them, make sure the expressions include a call to
load(), or another suitable function.rInstallDir (
str, optional) –On Windows: the path to the directory where R is installed, if you do not want R’s installation directory to be discovered automatically. You can determine the installation directory from within R by executing the function
R.home(). If this parameter is not provided, the installation directory will be located automatically. Three methods will be tried, in this order:If the R_HOME environment variable has been set, it will be used. The program Rscript.exe must exist in the
bin\x64subdirectory of R_HOME or aFileNotFoundErrorexception will be raised.Otherwise (R_HOME has not been set), the Registry will be checked, starting with the
HKEY_CURRENT_USER\Software\R-corekey and falling back toHKEY_LOCAL_MACHINE\Software\R-coreonly if the former does not exist. For whichever exists, the value ofR64\InstallPathwill be used. The program Rscript.exe must exist in thebin\x64subdirectory of that directory or aFileNotFoundErrorexception will be raised.Otherwise (neither of those registry keys exist), the PATH environment variable will be checked for the program Rscript.exe. If it does not exist,
FileNotFoundErrorexception will be raised.
On other operating systems: this parameter is ignored, and R’s executables are expected to be available through via the PATH environment variable.
Minimum length꞉ 1. Must exist.
rLibDir (
str, optional) –Path to the R library directory where R packages should be stored. When a package is needed, it will be loaded from this directory if it exists there, and downloaded there it does not exist. If not provided, R’s default will be used. See the R documentation for details.
You should provide a custom directory if you want MGET to maintain its own set of R packages, rather than those you use when running R yourself. For example, when running MGET, you may want to use only packages that have been released to CRAN, while when running R yourself, you may want to use newer or experimental versions that you obtained elsewhere.
Minimum length꞉ 1.
rRepository (
str, optional) – R repository to use when downloading packages. If not provided, https://cloud.r-project.org will be used. Minimum length꞉ 1.updateRPackages (
bool, optional) –If True, the R
update.packages()function will be called automatically when R starts up, to update all R packages to their latest versions. If False, the default, this will not be done, and once a package has been installed, it will remain at that version until it is updated via some other mechanism.Use this option to ensure your R package library is automatically kept up to date. It is set to False by default to prevent MGET from updating your already-installed packages without your explicit permission. However, even if this option is set to False, MGET will still automatically install any packages that it needs that are missing.
winBinaryOnly (
bool, optional) –If True, the default, then when running on Windows
options(pkgType = "win.binary")will be invoked when R starts up, before any packages are installed or updated. This will ensure that only binary packages available from the package repository will be installed, and block the installation of packages that are only available as source code.When a package is only available as source, R must compile it locally, usually using RTools and related R utilities. We found that sometimes these locally-compiled packages end up being incompatible with other packages, possibly due to conflicts over common libraries that they both need. Package repositories help avoid this problem by compiling all packages against the same common libraries. We found that by restricting our Windows versions of R to only binary packages, the conflicts were avoided.
If you want to go ahead and allow MGET to install source-only packages, set this parameter to False. This can sometimes provide access to the very latest versions of packages that have not yet been compiled by the package repository.
This parameter is ignored on platforms other than Windows (e.g. Linux). On those platforms, R’s default settings are used, which usually allow source-only packages to be installed.
port (
int, optional) – TCP port to use for communicating with R via the R plumber package. If not specified, an unused port will be selected automatically. Minimum value꞉ 1.startupTimeout (
float, optional) –Maximum amount of time, in seconds, that R is allowed to take to initialize itself and begin servicing requests. This time is usually only a second or two, but can be longer if the machine is busy. Because of this, the default is set to 15 seconds. If the timeout elapses without the R process indicating that it is ready, an error will be raised. To allow an infinite amount of time, provide
Nonefrom Python or delete all text from this text box in the ArcGIS user interface.If packages must be installed or updated, as usually occurs the first time you use MGET to interact with R, the delay is automatically extended to allow package installation to complete.
Warning
If you allow an infinite amount of time and R never responds, your program will be blocked forever. Use caution.
Must be greater than 0.0.
defaultTZ (
str, optional) –Name of the time zone to use when 1) setting R variables from time-zone naive
datetimeinstances, 2) returningdatetimeinstances from R. The time zone names are those from the IANA Time Zone Database. At the time of this writing, many of the names were conveniently listed in WikipediaSetting R variables using naive datetime instances
When a
datetimeinstance is sent to R, it is converted to an RPOSIXctobject, which represents time as the number of seconds since the UNIX epoch, which is defined as 1970-01-01 00:00:00 UTC. Because of this, MGET needs to know which time zone thedatetimeinstance is in so that it can be converted to UTC for R.If a
datetimeinstance has a time zone defined (meaning that its tzinfo attribute is notNone), then MGET will apply that time zone when computing UTC times to send to R. But if it does not have a time zone defined, it is known as a “naive”datetime. In this case, this default time zone parameter (defaultTZ) determines the time zone to use, as follows:If defaultTZ is
None(the default), MGET will assume that naivedatetimeinstances are in the local time zone, consistent with how many of the Pythondatetimemethods treat naive instances. MGET will then look up the local time zone using the Python tzlocal package and apply it when computing UTC times to send to R.If defaultTZ is a string, a
ZoneInfowill be instantiated from it and used instead. For example, if you want all naivedatetimeinstances to be treated as UTC, provide'UTC'for defaultTZ.Getting datetime instances back from R
For consistency with the behavior described above, if defaultTZ is
None(the default), MGET will look up the local time zone using the Python tzlocal package and convert alldatetimeinstances to that time zone before returning them. The returned instances will have that time zone defined (they will not be naive).If defaultTZ is a string, a
ZoneInfowill be instantiated and used instead.Minimum length꞉ 1.
variableNames (
listofstr, optional) –A list of names of variables to define in the R interpreter before the R expressions are evaluated.
This list must have the same number of entries as the Variable Values parameter. This list specifies the names of the variables that will be defined and that list specifies their values.
These two parameters are useful when you need to pass input data that will be used in your R expressions. You can initialize variables to values you specify, and then refer to the variables in the R expressions. For example, you might define a variable named
inputCSVFileand then include the following R expressions to read the table and print a summary:x = read.csv(inputCSVFile) print(summary(x))
variableValues (
listofobject, optional) –A list of values of variables to define in the R interpreter before the R expressions are evaluated.
This list must have the same number of entries as the Variable Names parameter. That list specifies the names of the variables that will be defined and this list specifies their values.
The values you provide are automatically converted to the most appropriate R data types. Please see the MGET documentation for the RWorkerProcess class for details. However, because this function is intended to be invoked as an ArcGIS geoprocessing tool, it handles strings differently that described in that documentation.
The reason this is necessary is because the ArcGIS geoprocessing framework passes all parameters to Python tools (such as this one) as strings, making it impossible to determine the original data type of each parameter simply from its value. For example, given the string “123”, it is impossible to determine if it was supposed to represent the string “123”, the integer 123, or the floating point number 123.0.
To address this limitation, this function attempts to parse strings into booleans, integers, floating point numbers, and datetimes, in that order. If a parsing attempt succeeds, the parsed value is used. If all parsing attempts fail, it is converted to a string as normal. If a string is empty (it has a length of zero), it is converted to
NAin R. (If a string contains one or more whitespace characters, it is not considered empty.)For example:
”True” is converted to an R
logical”5” is converted to an R
integer”1.05” is converted to an R
double”2007-12-31 12:34:56” is converted to an R
POSIXct.”1.05 days” is converted to an R
character”” is converted to R
NA
This tool parses booleans as “true” or “false” (case insensitive). It attempts to parse dates using a large number of formats, starting with what appears to be the appropriate formats for the operating system’s current locale. If no time zone is included in the string itself, the time zone is specified by the defaultTZ parameter.
This special parsing logic only applies to atomic string values. It does NOT apply to collections of strings, such as lists or dictionaries of strings. (It is only possible to provide such collections when calling this function from Python; it cannot be done from ArcGIS geoprocessing.)
Must have the same length as variableNames.
- Returns:
Result returned from the R interpreter. If the evaluated R code contained multiple expressions, the value of the last expressions is returned. The type of the returned value depends on the expressions that is evaluated.
- Return type: