GeoEco.R.RWorkerProcess
- class GeoEco.R.RWorkerProcess(rInstallDir=None, rLibDir=None, rRepository='https://cloud.r-project.org', rPackages=None, winBinaryOnly=True, updateRPackages=False, port=None, timeout=5.0, startupTimeout=15.0, defaultTZ=None)
Bases:
MutableMappingStarts and manages an R child process and provides methods for interacting with it.
Similar to the rpy2 package,
RWorkerProcessstarts the R interpreter and provides mechanisms for Python code to get and set R variables and evaluate R expressions.RWorkerProcessis not as fully-featured as rpy2 and has several important differences in how it is implemented:RWorkerProcesshosts the R interpreter in a child process (using the Rscript program), while rpy2 hosts it within the same process as the Python interpreter.RWorkerProcessis therefore less likely to encounter “DLL Hell” conflicts, in which Python and R try to load different versions of the same shared library, which can cause the process to crash. However,RWorkerProcessis slower than rpy2, because interactions with R have to occur via interprocess communication.RWorkerProcessimplements this with the R plumber package, which allows R functions to be exposed as HTTP endpoints. This mechanism is also less secure than that used by rpy2; see Security below.RWorkerProcessdoes not allow as full a range of data types to be exchanged between Python and R as rpy2. WithRWorkerProcess, the communication between Python and R uses JSON for exchanging basic types and Apache feather for exchanging data frames. These choices simplified implementation but placed some limitations on what can be exchanged. Most notably, Python numpy arrays cannot be translated to R matrices (although support for this could be added in the future). By contrast, rpy2 calls R’s C API directly and has implemented translation code for more data types, including numpy arrays to R matrices.RWorkerProcessdoes not need to be compiled against a specific version of R, and can therefore work with any version of R that you have installed, while rpy2 must be recompiled for the R version you have, whenever you change it.RWorkerProcesssupports Microsoft Windows, while rpy2 historically has lacked a Windows maintainer. While it can be possible to get rpy2 working on Windows, there are usually no binary distributions (Python wheels) for Windows on the Python Package Index. For Conda users, which generally includes users of ArcGIS, there is a release of rpy2 on conda-forge, but it can be out of date by a year or more and may not be compatible with recent R versions. To work around this, Windows users can try to build rpy2 from source, but installing the correct compiler and required libraries can be challenging and time consuming.
If rpy2 works for you, we recommend you continue to use it. But if not, or some of the issues mentioned above affect you,
RWorkerProcesscould provide an effective alternative.Using RWorkerProcess
RWorkerProcessrepresents the child R process. When you instantiateRWorkerProcess, nothing happens at first. The child process is started automatically when you start using theRWorkerProcessinstance to interact with R. We recommend you use thewithstatement to automatically control the child process’s lifetime:from GeoEco.R import RWorkerProcess with RWorkerProcess() as r: ... x = r.Eval('1+1') # Worker process started here, at the first use of the RWorkerProcess instance ... print(x) # Worker process stopped before this line is executed, after the block above exits
This will start the child process when it is first needed and automatically stop it when the
withblock is exited, even if an exception is raised.If desired, you can call
Start()to start it manually orStop()to stop it. We recommend you use atry/finallyblock to do it:r = RWorkerProcess() r.Start() # Worker process started here try: ... finally: r.Stop() # Worker process stopped here
Regardless of which style you use, if the R child process is still running when the Python process exits, the operating system will stop the child process, even if Python dies without exiting properly.
Warning
RWorkerProcessmust install the R plumber package the first time it interacts with R, unless the package is already installed. Plumber depends on a number of R packages. Installing plumber and its dependencies may take several minutes on Windows. On Linux, where R package installations typically requiring from C source code, it can take 20 minutes or more. After this has been done for the first time, it will not be necessary to do again, unless you uninstall plumber.Evaluating R expressions from Python
Eval()accepts a string representing an R expression, passes it to the R interpreter for evaluation, and returns the result, translating R types into suitable Python types. You can supply multiple expressions in a single call, separated by newline characters or semicolons. The last value of the last expression will be returned:>>> from GeoEco.R import RWorkerProcess >>> r = RWorkerProcess() >>> r.Eval('x <- 6; y <- 7; x * y') 42
A variety of R types can be translated into Python types. The rules of translation are governed by the serialization formats used to marshal data between Python and R. For most types, JSON is used as the serialization format, with the requests package handling it on the Python side and plumber on the R side. In general, R vectors, lists, and data frames are supported, as follows:
R vectors of length 1, sometimes known as atomic values, with the type
logical,integer,double, orcharacterare returned as Pythonbool,int,float, andstr, respectively:>>> r.Eval('TRUE') True >>> r.Eval('123') 123 >>> r.Eval('pi') 3.141592653589793 >>> r.Eval('"Hello, world"') 'Hello, world'
Those atomic types are also returned even if you use R’s
c()function to create a length 1 vector. (It does not matter how you construct it; if the vector has length 1, the atomic types are returned.)>>> r.Eval('c(TRUE)') True >>> r.Eval('c(123)') 123 >>> r.Eval('c(pi)') 3.141592653589793 >>> r.Eval('c("Hello, world")') 'Hello, world'
R vectors of length 2 or more are returned as a Python
list:>>> r.Eval('c(1,2,3)') [1, 2, 3]
R unnamed lists are also returned as a
list. In this case, a list of length 1 is not returned as an atomic type, but as alistwith one item:>>> r.Eval('list(1)') [1] >>> r.Eval('list(1,2,3)') [1, 2, 3] >>> r.Eval('list(c(1, 2, 3))') [[1, 2, 3]] >>> r.Eval('list(c(1,2,3), c("A", "B", "C"))') [[1, 2, 3], ['A', 'B', 'C']]
R vectors and lists of length 0 are returned as an empty
list:>>> r.Eval('logical(0)') [] >>> r.Eval('integer(0)') [] >>> r.Eval('numeric(0)') [] >>> r.Eval('character(0)') [] >>> r.Eval('list()') []
R named lists are returned as a Python
dict:>>> r.Eval('list(a=1, b=2, c=3)') {'a': 1, 'b': 2, 'c': 3} >>> r.Eval('list(a=c(1,2,3), b=4, c=c("A", "B", "C"))') {'a': [1, 2, 3], 'b': 4, 'c': ['A', 'B', 'C']}
R vectors of
POSIXt(i.e.POSIXctorPOSIXlt) are returned as Pythondatetimeinstances:>>> r.Eval('Sys.time()') datetime.datetime(2025, 2, 5, 15, 13, 47, 641000, tzinfo=zoneinfo.ZoneInfo(key='America/New_York'))
Time values obtained from R will have millisecond precision, even if R itself has higher precision. The millisecond limitation results from the format used by the R plumber package to represent times in JSON.
The defaultTZ parameter of the
RWorkerProcessconstructor determines the time zone that allPOSIXtobjects will be converted to when they are returned to Python. By default, it is the time zone of the Python process, as returned byget_localzone()from the tzlocal package. To specify a different timezone, provide it to theRWorkerProcessconstructor:>>> r = RWorkerProcess(defaultTZ='America/Los_Angeles') >>> r.Eval('Sys.time()') datetime.datetime(2025, 2, 5, 12, 13, 47, 641000, tzinfo=zoneinfo.ZoneInfo(key='America/Los_Angeles'))
See the documentation for defaultTZ for more information.
R
NAis returned as a PythonNone:>>> r.Eval('NA') is None True >>> r.Eval('c(1, 2, NA, 3)') [1, 2, None, 3]
R data frames are returned as Python pandas DataFrames:
>>> df = r.Eval('iris') >>> df.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 150 entries, 0 to 149 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Sepal.Length 150 non-null float64 1 Sepal.Width 150 non-null float64 2 Petal.Length 150 non-null float64 3 Petal.Width 150 non-null float64 4 Species 150 non-null category dtypes: category(1), float64(4) memory usage: 5.1 KB >>> df.head() Sepal.Length Sepal.Width Petal.Length Petal.Width Species 0 5.1 3.5 1.4 0.2 setosa 1 4.9 3.0 1.4 0.2 setosa 2 4.7 3.2 1.3 0.2 setosa 3 4.6 3.1 1.5 0.2 setosa 4 5.0 3.6 1.4 0.2 setosa
Arbitrary R objects not covered above are usually converted to an R list with R’s
unclass()and then returned as Pythondicts:>>> model = r.Eval('lm(dist ~ speed, data = cars)') >>> from pprint import pprint >>> pprint(model, width=150, compact=True) {'assign': [0, 1], 'call': {}, 'coefficients': [-17.579094890510934, 3.932408759124087], 'df.residual': 48, 'effects': [-303.9144945539781, 145.55225504575705, -8.115439504379111, 9.884560495620892, 0.194114676507422, -9.49633114260605, -5.186776961719519, 2.8132230382804804, 10.81322303828048, -9.87722278083299, 1.1227772191670096, -16.56766859994646, -10.56766859994646, -6.56766859994646, -2.5676685999464604, -8.25811441905993, -0.2581144190599315, -0.2581144190599315, 11.74188558094007, -11.948560238173402, -1.948560238173402, 22.0514397618266, 42.05143976182659, -21.63900605728687, -15.639006057286872, 12.360993942713128, -13.329451876400343, -5.329451876400342, -17.019897695513812, -9.019897695513812, 0.9801023044861885, -10.710343514627283, 3.2896564853727175, 23.289656485372713, 31.289656485372713, -20.400789333740754, -10.400789333740754, 11.599210666259246, -28.091235152854225, -12.091235152854225, -8.091235152854225, -4.091235152854224, 3.908764847145776, -1.4721267910811655, -17.162572610194637, -4.853018429308115, 17.146981570691885, 18.146981570691885, 45.146981570691885, 6.456535751578421], 'fitted.values': [-1.8494598540146354, -1.8494598540145883, 9.94776642335767, 9.947766423357667, 13.880175182481754, 17.81258394160584, 21.74499270072993, 21.74499270072993, 21.74499270072993, 25.677401459854018, 25.677401459854018, 29.609810218978105, 29.6098102189781, 29.609810218978105, 29.609810218978105, 33.54221897810219, 33.54221897810219, 33.54221897810219, 33.54221897810219, 37.47462773722628, 37.47462773722628, 37.47462773722627, 37.474627737226285, 41.40703649635036, 41.407036496350365, 41.407036496350365, 45.33944525547445, 45.33944525547445, 49.27185401459854, 49.27185401459854, 49.27185401459854, 53.204262773722625, 53.204262773722625, 53.20426277372263, 53.20426277372263, 57.13667153284671, 57.13667153284671, 57.13667153284671, 61.0690802919708, 61.0690802919708, 61.0690802919708, 61.0690802919708, 61.0690802919708, 68.93389781021898, 72.86630656934307, 76.79871532846715, 76.79871532846715, 76.79871532846715, 76.79871532846715, 80.73112408759124], 'model': [{'dist': 2, 'speed': 4}, {'dist': 10, 'speed': 4}, {'dist': 4, 'speed': 7}, {'dist': 22, 'speed': 7}, {'dist': 16, 'speed': 8}, {'dist': 10, 'speed': 9}, {'dist': 18, 'speed': 10}, {'dist': 26, 'speed': 10}, {'dist': 34, 'speed': 10}, {'dist': 17, 'speed': 11}, {'dist': 28, 'speed': 11}, {'dist': 14, 'speed': 12}, {'dist': 20, 'speed': 12}, {'dist': 24, 'speed': 12}, {'dist': 28, 'speed': 12}, {'dist': 26, 'speed': 13}, {'dist': 34, 'speed': 13}, {'dist': 34, 'speed': 13}, {'dist': 46, 'speed': 13}, {'dist': 26, 'speed': 14}, {'dist': 36, 'speed': 14}, {'dist': 60, 'speed': 14}, {'dist': 80, 'speed': 14}, {'dist': 20, 'speed': 15}, {'dist': 26, 'speed': 15}, {'dist': 54, 'speed': 15}, {'dist': 32, 'speed': 16}, {'dist': 40, 'speed': 16}, {'dist': 32, 'speed': 17}, {'dist': 40, 'speed': 17}, {'dist': 50, 'speed': 17}, {'dist': 42, 'speed': 18}, {'dist': 56, 'speed': 18}, {'dist': 76, 'speed': 18}, {'dist': 84, 'speed': 18}, {'dist': 36, 'speed': 19}, {'dist': 46, 'speed': 19}, {'dist': 68, 'speed': 19}, {'dist': 32, 'speed': 20}, {'dist': 48, 'speed': 20}, {'dist': 52, 'speed': 20}, {'dist': 56, 'speed': 20}, {'dist': 64, 'speed': 20}, {'dist': 66, 'speed': 22}, {'dist': 54, 'speed': 23}, {'dist': 70, 'speed': 24}, {'dist': 92, 'speed': 24}, {'dist': 93, 'speed': 24}, {'dist': 120, 'speed': 24}, {'dist': 85, 'speed': 25}], 'qr': {'pivot': [1, 2], 'qr': [[-7.0710678118654755, -108.8944443027283], [0.1414213562373095, 37.0135110466435], [0.1414213562373095, 0.18878369792756214], [0.1414213562373095, 0.18878369792756214], [0.1414213562373095, 0.16176653657964718], [0.1414213562373095, 0.13474937523173222], [0.1414213562373095, 0.10773221388381726], [0.1414213562373095, 0.10773221388381726], [0.1414213562373095, 0.10773221388381726], [0.1414213562373095, 0.0807150525359023], [0.1414213562373095, 0.0807150525359023], [0.1414213562373095, 0.05369789118798735], [0.1414213562373095, 0.05369789118798735], [0.1414213562373095, 0.05369789118798735], [0.1414213562373095, 0.05369789118798735], [0.1414213562373095, 0.026680729840072397], [0.1414213562373095, 0.026680729840072397], [0.1414213562373095, 0.026680729840072397], [0.1414213562373095, 0.026680729840072397], [0.1414213562373095, -0.00033643150784255907], [0.1414213562373095, -0.00033643150784255907], [0.1414213562373095, -0.00033643150784255907], [0.1414213562373095, -0.00033643150784255907], [0.1414213562373095, -0.027353592855757516], [0.1414213562373095, -0.027353592855757516], [0.1414213562373095, -0.027353592855757516], [0.1414213562373095, -0.05437075420367247], [0.1414213562373095, -0.05437075420367247], [0.1414213562373095, -0.08138791555158742], [0.1414213562373095, -0.08138791555158742], [0.1414213562373095, -0.08138791555158742], [0.1414213562373095, -0.10840507689950238], [0.1414213562373095, -0.10840507689950238], [0.1414213562373095, -0.10840507689950238], [0.1414213562373095, -0.10840507689950238], [0.1414213562373095, -0.13542223824741734], [0.1414213562373095, -0.13542223824741734], [0.1414213562373095, -0.13542223824741734], [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.2164737222911622], [0.1414213562373095, -0.24349088363907717], [0.1414213562373095, -0.27050804498699216], [0.1414213562373095, -0.27050804498699216], [0.1414213562373095, -0.27050804498699216], [0.1414213562373095, -0.27050804498699216], [0.1414213562373095, -0.2975252063349071]], 'qraux': [1.1414213562373094, 1.269835181971307], 'rank': 2, 'tol': 1e-07}, 'rank': 2, 'residuals': [3.8494598540146354, 11.849459854014588, -5.94776642335767, 12.052233576642333, 2.119824817518246, -7.812583941605841, -3.744992700729929, 4.255007299270071, 12.255007299270071, -8.677401459854016, 2.3225985401459837, -15.609810218978105, -9.609810218978101, -5.609810218978103, -1.609810218978103, -7.54221897810219, 0.4577810218978093, 0.4577810218978093, 12.45778102189781, -11.474627737226276, -1.474627737226278, 22.525372262773725, 42.525372262773715, -21.40703649635036, -15.407036496350365, 12.592963503649635, -13.339445255474452, -5.339445255474452, -17.27185401459854, -9.271854014598537, 0.7281459854014627, -11.204262773722625, 2.795737226277375, 22.79573722627737, 30.79573722627737, -21.136671532846712, -11.136671532846712, 10.863328467153288, -29.0690802919708, -13.0690802919708, -9.0690802919708, -5.0690802919708, 2.9309197080292, -2.933897810218975, -18.866306569343063, -6.798715328467158, 15.201284671532843, 16.201284671532843, 43.20128467153284, 4.268875912408762], 'terms': {}, 'xlevels': {}}
When an R expression evaluates to
NULLin R, aNoneis returned. Note that this includes the R expressionc():>>> r.Eval('NULL') is None True >>> r.Eval('c()') is None True
However, the usual R rules about how
NULLis handled by R still apply. For example, R removesNULLelements from R vectors. This can yield results that may be unexpected by Python developers:>>> r.Eval('c(1, 2)') [1, 2] >>> r.Eval('c(1, NULL)') 1 >>> r.Eval('c(1, NULL, NULL)') 1 >>> r.Eval('c(1, NULL, NULL, 2)') [1, 2] >>> r.Eval('c(NULL, NULL, NULL, NULL)') is None True
But R does not remove
NULLfrom R lists, and it will be translated toNone:>>> r.Eval('list(NULL)') [None] >>> r.Eval('list(NULL, NULL, NULL)') [None, None, None] >>> r.Eval('list(a=NULL, b=NULL, c=NULL)') {'a': None, 'b': None, 'c': None}
Getting and setting R variables from Python
You can get and set variables in the R interpreter through the dictionary interface of the
RWorkerProcessinstance:>>> r['my_variable'] = 42 # Set my_variable to 42 in the R interpreter >>> print(r['my_variable']) # Get back the value of my_variable and print it 42 >>> print(list(r.keys())) # Print a list of the variables defined in the R interpreter ['my_variable'] >>> del r['my_variable'] # Delete my_variable from the R interpreter
Python types will be automatically translated to and from R types as described above.
Unexpected behaviors
Because of differences between R and Python and the imperfectness of JSON and feather as data marshaling formats, there some unexpected behaviors, including:
In an R
doublevector, any value that happens to be an integer is returned to Python as anint:>>> r.Eval('typeof(1.0)') 'double' >>> type(r.Eval('1.0')) <class 'int'> >>> r.Eval('typeof(c(1,2,3.3))') 'double' >>> [type(x) for x in r.Eval('c(1,2,3.3)')] [<class 'int'>, <class 'int'>, <class 'float'>]
If you set an R variable to a Python
listthat has a length of 1 and then get it back from R, it will no longer be alist:>>> r['x'] = [1] >>> r['x'] 1
This is because in R, atomic values are actually stored as length 1 vectors, while Python distinguishes between the two. When returning a length 1 vector to Python, we can’t determine if it would be best represented as an atomic value (e.g.
int) or as alistwith a single value in it. We judged that an atomic value would be appropriate more of the time, and lacking any way to determine otherwise, we designedRWorkerProcessto always translate length 1 vectors into atomic values.R
complexis not supported (because JSON does not support complex numbers) and is returned as Pythonstr:>>> r.Eval('c(1+2i, 3-5i, 6)') ['1+2i', '3-5i', '6+0i']
Character encoding
Data are exchanged with R in UTF-8:
>>> r.Eval('"Café, résumé, naïve, jalapeño"') 'Café, résumé, naïve, jalapeño' >>> r.Eval('"Python 🐍 is awesome! 你好! Привет!"') 'Python 🐍 is awesome! 你好! Привет!'
Logging and error handling
Messages written by R to R’s stdout pipe, e.g. with the the R
cat()function, are logged to the PythonGeoEco.Rlogger as INFO messages. Messages written by R to its stderr pipe, e.g. with the Rmessage()function, are logged to theGeoEco.Rlogger as WARNING messages.>>> from GeoEco.Logging import Logger >>> Logger.Initialize() >>> from GeoEco.R import RWorkerProcess >>> r = RWorkerProcess() >>> x = r.Eval('print(pi)') 2025-02-05 16:19:09.213 INFO [1] 3.141593 >>> r.Eval('cat("Hello, world!\n")') 2025-02-05 16:19:56.232 INFO Hello, world! >>> r.Eval('message("Something might be wrong")') 2025-02-05 16:20:19.721 WARNING Something might be wrong
If an error is signaled in R and not caught before the signal propagates back up to the plumber API, it is sent back to Python and
RuntimeErrorwill be raised:>>> r.Eval('stop("There is a problem!")') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/jason/Development/MGET/src/GeoEco/R/_RWorkerProcess.py", line 994, in Eval return(self._ProcessResponse(resp, parseReturnValue=True)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/jason/Development/MGET/src/GeoEco/R/_RWorkerProcess.py", line 745, in _ProcessResponse raise RuntimeError(f'From R: {respJSON["message"]}') RuntimeError: From R: Error in eval(parsedExpr, envir = clientEnv, enclos = baseenv()): There is a problem!
You can get a detailed view of the exchange of data between Python and R by turning on DEBUG logging for the GeoEco.R logger, either programmatically as shown below or by configuring GeoEco’s logging configuration file (see
GeoEco.Logging.Logger.Initialize()).>>> from GeoEco.Logging import Logger >>> Logger.Initialize() >>> import logging >>> logging.getLogger('GeoEco.R').setLevel(logging.DEBUG) >>> from GeoEco.R import RWorkerProcess >>> r = RWorkerProcess() >>> r['x'] = [1,2,3,4,5] 2025-02-05 16:55:14.946 DEBUG R: SET: x <- length 5 integer: 2025-02-05 16:55:14.946 DEBUG R: [1] 1 2 3 4 5 >>> r.Eval('x*2') 2025-02-05 16:55:31.475 DEBUG R: EVAL: x*2 2025-02-05 16:55:31.475 DEBUG R: RESULT: length 5 numeric: 2025-02-05 16:55:31.475 DEBUG R: [1] 2 4 6 8 10 [2, 4, 6, 8, 10] >>> df = r.Eval('iris') 2025-02-05 17:00:02.296 DEBUG R: EVAL: iris 2025-02-05 17:00:02.296 DEBUG R: RESULT: data.frame with 150 rows, 5 columns
Security
As noted, communication between Python and R occurs over HTTP over TCP/IP. This raises the possibility of a malicious party exploiting the communication channel. To mitigate this, R listens on the loopback interface (IPv4 address 127.0.0.1), which is only accessible to processes running on the local machine, and uses a randomly-selected TCP port. If a local process does discover the port (e.g. via scanning all local ports) and tries to invoke the REST APIs exposed by R, to succeed it must guess a 512-bit randomly-generated token, which is extremely improbable. However, a malicious local process could still mount a denial of service attack on the R interface by flooding it with bogus requests. Because R is single-threaded, such an attack might starve Python of the opportunity to place its own calls. It would also maximize utilization of one processor.
Constructor
Requires: Python pandas module, Python pyarrow module, Python requests module, Python tzlocal module.
- Parameters:
rInstallDir (
str, optional) –On Windows: the path to the directory where R is installed, if you do not want R’s installation directory to be discovered automatically. You can determine the installation directory from within R by executing the function
R.home(). If this parameter is not provided, the installation directory will be located automatically. Three methods will be tried, in this order:If the R_HOME environment variable has been set, it will be used. The program Rscript.exe must exist in the
bin\x64subdirectory of R_HOME or aFileNotFoundErrorexception will be raised.Otherwise (R_HOME has not been set), the Registry will be checked, starting with the
HKEY_CURRENT_USER\Software\R-corekey and falling back toHKEY_LOCAL_MACHINE\Software\R-coreonly if the former does not exist. For whichever exists, the value ofR64\InstallPathwill be used. The program Rscript.exe must exist in thebin\x64subdirectory of that directory or aFileNotFoundErrorexception will be raised.Otherwise (neither of those registry keys exist), the PATH environment variable will be checked for the program Rscript.exe. If it does not exist,
FileNotFoundErrorexception will be raised.
On other operating systems: this parameter is ignored, and R’s executables are expected to be available through via the PATH environment variable.
Minimum length꞉ 1. Must exist.
rLibDir (
str, optional) –Path to the R library directory where R packages should be stored. When a package is needed, it will be loaded from this directory if it exists there, and downloaded there it does not exist. If not provided, R’s default will be used. See the R documentation for details.
You should provide a custom directory if you want MGET to maintain its own set of R packages, rather than those you use when running R yourself. For example, when running MGET, you may want to use only packages that have been released to CRAN, while when running R yourself, you may want to use newer or experimental versions that you obtained elsewhere.
Minimum length꞉ 1.
rRepository (
str, optional) – R repository to use when downloading packages. If not provided, https://cloud.r-project.org will be used. Minimum length꞉ 1.rPackages (
listofstr, optional) –List of R packages to ensure are installed. For each package that is provided, MGET will check whether it is installed. If it is not, MGET will install it. If it is, MGET will do nothing. To update already-installed packages, use the updateRPackages parameter.
MGET does not automatically “load” the packages given here. If you need to load them, make sure the expressions include a call to
load(), or another suitable function.updateRPackages (
bool, optional) –If True, the R
update.packages()function will be called automatically when R starts up, to update all R packages to their latest versions. If False, the default, this will not be done, and once a package has been installed, it will remain at that version until it is updated via some other mechanism.Use this option to ensure your R package library is automatically kept up to date. It is set to False by default to prevent MGET from updating your already-installed packages without your explicit permission. However, even if this option is set to False, MGET will still automatically install any packages that it needs that are missing.
winBinaryOnly (
bool, optional) –If True, the default, then when running on Windows
options(pkgType = "win.binary")will be invoked when R starts up, before any packages are installed or updated. This will ensure that only binary packages available from the package repository will be installed, and block the installation of packages that are only available as source code.When a package is only available as source, R must compile it locally, usually using RTools and related R utilities. We found that sometimes these locally-compiled packages end up being incompatible with other packages, possibly due to conflicts over common libraries that they both need. Package repositories help avoid this problem by compiling all packages against the same common libraries. We found that by restricting our Windows versions of R to only binary packages, the conflicts were avoided.
If you want to go ahead and allow MGET to install source-only packages, set this parameter to False. This can sometimes provide access to the very latest versions of packages that have not yet been compiled by the package repository.
This parameter is ignored on platforms other than Windows (e.g. Linux). On those platforms, R’s default settings are used, which usually allow source-only packages to be installed.
port (
int, optional) – TCP port to use for communicating with R via the R plumber package. If not specified, an unused port will be selected automatically. Minimum value꞉ 1.timeout (
float, optional) –Maximum amount of time, in seconds, that a call into R is allowed to take to start responding when getting, setting, or deleting variable values. If this time elapses without the R worker process beginning to send its response, an error will be raised. In general, a very short value such as 5 seconds is appropriate here. To allow an infinite amount of time, provide
Nonefrom Python or delete all text from this text box in the ArcGIS user interface.Warning
If you allow an infinite amount of time and R never responds, your program will be blocked forever. Use caution.
Must be greater than 0.0.
startupTimeout (
float, optional) –Maximum amount of time, in seconds, that R is allowed to take to initialize itself and begin servicing requests. This time is usually only a second or two, but can be longer if the machine is busy. Because of this, the default is set to 15 seconds. If the timeout elapses without the R process indicating that it is ready, an error will be raised. To allow an infinite amount of time, provide
Nonefrom Python or delete all text from this text box in the ArcGIS user interface.If packages must be installed or updated, as usually occurs the first time you use MGET to interact with R, the delay is automatically extended to allow package installation to complete.
Warning
If you allow an infinite amount of time and R never responds, your program will be blocked forever. Use caution.
Must be greater than 0.0.
defaultTZ (
str, optional) –Name of the time zone to use when 1) setting R variables from time-zone naive
datetimeinstances, 2) returningdatetimeinstances from R. The time zone names are those from the IANA Time Zone Database. At the time of this writing, many of the names were conveniently listed in WikipediaSetting R variables using naive datetime instances
When a
datetimeinstance is sent to R, it is converted to an RPOSIXctobject, which represents time as the number of seconds since the UNIX epoch, which is defined as 1970-01-01 00:00:00 UTC. Because of this, MGET needs to know which time zone thedatetimeinstance is in so that it can be converted to UTC for R.If a
datetimeinstance has a time zone defined (meaning that its tzinfo attribute is notNone), then MGET will apply that time zone when computing UTC times to send to R. But if it does not have a time zone defined, it is known as a “naive”datetime. In this case, this default time zone parameter (defaultTZ) determines the time zone to use, as follows:If defaultTZ is
None(the default), MGET will assume that naivedatetimeinstances are in the local time zone, consistent with how many of the Pythondatetimemethods treat naive instances. MGET will then look up the local time zone using the Python tzlocal package and apply it when computing UTC times to send to R.If defaultTZ is a string, a
ZoneInfowill be instantiated from it and used instead. For example, if you want all naivedatetimeinstances to be treated as UTC, provide'UTC'for defaultTZ.Getting datetime instances back from R
For consistency with the behavior described above, if defaultTZ is
None(the default), MGET will look up the local time zone using the Python tzlocal package and convert alldatetimeinstances to that time zone before returning them. The returned instances will have that time zone defined (they will not be naive).If defaultTZ is a string, a
ZoneInfowill be instantiated and used instead.Minimum length꞉ 1.
- Returns:
RWorkerProcessinstance.- Return type:
Methods
Evaluate an R expression and return the result.
Start R, evaluate one or more R expressions, stop R, and optionally return the result of the last expression.
Start the R worker process.
Stop the R worker process.
If key is not found, d is returned if given, otherwise KeyError is raised.
as a 2-tuple; but raise KeyError if D is empty.
If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v