GeoEco.R.RWorkerProcess

class GeoEco.R.RWorkerProcess(rInstallDir=None, rLibDir=None, rRepository='https://cloud.r-project.org', rPackages=None, updateRPackages=False, port=None, timeout=5.0, startupTimeout=15.0, defaultTZ=None)

Bases: MutableMapping

Starts and manages an R child process and provides methods for interacting with it.

Similar to the rpy2 package, RWorkerProcess starts the R interpreter and provides mechanisms for Python code to get and set R variables and evaluate R expressions. RWorkerProcess is not as fully-featured as rpy2 and has several important differences in how it is implemented:

  1. RWorkerProcess hosts the R interpreter in a child process (using the Rscript program), while rpy2 hosts it within the same process as the Python interpreter. RWorkerProcess is therefore less likely to encounter “DLL Hell” conflicts, in which Python and R try to load different versions of the same shared library, which can cause the process to crash. However, RWorkerProcess is slower than rpy2, because interactions with R have to occur via interprocess communication. RWorkerProcess implements this with the R plumber package, which allows R functions to be exposed as HTTP endpoints. This mechanism is also less secure than that used by rpy2; see Security below.

  2. RWorkerProcess does not allow as full a range of data types to be exchanged between Python and R as rpy2. With RWorkerProcess, the communication between Python and R uses JSON for exchanging basic types and Apache feather for exchanging data frames. These choices simplified implementation but placed some limitations on what can be exchanged. Most notably, Python numpy arrays cannot be translated to R matrices (although support for this could be added in the future). By contrast, rpy2 calls R’s C API directly and has implemented translation code for more data types, including numpy arrays to R matrices.

  3. RWorkerProcess does not need to be compiled against a specific version of R, and can therefore work with any version of R that you have installed, while rpy2 must be recompiled for the R version you have, whenever you change it.

  4. RWorkerProcess supports Microsoft Windows, while rpy2 historically has lacked a Windows maintainer. While it can be possible to get rpy2 working on Windows, there are usually no binary distributions (Python wheels) for Windows on the Python Package Index. For Conda users, which generally includes users of ArcGIS, there is a release of rpy2 on conda-forge, but it can be out of date by a year or more and may not be compatible with recent R versions. To work around this, Windows users can try to build rpy2 from source, but installing the correct compiler and required libraries can be challenging and time consuming.

If rpy2 works for you, we recommend you continue to use it. But if not, or some of the issues mentioned above affect you, RWorkerProcess could provide an effective alternative.

Using RWorkerProcess

RWorkerProcess represents the child R process. When you instantiate RWorkerProcess, nothing happens at first. The child process is started automatically when you start using the RWorkerProcess instance to interact with R. We recommend you use the with statement to automatically control the child process’s lifetime:

from GeoEco.R import RWorkerProcess
with RWorkerProcess() as r:
    ...
    x = r.Eval('1+1')       # Worker process started here, at the first use of the RWorkerProcess instance
    ...
print(x)                    # Worker process stopped before this line is executed, after the block above exits

This will start the child process when it is first needed and automatically stop it when the with block is exited, even if an exception is raised.

If desired, you can call Start() to start it manually or Stop() to stop it. We recommend you use a try/finally block to do it:

r = RWorkerProcess()
r.Start()                   # Worker process started here
try:
    ...
finally:
    r.Stop()                # Worker process stopped here

Regardless of which style you use, if the R child process is still running when the Python process exits, the operating system will stop the child process, even if Python dies without exiting properly.

Warning

RWorkerProcess must install the R plumber package the first time it interacts with R, unless the package is already installed. Plumber depends on a number of R packages. Installing plumber and its dependencies may take several minutes on Windows. On Linux, where R package installations typically requiring from C source code, it can take 20 minutes or more. After this has been done for the first time, it will not be necessary to do again, unless you uninstall plumber.

Evaluating R expressions from Python

Eval() accepts a string representing an R expression, passes it to the R interpreter for evaluation, and returns the result, translating R types into suitable Python types. You can supply multiple expressions in a single call, separated by newline characters or semicolons. The last value of the last expression will be returned:

>>> from GeoEco.R import RWorkerProcess
>>> r = RWorkerProcess()
>>> r.Eval('x <- 6; y <- 7; x * y')
42

A variety of R types can be translated into Python types. The rules of translation are governed by the serialization formats used to marshal data between Python and R. For most types, JSON is used as the serialization format, with the requests package handling it on the Python side and plumber on the R side. In general, R vectors, lists, and data frames are supported, as follows:

  • R vectors of length 1, sometimes known as atomic values, with the type logical, integer, double, or character are returned as Python bool, int, float, and str, respectively:

    >>> r.Eval('TRUE')
    True
    >>> r.Eval('123')
    123
    >>> r.Eval('pi')
    3.141592653589793
    >>> r.Eval('"Hello, world"')
    'Hello, world'
    

    Those atomic types are also returned even if you use R’s c() function to create a length 1 vector. (It does not matter how you construct it; if the vector has length 1, the atomic types are returned.)

    >>> r.Eval('c(TRUE)')
    True
    >>> r.Eval('c(123)')
    123
    >>> r.Eval('c(pi)')
    3.141592653589793
    >>> r.Eval('c("Hello, world")')
    'Hello, world'
    
  • R vectors of length 2 or more are returned as a Python list:

    >>> r.Eval('c(1,2,3)')
    [1, 2, 3]
    
  • R unnamed lists are also returned as a list. In this case, a list of length 1 is not returned as an atomic type, but as a list with one item:

    >>> r.Eval('list(1)')
    [1]
    >>> r.Eval('list(1,2,3)')
    [1, 2, 3]
    >>> r.Eval('list(c(1, 2, 3))')
    [[1, 2, 3]]
    >>> r.Eval('list(c(1,2,3), c("A", "B", "C"))')
    [[1, 2, 3], ['A', 'B', 'C']]
    
  • R vectors and lists of length 0 are returned as an empty list:

    >>> r.Eval('logical(0)')
    []
    >>> r.Eval('integer(0)')
    []
    >>> r.Eval('numeric(0)')
    []
    >>> r.Eval('character(0)')
    []
    >>> r.Eval('list()')
    []
    
  • R named lists are returned as a Python dict:

    >>> r.Eval('list(a=1, b=2, c=3)')
    {'a': 1, 'b': 2, 'c': 3}
    >>> r.Eval('list(a=c(1,2,3), b=4, c=c("A", "B", "C"))')
    {'a': [1, 2, 3], 'b': 4, 'c': ['A', 'B', 'C']}
    
  • R vectors of POSIXt (i.e. POSIXct or POSIXlt) are returned as Python datetime instances:

    >>> r.Eval('Sys.time()')
    datetime.datetime(2025, 2, 5, 15, 13, 47, 641000, tzinfo=zoneinfo.ZoneInfo(key='America/New_York'))
    

    Time values obtained from R will have millisecond precision, even if R itself has higher precision. The millisecond limitation results from the format used by the R plumber package to represent times in JSON.

    The defaultTZ parameter of the RWorkerProcess constructor determines the time zone that all POSIXt objects will be converted to when they are returned to Python. By default, it is the time zone of the Python process, as returned by get_localzone() from the tzlocal package. To specify a different timezone, provide it to the RWorkerProcess constructor:

    >>> r = RWorkerProcess(defaultTZ='America/Los_Angeles')
    >>> r.Eval('Sys.time()')
    datetime.datetime(2025, 2, 5, 12, 13, 47, 641000, tzinfo=zoneinfo.ZoneInfo(key='America/Los_Angeles'))
    

    See the documentation for defaultTZ for more information.

  • R NA is returned as a Python None:

    >>> r.Eval('NA') is None
    True
    >>> r.Eval('c(1, 2, NA, 3)')
    [1, 2, None, 3]
    
  • R data frames are returned as Python pandas DataFrames:

    >>> df = r.Eval('iris')
    >>> df.info()
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 150 entries, 0 to 149
    Data columns (total 5 columns):
     #   Column        Non-Null Count  Dtype
    ---  ------        --------------  -----
     0   Sepal.Length  150 non-null    float64
     1   Sepal.Width   150 non-null    float64
     2   Petal.Length  150 non-null    float64
     3   Petal.Width   150 non-null    float64
     4   Species       150 non-null    category
    dtypes: category(1), float64(4)
    memory usage: 5.1 KB
    >>> df.head()
       Sepal.Length  Sepal.Width  Petal.Length  Petal.Width Species
    0           5.1          3.5           1.4          0.2  setosa
    1           4.9          3.0           1.4          0.2  setosa
    2           4.7          3.2           1.3          0.2  setosa
    3           4.6          3.1           1.5          0.2  setosa
    4           5.0          3.6           1.4          0.2  setosa
    
  • Arbitrary R objects not covered above are usually converted to an R list with R’s unclass() and then returned as Python dicts:

    >>> model = r.Eval('lm(dist ~ speed, data = cars)')
    >>> from pprint import pprint
    >>> pprint(model, width=150, compact=True)
    {'assign': [0, 1],
     'call': {},
     'coefficients': [-17.579094890510934, 3.932408759124087],
     'df.residual': 48,
     'effects': [-303.9144945539781, 145.55225504575705, -8.115439504379111, 9.884560495620892, 0.194114676507422, -9.49633114260605, -5.186776961719519,
                 2.8132230382804804, 10.81322303828048, -9.87722278083299, 1.1227772191670096, -16.56766859994646, -10.56766859994646, -6.56766859994646,
                 -2.5676685999464604, -8.25811441905993, -0.2581144190599315, -0.2581144190599315, 11.74188558094007, -11.948560238173402,
                 -1.948560238173402, 22.0514397618266, 42.05143976182659, -21.63900605728687, -15.639006057286872, 12.360993942713128,
                 -13.329451876400343, -5.329451876400342, -17.019897695513812, -9.019897695513812, 0.9801023044861885, -10.710343514627283,
                 3.2896564853727175, 23.289656485372713, 31.289656485372713, -20.400789333740754, -10.400789333740754, 11.599210666259246,
                 -28.091235152854225, -12.091235152854225, -8.091235152854225, -4.091235152854224, 3.908764847145776, -1.4721267910811655,
                 -17.162572610194637, -4.853018429308115, 17.146981570691885, 18.146981570691885, 45.146981570691885, 6.456535751578421],
     'fitted.values': [-1.8494598540146354, -1.8494598540145883, 9.94776642335767, 9.947766423357667, 13.880175182481754, 17.81258394160584,
                       21.74499270072993, 21.74499270072993, 21.74499270072993, 25.677401459854018, 25.677401459854018, 29.609810218978105,
                       29.6098102189781, 29.609810218978105, 29.609810218978105, 33.54221897810219, 33.54221897810219, 33.54221897810219,
                       33.54221897810219, 37.47462773722628, 37.47462773722628, 37.47462773722627, 37.474627737226285, 41.40703649635036,
                       41.407036496350365, 41.407036496350365, 45.33944525547445, 45.33944525547445, 49.27185401459854, 49.27185401459854,
                       49.27185401459854, 53.204262773722625, 53.204262773722625, 53.20426277372263, 53.20426277372263, 57.13667153284671,
                       57.13667153284671, 57.13667153284671, 61.0690802919708, 61.0690802919708, 61.0690802919708, 61.0690802919708, 61.0690802919708,
                       68.93389781021898, 72.86630656934307, 76.79871532846715, 76.79871532846715, 76.79871532846715, 76.79871532846715,
                       80.73112408759124],
     'model': [{'dist': 2, 'speed': 4}, {'dist': 10, 'speed': 4}, {'dist': 4, 'speed': 7}, {'dist': 22, 'speed': 7}, {'dist': 16, 'speed': 8},
               {'dist': 10, 'speed': 9}, {'dist': 18, 'speed': 10}, {'dist': 26, 'speed': 10}, {'dist': 34, 'speed': 10}, {'dist': 17, 'speed': 11},
               {'dist': 28, 'speed': 11}, {'dist': 14, 'speed': 12}, {'dist': 20, 'speed': 12}, {'dist': 24, 'speed': 12}, {'dist': 28, 'speed': 12},
               {'dist': 26, 'speed': 13}, {'dist': 34, 'speed': 13}, {'dist': 34, 'speed': 13}, {'dist': 46, 'speed': 13}, {'dist': 26, 'speed': 14},
               {'dist': 36, 'speed': 14}, {'dist': 60, 'speed': 14}, {'dist': 80, 'speed': 14}, {'dist': 20, 'speed': 15}, {'dist': 26, 'speed': 15},
               {'dist': 54, 'speed': 15}, {'dist': 32, 'speed': 16}, {'dist': 40, 'speed': 16}, {'dist': 32, 'speed': 17}, {'dist': 40, 'speed': 17},
               {'dist': 50, 'speed': 17}, {'dist': 42, 'speed': 18}, {'dist': 56, 'speed': 18}, {'dist': 76, 'speed': 18}, {'dist': 84, 'speed': 18},
               {'dist': 36, 'speed': 19}, {'dist': 46, 'speed': 19}, {'dist': 68, 'speed': 19}, {'dist': 32, 'speed': 20}, {'dist': 48, 'speed': 20},
               {'dist': 52, 'speed': 20}, {'dist': 56, 'speed': 20}, {'dist': 64, 'speed': 20}, {'dist': 66, 'speed': 22}, {'dist': 54, 'speed': 23},
               {'dist': 70, 'speed': 24}, {'dist': 92, 'speed': 24}, {'dist': 93, 'speed': 24}, {'dist': 120, 'speed': 24}, {'dist': 85, 'speed': 25}],
     'qr': {'pivot': [1, 2],
            'qr': [[-7.0710678118654755, -108.8944443027283], [0.1414213562373095, 37.0135110466435], [0.1414213562373095, 0.18878369792756214],
                   [0.1414213562373095, 0.18878369792756214], [0.1414213562373095, 0.16176653657964718], [0.1414213562373095, 0.13474937523173222],
                   [0.1414213562373095, 0.10773221388381726], [0.1414213562373095, 0.10773221388381726], [0.1414213562373095, 0.10773221388381726],
                   [0.1414213562373095, 0.0807150525359023], [0.1414213562373095, 0.0807150525359023], [0.1414213562373095, 0.05369789118798735],
                   [0.1414213562373095, 0.05369789118798735], [0.1414213562373095, 0.05369789118798735], [0.1414213562373095, 0.05369789118798735],
                   [0.1414213562373095, 0.026680729840072397], [0.1414213562373095, 0.026680729840072397], [0.1414213562373095, 0.026680729840072397],
                   [0.1414213562373095, 0.026680729840072397], [0.1414213562373095, -0.00033643150784255907],
                   [0.1414213562373095, -0.00033643150784255907], [0.1414213562373095, -0.00033643150784255907],
                   [0.1414213562373095, -0.00033643150784255907], [0.1414213562373095, -0.027353592855757516],
                   [0.1414213562373095, -0.027353592855757516], [0.1414213562373095, -0.027353592855757516], [0.1414213562373095, -0.05437075420367247],
                   [0.1414213562373095, -0.05437075420367247], [0.1414213562373095, -0.08138791555158742], [0.1414213562373095, -0.08138791555158742],
                   [0.1414213562373095, -0.08138791555158742], [0.1414213562373095, -0.10840507689950238], [0.1414213562373095, -0.10840507689950238],
                   [0.1414213562373095, -0.10840507689950238], [0.1414213562373095, -0.10840507689950238], [0.1414213562373095, -0.13542223824741734],
                   [0.1414213562373095, -0.13542223824741734], [0.1414213562373095, -0.13542223824741734], [0.1414213562373095, -0.1624393995953323],
                   [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.1624393995953323],
                   [0.1414213562373095, -0.1624393995953323], [0.1414213562373095, -0.2164737222911622], [0.1414213562373095, -0.24349088363907717],
                   [0.1414213562373095, -0.27050804498699216], [0.1414213562373095, -0.27050804498699216], [0.1414213562373095, -0.27050804498699216],
                   [0.1414213562373095, -0.27050804498699216], [0.1414213562373095, -0.2975252063349071]],
            'qraux': [1.1414213562373094, 1.269835181971307],
            'rank': 2,
            'tol': 1e-07},
     'rank': 2,
     'residuals': [3.8494598540146354, 11.849459854014588, -5.94776642335767, 12.052233576642333, 2.119824817518246, -7.812583941605841,
                   -3.744992700729929, 4.255007299270071, 12.255007299270071, -8.677401459854016, 2.3225985401459837, -15.609810218978105,
                   -9.609810218978101, -5.609810218978103, -1.609810218978103, -7.54221897810219, 0.4577810218978093, 0.4577810218978093,
                   12.45778102189781, -11.474627737226276, -1.474627737226278, 22.525372262773725, 42.525372262773715, -21.40703649635036,
                   -15.407036496350365, 12.592963503649635, -13.339445255474452, -5.339445255474452, -17.27185401459854, -9.271854014598537,
                   0.7281459854014627, -11.204262773722625, 2.795737226277375, 22.79573722627737, 30.79573722627737, -21.136671532846712,
                   -11.136671532846712, 10.863328467153288, -29.0690802919708, -13.0690802919708, -9.0690802919708, -5.0690802919708, 2.9309197080292,
                   -2.933897810218975, -18.866306569343063, -6.798715328467158, 15.201284671532843, 16.201284671532843, 43.20128467153284,
                   4.268875912408762],
     'terms': {},
     'xlevels': {}}
    
  • When an R expression evaluates to NULL in R, a None is returned. Note that this includes the R expression c():

    >>> r.Eval('NULL') is None
    True
    >>> r.Eval('c()') is None
    True
    

    However, the usual R rules about how NULL is handled by R still apply. For example, R removes NULL elements from R vectors. This can yield results that may be unexpected by Python developers:

    >>> r.Eval('c(1, 2)')
    [1, 2]
    >>> r.Eval('c(1, NULL)')
    1
    >>> r.Eval('c(1, NULL, NULL)')
    1
    >>> r.Eval('c(1, NULL, NULL, 2)')
    [1, 2]
    >>> r.Eval('c(NULL, NULL, NULL, NULL)') is None
    True
    

    But R does not remove NULL from R lists, and it will be translated to None:

    >>> r.Eval('list(NULL)')
    [None]
    >>> r.Eval('list(NULL, NULL, NULL)')
    [None, None, None]
    >>> r.Eval('list(a=NULL, b=NULL, c=NULL)')
    {'a': None, 'b': None, 'c': None}
    

Getting and setting R variables from Python

You can get and set variables in the R interpreter through the dictionary interface of the RWorkerProcess instance:

>>> r['my_variable'] = 42     # Set my_variable to 42 in the R interpreter
>>> print(r['my_variable'])   # Get back the value of my_variable and print it
42
>>> print(list(r.keys()))     # Print a list of the variables defined in the R interpreter
['my_variable']
>>> del r['my_variable']      # Delete my_variable from the R interpreter

Python types will be automatically translated to and from R types as described above.

Unexpected behaviors

Because of differences between R and Python and the imperfectness of JSON and feather as data marshaling formats, there some unexpected behaviors, including:

  • In an R double vector, any value that happens to be an integer is returned to Python as an int:

    >>> r.Eval('typeof(1.0)')
    'double'
    >>> type(r.Eval('1.0'))
    <class 'int'>
    >>> r.Eval('typeof(c(1,2,3.3))')
    'double'
    >>> [type(x) for x in r.Eval('c(1,2,3.3)')]
    [<class 'int'>, <class 'int'>, <class 'float'>]
    
  • If you set an R variable to a Python list that has a length of 1 and then get it back from R, it will no longer be a list:

    >>> r['x'] = [1]
    >>> r['x']
    1
    

    This is because in R, atomic values are actually stored as length 1 vectors, while Python distinguishes between the two. When returning a length 1 vector to Python, we can’t determine if it would be best represented as an atomic value (e.g. int) or as a list with a single value in it. We judged that an atomic value would be appropriate more of the time, and lacking any way to determine otherwise, we designed RWorkerProcess to always translate length 1 vectors into atomic values.

  • R complex is not supported (because JSON does not support complex numbers) and is returned as Python str:

    >>> r.Eval('c(1+2i, 3-5i, 6)')
    ['1+2i', '3-5i', '6+0i']
    

Character encoding

Data are exchanged with R in UTF-8:

>>> r.Eval('"Café, résumé, naïve, jalapeño"')
'Café, résumé, naïve, jalapeño'
>>> r.Eval('"Python 🐍 is awesome! 你好! Привет!"')
'Python 🐍 is awesome! 你好! Привет!'

Logging and error handling

Messages written by R to R’s stdout pipe, e.g. with the the R cat() function, are logged to the Python GeoEco.R logger as INFO messages. Messages written by R to its stderr pipe, e.g. with the R message() function, are logged to the GeoEco.R logger as WARNING messages.

>>> from GeoEco.Logging import Logger
>>> Logger.Initialize()
>>> from GeoEco.R import RWorkerProcess
>>> r = RWorkerProcess()
>>> x = r.Eval('print(pi)')
2025-02-05 16:19:09.213 INFO [1] 3.141593
>>> r.Eval('cat("Hello, world!\n")')
2025-02-05 16:19:56.232 INFO Hello, world!
>>> r.Eval('message("Something might be wrong")')
2025-02-05 16:20:19.721 WARNING Something might be wrong

If an error is signaled in R and not caught before the signal propagates back up to the plumber API, it is sent back to Python and RuntimeError will be raised:

>>> r.Eval('stop("There is a problem!")')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jason/Development/MGET/src/GeoEco/R/_RWorkerProcess.py", line 994, in Eval
    return(self._ProcessResponse(resp, parseReturnValue=True))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jason/Development/MGET/src/GeoEco/R/_RWorkerProcess.py", line 745, in _ProcessResponse
    raise RuntimeError(f'From R: {respJSON["message"]}')
RuntimeError: From R: Error in eval(parsedExpr, envir = clientEnv, enclos = baseenv()): There is a problem!

You can get a detailed view of the exchange of data between Python and R by turning on DEBUG logging for the GeoEco.R logger, either programmatically as shown below or by configuring GeoEco’s logging configuration file (see GeoEco.Logging.Logger.Initialize()).

>>> from GeoEco.Logging import Logger
>>> Logger.Initialize()
>>> import logging
>>> logging.getLogger('GeoEco.R').setLevel(logging.DEBUG)
>>> from GeoEco.R import RWorkerProcess
>>> r = RWorkerProcess()
>>> r['x'] = [1,2,3,4,5]
2025-02-05 16:55:14.946 DEBUG R: SET: x <- length 5 integer:
2025-02-05 16:55:14.946 DEBUG R:   [1] 1 2 3 4 5
>>> r.Eval('x*2')
2025-02-05 16:55:31.475 DEBUG R: EVAL: x*2
2025-02-05 16:55:31.475 DEBUG R: RESULT: length 5 numeric:
2025-02-05 16:55:31.475 DEBUG R:           [1]  2  4  6  8 10
[2, 4, 6, 8, 10]
>>> df = r.Eval('iris')
2025-02-05 17:00:02.296 DEBUG R: EVAL: iris
2025-02-05 17:00:02.296 DEBUG R: RESULT: data.frame with 150 rows, 5 columns

Security

As noted, communication between Python and R occurs over HTTP over TCP/IP. This raises the possibility of a malicious party exploiting the communication channel. To mitigate this, R listens on the loopback interface (IPv4 address 127.0.0.1), which is only accessible to processes running on the local machine, and uses a randomly-selected TCP port. If a local process does discover the port (e.g. via scanning all local ports) and tries to invoke the REST APIs exposed by R, to succeed it must guess a 512-bit randomly-generated token, which is extremely improbable. However, a malicious local process could still mount a denial of service attack on the R interface by flooding it with bogus requests. Because R is single-threaded, such an attack might starve Python of the opportunity to place its own calls. It would also maximize utilization of one processor.

Constructor

Requires: Python pandas module, Python pyarrow module, Python requests module, Python tzlocal module.

Parameters:
  • rInstallDir (str, optional) –

    On Windows: the path to the directory where R is installed, if you do not want R’s installation directory to be discovered automatically. You can determine the installation directory from within R by executing the function R.home(). If this parameter is not provided, the installation directory will be located automatically. Three methods will be tried, in this order:

    1. If the R_HOME environment variable has been set, it will be used. The program Rscript.exe must exist in the bin\x64 subdirectory of R_HOME or a FileNotFoundError exception will be raised.

    2. Otherwise (R_HOME has not been set), the Registry will be checked, starting with the HKEY_CURRENT_USER\Software\R-core key and falling back to HKEY_LOCAL_MACHINE\Software\R-core only if the former does not exist. For whichever exists, the value of R64\InstallPath will be used. The program Rscript.exe must exist in the bin\x64 subdirectory of that directory or a FileNotFoundError exception will be raised.

    3. Otherwise (neither of those registry keys exist), the PATH environment variable will be checked for the program Rscript.exe. If it does not exist, FileNotFoundError exception will be raised.

    On other operating systems: this parameter is ignored, and R’s executables are expected to be available through via the PATH environment variable.

    Minimum length꞉ 1. Must exist.

  • rLibDir (str, optional) –

    Path to the R library directory where R packages should be stored. When a package is needed, it will be loaded from this directory if it exists there, and downloaded there it does not exist. If not provided, R’s default will be used. See the R documentation for details.

    You should provide a custom directory if you want MGET to maintain its own set of R packages, rather than those you use when running R yourself. For example, when running MGET, you may want to use only packages that have been released to CRAN, while when running R yourself, you may want to use newer or experimental versions that you obtained elsewhere.

    Minimum length꞉ 1.

  • rRepository (str, optional) – R repository to use when downloading packages. If not provided, https://cloud.r-project.org will be used. Minimum length꞉ 1.

  • rPackages (list of str, optional) –

    List of R packages to ensure are installed. For each package that is provided, MGET will check whether it is installed. If it is not, MGET will install it. If it is, MGET will do nothing. To update already-installed packages, use the updateRPackages parameter.

    MGET does not automatically “load” the packages given here. If you need to load them, make sure the expressions include a call to load(), or another suitable function.

  • updateRPackages (bool, optional) –

    If True, the R update.packages() function will be called automatically when R starts up, to update all R packages to their latest versions. If False, the default, this will not be done, and once a package has been installed, it will remain at that version until it is updated via some other mechanism.

    Use this option to ensure your R package library is automatically kept up to date. It is set to False by default to prevent MGET from updating your already-installed packages without your explicit permission. However, even if this option is set to False, MGET will still automatically install any packages that it needs that are missing.

  • port (int, optional) – TCP port to use for communicating with R via the R plumber package. If not specified, an unused port will be selected automatically. Minimum value꞉ 1.

  • timeout (float, optional) –

    Maximum amount of time, in seconds, that a call into R is allowed to take to start responding when getting, setting, or deleting variable values. If this time elapses without the R worker process beginning to send its response, an error will be raised. In general, a very short value such as 5 seconds is appropriate here. To allow an infinite amount of time, provide None from Python or delete all text from this text box in the ArcGIS user interface.

    Warning

    If you allow an infinite amount of time and R never responds, your program will be blocked forever. Use caution.

    Must be greater than 0.0.

  • startupTimeout (float, optional) –

    Maximum amount of time, in seconds, that R is allowed to take to initialize itself and begin servicing requests. This time is usually only a second or two, but can be longer if the machine is busy. Because of this, the default is set to 15 seconds. If the timeout elapses without the R process indicating that it is ready, an error will be raised. To allow an infinite amount of time, provide None from Python or delete all text from this text box in the ArcGIS user interface.

    If packages must be installed or updated, as usually occurs the first time you use MGET to interact with R, the delay is automatically extended to allow package installation to complete.

    Warning

    If you allow an infinite amount of time and R never responds, your program will be blocked forever. Use caution.

    Must be greater than 0.0.

  • defaultTZ (str, optional) –

    Name of the time zone to use when 1) setting R variables from time-zone naive datetime instances, 2) returning datetime instances from R. The time zone names are those from the IANA Time Zone Database. At the time of this writing, many of the names were conveniently listed in Wikipedia

    Setting R variables using naive datetime instances

    When a datetime instance is sent to R, it is converted to an R POSIXct object, which represents time as the number of seconds since the UNIX epoch, which is defined as 1970-01-01 00:00:00 UTC. Because of this, MGET needs to know which time zone the datetime instance is in so that it can be converted to UTC for R.

    If a datetime instance has a time zone defined (meaning that its tzinfo attribute is not None), then MGET will apply that time zone when computing UTC times to send to R. But if it does not have a time zone defined, it is known as a “naive” datetime. In this case, this default time zone parameter (defaultTZ) determines the time zone to use, as follows:

    If defaultTZ is None (the default), MGET will assume that naive datetime instances are in the local time zone, consistent with how many of the Python datetime methods treat naive instances. MGET will then look up the local time zone using the Python tzlocal package and apply it when computing UTC times to send to R.

    If defaultTZ is a string, a ZoneInfo will be instantiated from it and used instead. For example, if you want all naive datetime instances to be treated as UTC, provide 'UTC' for defaultTZ.

    Getting datetime instances back from R

    For consistency with the behavior described above, if defaultTZ is None (the default), MGET will look up the local time zone using the Python tzlocal package and convert all datetime instances to that time zone before returning them. The returned instances will have that time zone defined (they will not be naive).

    If defaultTZ is a string, a ZoneInfo will be instantiated and used instead.

    Minimum length꞉ 1.

Returns:

RWorkerProcess instance.

Return type:

RWorkerProcess

Methods

Eval

Evaluate an R expression and return the result.

ExecuteRAndEvaluateExpressions

Start R, evaluate one or more R expressions, stop R, and optionally return the result of the last expression.

Start

Start the R worker process.

Stop

Stop the R worker process.

clear

get

items

keys

pop

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem

as a 2-tuple; but raise KeyError if D is empty.

setdefault

update

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values