GeoEco.Datasets.Grid

class GeoEco.Datasets.Grid(parentCollection=None, queryableAttributes=None, queryableAttributeValues=None, lazyPropertyValues=None)

Bases: Dataset

Base class for classes representing gridded Datasets with 2, 3, or 4 dimensions.

Grid provides a generic wrapper around gridded and array data, allowing GeoEco components to access them through a common interface that returns and accepts numpy.ndarray. class:Grid is a base class that should not be instantiated directly; instead, users should instantiate one of the many derived classes representing the type of grid they’re interested in.

Grid was developed in the 2000s for GeoEco’s internal use and predates more recent projects such as Xarray that may provide similar functionality with additional features or a more polished interface. When GeoEco was ported to Python 3, we decided to expose Grid and other classes in GeoEco.Datasets in case they were useful, but we encourage Python users needing to access multidimensional arrays to consider Xarray and similar projects that have greater adoption and a robust, well-funded developer community that supports their maintenance.

Dimensions

Grid is specifically designed to represent the most common types of gridded datasets used in marine spatial ecology. As such, Grid supports up to four dimensions: x (northing), y (easting), z (depth or altitude), and t (time). x and y may be angular coordinates (e.g. degrees) or linear coordinates (e.g. meters) depending on whether the Grid uses a geographic or projected coordinate system. You can retrieve the coordinate system with GetSpatialReference().

Dimensions returns the dimensions as a string. Currently four combinations of dimensions are supported:

  • yx - static 2D datasets, such as bathymetry rasters.

  • tyx - dynamic 3D datasets, such as a time series of sea surface temperature images.

  • zyx - static 3D datasets, where each z slice represents the values at a certain depth or altitude, such as a stack of rasters representing a climatology of ocean temperature across a range of depths.

  • tzyx - dynamic 4D datasets, such as a time series of ocean temperature data available across a range of depths, as output by a physical ocean model like HYCOM.

Dimensions and other attributes and methods of Grid always return dimensions in the orders shown above. If the underlying dataset stores them in a different order, Grid automatically reorders them into those orders.

The spatial reference of the Grid determines the units of x and y. TIncrementUnit gives the unit of t. Currently, Grid does not make the unit of z available.

Coordinates

Grid supports both datasets with constant coordinate increments and datasets with varying coordinate increments. CoordIncrements gives the constant increment for each dimension, or None if the increment is variable. For example, a global bathymetry grid with a geographic coordinate system and 0.1 degree resolution would have these properties:

>>> bathyGrid.GetSpatialReference('proj4')
'+proj=longlat +datum=WGS84 +no_defs'
>>> sr = bathyGrid.GetSpatialReference('obj')
>>> bool(sr.IsGeographic())
True
>>> sr.GetAngularUnitsName()
'Degree'
>>> bathyGrid.Dimensions
'yx'
>>> bathyGrid.CoordIncrements
(0.1, 0.1)

Variably-incrementing coordinates can optionally also be dependent on other coordinates, which means that the coordinate values vary based on the values of the depended-upon coordinates. CoordDependencies gives the dependencies for each coordinate as a string, or None if the coordinate does not depend on other coordinates.

Datasets with coordinate dependencies are rare. The most complex example may be ROMS ocean models that have a fixed number of depth layers (e.g. 40) but the depth of each layer depends both on time and on geographic location. The time dependency accounts for the temporally-varying height of the sea surface. Close to shore, where the maximum depth is small, the spacing between each depth layer is also small, boosting resolution in these dynamic nearshore locations. Far from shore, where the maximum depth is large and it is not as important to have high resolution, the depth layers are spaced farther apart.

As an example of a dataset with complex coordinate dependencies, the circa-2010 ROMS-CoSiNE model produced by the University of Maine Ocean Modeling Group had these properties:

>>> romsGrid.GetSpatialReference('proj4')
'+proj=merc +lon_0=180 +k=1 +x_0=0 +y_0=0 +R=6371000 +units=m +no_defs'
>>> sr = romsGrid.GetSpatialReference('obj')
>>> bool(sr.IsGeographic())
False
>>> sr.GetLinearUnitsName()    # These names are determined by PROJ and GDAL
'metre'
>>> romsGrid.Dimensions
'tzyx'
>>> romsGrid.CoordIncrements
(73.0, None, 13899.36583056984, 13899.36583056984)
>>> romsGrid.CoordDependencies
(None, 'tyx', None, None)
>>> romsGrid.TIncrementUnit
'hour'

This 4D grid used a Mercator coordinate system with a cell size of approximately 13.9 km and a temporal resolution of 73 hours. The x, y, and t coordinates used a constant increment but the z coordinate was variably-incrementing and depended on the values of the other three coordinates.

Semi-regular t coordinates

For many grids that have a constantly-incrementing t coordinate, each time slice is always exactly as long as the t coordinate’s increment. For example, grids with increments of 6 hours, 1 day, 2 months, or 1 year exhibit this property. We refer to these grids a having regular t coordinates.

It not necessary for each slice of a regular t coordinate to span the same amount of absolute time: one 2-month slice may span more or fewer days than another one, depending on which months are included in the slice. For a t coordinate to be considered regular, it is only necessary that the length of every slice be the same multiple of the TIncrementUnit. (E.g. every 2-month slice is still 2 months long, regardless of which months it includes.)

For some datasets with constantly-incrementing t coordinates, this characteristic does not hold because they want the first time slice of every year to start on January 1, but the year cannot always be divided into equal-length slices. For example, NASA GSFC Ocean Color L3 includes products that are 8-day averages. Because the year spans 365 days in some years and 366 in others and neither is evenly divisible by 8, it is not possible for every slice of a given year to span exactly 8 days. Instead, the last slice of each year always spans however may days are left. For example, for NASA’s 8-day products, the 46th slice of each year spans days 361-365 on non-leap years, and days 361-366 on leap years.

We refer to these datasets as having semi-regular t coordinates. For semi-regular grids, TSemiRegularity will be 'annual' and TCountPerSemiRegularPeriod will be the number of time slices per year, e.g. 46 for NASA’s 8-day products. For regular grids, both TSemiRegularity and TCountPerSemiRegularPeriod will be None.

Getting the length of each dimension

Shape returns a tuple giving the length of each dimension, in the same order as Dimensions. Continuing the global bathymetry example above:

>>> bathyGrid.Dimensions
'yx'
>>> bathyGrid.Shape
(1800, 3600)

Getting coordinate values

To allow retrieval of coordinate values, Grid exposes three properties, MinCoords, MaxCoords, and CenterCoords. Each is an immutable sequence-like object that supports []-style indexing. Each accepts a 1-character dimension name and an int or range and returns a single coordinate or numpy.ndarray of coordinates, respectively. Continuing the global bathymetry example above:

>>> bathyGrid.MinCoords['x', 0]      # int index returns a float
-180.0
>>> bathyGrid.MinCoords['x', -1]
179.90000000000003
>>> bathyGrid.MinCoords['x', :3]     # range index returns a numpy.ndarray
array([-180. , -179.9, -179.8])
>>> bathyGrid.MinCoords['x', -3:]
array([179.7, 179.8, 179.9])

If only the dimension name is provided, all of the coordinates are returned:

>>> bathyGrid.MinCoords['x']
array([-180. , -179.9, -179.8, ...,  179.7,  179.8,  179.9])

MinCoords and MaxCoords return the coordinates of the edges of each cell; e.g. for the x coordinate, these are the left and right edges, respectively. CenterCoords returns the coordinates of the centers:

>>> bathyGrid.MinCoords['x']
array([-180. , -179.9, -179.8, ...,  179.7,  179.8,  179.9])
>>> bathyGrid.MaxCoords['x']
array([-179.9, -179.8, -179.7, ...,  179.8,  179.9,  180. ])
>>> bathyGrid.CenterCoords['x']
array([-179.95, -179.85, -179.75, ...,  179.75,  179.85,  179.95])

The coordinates of the right side of one cell are the same as the left side of the adjacent cell:

>>> all(bathyGrid.MaxCoords['x', :-1] == bathyGrid.MinCoords['x', 1:])
True

Requires: Python numpy module.

Parameters:
  • parentCollection (DatasetCollection, optional) – Parent DatasetCollection that this object is part of (if any).

  • queryableAttributes (tuple of QueryableAttribute, optional) – Queryable attributes defined for this object.

  • queryableAttributeValues (dict mapping str to object, optional) – Values of the queryable attributes, expressed as a dictionary mapping the case-insensitive names of queryable attributes to their values.

  • lazyPropertyValues (dict mapping str to object, optional) – Lazy properties to set when this object is constructed, expressed as a dictionary mapping the names of lazy properties to their values.

Returns:

Grid instance.

Return type:

Grid

Properties

property CenterCoords

(object) Coordinates of the grid cell centers, indexed using the 1-character dimension of interest and optionally a range to retrieve a numpy.ndarray of coordinates (e.g. CenterCoords['x', 0:4]) or an integer to retrieve a float for a single coordinate (e.g. CenterCoords['x', 10]). Coordinates for the t dimension are returned as datetime instances. Read only.

property CoordDependencies

(tuple of str) Same length as Dimensions. Dimensions that each dimension depends on for determining its coordinates. None for dimensions that have a constant coordinate increment. Read only.

property CoordIncrements

(tuple of float) Same length as Dimensions. Coordinate increment for each dimension. None for dimensions that do not have a constant coordinate increment. Read only.

property Data

(object) This grid’s data, indexable using slices (e.g. grid.Data[:, 5:10, -10:]) or integers (e.g. grid.Data[0,1,-2]) or both in combination. Strides and negative indexes are supported in the traditional manner. If the grid is writable, Data can be assigned to write values to the grid, e.g. grid.Data[0,1] = 5 or grid.Data[:,:] = numpy.zeros(grid.Shape). Returns and accepts numpy.ndarray, float, and int. Read only.

property DataIsScaled

(bool) If True, the underlying raw data are stored as the UnscaledDataType to save storage space and then transformed by a scaling equation on the fly when they are returned by Data. The raw data can be accessed with UnscaledData. If False, the raw data are returned as is, with no transformation needed, and UnscaledDataType and DataType are the same, and UnscaledData returns the same values as Data. Read only.

property DataType

(str) Numeric data type of the grid, after the scaling function (if any) has been applied to the raw data. numpy.ndarrays returned by Data have this type. Read only. Allowed values꞉ 'int8', 'uint8', 'int16', 'uint16', 'int32', 'uint32', 'float32', 'float64'. Case sensitive.

property Dimensions

(str) Dimensions of this grid. Read only. Allowed values꞉ 'yx', 'zyx', 'tyx', 'tzyx'. Case sensitive.

property DisplayName

(str) Informal name of this object, suitable to be displayed to the user. Read only. Minimum length꞉ 1.

property MaxCoords

(object) Maximum coordinate value for each cell (i.e., the coordinates of the cells’ right edges), indexed using the 1-character dimension of interest and optionally a range to retrieve a numpy.ndarray of coordinates (e.g. MaxCoords['x', 0:4]) or an integer to retrieve a float for a single coordinate (e.g. MaxCoords['x', 10]). Coordinates for the t dimension are returned as datetime instances. Read only.

property MinCoords

(object) Minimum coordinate value for each cell (i.e., the coordinates of the cells’ left edges), indexed using the 1-character dimension of interest and optionally a range to retrieve a numpy.ndarray of coordinates (e.g. MinCoords['x', 0:4]) or an integer to retrieve a float for a single coordinate (e.g. MinCoords['x', 10]). Coordinates for the t dimension are returned as datetime instances. Read only.

property NoDataValue

(object or None) int or float value that indicates that cells of Data should be interpreted as having no data (these are also known as missing, NA, or NULL cells), or None if all cells must have data. Read only.

property ParentCollection

(DatasetCollection or None) Parent DatasetCollection that this object is part of (if any). Read only.

property Shape

(tuple of int) Same length as Dimensions. Length (number of grid cells) of each dimension. Read only.

property TCountPerSemiRegularPeriod

(int or None) Number of time slices per semi-regular period (i.e. per year). None if the grid’s dimensions do not contain a t coordinate or the t coordinate is not semi-regular. Read only.

property TIncrementUnit

(str or None) Unit of the t coordinate. None if the grid’s dimensions do not contain a t coordinate. Read only. Allowed values꞉ 'year', 'month', 'day', 'hour', 'minute', 'second'. Case sensitive.

property TSemiRegularity

(str or None) Type of semi-regularity used for the t coordinate. None if the grid’s dimensions do not contain a t coordinate or the t coordinate is not semi-regular. Read only. Allowed values꞉ 'annual'. Case sensitive.

property UnscaledData

(object) This grid’s data underlying raw data, before it has been transformed by a scaling equation. UnscaledData is indexable using slices (e.g. grid.UnscaledData[:, 5:10, -10:]) or integers (e.g. grid.UnscaledData[0,1,-2]) or both in combination. Strides and negative indexes are supported in the traditional manner. If the grid is writable, UnscaledData can be assigned to write values to the grid, e.g. grid.UnscaledData[0,1] = 5 or grid.UnscaledData[:,:] = numpy.zeros(grid.Shape). Returns and accepts numpy.ndarray, float, and int. Read only.

property UnscaledDataType

(str) Numeric data type of the grid’s raw data, before it has been transformed by a scaling equation. numpy.ndarrays returned by UnscaledData have this type. If no transformation is needed (DataIsScaled is False), then UnscaledDataType and ScaledDataType are the same, and UnscaledData returns the same values as Data. Read only. Allowed values꞉ 'int8', 'uint8', 'int16', 'uint16', 'int32', 'uint32', 'float32', 'float64'. Case sensitive.

property UnscaledNoDataValue

(object or None) int or float value that indicates that cells of UnscaledData should be interpreted as having no data (these are also known as missing, NA, or NULL cells), or None if all cells must have data. Read only.

Methods

Close

Closes any open files or connections associated with this object and releases any other resources allocated to access it.

ConvertSpatialReference

Converts a spatial reference from one format to another, such as an OGC WKT string to a Proj4 string.

DeleteLazyPropertyValue

Deletes the lazy property with the specified name.

GetAllQueryableAttributes

Returns a list of all queryable attributes.

GetIndicesForCoords

Given a tuple or list of coordinates, returns a list of int indices into Data for the cell that contains the coordinates.

GetLazyPropertyValue

Returns the value of the lazy property with the specified name.

GetQueryableAttribute

Returns the queryable attribute with the specified name.

GetQueryableAttributeValue

Returns the value of the queryable attribute with the specified name.

GetQueryableAttributesWithDataType

Returns a list queryable attributes having the specified data type.

GetSpatialReference

Returns the spatial reference of this dataset.

HasLazyPropertyValue

Returns True if the specified lazy property has a value.

SetLazyPropertyValue

Sets the lazy property with the specified name to the specified value.

SetSpatialReference

Sets the spatial reference of this dataset.

TestCapability

Tests whether a capability is supported by this class or an instance of it.

numpy_equal_nan