GeoEco.Datasets.Collections.DatasetCollectionTree

class GeoEco.Datasets.Collections.DatasetCollectionTree(pathParsingExpressions=None, pathCreationExpressions=None, canSortByDate=True, parentCollection=None, queryableAttributes=None, queryableAttributeValues=None, lazyPropertyValues=None, cacheDirectory=None)

Bases: DatasetCollection

Base class representing DatasetCollections that are organized as hierarchical trees.

DatasetCollectionTree is a base class that should not be instantiated directly; instead, users should instantiate one of the many derived classes representing the type of dataset collection they’re interested in.

Parameters:
  • pathParsingExpressions (list of str, optional) –

    List of regular expressions used for finding datasets in the tree and parsing queryable attribute values from their paths. One expression per path level. Use Python Regular Expression Syntax.

    Queryable attributes are represented by “named groups” in the regular expressions. For example, if your collection is an ArcGIS geodatabase that contains feature classes and tables that you want to query by name, you could provide [r'(?P<TableName>.+)'] for this parameter. This defines a single path level (because the list has one element), which contains a single queryable attribute (because there is one named group), which is named TableName, which must be at least one character long (because .+ means “one or more characters”). Then, for queryableAttributes, provide (QueryableAttribute('TableName', 'Table name', UnicodeStringTypeMetadata()),). Finally, when calling QueryDatasets(), use an expression like "TableName = 'Foo'". Minimum length꞉ 1.

  • pathCreationExpressions (list of str, optional) – List of printf-style formatters used when importing datasets into this tree. Used to create destination path names from queryable attribute values. One formatter per path level. Minimum length꞉ 1.

  • canSortByDate (bool, optional) –

    This parameter is primarily of interest to developers of derived classes. If True and queryableAttributes includes a QueryableAttribute with a dataType of DateTimeTypeMetadata, the GetOldestDataset() and GetNewestDataset() methods will assume they can call the derived class’s _QueryRecursive() method with a queryType of 'oldest' and 'newest', respectively, and the derived class will implement suitable optimizations to efficiently retrieve the oldest and newest datasets in this collection.

    If False, GetOldestDataset() and GetNewestDataset() will retrieve and examine relevant metadata from every dataset in this collection to determine which dataset is oldest and newest.

  • parentCollection (DatasetCollection, optional) – Parent DatasetCollection that this object is part of (if any).

  • queryableAttributes (tuple of QueryableAttribute, optional) – Queryable attributes defined for this object.

  • queryableAttributeValues (dict mapping str to object, optional) – Values of the queryable attributes, expressed as a dictionary mapping the case-insensitive names of queryable attributes to their values.

  • lazyPropertyValues (dict mapping str to object, optional) – Lazy properties to set when this object is constructed, expressed as a dictionary mapping the names of lazy properties to their values.

  • cacheDirectory (str, optional) – Directory for caching local copies of remote datasets. Minimum length꞉ 1.

Returns:

DatasetCollectionTree instance.

Return type:

DatasetCollectionTree

Properties

property CacheDirectory

(str or None) Directory for caching local copies of remote datasets. Minimum length꞉ 1. If a cache directory is not provided, then after a remote dataset is downloaded it will be kept either only in memory or in a temporary directory on disk, depending on the type of data it is. The temporary directory will be automatically deleted when Close() is called.

If a cache directory is provided, remote datasets will be stored in it when they are downloaded. Before a download is attempted, the cache directory will be checked first for the relevant dataset, and if it is found, the download will be skipped, speeding up execution.

The datasets are organized in the cache directory in an undocumented format that is specific to the collection. Once a dataset is stored in the cache directory, it is never changed or deleted. If the original remote datasets are changed, these changes will not be detected and the cache will not be updated. If the disk fills up, cached datasets will not be automatically deleted to mitigate the problem.

If you determine that the cached datasets are obsolete or the disk is too full, delete the entire cache directory. You may also be able to delete a portion of it, if you can reverse engineer how datasets are stored within it, but the organizational structure is not documented.

property DisplayName

(str) Informal name of this object, suitable to be displayed to the user. Read only. Minimum length꞉ 1.

property ParentCollection

(DatasetCollection or None) Parent DatasetCollection that this object is part of (if any). Read only.

property PathCreationExpressions

(list of str or None) List of printf-style formatters used when importing datasets into this tree. Used to create destination path names from queryable attribute values. One formatter per path level. Read only. Minimum length꞉ 1.

property PathParsingExpressions

(list of str or None) List of regular expressions used for finding datasets in the tree and parsing queryable attribute values from their paths. One expression per path level. Use Python Regular Expression Syntax.

Queryable attributes are represented by “named groups” in the regular expressions. For example, if your collection is an ArcGIS geodatabase that contains feature classes and tables that you want to query by name, you could provide [r'(?P<TableName>.+)'] for this parameter. This defines a single path level (because the list has one element), which contains a single queryable attribute (because there is one named group), which is named TableName, which must be at least one character long (because .+ means “one or more characters”). Then, for queryableAttributes, provide (QueryableAttribute('TableName', 'Table name', UnicodeStringTypeMetadata()),). Finally, when calling QueryDatasets(), use an expression like "TableName = 'Foo'". Read only. Minimum length꞉ 1.

Methods

Close

Closes any open files or connections associated with this object and releases any other resources allocated to access it.

DeleteLazyPropertyValue

Deletes the lazy property with the specified name.

GetAllQueryableAttributes

Returns a list of all queryable attributes.

GetLazyPropertyValue

Returns the value of the lazy property with the specified name.

GetNewestDataset

Queries the collection and returns the newest Dataset that matches the search expression.

GetOldestDataset

Queries the collection and returns the oldest Dataset that matches the search expression.

GetQueryableAttribute

Returns the queryable attribute with the specified name.

GetQueryableAttributeValue

Returns the value of the queryable attribute with the specified name.

GetQueryableAttributesWithDataType

Returns a list queryable attributes having the specified data type.

HasLazyPropertyValue

Returns True if the specified lazy property has a value.

ImportDatasets

Copies each Dataset in a list into this DatasetCollection.

QueryDatasets

Queries the collection and returns a list of Datasets that match a search expression.

SetLazyPropertyValue

Sets the lazy property with the specified name to the specified value.

TestCapability

Tests whether a capability is supported by this class or an instance of it.