Network data model #24

tomalrussell · 2021-07-29T17:35:07Z

As in #21, define a spatial network graph G = {V,E} as a set of nodes V and edges E where each edge connects two nodes.

see notes below for objects and attributes
some opinions about file formats and how to store metadata, conventions to follow for ease of use:
- nodes and edges of a single network can be stored in one GeoPackage with layers named nodes and edges (a railway network could be in rail.gpkg).
- if nodes have associated polygon geometries, store them in an node_areas layer in the same GeoPackage, with node_id to link them (GeoPackage modelling guidelines have related guidance)
- sources can be stored in a CSV or Excel sheet (taking care to have one table per sheet, consistent column naming)
- asset_type definitions can be captured in two tables: one for asset_types (name, title) and one table for asset_type_attributes, linked to asset type by asset_type column.

A node:

has a Point geometry
has a node_id String/UUID which should be unique within an analysis project
has an asset_type String which groups it with other assets sharing custom attributes
may have a source String which refers to a third-party dataset
may have a source_id which links it to an entry in a third-party dataset
may have an associated area Polygon that represents a site, building, or service area
may have conventional attributes to support analysis:
- unit_cost: rehabilitation cost (units must be defined, may be made consistent across a project, e.g. million $USD in 2020)
- cost_per_km2: rehabilitation cost if partial rehabilitation is possible per unit area

An edge:

has a LineString geometry
has an edge_id
has from_id and to_id referring to its start and end nodes
may have a length_km length in kilometres (which might otherwise be calculated from the geometry)
may be directed (True/False), if flows on the graph should only be allowed to traverse the edge in one direction
may have conventional attributes to support analysis:
- unit_cost: rehabilitation cost if any damage is supposed to require full replacement
- cost_per_km: rehabilitation cost per length damaged if partial repairs are possible

An asset type:

has an asset_type short name or id, for use in network data to refer back to it
has a title human readable name that can be more descriptive
may have a description with more details or notes
has a list of attributes
- name used as attribute key or column name, conventionally all in lower_snake_case
- title human readable name that can be more descriptive
- description human-readable description (spell out abbreviations)
- unit units
- dtype data type: boolean, integer, string, float, categorical
- categories: list of options if dtype is categorical

A data source:

For design ideas and similar approaches (particularly for data source/provenance schema):

The text was updated successfully, but these errors were encountered:

thomas-fred added the feature New feature or functionality label May 25, 2022

thomas-fred changed the title ~~Network metadata~~ Network data model May 25, 2022

Provide feedback