Replies: 3 comments
-
Proposal: Stream class - constructor takes device name (could be gotten from a Device object). Stream sets its name e.g. from abc import ABC, abstractmethod
class Stream(ABC):
@abstractmethod
def __init__(self, device_name):
pass
class SubjectWeight(Stream):
def __init__(self, device_name):
self.device_name = device_name
cols = ["weight", "confidence", "subject_id", "int_id"]
self.reader = reader.Csv(f"{device_name}_{self.__name__}*", cols)
device_name = "Nest"
weight_stream = Stream(device_name)
class Device():
def __init__(self, name, *args):
...
device = Device(device_name, SubjectWeight, ...) |
Beta Was this translation helpful? Give feedback.
-
Questions remain on whether to break |
Beta Was this translation helpful? Give feedback.
-
To follow up on this, I have started prototyping a refactor of the device and streams API and have refreshed my thoughts on why the architecture was designed the way it is currently. There are lots to be said, but essentially it boils down to the decision to use the It seems then to reorganize the architecture we can take two options:
Re. pros and cons, 1) has the advantage it requires no major changes to the existing code organization. The concepts will remain slightly entangled, since a For popular examples where this composite design is leveraged successfully, we need to look no further than JSON or XML. In either of these data and schema standards, values can be either primitive values or objects, and nesting is achieved naturally via this composition polymorphism. Forcing a separation might seem "cleaner" but in reality is less flexible since we are setting in stone very hard and fast rules about what kinds of things can be composed and how. P.S.: For reference, including here Wikipedia article on the Composite pattern which is the design pattern being applied in the current implementation (hence the function name |
Beta Was this translation helpful? Give feedback.
-
Following earlier discussions in DA meetings and the current PR at #310 proposing to change the low-level interface API, I wanted to introduce here a more formal glossary of terms as they are currently understood in the low-level data interface, to allow us to undergo a more systematic and thorough assessment of the interface design guidelines which were used to create the current version of the API.
I feel it would be wise to carefully consider these before proceeding with any further changes. From both my own experience and documented best practices in the field of software engineering, APIs are best thought of as contracts, and should not be changed lightly once they have been put to regular use. The current API design was created over the course of more than one year of carefully considering a large number of possible scenarios we might encounter when interfacing with Project Aeon data, not just for the ongoing foraging or social experiments, but for experiments going even beyond the scope of the foraging group.
With that in mind, I will break down this discussion into two parts, the Glossary and Proposal. The latter is intended as a space for ongoing discussion and as a working document for discussion at DA meetings, where we can iterate ideas for how to clarify the existing API design before we can decide on action items for the API refactoring proper.
Glossary
Below I provide working definitions for terms which are not defined in the Aeon Glossary but are nevertheless critical to understand the current design decisions. Interpretation of the terms below have an intended bias towards standard practice in software engineering, which may differ from regular language use. On occasion we repeat definitions from the Aeon Glossary, when we considered the existing definition to require further clarification for the discussion at hand.
File (from Wikipedia)
Stream (from Wikipedia)
Chunk File
A file storing an Acquisition Chunk, i.e. a file storing all data from a specific stream over a specific one-hour acquisition period.
Note
From the above definitions, it follows that a "stream" is not a "file", and specifically a "stream" is not a "chunk file". The collection of all "chunk files" associated with a named stream is a serializable representation of "data elements" in a stream, but is itself not a stream.
Reader
An object providing access to the data stored inside specific chunk files.
Device
From Wikipedia definition for Peripheral device
A uniquely identified component in the experimental environment, usually a hardware data collection device. Originally intended as the definition of "peripheral device" above, but which we have since extended to represent also "software devices", i.e. purely virtual devices or logical modules or any other logically independent component in an experiment.
Device Stream
A uniquely identified sequence of data elements made available over time by a specific device. A device stream is uniquely identified by a combination of the name of the device and the name of the stream, where the latter must be unique within the containing device.
Note
From the above definition, a "device stream" represents both online streams being acquired live during an experiment, and offline streams made available by the Aeon IO API. This duality is intentional and allows setting in place specific expectations for symmetry and parity between data contract, acquisition system and low-level data interface.
Important
The name of a device stream is required. A sequence of data which is not uniquely identified by a device name and stream name is not a "device stream" under the above definition.
Schema (from Wikipedia)
Note
In the Aeon IO API the only representations for "device" and "device stream" are schema objects. Therefore, the terms "device" and "device stream" in the Aeon IO API should be interchangeably understood as "device schema" and "device stream schema", respectively. Nevertheless we define these terms below explicitly in the context of the Aeon IO API since it is understood this is where the root of the confusion lies.
Device Schema
A dictionary describing the set of streams made available by a device. Each device must have a unique name in a given experiment. Device schemas are currently represented by the
Device
class.Important
In the current implementation, we allow the creation of "anonymous" device schemas, or "composite streams", which are essentially temporary dictionary objects containing collections of device stream schemas. They are used primarily as a composition tool, to allow aggregating together multiple device stream schemas hierarchically before passing them on to the main device schema object.
Device Stream Schema
An object comprising:
Reader
object.In combination with the
aeon.io.api.load
function, a device stream schema can be used to make the stream data elements available over arbitrary time ranges (see Device Stream and Stream above).Important
Currently device stream schemas are not represented by an explicit class object, but rather by dictionary objects where a
Reader
is paired with a unique key. The pattern used to find the chunk files in the stream is currently stored inside theReader
object, which we believe may be a possible source of confusion. Understanding this it should hopefully become clear that "binder functions" are really just functions that create device stream schemas, or simply device streams.Data Contract (a.k.a. Experiment Schema)
The collection of all device schemas and device stream schemas for a specific experiment, e.g. foraging or social experiments. Currently represented as a
DotMap
dictionary of device schema objects.Proposal
Below we outline a refactoring proposal to clarify and materialize the terminology and above glossary directly in the API.
DeviceStream
class, with aname
attribute, instead of loose dictionariesDevice
andDeviceStream
classes into theschema
module to clarify their intended usage as data schema objectsDeviceStream
objects as input to theload
function in addition, or instead, ofReader
objectsBeta Was this translation helpful? Give feedback.
All reactions