Skip to content

Commit

Permalink
Merge branch 'datajoint_pipeline'
Browse files Browse the repository at this point in the history
  • Loading branch information
jkbhagatio committed Mar 12, 2022
2 parents 55dfb8b + 2954345 commit 41ce4eb
Show file tree
Hide file tree
Showing 79 changed files with 7,471 additions and 7,653 deletions.
18 changes: 18 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
__pycache__/
./makefile
.git*
.idea/
.profile
.travis.y*ml
.vscode/*
*.code-workspace
*.DS_Store
*.egg
*.egg-info/
**/.DS_Store
**/*.ipynb_checkpoints
dj_local_conf.json
log*.txt
scratch*.py
scratch/
tox.ini
12 changes: 11 additions & 1 deletion .flake8
Original file line number Diff line number Diff line change
@@ -1 +1,11 @@
#
[flake8]
max-line-length = 88
# Ignore the following errors and warnings:
# - whitespace after '(' (E201)
# - whitespace before ')' (E202)
# - whitespace before ':') (E203)
# - assigning # to a lambda expression (E371)
# - linebreaking before a binary operator (W503)
extend-ignore = E201, E202, E203, E731, W503
# Ignore config setting files for git, github, intellij, vscode.
exclude = .git, .github, .idea, .vscode
1 change: 0 additions & 1 deletion .github/workflows/actions.yml

This file was deleted.

83 changes: 83 additions & 0 deletions .github/workflows/docker-aeon-mecha.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
name: Aeon Mecha Container Environment

on:
push:
branches: [datajoint_pipeline]

jobs:
build_and_push:
runs-on: ubuntu-latest
steps:
- name: Make build space
run: |
sudo du -hd2 /tmp
sudo rm -rf /usr/local/lib/android
sudo rm -rf /usr/share/dotnet
sudo apt-get clean -y
sudo apt-get autoremove -y
docker system df
docker images -a
docker system prune -fa --volumes
- name: Checkout Repository
uses: actions/checkout@v2
with:
fetch-depth: 0

- name: Get previous tag
id: previoustag
uses: WyriHaximus/github-action-get-previous-tag@v1
with:
fallback: v0.0.0a

- name: Assign environment variables
run: |
echo "repository_lower=$(echo ${{ github.repository }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV
echo "image_build_date=$(date -u +'%Y-%m-%dT%H:%M:%SZ')" >> $GITHUB_ENV
- name: Setup QEMU
uses: docker/setup-qemu-action@v1
with:
platforms: linux/amd64,linux/arm64

- name: Setup Docker buildx
id: buildx
uses: docker/setup-buildx-action@v1
with:
install: true
driver: docker-container
driver-opts: |
image=moby/buildkit:buildx-stable-1
buildkitd-flags: --debug
config-inline: |
[worker.oci]
max-parallelism = 2
- name: Login to GitHub Container Registry
uses: docker/login-action@v1
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build and push
id: docker_build
uses: docker/build-push-action@v2
with:
no-cache: true
build-args: |
IMAGE_CREATED=${{ env.image_build_date }}
IMAGE_VERSION=${{ steps.previoustag.outputs.tag }}
context: .
file: docker/image/Dockerfile
platforms: linux/arm64,linux/amd64
push: true
tags: |
ghcr.io/${{ env.repository_lower }}:latest
ghcr.io/${{ env.repository_lower }}:${{ steps.previoustag.outputs.tag }}
- name: Image digest
run: |
echo ${{ steps.docker_build.outputs.digest }}
docker system df
docker images -a
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ __pycache__/
*.egg

# dotenv
.env
!docker/template.env
**/*.env


# datajoint
dj_local_conf.json
Expand All @@ -15,6 +17,7 @@ dj_local_conf.json
.vscode/*
*.code-workspace
.idea/
**/*.ipynb_checkpoints

# misc
**/.DS_Store
Expand Down
35 changes: 35 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
default_language_version:
python: python3.9

default_stages: [commit, push]
files: "^(docker|aeon\/dj_pipeline)\/.*$"
repos:
- repo: meta
hooks:
- id: identity

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.1.0
hooks:
- id: check-yaml
- id: detect-private-key
- id: end-of-file-fixer
exclude: LICENSE
- id: no-commit-to-branch

- repo: https://github.com/psf/black
rev: 22.1.0
hooks:
- id: black
args:
- "--config"
- "./pyproject.toml"

- repo: https://github.com/pycqa/isort
rev: 5.10.1
hooks:
- id: isort
name: isort (python)
args:
- "--settings-file"
- "./pyproject.toml"
7 changes: 7 additions & 0 deletions .profile
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Set env modules
module add miniconda

# Save Bonsai and deps to path
export PATH=$PATH:/ceph/aeon/aeon/code/bonsai/Bonsai.Player/bin/Debug/net5.0
export DOTNET_ROOT=/ceph/aeon/aeon/code/dotnet
export PATH=$PATH:/ceph/aeon/aeon/code/dotnet
11 changes: 10 additions & 1 deletion aeon/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,10 @@
#
from importlib_metadata import PackageNotFoundError, version

try:
# Change here if project is renamed and does not equal the package name
dist_name = "aeon"
__version__ = version(dist_name)
except PackageNotFoundError:
__version__ = "unknown"
finally:
del version, PackageNotFoundError
24 changes: 12 additions & 12 deletions aeon/dj_pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,14 @@ computation routines.

## Core tables

1. `Experiment` - the `experiment.Experiment` table stores meta information about the experiments
1. `Experiment` - the `aquisition.Experiment` table stores meta information about the experiments
done in Project Aeon, with secondary information such as the lab/room the experiment is carried out,
which animals participating, the directory storing the raw data, etc.

2. `TimeBin` - the raw data are acquired by Bonsai and stored as
a collection of files every one hour - we call this one-hour a timebin.
The `experiment.TimeBin` table records all timebins and their associated raw data files for
any particular experiment (in the above `experiment.Experiment` table)
2. `Chunk` - the raw data are acquired by Bonsai and stored as
a collection of files every one hour - we call this one-hour a time chunk.
The `aquisition.Chunk` table records all time chunks and their associated raw data files for
any particular experiment (in the above `aquisition.Experiment` table)

3. `ExperimentCamera` - the cameras and associated specifications used for this experiment -
e.g. camera serial number, frame rate, location, time of installation and removal, etc.
Expand All @@ -40,11 +40,11 @@ from a particular `ExperimentFoodPatch`
6. `Session` - a session is defined, for a given animal, as the time period where
the animal enters the arena until it exits (typically 4 to 5 hours long)

7. `SessionEpoch` - data for each session are stored in smaller time-chunck called epochs.
Currently, an epoch is defined to be 10-minute long. Storing data in smaller epochs allow for
7. `TimeSlice` - data for each session are stored in smaller time bins called time slices.
Currently, a time slice is defined to be 10-minute long. Storing data in smaller time slices allows for
more efficient searches, queries and fetches from the database.

8. `SubjectPosition` - position data (x, y, speed, area) of the subject in the epochs for
8. `SubjectPosition` - position data (x, y, speed, area) of the subject in the time slices for
any particular session.

9. `SessionSummary` - a table for computation and storing some summary statistics on a
Expand All @@ -69,7 +69,7 @@ animals, cameras, food patches setup, etc.
+ These information are either entered by hand, or parsed and inserted from configuration
yaml files.
+ For experiment 0.1 these info can be inserted by running
the [exp01_insert_meta script](./ingest/exp01_insert_meta.py) (just need to do this once)
the [exp01_insert_meta script](./ingest/create_experiment_01.py) (just need to do this once)

Tables in DataJoint are written with a `make()` function -
instruction to generate and insert new records to itself, based on data from upstream tables.
Expand All @@ -81,8 +81,8 @@ Essentially, turning on the auto-processing routine amounts to running the
following 3 commands (in different processing threads)


python aeon/dj_pipeline/ingest/process.py high
aeon_ingest high

python aeon/dj_pipeline/ingest/process.py middle
aeon_ingest mid

python aeon/dj_pipeline/ingest/process.py low
aeon_ingest low
29 changes: 23 additions & 6 deletions aeon/dj_pipeline/__init__.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,32 @@
import datajoint as dj
import os

_default_database_prefix = 'aeon_'
import datajoint as dj
import hashlib
import uuid

dj.config['display.width'] = 30
_default_database_prefix = os.getenv("DJ_DB_PREFIX") or "aeon_"
_default_repository_config = {"ceph_aeon": "/ceph/aeon"}

# safe-guard in case `custom` is not provided
if 'custom' not in dj.config:
dj.config['custom'] = {}
if "custom" not in dj.config:
dj.config["custom"] = {}

db_prefix = dj.config["custom"].get("database.prefix", _default_database_prefix)

db_prefix = dj.config['custom'].get('database.prefix', _default_database_prefix)
repository_config = dj.config['custom'].get('repository_config',
_default_repository_config)


def get_schema_name(name):
return db_prefix + name


def dict_to_uuid(key):
"""
Given a dictionary `key`, returns a hash string as UUID
"""
hashed = hashlib.md5()
for k, v in sorted(key.items()):
hashed.update(str(k).encode())
hashed.update(str(v).encode())
return uuid.UUID(hex=hashed.hexdigest())
Loading

0 comments on commit 41ce4eb

Please sign in to comment.