From communication
Documents the input data format required by the log processing pipeline: NPZ log archives produced by DataLogger, source ID semantics, microcontroller manifest system, archive internal message layout, and communication protocol. Use when the user asks about log archive format, source IDs, DataLogger output, or why processing fails due to missing or malformed archives.
npx claudepluginhub sun-lab-nbb/ataraxis --plugin communicationThis skill uses the workspace's default tool permissions.
Documents the input data format required by the microcontroller data extraction pipeline, including how
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Documents the input data format required by the microcontroller data extraction pipeline, including how log archives are produced, their internal structure, manifest system, and source ID semantics.
Covers:
Does not cover:
/log-processing)/log-processing-results)/extraction-configuration)/microcontroller-setup)/mcp-environment-setup)Log archives follow the naming pattern {source_id}_log.npz:
101_log.npz # Source ID 101 (from MicroControllerInterface controller_id=101)
102_log.npz # Source ID 102 (from MicroControllerInterface controller_id=102)
51_log.npz # Source ID 51 (from VideoSystem system_id=51, if sharing the logger)
The suffix _log.npz is defined as LOG_ARCHIVE_SUFFIX in log_processing.py. The source ID portion
is the integer form of the originating system's controller ID.
Archives are created in two phases:
Runtime logging — A DataLogger instance receives LogPackage messages via its multiprocessing
input queue. Each MicroControllerInterface sends all communication messages to the shared logger
using its controller_id as the source_id. The logger persists each message as an individual
.npy file named {source_id:03d}_{acquisition_time:020d}.npy inside its output directory.
Post-runtime assembly — assemble_log_archives() from ataraxis-data-structures consolidates
all .npy files in a log directory into one .npz archive per unique source ID. It groups files by
source ID, sorts by timestamp, and writes uncompressed .npz archives. The original .npy files
are removed by default.
You MUST ensure archives have been assembled before running the processing pipeline. The pipeline
expects .npz files, not raw .npy files. If no .npz archives are found during discovery, instruct
the user to run assemble_log_archives_tool (see /microcontroller-setup) on the log directories first.
Each DataLogger output directory should contain a microcontroller_manifest.yaml file that identifies
which archives were produced by ataraxis-communication-interface controllers. The manifest structure:
controllers:
- id: 101
name: teensy_main
modules:
- module_type: 1
module_id: 1
name: encoder
- module_type: 2
module_id: 1
name: lick_sensor
- id: 102
name: teensy_aux
modules:
- module_type: 3
module_id: 1
name: valve
How manifests are produced:
MicroControllerInterface.__init__() writes a manifest entry to the DataLogger output
directory using the controller_id, name, and module list. Each MicroControllerInterface sharing a
DataLogger appends its entry to the same manifest file.write_microcontroller_manifest_tool (see /microcontroller-setup) to retroactively
tag legacy log directories that predate the manifest system.Why manifests matter: The discover_microcontroller_data_tool uses manifest-based routing to identify
controller-produced log archives. Directories without a microcontroller_manifest.yaml will not be
discovered. Manifests also associate controller IDs with human-readable names and enumerate the hardware
modules managed by each controller.
Key difference from axvs manifests: AXCI manifests include a modules list per controller, providing
full hardware module metadata (type, id, name). AXVS camera manifests only have source ID and camera name.
A source ID is a np.uint8 value (1-255) that identifies the hardware system that produced log data.
In ataraxis-communication-interface, each MicroControllerInterface instance has a controller_id that
becomes the source_id in all log entries sent to the DataLogger.
MicroControllerInterface(controller_id=np.uint8(101), data_logger=logger)
→ LogPackage(source_id=np.uint8(101), ...)
→ Raw .npy files: 101_00000000000000000000.npy, 101_00000000000001234567.npy, ...
→ Assembled archive: 101_log.npz
→ Processed output: controller_101_module_1_1.feather, controller_101_kernel.feather
Source IDs have two different uniqueness scopes:
| Scope | Constraint |
|---|---|
| Within a single DataLogger | Source IDs MUST be unique. Multiple controllers sharing one DataLogger must have different controller_ids. |
| Across DataLogger instances | Source IDs MAY repeat. Two loggers in the same recording can each have a source with the same ID without conflict, because each logger writes to its own output directory. |
This means source IDs are unique per log directory, not globally across a recording session.
Runtime MicroControllerInterface instances are advised to use IDs in the range 101-150. This
convention is advised but not enforced. Any np.uint8 value (1-255) is valid as long as source IDs
are unique across all sources within each DataLogger instance, including sources from other
libraries (e.g., ataraxis-video-system). The 101-150 range avoids collisions with other libraries'
advised ranges.
A recording session with one DataLogger produces:
recording_root/
└── session_data_log/ # DataLogger output (instance_name="session")
├── microcontroller_manifest.yaml # Controller manifest (auto-written by MCI.__init__)
├── 101_00000000000000000000.npy # Raw logs (before assembly)
├── 101_00000000000001234567.npy
├── 102_00000000000000000000.npy
├── 101_log.npz # Assembled archive (after assembly)
└── 102_log.npz
The DataLogger creates its output directory as {instance_name}_data_log/ inside the provided
output_directory. All MicroControllerInterface instances sharing this logger write to the same
directory, distinguished by their source IDs.
A recording session can use multiple DataLogger instances. Each creates its own output directory:
recording_root/
├── behavior_data_log/ # DataLogger instance_name="behavior"
│ ├── microcontroller_manifest.yaml # Manifest for behavior controllers
│ ├── 101_log.npz # Controller 101
│ └── 102_log.npz # Controller 102
└── imaging_data_log/ # DataLogger instance_name="imaging"
├── camera_manifest.yaml # Camera manifest (from axvs)
├── 51_log.npz # Camera system_id=51
└── 52_log.npz # Camera system_id=52
Each log directory is an independent processing unit. The discovery tool groups archives by their parent directory, and each directory is prepared and processed independently.
When microcontrollers and cameras share a DataLogger, the log directory contains both types of manifests
and archives. The AXCI processing pipeline only processes archives referenced in the
microcontroller_manifest.yaml, and the AXVS pipeline only processes archives referenced in
camera_manifest.yaml.
Each entry in an .npz archive stores a serialized message as a byte array:
[source_id: 1 byte][elapsed_us: 8 bytes (uint64)][protocol: 1 byte][payload: N bytes]
Archive keys follow the pattern {source_id:03d}_{elapsed_us:020d}, preserving the 3-digit zero-padded
source ID and 20-digit zero-padded timestamp from the original .npy filenames.
| Protocol Code | Name | Description |
|---|---|---|
| 6 | MODULE_DATA | Data message from a hardware module |
| 7 | KERNEL_DATA | Data message from the kernel |
| 8 | MODULE_STATE | State/status message from a hardware module |
| 9 | KERNEL_STATE | State/status message from the kernel |
| Type | Identifier | Payload | Purpose |
|---|---|---|---|
| Onset | elapsed_us == 0 | 8 bytes: int64 UTC epoch microseconds | Absolute time reference |
| Data | elapsed_us > 0 | Protocol byte + command + event + typed data | Module/kernel data message |
| State | elapsed_us > 0 | Protocol byte + command + event | Module/kernel state message |
Onset message: The first message in every archive has elapsed_us=0. Its payload contains the UTC
epoch timestamp (microseconds since epoch) that serves as the absolute time reference. All other
timestamps in the archive are relative to this onset.
Data/State messages: Each communication event produces a message with elapsed_us set to the
microseconds elapsed since onset. The processing pipeline extracts messages matching the extraction
config's event codes, computes absolute timestamps, and writes them to feather files.
After the leading protocol byte, the remaining bytes follow protocol-specific layouts.
MODULE_DATA (protocol 6):
[module_type: 1 byte][module_id: 1 byte][command: 1 byte][event: 1 byte][prototype_code: 1 byte][data: N bytes]
MODULE_STATE (protocol 8):
[module_type: 1 byte][module_id: 1 byte][command: 1 byte][event: 1 byte]
KERNEL_DATA (protocol 7):
[command: 1 byte][event: 1 byte][prototype_code: 1 byte][data: N bytes]
KERNEL_STATE (protocol 9):
[command: 1 byte][event: 1 byte]
The processing pipeline converts relative timestamps to absolute UTC microseconds:
absolute_timestamp_us = onset_us + elapsed_us
Before running the log processing pipeline, verify these conditions:
Microcontroller manifest present — Log directories should contain a
microcontroller_manifest.yaml file for discover_microcontroller_data_tool to locate them. If
missing, use write_microcontroller_manifest_tool to create one.
Archives assembled — Log directories contain .npz files, not just raw .npy files. If only
.npy files are present, assemble_log_archives_tool must be run first.
Archive naming valid — Files match the {source_id}_log.npz pattern.
Onset message present — Each archive must contain exactly one onset message (elapsed_us=0) with a valid UTC epoch payload. Archives missing the onset message cannot be processed.
Extraction config valid — A validated ExtractionConfig YAML file must exist with event codes
matching the firmware's data/state message events. See /extraction-configuration.
| Skill | Relationship |
|---|---|
/microcontroller-setup | Upstream: MCP tools that assemble and discover archives |
/microcontroller-interface | Upstream: MicroControllerInterface instances that produce log data |
/extraction-configuration | Context: extraction config determines which messages are extracted |
/log-processing | Downstream: consumes archives in the format documented here |
/log-processing-results | Downstream: documents the output format produced from these archives |
/pipeline | Context: reference skill for the end-to-end pipeline phases |
/mcp-environment-setup | Prerequisite: MCP server connectivity for discovery and processing |
Log Input Format:
- [ ] Microcontroller manifest (microcontroller_manifest.yaml) present in log directories
- [ ] Log directories contain assembled .npz archives (not raw .npy files)
- [ ] Archive filenames match {source_id}_log.npz pattern
- [ ] Source IDs are unique within each log directory
- [ ] Each archive contains an onset message (elapsed_us=0 with UTC epoch payload)
- [ ] Extraction config has event codes matching firmware message events
- [ ] Directory structure matches expected DataLogger output layout