ARA-2 Client Library

Rust client library for the Kinara ARA-2 neural network accelerator. Provides session management, model loading, and inference on NXP i.MX platforms equipped with ARA-2 PCIe hardware.

Supported Platforms

Platform	SoC	Status
NXP FRDM i.MX 8M Plus	i.MX 8M Plus	Tested
NXP FRDM i.MX 95	i.MX 95	Tested

Requires EdgeFirst Yocto Images with ARA-2 SDK support.

Workspace

Crate	Description
`ara2`	Core client library — session, endpoint, model, and DVM metadata APIs
`ara2-sys`	FFI bindings to `libaraclient.so` via `libloading`

Integration with edgefirst-hal

The ara2 crate depends on edgefirst-hal for:

Tensor memory management — DMA-backed tensors for zero-copy NPU transfers
Image preprocessing — Hardware-accelerated format conversion and scaling
Post-processing — YOLO decoding, overlay rendering, segmentation masks

Python Bindings

Python bindings are available as a separate package via PyPI:

pip install edgefirst-ara2

See crates/ara2-py/README.md for the Python API reference.

Quick Start

use ara2::{Session, DEFAULT_SOCKET};
use edgefirst_hal::tensor::{TensorMemory, TensorTrait as _};

// Connect to the ARA-2 proxy service
let session = Session::create_via_unix_socket(DEFAULT_SOCKET)?;

// Enumerate NPU endpoints and check status
let endpoints = session.list_endpoints()?;
let endpoint = &endpoints[0];
println!("Endpoint state: {:?}", endpoint.check_status()?);

// Load a compiled model (.dvm) and allocate DMA tensors
let mut model = endpoint.load_model_from_file("model.dvm".as_ref())?;
model.allocate_tensors(Some(TensorMemory::Dma))?;

// Run inference
let timing = model.run()?;
println!("NPU inference: {:?}", timing.run_time);
# Ok::<(), ara2::Error>(())

Async Inference

The submit() / wait() API enables overlapping CPU work with NPU execution — the building block for pipeline parallelism:

use ara2::{Session, DEFAULT_SOCKET, DEFAULT_TIMEOUT_MS};

let session = Session::create_via_unix_socket(DEFAULT_SOCKET)?;
let endpoints = session.list_endpoints()?;
let mut model = endpoints[0].load_model_from_file("model.dvm".as_ref())?;
model.allocate_tensors(None)?;

// Submit — returns immediately while the NPU works
let request = model.submit()?;

// CPU is free to do other work (preprocess next frame, etc.)

// Block until the NPU finishes
let timing = request.wait(DEFAULT_TIMEOUT_MS)?;
println!("NPU inference: {:?}", timing.run_time);

// Monitor pipeline depth
assert_eq!(session.inflight_count()?, 0);
# Ok::<(), ara2::Error>(())

The Python API mirrors this exactly:

import edgefirst_ara2 as ara2

session = ara2.Session.create_via_unix_socket(ara2.DEFAULT_SOCKET)
endpoint = session.list_endpoints()[0]
model = endpoint.load_model("model.dvm")
model.allocate_tensors()

# Submit — returns immediately
request = model.submit()

# CPU work here... the GIL is NOT held during wait()
timing = request.wait()
print(f"NPU inference: {timing.run_time_us} µs")

See the async_infer example for a complete benchmark comparing synchronous vs. asynchronous inference, and async_pipeline for pipelined inference with a circular buffer of DMA-BUF tensor sets (2x+ throughput improvement).

Runtime Requirements

The following must be present on the target system:

libaraclient.so.1 — Kinara client library (from the ARA-2 SDK)
ara2-proxy / dvproxy — System service providing NPU access, must be running (systemd unit name is platform-dependent: ara2.service on EdgeFirst Yocto images, dvproxy.service on other platforms)
ARA-2 hardware — PCIe accelerator card visible via lspci

Building

Native

cargo build --release

Cross-compile for aarch64 (NXP i.MX)

cargo zigbuild --release --target aarch64-unknown-linux-gnu

Performance

Benchmarked on NXP FRDM i.MX 95 + ARA-2 with YOLOv8m-seg (640×640), showing the Python API adds minimal overhead over native Rust thanks to DMA-BUF zero-copy tensor sharing — the GPU and NPU operate on the same physical buffers with no CPU copies in the data path.

Stage	Rust	Python	Overhead
GPU preprocess (letterbox + RGBA→CHW)	2.85 ms	2.88 ms	+0.03 ms
NPU inference (wall clock)	34.53 ms	34.63 ms	+0.10 ms
NPU execution	26.04 ms	26.04 ms	—
DMA input upload	2.02 ms	2.05 ms	—
DMA output download	3.68 ms	3.68 ms	—
Decode (NMS + dequant)	4.05 ms	4.31 ms	+0.26 ms
Materialize (CPU coeff × proto → bitmaps)	5.67 ms	5.98 ms	+0.31 ms
Draw (GL mask overlay)	5.54 ms	5.71 ms	+0.17 ms
Total pipeline	52.64 ms	53.52 ms	+0.88 ms
Throughput	19.0 FPS	18.7 FPS

Steady-state mean over 30 iterations after warmup. Python overhead is under 1 ms across the entire pipeline. GPU preprocessing and NPU inference are identical since both use the same DMA-BUF tensors.

Examples

Example	Description
`yolov8.rs`	Rust — YOLOv8 detection + segmentation with letterbox preprocessing and 3-step mask pipeline
`yolov8.py`	Python — Same 3-step pipeline via `edgefirst-hal` and `edgefirst-ara2` Python packages
`async_infer.rs`	Rust — Async inference benchmark: sync vs. submit/wait vs. overlap
`async_infer.py`	Python — Same async benchmark via `edgefirst-ara2`
`async_pipeline.rs`	Rust — Pipelined inference with circular DMA-BUF buffer ring (2x+ speedup)
`async_pipeline.py`	Python — Same pipeline demo via `edgefirst-ara2`
`endpoints.py`	Python — Connect, list endpoints, check status
`test_dvm_metadata.rs`	Rust — Read and display DVM model metadata

Running the Rust example

Cross-compile from your development machine and deploy to the target:

# Build
cargo zigbuild --release --example yolov8 --target aarch64-unknown-linux-gnu

# Deploy and run
scp target/aarch64-unknown-linux-gnu/release/examples/yolov8 <target>:/root/yolov8-ara2
ssh <target> "/root/yolov8-ara2 model.dvm image.jpg --benchmark 30 --save"

Running the Python example

Create a virtual environment on the target and install the packages from PyPI:

# On target
python3 -m venv ~/venv
~/venv/bin/pip install edgefirst-ara2 edgefirst-hal

Copy the script and run:

# From dev machine
scp examples/yolov8.py <target>:/root/

# On target
~/venv/bin/python3 /root/yolov8.py model.dvm image.jpg --benchmark 30 --save

Testing

Tests require an NXP i.MX + ARA-2 system with the proxy running:

# All tests (on-target with hardware)
cargo test -p ara2

# Metadata tests only (no hardware needed)
cargo test -p ara2 dvm_metadata

# Model tests (needs a .dvm file)
ARA2_TEST_MODEL=/path/to/model.dvm cargo test -p ara2 model

Documentation

ARCHITECTURE.md — System architecture and ownership model
TESTING.md — Test guide, on-target setup, and debugging
CONTRIBUTING.md — Contribution guidelines
SECURITY.md — Security policy
CHANGELOG.md — Release history

License

Licensed under the Apache License 2.0. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARA-2 Client Library

Supported Platforms

Workspace

Integration with edgefirst-hal

Python Bindings

Quick Start

Async Inference

Runtime Requirements

Building

Native

Cross-compile for aarch64 (NXP i.MX)

Performance

Examples

Running the Rust example

Running the Python example

Testing

Documentation

License

About

Uh oh!

Releases 11

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github		.github
crates		crates
examples		examples
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
TESTING.md		TESTING.md
rustfmt.toml		rustfmt.toml

Folders and files

Latest commit

History

Repository files navigation

ARA-2 Client Library

Supported Platforms

Workspace

Integration with edgefirst-hal

Python Bindings

Quick Start

Async Inference

Runtime Requirements

Building

Native

Cross-compile for aarch64 (NXP i.MX)

Performance

Examples

Running the Rust example

Running the Python example

Testing

Documentation

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages