This guide specifically addresses Coral dev board with its Quad-core ARM Cortex-A53. However, a deployment on other Arm based devices should be possible in a similar way.
Even though Coral dev board also disposes of an Edge TPU for accelerated inferencing, this guide currently addresses inferencing on the ARM cpu exclusively.
Note that all almost all steps to do in order to deploy on ARM are already implemented in branch
coral
.Details are explained here merely for completeness and transparency.
The overall setup still remains the same as shown in Deployment and Provisioning. The peculiarity here is that whereas the model repository on Bindle and the OCI registry are still hosted on the same machine, the runtime is deployed on the, potentially remote, Arm device.
Given the machine learning application and corresponding tools are already installed on the development machine, for the setup to work it is necessary to further install wasmCloud and NATS on the ARM device. The fastest way to install both is via a download from the respective release repository.
To be executed on the ARM device
echo "Downloading NATS 2.8.1"
curl -fLO https://github.com/nats-io/nats-server/releases/download/v2.8.1/nats-server-v2.8.1-linux-arm64.tar.gz
echo "Downloading wasmCloud host 0.54.4"
curl -fLO https://github.com/wasmCloud/wasmcloud-otp/releases/download/v0.54.4/aarch64-linux.tar.gz
echo "Extracting..."
tar -xf aarch64-linux.tar.gz
tar -xf nats-server-v2.8.1-linux-arm64.tar.gz
# (optional)
sudo mv nats-server-v2.8.1-linux-arm64/nats-server /usr/local/bin/
The hardware target for Coral dev board is known as aarch64
. All actors are inherently portable but the capability providers have to be compiled for their specific target.
The two capability providers in this application are http-server and mlinference. https-server is already available for aarch64
but mlinference has to be built. The recommended procedure is to cross compile the capability provider. The following steps guide through the sequence of cross-compilation.
par_targets
in providers/mlinference/provider.mk comprises target aarch64-unknown-linux-gnu
, e.g.par_targets ?= \
aarch64-unknown-linux-gnu
[target.armv7-unknown-linux-gnueabihf]
image = "wasmcloud/cross:armv7-unknown-linux-gnueabihf"
[target.aarch64-unknown-linux-gnu]
image = "wasmcloud/cross:aarch64-unknown-linux-gnu"
[target.x86_64-apple-darwin]
image = "wasmcloud/cross:x86_64-apple-darwin"
[target.aarch64-apple-darwin]
image = "wasmcloud/cross:aarch64-apple-darwin"
[target.x86_64-unknown-linux-gnu]
image = "wasmcloud/cross:x86_64-unknown-linux-gnu"
[build.env]
passthrough = [
"XDG_CACHE_HOME",
]
Set the environment varialbe XDG_CACHE_HOME
to the path the current user has write access, e.g. XDG_CACHE_HOME=/tmp
Eventually, in providers/mlinference build mlinference with make par-full
The configuration is slightly more envolved. Related scripts allow to selectively deploy the machine learning application may either on the development machine or on the ARM device.
On the development machine in deploy/env there are the new environment variables HOST_DEVICE_IP
and TARGET_DEVICE_IP
. They represent the address of the development machine (host) and the ARM device (target device) respectively.
In case both parameters are not set, the application is going to be deployed on the development machine. In case TARGET_DEVICE_IP
is set to the address of the ARM device, the application is going to be deployed remotely. In the latter case the value for HOST_DEVICE_IP
should be set such that both addresses are in the same network.
Example values are
export HOST_DEVICE_IP=192.168.178.24
export TARGET_DEVICE_IP=192.168.178.148
Given TARGET_DEVICE_IP
does not equal to 127.0.0.1
and deploy/run_iot_device.sh is launched, a checklist is displayed comprising all preparation steps which should have been done by now:
HOST_DEVICE_IP
in deploy/envTARGET_DEVICE_IP
in deploy/envsource ./configure_edge.sh
on the target devicenats-server --jetstream
) on the target deviceThe bulk of configuration is done in iot/configure_edge.sh:
export RUST_LOG=debug
export WASMCLOUD_OCI_ALLOWED_INSECURE=192.168.178.24:5000
export WASMCLOUD_RPC_TIMEOUT_MS=16000
export BINDLE_URL=http://192.168.178.24:8080/v1/
cd ~/wasmcloudHost
Set the log level with RUST_LOG
according to your needs.
WASMCLOUD_OCI_ALLOWED_INSECURE
is used in a development context only. If this is omitted, wasmCloud runtime prohibits unauthenticated access to OCI registries. For further details see Allowing unauthenticated OCI registry access. The value this environment variable is assigned to is supposed to represent the OCI registry where the artifacts of the application are stored. Since the OCI registry in this setup is hosted on the development machine, 192.168.178.24
in this example is the IP address of the development machine.
This guide targets inference on ARM cpus. Depending on the respective model and data this may take a while. Since wasmCloud has a built-in timeout of two seconds, its value is increased proactively in order to avoid “internal server errors” resulting in HTTP 503 like responses upon inference requests. Set the value of WASMCLOUD_RPC_TIMEOUT_MS
>2000.
BINDLE_URL
represents the endpoint of bindle server where the models are stored.
The script assumes that the runtime is located at ~/wasmcloudHost. This is where the the script goes to in the last line.
In order to re-start the runtime restart_edge.sh may be used. It
The folder structure in configure_edge.sh and restart_edge.sh may have to be modified.
cd deploy
./run_iot_device.sh bindle-start
./run_iot_device.sh all
Example requests may look like the following:
curl --silent -T ../images/cat.jpg 192.168.178.134:8078/mobilenetv27/matches | jq
curl --silent -T ../images/hotdog.jpg 192.168.178.134:8078/squeezenetv117/matches | jq