Architecture

The cyberinfrastructure consists of coordinating hardware and software services enabling AI at the edge. Below is a quick summary of the different infrastructure pieces, starting at the highest-level and zooming into each component to understand the relationships and role each plays.

High-Level Infrastructure

Figure 1: High-level Node & Beehive Relationship

There are 2 main components of the cyberinfrastructure:

Nodes that exist at the edge
The cloud that hosts services and storage systems to facilitate running “science goals” @ the edge

Every edge node maintains connections to 2 core cloud components: one to a Beehive and one to a Beekeeper

Beekeeper

The Beekeeper is an administrative server that allows system administrators to perform actions on the nodes such as gather health metrics and perform software updates. All nodes "phone home" to their Beekeeper and maintain this "life-line" connection.

Details & source code: https://github.com/waggle-sensor/beekeeper

Beehive

The Node-to-Beehive connection is the pipeline for the science. It is over this connection that instructions for the node will be sent, in addition to how data is published into the Beehive storage systems from applications (plugins) running on the nodes.

The overall infrastructure supports multiple Beehives, where each node is associated with a single Beehive. The set of nodes associated with a Beehive creates a "project" where each "project" is separate, having its own data store, web services, etc.

Figure 2: Multiple Beehives

In the example above, there are 2 nodes associated with Beehive 1, while a single node is associated with Beehive 2. With all nodes, in this example, being administered by a single Beekeeper.

Note: the example above shows a single Beekeeper, but a second Beekeeper could have been used for administrative isolation.

Details & source code: https://github.com/waggle-sensor/waggle-beehive-v2

Beehive Infrastructure

Looking deeper into the Beehive infrastructure, it contains 2 main components:

software services such as the Edge Scheduler (ES), Lambda Triggers (LT), data APIs, and websites/portals
data storage systems such as the Data Repository (DR) and the Edge Code Repository (ECR)

Figure 3: Beehive High-level Architecture

The Beehive is the “command center” for interacting with the Waggle nodes at the edge. Hosting websites and interfaces allowing scientists to create science goals to run plugins at the edge & browse the data produced by those plugins.

Figure 4: Beehive Infrastructure Details

The software services and data storage systems are deployed within a kubernetes environment to allow for easy administration and to support running in a multiple server architecture, supporting redundancy and service replication.

While the services running within Beehive are many (both graphical and REST style API interfaces), the following is an outline of the most vital.

Data Repository (DR)

The Data Repository is the data store for housing all the edge produced plugin data. It consists of different storage technologies (i.e. influxdb) and techniques to store simple textual data (i.e. key-value pairs) in addition to large blobular data (i.e. audio, images, video). The Data Repository additionally has an API interface for easy access to this data.

The data store is a time-series database of key-value pairs with each entry containing metadata about how and when the data originated @ the edge. Included in this metadata is the data collection timestamp, plugin version used to collect the data, the node the plugin was run on, and the specific compute unit within the node that the plugin was running on.

{
    "timestamp":"2022-06-10T22:37:47.369013647Z",
    "name":"iio.in_temp_input",
    "value":25050,
    "meta":{
        "host":"0000dca632ed6d06.ws-rpi",
        "job":"sage",
        "node":"000048b02d35a97c",
        "plugin":"plugin-iio:0.6.0",
        "sensor":"bme680",
        "task":"iio-rpi",
        "vsn":"W08C"
    }
}

In the above example, the value of 25050 was collected @ 2022-06-10T22:37:47.369013647Z from the bme680 sensor on node 000048b02d35a97c via the plugin-iio:0.6.0 plugin.

Note: see the Access and use data site for more details and data access examples.

Details & source code: https://github.com/waggle-sensor/data-repository

Edge Scheduler (ES)

The Edge Scheduler is defined as the suite of services running in Beehive that facilitate running plugins @ the edge. Included here are user interfaces and APIs for scientists to create and manage their science goals. The Edge Scheduler continuously analyzes node workloads against all the science goals to determine how the science goals are deployed to the Beehive nodes. When it is determined that a node's science goals are to be updated, the Edge Scheduler interfaces with WES running on those nodes to update the node's local copy of the science goals. Essentially, the Edge Scheduler is the overseer of all the Beehive's nodes, deploying science goals to them to meet the scientists plugin execution objectives.

Details & source code: https://github.com/waggle-sensor/edge-scheduler

Edge Code Repository (ECR)

The Edge Code Repository is the "app store" that hosts all the tested and benchmarked edge plugins that can be deployed to the nodes. This is the interface allowing users to discover existing plugins (for potential inclusion in their science goals) in addition to submitting their own. At it's core, the ECR provides a verified and versioned repository of plugin Docker images that are pulled by the nodes when a plugin is to be downloaded as run-time component of a science goal.

Details & source code: https://github.com/waggle-sensor/edge-code-repository

Lambda Triggers (LT)

The Lambda Triggers service provides a framework for running reactive code within the Beehive. There are two kinds of reaction triggers considered here: From-Edge and To-Edge.

From-Edge triggers, or messages that originate from an edge node, can be used to trigger lambda functions -- for example, if high wind velocity is detected, a function could be triggered to determine how to reconfigure sensors or launch a computation or send an alert.

To-Edge triggers are messages that are to change a node's behavior. For example an HPC calculation or cloud-based data analysis could trigger an Edge Scheduler API call to request a science goal to be run on a particular set of edge nodes.

Details & source code: https://github.com/waggle-sensor/lambda-triggers

Nodes

Nodes are the edge computing component of the cyberinfrastructure. All nodes consist of 3 items:

Persisent storage for housing downloaded plugins and caching published data before it is transferred to the node's Beehive
CPU and GPU compute modules where plugins are executed and perform the accelerated inferences
Sensors such as environment sensors, cameras and LiDAR systems

Figure 5: Node Overview

Edge nodes enable fast computation @ the edge, leveraging the large non-volatile storage to handle caching of high frequency data (including images, audio and video) in the event the node is "offline" from its Beehive. Through expansion ports the nodes support the adding and removing of sensors to fully customize the node deployments for the particular deployment environment.

Overall, even though the nodes may use different CPU architectures and different sensor configurations, they all leverage the same Waggle Edge Stack (WES) to run plugins.

Wild Sage Node (Wild Waggle Node)

The Wild Sage Node (or Wild Waggle Node) is a custom built weather-proof enclosure intended for remote outdoor installation. The node features software and hardware resilience via a custom operating system and custom circuit board. Internal to the node is a power supply and PoE network switch supporting the addition of sensors through standard Ethernet (PoE), USB and other embedded protocols via the node expansion ports.

Figure 6: Wild Sage/Waggle Node Overview

The technical capabilities of these nodes consists of:

NVidia Xavier NX ARM64 Node Controller w/ 8GB of shared CPU/GPU RAM
1 TB of NVMe storage
4x PoE expansion ports
1x USB2 expansion port
optional Stevenson Shield housing a RPi 4 w/ environmental sensors & microphone
optional 2nd NVidia Xavier NX ARM64 Edge Processor

Node installation manual: https://sagecontinuum.org/docs/installation-manuals/wsn-manual

Details & source code: https://github.com/waggle-sensor/wild-waggle-node

Blade Nodes

A Blade Node is a standard commercially available server intended for use in a climate controlled machine room, or extended temperature range telecom-grade blades for harsher environments. The AMD64 based operating system supports these types of nodes, enabling the services needed to support WES.

Figure 7: Blade Node Overview

The above diagram shows the basic technical configuration of a Blade Node:

Multi-core ARM64
32GB of RAM
Dedicated NVida T4 GPU
1 TB of SSD storage

Note: it is possible to add the same optional Stevenson Shield housing that is available to the Wild Sage Nodes

Details & source code: https://github.com/waggle-sensor/waggle-blade

Running plugins @ the Edge

Included in the Waggle operating systems are the core components necessary to enable running plugins @ the edge. At the heart of this is k3s, which creates a protected & isolated run-time environment. This environment combined with the tools and services provided by WES enable plugin access to the node's CPU, GPU, sensors and cameras.

Waggle Edge Stack (WES)

The Waggle Edge Stack is the set of core services running within the edge node's k3s run-time environment that supports all the features that plugins need to run on the Waggle nodes. The WES services coordinate with the core Beehive services to download & run scheduled plugins (including load balancing) and facilitate uploading plugin published data to the Beehive data repository. Through abstraction technologies and WES provided tools, plugins have access to sensor and camera data.

Figure 8: Waggle Edge Stack Overview

The above diagram demonstrates 2 plugins running on a Waggle node. Plugin 1 ("neon-kafka") is an example plugin that is running alongside Plugin 2 ("data-smooth"). In this example, "neon-kafka" (via the WES tools) is reading metrics from the node's sensors and then publishing that data within the WES run-time environment (internal to the node). At the same time, the "data-smooth" plugin is subscribing to this data stream, performing some sort of inference and then publishing the inference results (via WES tools) to Beehive.

Note: see the Edge apps guide on how to create a Waggle plugin.

Details & source code: https://github.com/waggle-sensor/waggle-edge-stack

What is a plugin?

Plugins are the user-developed modules that the cyberinfrastructure is designed around. At it's simplest definition a "plugin" is code that runs @ the edge to perform some task. That task may be simply collecting sample camera images or a complex inference combining sensor data and results published from other plugins. A plugin's code will interface with the edge node's sensor(s) and then publish resulting data via the tools provided by WES. All developed plugins are hosted by the Beehive Edge Code Repository.

See how to create plugins for details.

Science Goals

A "science goal" is a rule-set for how and when plugins are run on edge nodes. These science goals are created by scientist to accomplish a science objective through the execution of plugins in a specific manner. Goals are created, in a human language, and managed within the Beehive Edge Scheduler. It is then the cyberinfrastucture responsibility to deploy the science goals to the edge nodes and execute the goal's plugins. The tutorial walks through running a science goal.

LoRaWAN

The Waggle Edge Stack includes the ChirpStack software stack and other services to facilitate communication between Nodes and LoRaWAN devices. This empowers Nodes to effortlessly establish connections with wireless sensors, enabling your plugins to seamlessly access and harness valuable data from these sensors.

Figure 9: Abstracted WES Lorawan Architecture

The main components in our LoRaWAN implementation are the Chirpstack software stack, the lorawan listener plugin, and the LoRaWAN gateway.

Chirpstack is a network server that manages LoRaWAN devices.
The lorawan listener plugin publishes values sent by LoRaWAN devices to the beehive.
The LoRaWAN gateway is a hardware device that receives wireless data from LoRaWAN sensors and forwards it to the node for processing.

To help you get started with LoRaWAN, refer to the LoRaWAN Reference Guide.

High-Level Infrastructure​

Beekeeper​

Beehive​

Beehive Infrastructure​

Data Repository (DR)​

Edge Scheduler (ES)​

Edge Code Repository (ECR)​

Lambda Triggers (LT)​

Nodes​

Wild Sage Node (Wild Waggle Node)​

Blade Nodes​

Running plugins @ the Edge​

Waggle Edge Stack (WES)​

What is a plugin?​

Science Goals​

LoRaWAN​