Armchair Architects: The edge computing model
So, what is “edge computing”? It’s a compute environment of different sizes that exists on-premises that helps customers and solutions do things like aggregate, filter, or inference on data that’s coming from devices that are connected to that on-premises environment. In many cases, there’s a spectrum of devices and we can go from constrained edge devices to data-class edge devices, all with different levels of computer power and complexity. There are different types of solutions and places where you can host AI models, filtering models, database components, all for the purpose of doing things in the premises world before they actually connect to the cloud and send data to the cloud.
Another key reason it is called “Edge” (you could do all of the above with many types of traditional on-premises technologies) is that it’s part of something called the Cloud. Thus, you take the operating model, the control plane, the business model, the technology model, security models, etc., and export it from a centralized environment like Azure to a place where you want the cloud capabilities to run on-premises. It could be in a factory, it could be inside an airplane, or even inside an automobile.
The edge model provides flexibility by pushing processing to the edge so that data and calls do not need to travel back to cloud infrastructure, wasting time and reducing performance. However, Microsoft does not assume that customers want to run a full data center environment (i.e., a rack). Instead, what we do is provide a software framework that does potentially have an appliance, like Azure Stack Edge or Azure Stack Hub.
The latest incarnation of our edge management framework is called Azure Arc. Azure Arc extends the control plane of Azure to capabilities that are outside of Azure, like VMware VMs or Azure Stack VMs or Kubernetes environments in K8. Down the line, we’re also planning to support smaller Kubernetes environments, such as K3s so that you can effectively manage edge environments like server-class hardware, and smaller and more constrained devices (i.e., devices that have compute and storage capabilities like raspberry PIs and even smaller).
There are three primary use cases that make edge computing important. First, safety! If a physical device humans interact with or are often near have operating safety requirements (like ovens), telemetry indicating alerts or alarms should be routed to edge workloads first and not wait to go to the cloud. Local alerting and decision-making is important, especially if an alert indicates that an oven is having a dangerous operational cycle that could damage the equipment and endanger operators. Anything that has to do with safety, especially where human operators and personnel are involved, must be local.
The key consideration for edge workload requirements is primarily latency and secondarily compute capability at the edge. In the above example, latency becomes the critical consideration as to where the alerts must be routed and acted upon. A round trip to the cloud, despite the fidelity of the connectivity is the long path and will often be a supplemental consideration. When scenarios, use cases or other pressing local evaluations of telemetry are required, edge workloads are primary targets for this type of work. These edge workloads, managed as an extension from the cloud offer increased processing and capabilities within the edge environment for rapid dispositioning and action.
Of course, there are additional constraints and requirements that can challenge edge workloads, such as the difficulty of running edge compute and storage environments on constrained devices. Another significant challenge is make sure we don’t assume that every device can make direct cloud calls. Many brownfield devices are not only unable to make direct connections and calls to via the internet, but also are constrained (with good reason) from doing so. For example, many customers may have to support brownfield devices in a ISA95 and Purdue network configuration where there are various levels of tightly controlled network connectivity. We need to think about the feasibility of running in those constrained environments and in secure network infrastructure.
For instance, in secured manufacturing environments, it can be extremely difficult to gather telemetry from devices in Level 0 data, given that often you can’t see the plant network or even the Internet network. Thus, we must design clever ways of deploying nested edge capabilities to bubble telemetry up to an internet connected level in a secure and constrained environment. Nesting allows deeper level networks to talk to edge devices to talk with levels above it until reaching Level 5 (in ISA 95 parlance). Azure’s Nested Edge solution provides the necessary forwarding and queuing architectures so that commands and data can flow in either direction.
Architects must understand what type of constraints need accommodation for edge environments. Given the use case, requirements, and scope of what a future state solution needs to service, architects must identify the potential edge infrastructure, design the workloads and interaction, determine management requirements and control plane schemes and make sure that there are sufficient resources for the work to be done. For example, as an architect you should know whether a software edge workload or module that you’re pushing down to the edge can run on a constrained device – or if it needs a more powerful, consistent piece of infrastructure at the facility.
Thus, the architect determines if the software and overall system have the requisite reliability, security, consistency to support the overall solution. Does the solution need high availability or have safety concerns? If so, then you’ll probably need a larger system. If not, then you can probably get away with something less costly.
- If you’d like to learn more and see specific use cases, please watch the video Armchair Architects: Architecting on the Edge.
- If you’d like to learn more about Azure’s Well-Architected initiative, go to our public page for an overview.
Published on:
Learn more