Custom Meta Tags
Hero Banner
Sub Navigation
Title and intro (LC)

Generative AI at the Edge

Michael Uyttersprot walks through the different Edge GenAI demonstrations at embedded world 2025

Jump links: 

Generative AI at the Edge or 'Edge Gen AI' is the deployment and execution of generative artificial intelligence (AI) models on edge computing devices rather than relying solely on centralised cloud systems. This convergence brings AI capabilities closer to where data is generated - on devices such as sensors, microcontrollers (MCUs), gateways, and edge servers.

Here is a breakdown of what Gen AI at the edge entails:

 

Edge Gen AI Advantages Explanation

Localised Processing

Gen AI at the edge enables AI processing directly on embedded devices, minimising the need for constant communication with the cloud. This enables devices to think, generate, and respond in real-time.

Real-time Interactions

A primary advantage of Gen AI at the edge is the ability to achieve real-time, low-latency interactions and decision-making. This is crucial for applications like autonomous vehicles, robotics, and real-time diagnostics.

Enhanced Privacy and Security 

By processing data locally, Gen AI at the edge enhances data privacy and security by reducing the need to transmit sensitive information over networks. This is particularly important in sectors like healthcare and finance.

Reduced Bandwidth Consumption

Gen AI at the edge helps reduce both bandwidth usage and latency by processing data on-site, thereby minimising the need for extensive data transfer to and from the cloud. This is especially beneficial in environments with limited connectivity.

Improved Reliability

Gen AI at the edge provides higher reliability by enabling continuous functionality even in offline environments, thereby ensuring operational stability in critical applications such as industrial automation.

Scalability and Efficiency

While edge devices have resource constraints, advancements in model optimisation techniques, such as pruning, quantization, and knowledge distillation, are making it feasible to deploy generative models efficiently at the edge. Strategies such as model partitioning and federated learning also help balance the computational load across edge devices.

Where is ‘the Edge’?

There are different layers within the edge computing infrastructure, especially in the context of Gen AI workloads. While "on-premise" isn't directly defined as a distinct layer in the same way, it aligns with the general concept of edge computing as being localised rather than in the cloud. Here’s a breakdown of the differences:

Edge Locations Explanation

Far Edge

This layer represents the devices closest to where data is initially generated.

  • Examples include sensors, cameras, and IoT devices.
  • Processing at this stage focuses on real-time data analysis and immediate decision-making using compressed, resource-efficient Gen AI models. These models often comprise a small fraction of the parameters of larger Large Language Models (LLMs).
  • An example is a sensor in an industrial setting that monitors machinery and makes localised adjustments based on immediate data.

Near Edge

This layer consists of devices or edge servers that are more capable than far-edge devices and handle further processing.

  • Near-edge devices process data transmitted from far-edge devices and perform more complex computations and aggregations.
  • They can run larger LLMs compared to those deployed at the far edge, enabling deeper analysis.
  • In a smart city, a gateway that aggregates traffic data from various sensors and uses Gen AI to predict traffic patterns would be an example of near-edge processing.

On-Premise

Although not strictly a layer within the edge computing infrastructure, the concept is closely related to edge computing in general, as opposed to cloud computing.

  • "On-premise" generally refers to infrastructure and processing that is located locally, within an organisation’s physical premises or close to the users and devices generating or consuming data.
  • Edge computing, by its nature, brings data processing closer to the source, on-site or near the point of data generation, which can be considered "on-premise" in many contexts. This contrasts with sending data to a remote cloud data center for processing.
  • For example, the Avnet Silica’s ‘Phone Box’ chatbot processes and generates speech locally, operates on-premise in the sense that it does not rely on cloud infrastructure for real-time interactions.

In summary, the far edge involves initial, localised processing on very constrained devices, the near edge entails more complex processing on more capable local servers or devices, and on-premise generally refers to the localised nature of edge computing itself, contrasting with cloud-based processing. This on-premises processing, whether at the far or near edge, enables benefits such as low latency, enhanced privacy, and reduced bandwidth usage. 

Gen AI at the Edge in Action

Real-time Anomaly Detection and Predictive Maintenance in Industrial IoT (IIoT)

In manufacturing and industrial IoT, generative AI models deployed on edge devices enable real-time anomaly detection and predictive maintenance. These models can analyse data from sensors on industrial equipment locally, anticipating equipment failures and optimising operations without relying on cloud connectivity. For example, an AI chatbot integrated with industrial equipment at the edge can provide automated troubleshooting based on real-time data.

Personalised Voice Responses and Soundscapes in Smart Home Devices

Microcontrollers (MCUs) in smart home devices can run lightweight Gen AI at the edge LLMs to generate personalised voice responses or soundscapes based on user preferences and input data. Instead of sending user input to the cloud for processing, the device can dynamically create ambient background noise to match the user's mood or generate personalised workout routines locally, thereby enhancing privacy and reducing cloud reliance.

Real-time Diagnostic Insights from Medical Imaging

In healthcare, Gen AI at the edge models running on edge devices can analyse medical images in real time. This allows healthcare providers to obtain immediate, personalised diagnostic insights directly at the point of care, without the latency and privacy concerns associated with constant cloud connectivity. For instance, AI LLMs can enhance the resolution of medical scans at the edge, enabling radiologists to make more accurate diagnoses.

Autonomous Robots with Real-time Object Tracking and Navigation

Gen AI at the edge enables autonomous mobile robots (AMRs) to perform real-time object tracking and navigation directly on the device. This is crucial for applications such as autonomous delivery robots operating in areas with poor or intermittent connectivity, as they can process sensor data and make decisions locally without relying on the cloud. These robots can have carry-on LLM assistance running on their devices for various tasks.

Smart Bus Stops Providing Real-time Information

In the transportation sector, Gen AI at the edge chatbots are being used at smart bus stops to provide passengers with dynamic, real-time bus information through voice commands. These chatbots integrate with timetable databases and operate locally, allowing travellers to access schedules, route details, and estimated arrival times even during internet outages, improving accessibility and inclusivity.

NEED SUPPORT? CONTACT OUR AI EXPERTS

Case study (LC)

Featured case study

Edge Gen AI Chatbot (MM)

Generative AI at the Edge Chatbot

The Edge GenAI Chatbot (demonstrated at electronica 2024 and embedded world 2025) is a locally operated chatbot that runs directly on embedded devices, delivering fast, low-latency responses while ensuring enhanced privacy. Its modular software design allows flexibility across diverse hardware configurations, making it adaptable to specific application requirements. Additionally, it is compatible with a broad selection of TRIA System-on-Modules (SOMs) to optimize performance. Comprehensive software support is provided, offering robust resources and tools for seamless integration and customization.

Learn More
Articles title (LC)

Featured Generative AI at the Edge articles

Revolutionising chatbot interactions (MM)

Revolutionising chatbot interactions

This article explores the increasing demand for localised ai solutions, particularly in the field of voice chatbots, which leverage Avnet Silica’s modular hardware and software architecture to provide scalable, real-time chatbot interactions across industries. It highlights the technical advancements in Edge AI, the applications of embedded chatbot systems, and how Avnet Silica’s chatbot (running on TRIA System on Modules (SOMs) and other technologies) and software support ecosystem enable the seamless deployment of next-generation AI-powered voice assistants.

Learn More
Featured applications (LC)

Featured Generative AI at the Edge applications

Hospitality (GBL)

GenAI at the Edge in Hospitality

For hospitality businesses, the shift to edge computing represents a significant opportunity to enhance both customer experience and operational efficiency.

Edge GenAI in hospitality
Industrial Automation (GBL)

GenAI at the Edge in Industrial Automation

The industrial sector is fundamentally transforming as automation and artificial intelligence (AI) redefine how machines interact with operators and environments.

Edge GenAI in Smart Industry
Transportation (GBL)

GenAI at the Edge in Transportation

Chatbots are emerging as a transformative force in transportation, particularly in applications such as smart bus stops, where they assist visually impaired passengers through real-time auditory cues, route guidance, and accessibility information.

Edge GenAI in transportation
Conclusion (LC)

The Future is here

Avnet Silica offers scalable embedded solutions designed explicitly for localised AI, enabling real-time interactions and processing at the edge. This enables companies to transition away from solely cloud-based AI models, achieving low-latency operations while ensuring data privacy and security.

TRIA's System on Modules (SOMs), for example, serve as a versatile and modular hardware platform that facilitates scalable AI deployment. These modules enable businesses to tailor AI solutions for diverse industry applications while maintaining high performance and energy efficiency. The TRIA SOMs offer flexibility, optimised performance for AI workloads with low power consumption, and scalability to expand AI capabilities. This modular approach simplifies the development of custom AI-driven systems, thereby accelerating the time-to-market.

Avnet Silica's Edge GenAI ‘Phone Box’ chatbot is a key innovation demonstrating the effectiveness of real-time Gen AI at the edge processing. This AI-powered voice chatbot demonstrates how embedded AI can drive efficiency, enhance customer interactions, and provide automation across various industries without relying on the cloud.

Avnet Silica provides a comprehensive ecosystem of support services, extending beyond hardware and software solutions. Their TRIA software team is dedicated to helping businesses customise and optimise their AI integrations, tailoring solutions to specific operational needs, including language model adaptations and integration with existing systems.

Avnet Silica’s Field Application Engineer (FAE) team provides crucial support and guidance throughout the development and deployment process. FAEs offer technical expertise, assisting developers in configuring, testing, and optimising their AI solutions for peak performance, from proof-of-concept to full-scale deployment.

In essence, Avnet Silica provides a modular, scalable, and fully supported ecosystem of hardware and software solutions, exemplified by their TRIA SOMs and the ‘Phone Box’ chatbot, along with expert technical assistance, to enable companies to confidently adopt and leverage the benefits of Gen AI at the edge for various applications.

NEED SUPPORT? CONTACT OUR AI EXPERTS

Generative AI (GBL)

Overview

Generative AI Overview

Head over to our Generative AI overview page for more Generative AI articles, applications and resources.

Generative AI - chip brain and code
Spot Design Hub Project (GBD)

Tools

Design Hub

Browse and review over a thousand proven reference designs to accelerate your design process. Try our design tool and then export it to your CAD tool of choice.

Spot Power (GBD)

Technologies

Power

We’ll show you the right power solution whether you’re designing power components onto your board or using an AC/DC supply in your system.

Spot Training Events (GBD)

Training & Events

Learning for better, faster projects builds

Connect with the Avnet Silica experts who will guide you to reach further with your projects with on-going seminars, workshops, trade shows and online training.

Customer asking question at seminar.
Contact Us (GBD)

Contact us

Have a question?

Int. Freecall - 00800 412 412 11 | Product or shop-related inquiries: OnlineSupportEU@avnet.com | Content-related questions: yourmessage@avnet.eu

Modal
Contact us

Submit your inquiry via the form below.