On-device artificial intelligence

Fictional scenario: A day in Alice’s life

Alice woke up to the sound of her AI alarm. It rang a bit earlier because it checked her calendar and knew she had an important meeting this morning. The alarm understood that Alice needed more time to prepare in such cases. Unfortunately, she did not have a restful sleep, even with her AI bed that monitored her sleep, readjusted the temperature, firmness, and moved her based on her vitals. Last night, the readings were atypical and the AI bed kept adjusting itself to no avail, affecting Alice’s sleep quality. Without her knowledge, the AI bed sent the information via email to the vendor and recommended changing the mattress. Alice, who received a copy of the email, was not pleased as she was not aware of all the data being recorded and had never consented to her vital signs and sleeping habits being shared.

Her AI assistant had already turned on the TV in the living room to show the day’s news. The news was tailored based on her past TV usage: the weather, new advancements in technology and the schedule of her favourite TV shows. As usual, her AI assistant started its audio recording; all of Alice’s voice requests would serve to improve the manufacturer’s software and benefit its customer base. Alice never agreed to this, but it came as an enabled feature by default. Having been told that this was a no-configuration device, she never thought to look at the options.

In her garage, her AI-powered car was waiting for her, its door opening as soon as she got close. Her AI car informed her that a road accident had happened and automatically turned on the GPS to show her an alternative route. Only 5 minutes lost; no big deal. Nevertheless, the car detected her level of stress and confirmed it with her morning’s agenda. Additional readings from the AI car’s sensors detected she was tired but did not detect alcohol consumption. All in all, the AI car decided she could drive today. Soothing music came out of the speakers to make her feel comfortable, but her detour proved longer than expected as traffic slowed everyone down. Fuming and worried about being late, she turned off the music only to have it come back on again. The AI car was adamant that she needed to relax!

After a long day’s work, Alice made it back home.

Taking into account Alice's stress level and the ingredients available, the AI fridge recommended a dessert consisting of a milkshake made from banana, whole milk and, of course, chocolate ice cream. In the past, it made her feel better and was a relaxing treat after a hard day's work. Never mind her side job as a health influencer and her commitment to a healthy lifestyle. After all, her viewers would never know!

Alice would need to remember to erase this meal in her tracking app, which recorded her daily intake automatically.

While she was enjoying her dessert, she received an email from the meal tracking app provider. It informed her that they had suffered a data breach and that all her personal data had been leaked onto the Internet. She looked at her milkshake again, then at her phone. She just couldn’t believe her bad luck; her online critics would have a field day tomorrow, berating her mercilessly for cheating and misrepresenting herself.

A disastrous day, courtesy of her AI-enabled devices!

On-device artificial intelligence

By Andy Goldstein

On-device AI refers to a model architecture in which AI is implemented and executed directly on end devices, such as smartphones, wearables (e.g. smartwatches), or home appliances. The AI performs its inference[i] and continuous training on the end-device, close to where the data is generated, as opposed to running on servers or in the cloud. This minimises latency[ii] and enables real-time decision-making, which can be critical for some applications. In addition, since data is processed locally, only relevant data need to be sent to the cloud, conserving bandwidth and reducing data transmission costs, which can be particularly beneficial in environments with limited or costly internet connectivity.

On-device AI can also be implemented using federated learning, which is a way to build an AI model where multiple sources of data (end-devices) collaborate to train a shared AI model while keeping the data decentralised. Instead of sending raw data to a central server or even to each end-device directly, each end-device can process its own data locally and share the AI model updates. This allows building AI models that require data from different sources but where it is not possible or desirable to share this data.

The concept of on-device AI has gradually evolved since the 2000s with the advent of more powerful end-devices in terms of processing power. The introduction of smartphones put more computing power in the hands of individuals and, over time, this evolved into increasingly sophisticated Internet of Things (IoT) devices such as wearables and home devices (such as cameras, or doorbells).

Specialised processors have been developed to perform on-device AI tasks efficiently, often with better performance than CPUs, such as digital signal processors (DSPs)[iii], neural processing units (NPUs)[iv] and application-specific integrated circuits (ASICs)[v]. DSPs, for example, require little power, making them uniquely suitable for smartphones and wearables.

In situations where the AI model is not provided by an external source, the device is not part of a federated network, and the task itself does not require internet access, on-device AI may not require an internet, server or cloud connection.

As mentioned previously, on-device AI systems may also continuously train themselves with data (including personal data) collected by and located on the end-devices. The main disadvantage of training on the end device is that it has fewer resources (e.g. storage) when compared with those of a central server, limiting the capacity to train the model. Storage in particular can be a problem.

Additionally, substantial amounts of local data are required for processing. Running intensive on-device AIs can also significantly drain the battery of mobile devices - advances in low-power AI chips and energy-efficient algorithms are ongoing research areas.

Autonomous vehicles are increasingly leveraging on-device AI to enhance their functionality and safety. These vehicles can process vast amounts of sensor data in real-time, enabling them to detect and respond to dynamic environments, such as pedestrians, traffic signals and road conditions.

Another example are smart wearable devices (smartwatches, fitness trackers, health monitors...), which leverage on-device AI to process data locally, allowing for real-time analysis of various health metrics such as heart rate, activity levels and sleep patterns.

Other use cases for AI on devices are military applications such as drones or autonomous robots. The autonomy to take decisions in isolation is critical in scenarios where connectivity with a human operator or a central command system may be compromised due to frequency jamming or other forms of electronic warfare.

Development status

On-device AI is constantly evolving. ARM-based machines[vi], known for their power efficiency (more power, less heat), are resurging as strong candidates for on-device AI. Specialized processors built for AI tasks, such as Mobile System on Chips (SoCs) that include dedicated AI accelerators, are also advancing.

AI on-device is already widely used in smartphones, wearables, and smart home devices for tasks like voice assistants, face recognition and health monitoring. With advancements in edge computing, AI on-device is rapidly growing, particularly in industries like automotive and healthcare. While challenges like energy efficiency and privacy remain, the technology is quickly moving toward widespread adoption.

Neuromorphic chips are specialised types of computer processors engineered to emulate the neural structures and processes of the human brain, aimed at enhancing computing efficiency and adaptability. One primary objective is to achieve significant energy efficiency by utilising event-driven processing, which allows these chips to operate asynchronously and only activate when necessary, reducing power consumption compared to traditional computing architectures. This adaptability is crucial for developing intelligent systems that can operate in dynamic environments.

Smaller and more efficient storage solutions, in terms of capacity, power consumption, speed, and latency, enable devices to store and process more data on-device. This is crucial for the continuous training of AI and for storing larger, more powerful AI models.

On the software side, on-device AIs benefit from more efficient algorithms and novel computer science techniques, requiring less processing power and storage without significant loss of accuracy. For example, TinyML models are designed specifically for on-device AI, benefiting from model optimisation techniques such as Neural Architecture Search (NAS), which automates the process of designing efficient neural networks.

Even with some limitations, already nowadays, on-device AI is capable of performing multiple tasks, such as sensor data processing, advanced image processing (e.g., object detection and recognition, facial recognition) and using these as input for the AI model that runs within. With an ever-increasing amount and variety of sensors available (e.g., vision, speech, LIDAR), on-device AIs can now have a more comprehensive understanding of their context and process data more effectively. For example, a visually impaired individual’s request for a taxi on the street using photo lenses can now detect a passing free taxi and verbally inform the user to flag it down.

Potential impact on individuals

Not all on-device AI systems will process personal data, but for many applications - such as voice assistants, health monitoring, and personalised services - personal data becomes highly relevant, requiring careful consideration of privacy and data protection measures.

If the devices that process personal data are at the user’s end (e.g. a personal mobile device), there is no need to transmit the information outside the device holding the information. In other words, the personal data on the individual’s device does not need to be sent to a cloud service or the internet for processing by the AI. This significantly alters data protection risks from several angles.

First, personal data of the individual might not need to be transmitted outside the device where it is processed. This suggests a greater alignment with confidentiality, data minimisation and storage limitation principles. Ideally, there should only be one copy of the personal data residing on the device itself.

Second, since personal data processing does not occur outside the device, there is a higher likelihood that purpose limitation is better applied. This allows individuals to agree or disagree with sending their personal data outside the device. In this context, user information and awareness is critical to ensure that personal data is only sent outside the device for specific purposes.

It should be noted, however, that 'on device AI' data processing (as illustrated in the fictional scenario) does not necessarily mean that the purpose limitation is met - for example, a profile of the user can still be created, which can potentially be used for different various purposes, including the transfer of that profile to data brokers.

Moreover, it is important to emphasise that personal data is still processed on the device, such as for training purposes, which could result in excessive processing of personal data if the AI indiscriminately processes all available data on the device. The monitoring of data handling might be facilitated because the data remains in one location and does not leave the device, aiding in detecting whether confidentiality, data minimisation and storage limitation principles are properly applied.

Given that the input personal data (used for training and ongoing training) is closely associated with each individual, the AI's output quality promises to be more relevant to the individual, improving personalisation. However, AIs trained only on the end-devices may struggle to learn robust patterns and generalise effectively to new and unseen scenarios. Furthermore, training only on local data increases the risk of AI bias since it is not possible to access each and every end-device’s training data and thus tackle the potential bias of the AI overall. These risks can be mitigated through methods like federated learning models.

Security is a very important factor for on device AI systems, as data security becomes the individual’s responsibility in devices that provide limited security capabilities.

Ultimately, it is important to remember that the output of these systems will always have an impact on the individual, which in some cases may be relatively small (such as a smart watch that monitors the user’s sleep and makes suggestions for improvement) or large (such as an autonomous vehicle that decides when to brake and when to turn).

Suggestions for further reading

Moon, J., Lee, H. S., Chu, J., Park, D., Hong, S., Seo, H., ... & Ham, M. (2024, April). A New Frontier of AI: On-Device AI Training and Personalization. In 2024 IEEE/ACM 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (pp. 323-333). IEEE Computer Society. https://dl.acm.org/doi/10.1145/3639477.3639716
Sauptik Dhar, Junyao Guo, Jiayi (Jason) Liu, Samarth Tripathi, Unmesh Kurup, and Mohak Shah. 2021. A Survey of On-Device Machine Learning: An Algorithms and Learning Theory Perspective. ACM Trans. Internet Things 2, 3, Article 15 (August 2021), 49 pages.https://arxiv.org/pdf/1911.00623
Siu, J. C. Y., Chen, J., Huang, Y., Xing, Z., & Chen, C. (2023). Towards Real Smart Apps: Investigating Human-AI Interactions in Smartphone On-Device AI Apps. arXiv preprint arXiv:2307.00756.https://arxiv.org/pdf/2307.00756
Xu, J., Li, Z., Chen, W., Wang, Q., Gao, X., Cai, Q., & Ling, Z. (2024). On-device language models: A comprehensive review. arXiv preprint arXiv:2409.00088.https://arxiv.org/abs/2409.00088

[i] Inference (in AI) - Refers to the process of using a trained model to make predictions or decisions based on new, unseen data.

[ii] Latency - The time delay between a request for data and the beginning of the data transfer. Usually measured in milliseconds (ms).

[iii] Digital Signal Processors (DSPs) - Specialised microprocessors designed to perform the complex mathematical computations involved in digital signal processing, which includes tasks such as filtering, modulation and demodulation of signals, as well as other operations such as encoding, decoding and compression.

[iv] Neural Processing Units (NPUs) - Specialized hardware accelerators designed to efficiently handle the computational requirements of artificial neural networks and other machine learning algorithms. NPUs are purpose-built to deliver high performance and energy efficiency for AI workloads.

[v] Application-Specific Integrated Circuit (ASIC) - ASICs are custom-designed integrated circuits that are tailored to a specific purpose or application. They are not general-purpose circuits, such as standard microprocessors.

[vi] Advanced Reduce Instruction Set Computing Machines (ARM) - A family of computer processors invented in the decade of 1980 that utilises a small, highly optimised set of instructions.

On-device artificial intelligence

Fictional scenario: A day in Alice’s life

On-device artificial intelligence

Development status

Potential impact on individuals

Suggestions for further reading

Trainees Conference: From Cradle to Cloud: Surveillance and Digitalisation around Childhood

TechDispatch - Federated Learning

High-Level Debate on Competition, Innovation and Data Protection