Computation-Offloading Algorithms for ML Inference Jobs

Seminar talk titled “Computation-Offloading Algorithms for ML Inference Jobs”

Title Of the Talk: “Computation-Offloading Algorithms for ML Inference Jobs”
Speakers: Prof. Jaya Prakash Champati
Host Faculty: Dr. Bheemarjuna Reddy Tamma
Date &Time: Monday, 07th Feb 2022 12:00

Abstract:

Since the inception of the edge computing, significant attention has been given to the computational offloading problem, wherein the Edge Device (ED) needs to decide which jobs to offload to an Edge Server (ES). The objectives that were considered for optimizing the offloading decision are, 1) minimize the total execution delay of the jobs, i.e., makespan, and 2) maximize the energy savings at the ED. Recently, an increasing number of applications are using Machine Learning (ML) inference at the Edge Devices, and there is a major thrust for deploying DNNs, with reduced computation and storage requirements, on the EDs as this has, among other advantages, reduced latency. Nevertheless, the DNNs deployed at the EDs have lower inference accuracy. The aspect of accuracy is unique to the ML inference jobs and adds a new dimension to the computation offloading problem giving rise to novel trade-offs between makespan, energy savings at the ED, and the inference accuracy.

In this talk, I will present our recent work on the trade-off between the inference accuracy and makespan. We formulate a general assignment problem with the objective of maximizing the inference accuracy at the ED subject to a time constraint T on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR2), and prove that it results in a makespan at most 2T, and achieves a total accuracy that is lower by a small constant from the optimal total accuracy. Further, if the inference jobs are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. As proof of concept, we implemented AMR2 on a Raspberry Pi that is equipped with MobileNets and is connected to a server that is equipped with ResNet, and present results on the total accuracy and makespan performance of AMR2 for an image classification application.

Speaker Profile:

Jaya Prakash Champati is an Assistant Professor at IMDEA Networks Institute, where he leads the Edge Networks group. His general research interest is in the scheduling of communication and computation for emerging applications in edge computing systems, Internet of Things (IoT), and Cyber-Physical Systems (CPS). Prior to joining IMDEA, he was a post-doctoral researcher in the division of Information Science and Engineering, EECS, KTH Royal Institute of Technology, Sweden. He obtained his PhD in Electrical and Computer Engineering from the University of Toronto, Canada in 2017, and his master of technology degree from the Indian Institute of Technology (IIT) Bombay, India in 2010. Prior to joining PhD, he worked at Broadcom Communications, where he contributed to the LTE MAC layer development. He was a recipient of the best paper award at IEEE National Conference on Communications, India, 2011.

Date:
Monday, 07th Feb 2022 12:00