Designing System Software in the Era of Heterogeneity
Title of the Talk: Designing System Software in the Era of Heterogeneity
Speakers: Dr. Adithya Kumar
Host Faculty: Dr.Praveen T
Date: Jan 02, 2026
Time: 09:30 am.
Abstract: As the landscape of modern computer applications enters the realms of pervasive AI, their colossal computational needs are increasingly met by datacenter platforms that expose a diverse mix of CPUs, GPUs, accelerators, and smart peripherals such as SmartNICs and SSDs. While this plurality has unlocked numerous opportunities and enabled tremendous progress for AI applications, it has necessitated revisiting fundamental questions of what, when, and how software systems should leverage device-specific capabilities to navigate challenges not only in performance, efficiency, and reliability but also the long-term viability of these large-scale computing environments.
In this talk, I will outline a systems design approach for heterogeneous platforms that first characterizes the resource capabilities, designs components to explore the trade-offs, and finally demonstrates system solutions that exploit the heterogeneity. As a concrete instantiation of these ideas, I will detail SplitRPC, an RPC framework that tackles the overheads of serving ML inference on GPUs (“RPC tax”) by leveraging the P2P data-path and the compute capabilities of a SmartNIC under distributed settings. Finally, I will present a case for future system designers to build heterogeneity-first system components that implicitly discover and adapt to hardware and workload diversity in order to improve performance, efficiency, and reliability while reducing long-term system and energy costs.
Bio: Dr. Adithya Kumar is a software engineer at Meta’s Fundamental Artificial Intelligence Research (FAIR) Labs. He received his Ph.D. in Computer Science and Engineering from The Pennsylvania State University in 2022, and his B.Tech. in Computer Science and Engineering from NIT Trichy. His research interests span the design of distributed systems, with a focus on performance, efficiency, and reliability.