Invited Talk by Dr.Vivek Kumar on High Performance Parallel Programming on Modern Processors
Processor design has turned toward multicore and heterogeneous cores to achieve performance and energy efficiency. Developers find high-level languages attractive because they use abstraction to offer productivity and portability over hardware complexities. To achieve performance, several modern implementations of high-level languages use work-stealing scheduling for load balancing of dynamically created tasks. In this approach, the programmers explicitly expose potentially parallel tasks and rely on the underlying language runtime for scheduling these tasks, exploiting idle resources and unburdening those that are overloaded.
Despite the popularity, work-stealing comes with substantial overheads. These overheads arise as a necessary side effect of the implementation and hamper parallel performance. In this talk, I will present a detailed analysis of key sources of overheads associated with work-stealing runtime, namely sequential and dynamic overheads. I will then describe an approach that reuses runtime mechanisms already available within managed runtimes to substantially curtail these overheads and provide significant performance improvements.
Vivek Kumar obtained his Ph.D. from Australian National University in January 2014. His research interests are parallel programming models and runtime systems. His Ph.D. research focused on improving the performance and productivity of parallel programming on multicore architectures. His dissertation appeared in Doctoral Showcase Program at Supercomputing conference in 2013 and one of his research contributions was selected for ACM SIGPLAN Research Highlights. He is currently a Research Scientist at Rice University where his research is focused toward integrating task parallelism in HPC libraries. Prior to joining Ph.D. program in March 2010, he worked for nearly 6 years in the area of HPC at industries such as IBM India Software Labs and C-DAC R&D.