Research Overview

Vikram's research interests spans from the large-scale machine learning and graph applications to the system architecture and programming of emerging processors and memory technologies. Currently, he is exploring ways to mitigate the limits of memory capacity and memory bandwidth in the GPUs and near-memory/storage accelerators.

At a 1000-foot level, his current work can be classified into following thrust areas:

Thrust #1: Hardware and System Stack Design for large memory applications

In this thrust we explore the required changes needed in the computing stack (architecture, compilers and OS) to enable big memory applications such as the GPU enhanced data analytics pipeline, recommendation systems, graph analytics, and deep learning applications. These applications exhibit massive parallelism and require a large pool of memory for efficient execution at scale. To this end, we are exploring ways to integrate non-volatile memories such as 3DX-point and Flash to the current computing stack. Some published papers in this area are: FlatFlash, DeepStore EMOGI

Thrust #2: Software and Algorithm Design

Here, we explore the software architecture changes that requires to be done to enable big memory applications efficiently. Some of the topics that we explore include efficient caching mechanism for emerging deep learning based queries, data layout schemes, compression algorithms, graph analytics algorithm and also automatic model parallelism for faster training in large NLP models. Some published papers in this area are: DeepStore, GraphChallenge SparseDNN

Thrust #2: Profiling tools

Building efficient hardware and software stack requires in-depth understanding of the application behavior and this can be done with the help of profiling tools. To this end, my research deals with building tools for profiling large memory applications.