Serafini, Guan Working to Improve Training for Graph Neural Networks
Amazon Research and NSF funding will support research to improve speed and performance of machine learning models
Content
Assistant Professors Marco Serafini and Hui Guan have received an Amazon Research Award and a grant from the National Science Foundation to support their work on improving the training process for graph neural networks (GNNs). GNNs, machine learning models that learn and predict patterns in relationships between points of data in rich data structures known as graphs, are used in a range of applications, including drug discovery, fraud detection, and recommendation systems.
GNNs are typically implemented using a combination of central processing units (CPUs) and graphics processing units (GPUs), using CPUs for managing the overall structure of the graph and GPUs for computationally intensive operations. Because the data used by GNNs is highly structured and complex, transferring data between the CPU and the GPU can be very slow and limit the performance of the model, especially for large graphs.
“With bigger sets of data, we see performance issues and very slow training times,” says Guan, co-principal investigator on the grants. “For large graphs especially, we need to follow the data processing principle of moving the computation to the data, rather than the data to the computation.”
To solve this problem, Serafini and Guan will focus their work on rebuilding the GNN training pipelines in multi-GPU settings, eliminating the transfer bottleneck. In doing so, they will develop fast online algorithms to schedule work across GPUs at each training iteration with high data access locality and low communication cost between GPUs.
“Getting multiple GPUs to work together efficiently on the same graph has been a challenge for researchers,” explains Serafini, who serves as the principal investigator. “We make this possible with an approach where each GPU performs sampling and training on a specific partition of the graph, exchanging intermediate results with each other.”
In their preliminary experiments with a 4-GPU system using this system, Serafini and Guan were able to train GNN models between 6 and 32 times faster. Because machine learning can be extremely energy intensive, techniques that reduce training times through their approach could provide important reductions in energy use, according to the researchers.
Once their research is complete, they plan to integrate it with the Deep Graph Library, for use by other researchers and industry.
Serafini currently works in the UMass DREAM Lab (Data systems Research for Exploration, Analytics, and Modeling), where his research lies at the intersection of systems for machine learning, database management systems, and distributed systems. Prior to joining the CICS faculty, he was a senior scientist at the Qatar Computing Research Institute and a postdoctoral research fellow at Yahoo! Research in Barcelona. He received a doctorate in computer science from the Technical University of Darmstadt, Germany, where his thesis was nominated for best thesis awards by the German, Swiss, and Austrian computer science societies and the German chapter of the ACM.
Guan’s research lies in the intersection of machine learning and programming systems, with a current focus on both algorithm and system optimizations of deep multitask learning and graph machine learning. She joined the computer science faculty of UMass Amherst in 2020 and received her doctorate in electrical engineering from North Carolina State University in 2020. She is a member of the Programming Language and Systems at Massachusetts (PLASMA) lab at UMass.