Speaker: Ashraf Aboulnaga
University of Waterloo
Title: Self-Tuning Hadoop
Hadoop has become one of the most popular platforms for large scale analytics, and the MapReduce model that it is based on has enabled tremendous advances in data analysis. However, tuning and administering Hadoop clusters is still not a well-studied area. In this talk, I will present two projects that aim to make Hadoop more self-tuning. Most of the talk will focus on ReStore, a system for reusing MapReduce results. ReStore focuses on queries that are expressed in high-level languages such as Pig Latin, Hive, or Jaql. The compilers of these languages translate queries into workflows of MapReduce jobs. Each job in these workflows reads its input from HDFS and produces output that is stored in HDFS. The current practice is to delete all intermediate results from HDFS at the end of executing the workflow. ReStore keeps these intermediate results and reuses them for future queries, resulting in significant performance improvement. In the talk I will also present PStorM, a system for tuning Hadoop configuration parameters based on the history of previously observed MapReduce jobs. The two projects presented in the talk share the goal of automatically improving the performance of Hadoop based on workload characteristics.
Dr. Ashraf Aboulnaga is an Associate Professor in the Cheriton School of Computer Science at the University of Waterloo. His research interests are in the area of database management, with a current focus on platforms for Big Data, cloud computing and virtualization, data integration on the web, and self-managing database systems. Dr. Aboulnaga obtained M.S. and Ph.D. degrees from the University of Wisconsin - Madison, and B.S. and M.S. degrees from Alexandria University, Egypt. He was a Research Staff Member at the IBM Almaden Research Centre in San Jose, California from 2002 to 2004, and a Visiting Research Scientist at Google Waterloo during his sabbatical, from December 2009 to June 2010. Dr. Aboulnaga is an IBM Centre for Advanced Studies Faculty Fellow and a recipient of a Google Research Award, the Ontario Early Researcher Award, and a Best Paper Award at the VLDB 2011 conference. His research results have been integrated into commercial products such as IBM DB2, and he serves on the advisory board of ClevrU. Dr. Aboulnaga is currently an Associate Editor for the Proceedings of the VLDB Endowment. He holds 4 us patents (1 pending) and is a senior member of the IEEE and ACM.