Speaker: Professor Odej Kao,
Title: Scheduling and Resource Management for Big Data
Data Intensive Scalable Computing is a much-investigated topic in current research. The talk will present a platform for big data analytics, which built the foundation for Apache Flink including the operator scheduling and the data structures for shared state. We propose a programming model based on second order functions that describe what we call parallelization contracts (PACTs). PACTs are a generalization of the map/reduce programming model, extending it with additional higher order functions and output contracts that give guarantees about the behavior of a function. We will focus on the features of the execution engine and highlight the strategies, the detection of communication bottlenecks and network topology, and the lightweight fault tolerance methods. Finally, the talk will give an overview over some interesting application from the Berlin Big Data Center.
Dr. Odej Kao is a Full Professor at the Technische Universität Berlin and head of the research group on Complex and Distributed IT systems. Moreover, he is the director of the IT service center of the TU Berlin called tubIT and scientific advisor in the Fraunhofer Institute for Open Communication Systems FOKUS. Finally, he is the current dean of the Faculty for Computer Science and Electrical Engineering at the TU Berlin and member of the national IT board in Germany.
Dr. Kao is a graduate from the Technische Universität Clausthal, where he earned a Master's degree in Computer Science and Electrical Engineering in 1995. Thereafter, he spent two years working on his PhD thesis dealing with high performance image processing. In his work as PostDoc, Dr. Kao published many papers on high performance multimedia retrieval and was awarded an advanced PhD (habilitation) in 2002.
In April 2002 Dr. Kao joined the University of Paderborn, Germany as Associated Professor for Distributed and Operating systems. One year later he became a managing director of the Paderborn Center for Parallel Computing (PC2) where he has conducted research and many industry-relevant projects on high-performance computing, resource management, and Grid/Cloud computing. In 2006, he moved to the TU Berlin and focused his research on Cloud Computing, resource management and QoS, and Big Data. He is one of the creators of the Big data engine Stratosphere, which is not the Apache top level project named Flink. He has published over 300 peer-reviewed papers in scientific conferences and journals. Dr. Kao is member of many international program committees and editorial boards of Journals such as Parallel Computing and IEEE Transaction on Cloud Computing.