Data science, including big data and data analytics, has attracted enormous attention for its spectacular contributions in a wide range of scholarly disciplines and commercial endeavors. Our research in this area leverages large volumes of data generated in diverse applications as well as cloud platforms. An interdisciplinary research area by its nature combines research in three foundational disciplines namely Systems (Parallel and Distributed) providing the computational infrastructure to run data analytics on, Data Management and Analytics to efficiently process and organize data and Applications of Statistics and Machine Learning to convert data into knowledge and actions.
Our research in this area encompasses the performance evaluation of large software platforms running large scale statistical and machine learning algorithms, system reliability studies of large cloud infrastructures and file systems and the design of novel data management techniques to process massive amounts of structured and unstructured data. Moreover we apply statistical and machine learning techniques to improve the performance of software systems but also to design and build solutions for innovative applications.