Join us at the Data Science Applied Research and Education Seminar (ARES) with:
Dr. Benjamin Bolker
Director, School of Computational Science and Engineering
Professor, Department of Mathematics & Statistics and Department of Biology
Acting Associate Chair (Graduate), Mathematics
McMaster University
Free Event | Registration Required
Talk Title: No free lunch in inference
Abstract:
Statistical methods can target exploration, prediction, or inference. While big-data applications have emphasized prediction, inference remains important; in particular, inference is closely related to assessing the uncertainty of coefficients and predictions. Data-driven methods for model selection and tuning minimize prediction error by trading bias for variance, but they are rarely (never?) able to narrow confidence intervals or increase certainty. If used naively, popular methods of data-driven model selection and tuning lead to overconfidence. Post-selection inference, a non-naive method of accounting for the effects of data-driven model tuning, rely on strong assumptions. Researchers should should recognize how hard it is to quantify uncertainty reliably when they use data-driven model tuning, and in many cases should abstain from tuning altogether.
Speaker Profile:
Dr. Benjamin Bolker completed an undergraduate degree in mathematics and physics at Yale University and a Ph.D. in Zoology at Cambridge University, working on the dynamics of measles epidemics. He did a postdoc at Princeton University in ecology and evolutionary biology on spatial dynamics of plant and host-parasite communities, beginning a faculty position at the Department of Zoology (later Biology) at the University of Florida in 1999. He moved to McMaster University in 2010, where he has a joint appointment in Mathematics & Statistics and Biology and directs the School of Computational Science and Engineering. His research ranges broadly across ecology, evolution, and epidemiology, applying mathematical, statistical, and computational tools. He is especially interested in problems that involve parasites and disease, spatial population dynamics, estimation and inference of model parameters from observational data, or all three. In addition to many research papers, he is the author of two books (Ecological Models and Data in R and A Very Short Introduction to Infectious Disease, with Marta Wayne) and the author or maintainer of several widely used R packages.