Back to Search Start Over

Statistical Inference and Ensemble Machine Learning for Dependent Data

Authors :
Davies, Molly Margaret
van der Laan, Mark J1
Davies, Molly Margaret
Davies, Molly Margaret
van der Laan, Mark J1
Davies, Molly Margaret
Publication Year :
2015

Abstract

The focus of this dissertation is on extending targeted learning to settings with complex unknown dependence structure, with an emphasis on applications in environmental science and environmental health.The bulk of the work in targeted learning and semiparametric inference in general has been with respect to data generated by independent units. Truly independent, randomized experiments in the environmental sciences and environmental health are rare, and data indexed by time and/or space is quite common. These scientific disciplines need flexible algorithms for model selection and model combining that can accommodate things like physical process models and Bayesian hierarchical approaches. They also need inference that honestly and realistically handles limited knowledge about dependence in the data.The goal of the research program reflected in this dissertation is to formalize results and build tools to address these needs. Chapter 1 provides a brief introduction to the context and spirit of the work contained in this dissertation.Chapter 2 focuses on Super Learner for spatial prediction. Spatial prediction is an important problem in many scientific disciplines, and plays an especially important role in the environmental sciences. We review the optimality properties of Super Learner in general and discuss the assumptions required in order for them to hold when using Super Learner for spatial prediction. We present results of a simulation study confirming Super Learner works well in practice under a variety of sample sizes, sampling designs, and data-generating functions. We also apply Super Learner to a real world, benchmark dataset for spatial prediction methods.Appendix A contains a theorem extending an existing oracle inequality to the case of fixed design regression.Chapter 3 describes a new approach to standard error estimation called Sieve Plateau (SP) variance estimation, an approach that allows us to learn from sequences of influence function based variance

Details

Database :
OAIster
Notes :
application/pdf, English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1367542726
Document Type :
Electronic Resource