62 results on '"F. Jay Breidt"'
Search Results
2. Modeling ammonia volatilization from urea application to agricultural soils in the DayCent model
- Author
-
Stephen J. Del Grosso, Yao Zhang, Ram Gurung, F. Jay Breidt, William J. Parton, Keith Paustian, Stephen A. Williams, and Stephen M. Ogle
- Subjects
Volatilisation ,Analytical chemistry ,Soil Science ,chemistry.chemical_element ,Environmental pollution ,04 agricultural and veterinary sciences ,010501 environmental sciences ,Ammonia volatilization from urea ,engineering.material ,Type (model theory) ,01 natural sciences ,Nitrogen ,DayCent ,Ammonia ,chemistry.chemical_compound ,chemistry ,040103 agronomy & agriculture ,engineering ,0401 agriculture, forestry, and fisheries ,Fertilizer ,Agronomy and Crop Science ,0105 earth and related environmental sciences ,Mathematics - Abstract
Nitrogen (N) loss through ammonia $$({\mathrm{NH}}_{3})$$ volatilization in agricultural soils is a significant source of atmospheric $${\mathrm{NH}}_{3}$$ , contributing to low N use efficiency in crops, risk to human health, environmental pollution, and is an indirect source of nitrous oxide $$({\mathrm{N}}_{2}\mathrm{O})$$ emissions. Our objective was to develop an ammonia volatilization method within the DayCent ecosystem model that incorporates key 4R N management practices (right type, right rate, right placement, and right timing) that influence $${\mathrm{NH}}_{3}$$ volatilization associated with application of urea-based nitrogen fertilizers to agricultural soils. The $${\mathrm{NH}}_{3}$$ volatilization method was developed with Bayesian calibration using sampling importance resampling methods and Bayes factors to select the level of complexity in the model that best represents $${\mathrm{NH}}_{3}$$ volatilization given the observed data. The final model included urea hydrolysis and the influence of urease inhibitors; short-term soil pH changes following fertilization; fertilizer incorporation into the soil (mechanically and through irrigation/precipitation); and specification of the fertilizer placement method (i.e. broadcast vs. banding and surface vs incorporated). DayCent predicts $${\mathrm{NH}}_{3}$$ volatilization with a root-mean-squared error of 158 (95% interval ranging from 133 to 192), bias of 7 (95% interval ranging from − 106 to 102) g NH3-N ha−1 day−1, and with a Bayesian R2 value of 0.39 (95% interval ranging from 0.17 to 0.62). Furthermore, the model incorporates key management options influencing $${\mathrm{NH}}_{3}$$ volatilization related to placement method and fertilizer type with and without urease inhibitors that can be used to evaluate management and policy options for reducing losses of NH3 from urea fertilization.
- Published
- 2021
- Full Text
- View/download PDF
3. Large scale maximum average power multiple inference on time‐course count data with application to RNA‐seq analysis
- Author
-
Graham Peers, Wen Zhou, Meng Cao, and F. Jay Breidt
- Subjects
Statistics and Probability ,False discovery rate ,Biometry ,Scale (ratio) ,Gaussian ,Normal Distribution ,Negative binomial distribution ,Inference ,computer.software_genre ,01 natural sciences ,General Biochemistry, Genetics and Molecular Biology ,010104 statistics & probability ,03 medical and health sciences ,symbols.namesake ,Humans ,Computer Simulation ,RNA-Seq ,0101 mathematics ,030304 developmental biology ,0303 health sciences ,General Immunology and Microbiology ,Gene Expression Profiling ,Applied Mathematics ,General Medicine ,Binomial Distribution ,Identification (information) ,symbols ,Identifiability ,Data mining ,General Agricultural and Biological Sciences ,computer ,Algorithms ,Count data - Abstract
Experiments that longitudinally collect RNA sequencing (RNA-seq) data can provide transformative insights in biology research by revealing the dynamic patterns of genes. Such experiments create a great demand for new analytic approaches to identify differentially expressed (DE) genes based on large-scale time-course count data. Existing methods, however, are suboptimal with respect to power and may lack theoretical justification. Furthermore, most existing tests are designed to distinguish among conditions based on overall differential patterns across time, though in practice, a variety of composite hypotheses are of more scientific interest. Finally, some current methods may fail to control the false discovery rate. In this paper, we propose a new model and testing procedure to address the above issues simultaneously. Specifically, conditional on a latent Gaussian mixture with evolving means, we model the data by negative binomial distributions. Motivated by Storey (2007) and Hwang and Liu (2010), we introduce a general testing framework based on the proposed model and show that the proposed test enjoys the optimality property of maximum average power. The test allows not only identification of traditional DE genes but also testing of a variety of composite hypotheses of biological interest. We establish the identifiability of the proposed model, implement the proposed method via efficient algorithms, and demonstrate its good performance via simulation studies. The procedure reveals interesting biological insights, when applied to data from an experiment that examines the effect of varying light environments on the fundamental physiology of the marine diatom Phaeodactylum tricornutum.
- Published
- 2019
- Full Text
- View/download PDF
4. Estimation of fish consumption rates based on a creel angler survey of an urban river in New Jersey, USA
- Author
-
F. Jay Breidt, Gemma Kirkwood, Suzanne Baird, and Betsy Ruffle
- Subjects
Fishery ,Consumption (economics) ,Estimation ,021110 strategic, defence & security studies ,Health, Toxicology and Mutagenesis ,Ecological Modeling ,0211 other engineering and technologies ,%22">Fish ,Environmental science ,02 engineering and technology ,Fish consumption ,Pollution - Abstract
A one-year angler intercept survey was conducted on the lower 17 miles of the Passaic River, an urban industrialized river that flows through Newark, New Jersey. The purpose of the survey was to co...
- Published
- 2019
- Full Text
- View/download PDF
5. Attention-based convolutional capsules for evapotranspiration estimation at scale
- Author
-
Samuel Armstrong, Paahuni Khandelwal, Dhruv Padalia, Gabriel Senay, Darin Schulte, Allan Andales, F. Jay Breidt, Shrideep Pallickara, and Sangmi Lee Pallickara
- Subjects
Environmental Engineering ,Ecological Modeling ,Software - Published
- 2022
- Full Text
- View/download PDF
6. Modeling nitrous oxide mitigation potential of enhanced efficiency nitrogen fertilizers from agricultural systems
- Author
-
Stephen J. Del Grosso, Rodney T. Venterea, Melannie D. Hartman, Stephen A. Williams, William J. Parton, F. Jay Breidt, Stephen M. Ogle, Yao Zhang, and Ram Gurung
- Subjects
Environmental Engineering ,Nitrogen ,Nitrous Oxide ,chemistry.chemical_element ,DayCent ,chemistry.chemical_compound ,Soil ,Ecosystem model ,Environmental Chemistry ,Fertilizers ,Waste Management and Disposal ,Ecosystem ,business.industry ,Environmental engineering ,Sampling (statistics) ,Agriculture ,Bayes Theorem ,Nitrous oxide ,Pollution ,chemistry ,Greenhouse gas ,Soil water ,Environmental science ,business - Abstract
Agriculture soils are responsible for a large proportion of global nitrous oxide (N2O) emissions—a potent greenhouse gas and ozone depleting substance. Enhanced-efficiency nitrogen (N) fertilizers (EENFs) can reduce N2O emission from N-fertilized soils, but their effect varies considerably due to a combination of factors, including climatic conditions, edaphic characteristics and management practices. In this study, we further developed the DayCent ecosystem model to simulate two EENFs: controlled-release N fertilizers (CRNFs) and nitrification inhibitors (NIs) and evaluated their N2O mitigation potentials. We implemented a Bayesian calibration method using the sampling importance resampling (SIR) algorithm to derive a joint posterior distribution of model parameters that was informed by N2O flux measurements from corn production systems a network of experimental sites within the GRACEnet program. The joint posterior distribution can be applied to estimate predictions of N2O reduction factors when EENFs are adopted in place of conventional urea-based N fertilizer. The resulting median reduction factors were − 11.9% for CRNFs (ranging from −51.7% and 0.58%) and − 26.7% for NIs (ranging from −61.8% to 3.1%), which is comparable to the measured reduction factors in the dataset. By incorporating EENFs, the DayCent ecosystem model is able to simulate a broader suite of options to identify best management practices for reducing N2O emissions.
- Published
- 2020
7. Pilot surveys to improve monitoring of marine recreational fisheries in Hawaiʻi
- Author
-
David A. Van Voorhees, Tom K. Ogawa, F. Jay Breidt, Thomas R. Sminkey, Jean D. Opsomer, John R. Foster, Hongguang Ma, and Virginia Lesser
- Subjects
0106 biological sciences ,Shore ,geography.geographical_feature_category ,010604 marine biology & hydrobiology ,Fishing ,Mail survey ,Aquatic Science ,010603 evolutionary biology ,01 natural sciences ,Fishery ,Telephone survey ,Geography ,Recreational fishing ,Sampling design ,Recreation - Abstract
Marine recreational fishing from shore and from private boats in Hawaiʻi is monitored via the Hawaiʻi Marine Recreational Fishing Survey (HMRFS), using an access point intercept survey to collect catch rate information, and the Coastal Household Telephone Survey (CHTS) to collect fishing effort data. In response to a recent HMRFS review, roving surveys of shoreline fishing effort and catch rate, an aerial fishing effort survey, and a mail survey of fishing effort were tested simultaneously on one of the main Hawaiian Islands (Oʻahu) and compared with the current HMRFS approach for producing shoreline fishing estimates. The pilot roving surveys were stratified by region (rural vs urban), shift (three 4-h periods during the day), and day type (weekday vs weekend). A pilot access point survey of private boat fishing was also conducted on Oʻahu, using an alternate sampling design created by NOAA Fisheries’ Marine Recreational Information Program (MRIP). Three overlapping 6-h time blocks and site clusters with unequal inclusion probabilities were used to cover daytime fishing. Group catch was recorded for an entire vessel rather than individual catch, which is the current standard for MRIP intercept surveys. Although catch estimates from the pilot private boat survey were comparable to the current HMRFS catch estimates, the catch estimates from the pilot roving survey were lower than the HMRFS estimates. HMRFS uses effort data from the CHTS, which includes both day and night fishing in all areas, to estimate total catch, whereas effort data from the roving shoreline survey covered only daytime fishing from publicly accessible areas. We therefore suggest that a roving survey conducted during the day should have complementary surveys to include night fishing and fishing in remote and private/restricted areas. Results from these pilot studies will be used to improve the current surveys of marine recreational fishing activities in Hawaiʻi.
- Published
- 2018
- Full Text
- View/download PDF
8. Minimum Mean Squared Error Estimation of the Radius of Gyration in Small-Angle X-Ray Scattering Experiments
- Author
-
F. Jay Breidt, Cody Alsaker, and Mark J. van der Woerd
- Subjects
Statistics and Probability ,Physics ,Minimum mean square error ,Small-angle X-ray scattering ,Scattering ,Astrophysics::High Energy Astrophysical Phenomena ,fungi ,05 social sciences ,Generalized least squares ,01 natural sciences ,Computational physics ,Condensed Matter::Soft Condensed Matter ,010104 statistics & probability ,Autoregressive model ,0502 economics and business ,Radius of gyration ,0101 mathematics ,Statistics, Probability and Uncertainty ,Time series ,050205 econometrics - Abstract
Small-angle X-ray scattering (SAXS) is a technique that yields low-resolution structural information of biological macromolecules by exposing a large ensemble of molecules in solution to a powerful...
- Published
- 2018
- Full Text
- View/download PDF
9. Model-Assisted Survey Regression Estimation with the Lasso
- Author
-
F. Jay Breidt, Gretchen G. Moisen, Thomas C. M. Lee, and Kelly S. McConville
- Subjects
Statistics and Probability ,Elastic net regularization ,Statistics::Theory ,010504 meteorology & atmospheric sciences ,Calibration (statistics) ,Population ,01 natural sciences ,Statistics::Machine Learning ,010104 statistics & probability ,Lasso (statistics) ,Consistency (statistics) ,Statistics ,Econometrics ,Statistics::Methodology ,0101 mathematics ,education ,0105 earth and related environmental sciences ,Mathematics ,education.field_of_study ,Applied Mathematics ,Model selection ,Estimator ,Regression ,Statistics::Computation ,Statistics, Probability and Uncertainty ,Social Sciences (miscellaneous) - Abstract
In the U.S. Forest Service’s Forest Inventory and Analysis (FIA) program, as in other natural resource surveys, many auxiliary variables are available for use in model-assisted inference about finite population parameters. Some of this auxiliary information may be extraneous, and therefore model selection is appropriate to improve the efficiency of the survey regression estimators of finite population totals. A model-assisted survey regression estimator using the lasso is presented and extended to the adaptive lasso. For a sequence of finite populations and probability sampling designs, asymptotic properties of the lasso survey regression estimator are derived, including design consistency and central limit theory for the estimator and design consistency of a variance estimator. To estimate multiple finite population quantities with the method, lasso survey regression weights are developed, using both a model calibration approach and a ridge regression approximation. The gains in efficiency of the lasso estimator over the full regression estimator are demonstrated through a simulation study estimating tree canopy cover for a region in Utah.
- Published
- 2017
- Full Text
- View/download PDF
10. Sparse Functional Dynamical Models—A Big Data Approach
- Author
-
Ela Sienkiewicz, F. Jay Breidt, Dong Song, and Haonan Wang
- Subjects
Statistics and Probability ,Mathematical optimization ,Quantitative Biology::Neurons and Cognition ,Dynamical systems theory ,Volterra series ,Zero (complex analysis) ,Feature selection ,01 natural sciences ,Point process ,010104 statistics & probability ,03 medical and health sciences ,Identification (information) ,0302 clinical medicine ,Neural ensemble ,Discrete Mathematics and Combinatorics ,0101 mathematics ,Statistics, Probability and Uncertainty ,Algorithm ,030217 neurology & neurosurgery ,Mathematics ,Analytic function - Abstract
Nonlinear dynamical systems are encountered in many areas of social science, natural science, and engineering, and are of particular interest for complex biological processes like the spiking activity of neural ensembles in the brain. To describe such spiking activity, we adapt the Volterra series expansion of an analytic function to account for the point-process nature of multiple inputs and a single output (MISO) in a neural ensemble. Our model describes the transformed spiking probability for the output as the sum of kernel-weighted integrals of the inputs. The kernel functions need to be identified and estimated, and both local sparsity (kernel functions may be zero on part of their support) and global sparsity (some kernel functions may be identically zero) are of interest. The kernel functions are approximated by B-splines and a penalized likelihood-based approach is proposed for estimation. Even for moderately complex brain functionality, the identification and estimation of this sparse fun...
- Published
- 2017
- Full Text
- View/download PDF
11. Hierarchical Bayesian small area estimation for circular data
- Author
-
Daniel Hernandez-Stumpfhauser, Jean D. Opsomer, and F. Jay Breidt
- Subjects
Statistics and Probability ,Estimation ,Simplex ,05 social sciences ,Bayesian probability ,Time distribution ,Regression analysis ,01 natural sciences ,Hybrid Monte Carlo ,010104 statistics & probability ,Small area estimation ,Distribution function ,0502 economics and business ,Statistics ,0101 mathematics ,Statistics, Probability and Uncertainty ,050205 econometrics ,Mathematics - Abstract
We consider small area estimation for the departure times of recreational anglers along the Atlantic and Gulf coasts of the United States. A Bayesian area-level Fay–Herriot model is considered to obtain estimates of the departure time distribution functions. The departure distribution functions are modelled as circular distributions plus area-specific errors. The circular distributions are modelled as projected normal, and a regression model is specified to borrow information across domains. Estimation is conducted through the use of a Hamiltonian Monte Carlo sampler and a projective approach onto the probability simplex. The Canadian Journal of Statistics 44: 416–430; 2016 © 2016 Statistical Society of Canada
- Published
- 2016
- Full Text
- View/download PDF
12. Nonparametric Variance Estimation Under Fine Stratification: An Alternative to Collapsed Strata
- Author
-
I. Sánchez-Borrego, Jean D. Opsomer, and F. Jay Breidt
- Subjects
Statistics and Probability ,education.field_of_study ,Mean squared error ,05 social sciences ,Population ,Nonparametric statistics ,Estimator ,Variance (accounting) ,01 natural sciences ,Nonparametric regression ,010104 statistics & probability ,Mathematics::Algebraic Geometry ,0502 economics and business ,Statistics ,Econometrics ,Kernel regression ,0101 mathematics ,Statistics, Probability and Uncertainty ,education ,Mathematics::Symplectic Geometry ,050205 econometrics ,Stratum ,Mathematics - Abstract
Fine stratification is commonly used to control the distribution of a sample from a finite population and to improve the precision of resulting estimators. One-per-stratum designs represent the finest possible stratification and occur in practice, but designs with very low numbers of elements per stratum (say, two or three) are also common. The classical variance estimator in this context is the collapsed stratum estimator, which relies on creating larger “pseudo-strata” and computing the sum of the squared differences between estimated stratum totals across the pseudo-strata. We propose here a nonparametric alternative that replaces the pseudo-strata by kernel-weighted stratum neighborhoods and uses deviations from a fitted mean function to estimate the variance. We establish the asymptotic behavior of the kernel-based estimator and show its superior practical performance relative to the collapsed stratum variance estimator in a simulation study. An application to data from the U.S. Consumer Expe...
- Published
- 2016
- Full Text
- View/download PDF
13. Predictive analytics using statistical, learning, and ensemble methods to support real-time exploration of discrete event simulations
- Author
-
Sangmi Lee Pallickara, F. Jay Breidt, Walid Budgaga, Neil Harvey, Matthew Malensek, and Shrideep Pallickara
- Subjects
Computer Networks and Communications ,Process (engineering) ,Computer science ,business.industry ,Event (computing) ,Interface (computing) ,Cloud computing ,02 engineering and technology ,Predictive analytics ,computer.software_genre ,Ensemble learning ,Latin hypercube sampling ,Hardware and Architecture ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,User interface ,Discrete event simulation ,business ,computer ,Software - Abstract
Discrete event simulations (DES) provide a powerful means for modeling complex systems and analyzing their behavior. DES capture all possible interactions between the entities they manage, which makes them highly expressive but also compute-intensive. These computational requirements often impose limitations on the breadth and/or depth of research that can be conducted with a discrete event simulation.This work describes our approach for leveraging the vast quantity of computing and storage resources available in both private organizations and public clouds to enable real-time exploration of discrete event simulations. Rather than directly targeting simulation execution speeds, we autonomously generate and execute novel scenario variants to explore a representative subset of the simulation parameter space. The corresponding outputs from this process are analyzed and used by our framework to produce models that accurately forecast simulation outcomes in real time, providing interactive feedback and facilitating exploratory research.Our framework distributes the workloads associated with generating and executing scenario variants across a range of commodity hardware, including public and private cloud resources. Once the models have been created, we evaluate their performance and improve prediction accuracy by employing dimensionality reduction techniques and ensemble methods. To make these models highly accessible, we provide a user-friendly interface that allows modelers and epidemiologists to modify simulation parameters and see projected outcomes in real time. Our approach enables fast, accurate forecasts of discrete event simulations.The framework copes with high dimensionality and voluminous datasets.We facilitate simulation execution with cycle scavenging and cloud resources.We create and evaluate several predictive models, including ensemble methods.Our framework is made accessible to end users through an interactive web interface.
- Published
- 2016
- Full Text
- View/download PDF
14. Bayesian calibration of the DayCent ecosystem model to simulate soil organic carbon dynamics and reduce model uncertainty
- Author
-
Ram Gurung, William J. Parton, F. Jay Breidt, Stephen A. Williams, and Stephen M. Ogle
- Subjects
Bayesian probability ,Soil Science ,Sampling (statistics) ,04 agricultural and veterinary sciences ,Soil carbon ,010501 environmental sciences ,Carbon sequestration ,Bayesian inference ,01 natural sciences ,DayCent ,Ecosystem model ,Greenhouse gas ,040103 agronomy & agriculture ,Econometrics ,0401 agriculture, forestry, and fisheries ,Environmental science ,0105 earth and related environmental sciences - Abstract
Benefits of carbon sequestration in agricultural soils are well recognized, and process-based models have been developed to better understand sequestration potential. However, most studies ignore the uncertainty arising during model prediction—a critical requirement for scientific understanding, policy implementation and carbon emission trading. Furthermore, the dependencies created in process-based models due to many parameters and a relatively small set of empirical data hinder parameterization. We have implemented a Bayesian approach using the sampling importance resampling (SIR) method to calibrate the DayCent ecosystem model for estimating soil organic carbon (SOC) stocks, and to quantify uncertainty in model predictions. A SOC dataset compiled from 19 long-term field experiments, representing 117 combinations of management treatments, with 491 measurements of SOC, was split into independent datasets for model calibration and evaluation. The most important DayCent model parameters were identified through a global sensitivity analysis (GSA) for parameterization and SIR was used to calibrate the model and produce posterior distributions for the most sensitive parameters. On average, the Bayesian calibration reduced the model uncertainty by a factor of 6.6 relative to the uncertainty associated with the prior. The Bayesian model analysis framework will allow for ongoing updates to the model as new datasets and model structural improvements are made in future research, and overall provide a stronger basis for models to support policy and management decisions associated with GHG mitigation through C sequestration in agricultural soils.
- Published
- 2020
- Full Text
- View/download PDF
15. Nonparametric regression estimation under complex sampling designs
- Author
-
F. Jay Breidt, Ji-Yeon Kim, and Jean D. Opsomer
- Subjects
Polynomial regression ,Statistics::Theory ,Efficient estimator ,Statistics ,Statistics::Methodology ,Sampling (statistics) ,Estimator ,Regression analysis ,Semiparametric regression ,Invariant estimator ,Nonparametric regression ,Mathematics - Abstract
The efficient use of auxiliary information to improve the precision of estimation of population quantities of interest is a central problem in survey sampling. We consider nonparametric regression estimation using much weaker assumptions on the superpopulation model in more general survey situations. Complex designs such as multistage and multiphase sampling are often employed in many large-scale surveys. Nonparametric model-assisted estimators, based on local polynomial regression, for two-stage and two-phase sampling designs are proposed. The local polynomial regression estimator is a nonparametric version of the generalized regression (GREG) estimator and shares most of the desirable properties of the generalized regression estimator. The estimator of the finite population total for two-stage element sampling with complete cluster auxiliary information is a linear combination of cluster total estimators, with sample-dependent weights that are calibrated to known control totals. The nonparametric estimator for two-phase sampling with a regression model for between-phase inference is also expressed as a weighted linear sum of the study variable of interest over a second-phase sample, in which the weights are not calibrated directly to known control totals, but are calibrated to the Horvitz-Thompson estimators of known control totals over a first-phase sample. Asymptotic design unbiasedness and design consistency of the estimators are established, and consistent variance estimators are proposed. Simulation experiments indicate that the local polynomial regression estimators are more efficient than parametric regression estimators under model misspecification, while being nearly as good when the parametric mean function is correctly specified.
- Published
- 2018
- Full Text
- View/download PDF
16. Successive Difference Replication Variance Estimation in Two-Phase Sampling
- Author
-
Michael White, Yao Li, Jean D. Opsomer, and F. Jay Breidt
- Subjects
Statistics and Probability ,Two phase sampling ,Applied Mathematics ,05 social sciences ,01 natural sciences ,Balanced repeated replication ,010104 statistics & probability ,0502 economics and business ,Variance estimation ,Statistics ,Replication (statistics) ,0101 mathematics ,Statistics, Probability and Uncertainty ,Social Sciences (miscellaneous) ,050205 econometrics ,Mathematics - Published
- 2016
- Full Text
- View/download PDF
17. Laplace Variational Approximation for Semiparametric Regression in the Presence of Heteroscedastic Errors
- Author
-
Mark J. van der Woerd, F. Jay Breidt, and Bruce D. Bugbee
- Subjects
0301 basic medicine ,Statistics and Probability ,Heteroscedasticity ,Mathematical optimization ,Laplace transform ,Markov chain Monte Carlo ,Bayesian inference ,01 natural sciences ,Article ,010104 statistics & probability ,03 medical and health sciences ,symbols.namesake ,030104 developmental biology ,Metropolis–Hastings algorithm ,Laplace's method ,symbols ,Variational message passing ,Statistics::Methodology ,Discrete Mathematics and Combinatorics ,Applied mathematics ,Semiparametric regression ,0101 mathematics ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
Variational approximations provide fast, deterministic alternatives to Markov Chain Monte Carlo for Bayesian inference on the parameters of complex, hierarchical models. Variational approximations are often limited in practicality in the absence of conjugate posterior distributions. Recent work has focused on the application of variational methods to models with only partial conjugacy, such as in semiparametric regression with heteroskedastic errors. Here, both the mean and log variance functions are modeled as smooth functions of covariates. For this problem, we derive a mean field variational approximation with an embedded Laplace approximation to account for the non-conjugate structure. Empirical results with simulated and real data show that our approximate method has significant computational advantages over traditional Markov Chain Monte Carlo; in this case, a delayed rejection adaptive Metropolis algorithm. The variational approximation is much faster and eliminates the need for tuning parameter selection, achieves good fits for both the mean and log variance functions, and reasonably reflects the posterior uncertainty. We apply the methods to log-intensity data from a small angle X-ray scattering experiment, in which properly accounting for the smooth heteroskedasticity leads to significant improvements in posterior inference for key physical characteristics of an organic molecule.
- Published
- 2016
- Full Text
- View/download PDF
18. Model-Assisted Survey Estimation with Imperfectly Matched Auxiliary Data
- Author
-
Jean D. Opsomer, F. Jay Breidt, and Chien-Min Huang
- Subjects
Mixed model ,education.field_of_study ,Matching (graph theory) ,Computer science ,Statistics ,Population ,Estimator ,Survey data collection ,Sample (statistics) ,education ,Regression ,Nonparametric regression - Abstract
Model-assisted survey regression estimators combine auxiliary information available at a population level with complex survey data to estimate finite population parameters. Many prediction methods, including linear and mixed models, nonparametric regression, and machine learning techniques, can be incorporated into such model-assisted estimators. These methods assume that observations obtained for the sample can be matched without error to the auxiliary data. We investigate properties of estimators that rely on matching algorithms that do not in general yield perfect matches. We focus on difference estimators, which are exactly unbiased under perfect matching but not under imperfect matching. The methods are investigated analytically and via simulation, using a study of recreational angling in South Carolina to build a simulation population. In this study, the survey data come from a stratified, two-stage sample and the auxiliary data from logbooks filed by boat captains. Extensions to regression estimators under imperfect matching are discussed.
- Published
- 2017
- Full Text
- View/download PDF
19. Model-Assisted Survey Estimation with Modern Prediction Techniques
- Author
-
Jean D. Opsomer and F. Jay Breidt
- Subjects
Statistics and Probability ,Asymptotic analysis ,010504 meteorology & atmospheric sciences ,neural network ,General Mathematics ,Population ,Machine learning ,computer.software_genre ,01 natural sciences ,Generalized linear mixed model ,010104 statistics & probability ,nearest neighbors ,0101 mathematics ,education ,0105 earth and related environmental sciences ,Mathematics ,education.field_of_study ,Artificial neural network ,business.industry ,Recipe ,Linear model ,Estimator ,regression trees ,Nonparametric regression ,survey asymptotics ,nonparametric regression ,Artificial intelligence ,Statistics, Probability and Uncertainty ,business ,computer - Abstract
This paper reviews the design-based, model-assisted approach to using data from a complex survey together with auxiliary information to estimate finite population parameters. A general recipe for deriving model-assisted estimators is presented and design-based asymptotic analysis for such estimators is reviewed. The recipe allows for a very broad class of prediction methods, with examples from the literature including linear models, linear mixed models, nonparametric regression and machine learning techniques.
- Published
- 2017
- Full Text
- View/download PDF
20. A constrained least-squares approach to combine bottom-up and top-down CO2 flux estimates
- Author
-
Daniel Cooley, Andrew Schuh, Stephen M. Ogle, Thomas Lauvaux, and F. Jay Breidt
- Subjects
Statistics and Probability ,Constraint (information theory) ,Inventory valuation ,Covariance matrix ,Computation ,Statistics ,Inverse ,Flux ,Statistical model ,Variance (accounting) ,Statistics, Probability and Uncertainty ,General Environmental Science ,Mathematics - Abstract
Terrestrial CO2 flux estimates are obtained from two fundamentally different methods generally termed bottom-up and top-down approaches. Inventory methods are one type of bottom-up approach which uses various sources of information such as crop production surveys and forest monitoring data to estimate the annual CO2 flux at locations covering a study region. Top-down approaches are various types of atmospheric inversion methods which use CO2 concentration measurements from monitoring towers and atmospheric transport models to estimate CO2 flux over a study region. Both methods can also quantify the uncertainty associated with their estimates. Historically, these two approaches have produced estimates that differ considerably. The goal of this work is to construct a statistical model which sensibly combines estimates from the two approaches to produce a new estimate of CO2 flux for our study region. The two approaches have complementary strengths and weaknesses, and our results show that certain aspects of the uncertainty associated with each of the approaches are greatly reduced by combining the methods. Our model is purposefully simple and designed to take the two approaches’ estimates and measures of uncertainty at ‘face value’. Specifically, we use a constrained least-squares approach to appropriately weigh the estimates by the inverse of their variance, and the constraint imposes agreement between the two sources. Our application involves nearly 18,000 flux estimates for the upper midwest United States. The constrained dependencies result in a non-sparse covariance matrix, but computation requires only minutes due to the structure of the model.
- Published
- 2012
- Full Text
- View/download PDF
21. Autocovariance structures for radial averages in small-angle X-ray scattering experiments
- Author
-
Mark J. van der Woerd, F. Jay Breidt, and Andreea L. Erciulescu
- Subjects
Statistics and Probability ,Small-angle X-ray scattering ,Plane (geometry) ,Scattering ,Applied Mathematics ,Autocorrelation ,Detector ,Computational physics ,Convolution ,Autocovariance ,Kernel (image processing) ,Statistics ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
Small-angle X-ray scattering (SAXS) is a technique for obtaining low-resolution structural information about biological macromolecules, by exposing a dilute solution to a high-intensity X-ray beam and capturing the resulting scattering pattern on a two-dimensional detector. The two-dimensional pattern is reduced to a one-dimensional curve through radial averaging; that is, by averaging across annuli on the detector plane. Subsequent analysis of structure relies on these one-dimensional data. This paper reviews the technique of SAXS and investigates autocorrelation structure in the detector plane and in the radial averages. Across a range of experimental conditions and molecular types, spatial autocorrelation in the detector plane is present and is well-described by a stationary kernel convolution model. The corresponding autocorrelation structure for the radial averages is non-stationary. Implications of the autocorrelation structure for inference about macromolecular structure are discussed.
- Published
- 2012
- Full Text
- View/download PDF
22. Designing a national soil carbon monitoring network to support climate change policy: a case example for US agricultural lands
- Author
-
Shannon Spencer, F. Jay Breidt, Keith Paustian, Stephen M. Ogle, and J. Jeffery Goebel
- Subjects
Land use ,business.industry ,Geography, Planning and Development ,Climate change ,Soil carbon ,Environmental Science (miscellaneous) ,Agricultural soil science ,Soil functions ,Agriculture ,Agricultural land ,Environmental protection ,Greenhouse gas ,Environmental science ,business ,Environmental planning - Abstract
Soils contain the largest terrestrial pool of carbon, and have large annual transfers of carbon with biomass pools and the atmosphere. Agricultural land use and management, and changes in climate have significant impacts on soil carbon, and if managed with conservation practices agricultural soils could be enhanced while sequestering carbon and mitigating greenhouse gas emissions. To better inform national climate change policy decisions for agricultural lands, robust and accurate estimates of soil organic carbon (SOC) stock changes are needed at regional to national scales. The design of a national soil monitoring network for carbon on agricultural lands is discussed including determination of sample size, allocation, and site-scale plot design. A quantitative case study is presented using modeled estimates of SOC stock change variability and a set of soil sample measurements to evaluate a potential network design for U.S. agricultural lands. Stratification by climate, soil, and land use with sites alloc...
- Published
- 2011
- Full Text
- View/download PDF
23. A class of stochastic volatility models for environmental applications
- Author
-
Richard A. Davis, Wenying Huang, Ke Wang, and F. Jay Breidt
- Subjects
Statistics and Probability ,Mathematical optimization ,Heteroscedasticity ,Stochastic volatility ,Covariance function ,Estimation theory ,Applied Mathematics ,Context (language use) ,Covariance ,symbols.namesake ,symbols ,Statistics, Probability and Uncertainty ,Gaussian process ,Importance sampling ,Mathematics - Abstract
Many environmental data sets have a continuous domain, in time and/or space, and complex features that may be poorly modelled with a stationary (in space and time) Gaussian process (GP). We adapt stochastic volatility modelling to this context, resulting in a stochastic heteroscedastic process (SHP), which is unconditionally stationary and non-Gaussian. Conditional on a latent GP, the SHP is a heteroscedastic GP with non-stationary (in space and time) covariance structure. The realizations from SHP are versatile and can represent spatial inhomogeneities. The unconditional correlation functions of SHP form a rich isotropic class that can allow for a smoothed nugget effect. We apply an importance sampling strategy to implement pseudo maximum likelihood parameter estimation for the SHP. To predict the process at unobserved locations, we develop a plug-in best predictor. We extend the single-realization SHP model to handle replicates across time of SHP realizations in space. Empirical results with simulated data show that SHP is nearly as efficient as a stationary GP in out-of-sample prediction when the true process is a stationary GP, and outperforms a stationary GP substantially when the true process is SHP. The SHP methodology is applied to enhanced vegetation index data and US NO3 deposition data for illustration.
- Published
- 2011
- Full Text
- View/download PDF
24. Improved variance estimation for balanced samples drawn via the cube method
- Author
-
Guillaume Chauvet and F. Jay Breidt
- Subjects
Statistics and Probability ,Analysis of covariance ,Applied Mathematics ,Monte Carlo method ,Estimator ,Horvitz–Thompson estimator ,Sampling design ,Statistics ,Applied mathematics ,Probability distribution ,Martingale difference sequence ,Statistics, Probability and Uncertainty ,Martingale (probability theory) ,Mathematics - Abstract
The cube method proposed by Deville and Tille (2004) enables the selection of balanced samples: that is, samples such that the Horvitz–Thompson estimators of auxiliary variables match the known totals of those variables. As an exact balanced sampling design often does not exist, the cube method generally proceeds in two steps: a “flight phase” in which exact balance is maintained, and a “landing phase” in which the final sample is selected while respecting the balance conditions as closely as possible. Deville and Tille (2005) derive a variance approximation for balanced sampling that takes account of the flight phase only, whereas the landing phase can prove to add non-negligible variance. This paper uses a martingale difference representation of the cube method to construct an efficient simulation-based method for calculating approximate second-order inclusion probabilities. The approximation enables nearly unbiased variance estimation, where the bias is primarily due to the limited number of simulations. In a Monte Carlo study, the proposed method has significantly less bias than the standard variance estimator, leading to improved confidence interval coverage.
- Published
- 2011
- Full Text
- View/download PDF
25. Understanding the drivers of sensitive behavior using Poisson regression from quantitative randomized response technique data
- Author
-
Abu Conteh, Michael C. Gavin, F. Jay Breidt, Jennifer Solomon, and Meng Cao
- Subjects
0106 biological sciences ,Statistical methods ,Computer science ,Monte Carlo method ,lcsh:Medicine ,computer.software_genre ,01 natural sciences ,Geographical locations ,010104 statistics & probability ,Natural Resources ,Surveys and Questionnaires ,Poisson Distribution ,lcsh:Science ,Geographic Areas ,Conservation Science ,Likelihood Functions ,Multidisciplinary ,Covariance ,Geography ,Ecology ,Statistical Models ,Randomized Response Technique ,Regression ,Physical sciences ,Community Ecology ,symbols ,Regression Analysis ,Research Article ,Urban Areas ,Count data ,Generalized linear model ,Maximum likelihood ,Statistics (mathematics) ,Machine learning ,010603 evolutionary biology ,Sierra Leone ,symbols.namesake ,Randomized response ,Humans ,Poisson regression ,0101 mathematics ,Social Behavior ,Fisher information ,Probability ,Motivation ,business.industry ,Ecology and Environmental Sciences ,lcsh:R ,Biology and Life Sciences ,Random Variables ,Statistical model ,Probability Theory ,Research and analysis methods ,Africa ,Earth Sciences ,Mathematical and statistical techniques ,lcsh:Q ,Artificial intelligence ,People and places ,business ,computer ,Mathematics - Abstract
Understanding sensitive behaviors-those that are socially unacceptable or non-compliant with rules or regulations-is essential for creating effective interventions. Sensitive behaviors are challenging to study, because participants are unlikely to disclose sensitive behaviors for fear of retribution or due to social undesirability. Methods for studying sensitive behavior include randomized response techniques, which provide anonymity to interviewees who answer sensitive questions. A variation on this approach, the quantitative randomized response technique (QRRT), allows researchers to estimate the frequency or quantity of sensitive behaviors. However, to date no studies have used QRRT to identify potential drivers of non-compliant behavior because regression methodology has not been developed for the nonnegative count data produced by QRRT. We develop a Poisson regression methodology for QRRT data, based on maximum likelihood estimation computed via the expectation-maximization (EM) algorithm. The methodology can be implemented with relatively minor modification of existing software for generalized linear models. We derive the Fisher information matrix in this setting and use it to obtain the asymptotic variance-covariance matrix of the regression parameter estimates. Simulation results demonstrate the quality of the asymptotic approximations. The method is illustrated with a case study examining potential drivers of non-compliance with hunting regulations in Sierra Leone. The new methodology allows assessment of the importance of potential drivers of different quantities of non-compliant behavior, using a likelihood-based, information-theoretic approach. Free, open-source software is provided to support QRRT regression.
- Published
- 2018
- Full Text
- View/download PDF
26. Scale and uncertainty in modeled soil organic carbon stock changes for US croplands using a process-based model
- Author
-
S. Williams, Mark Easter, K. Killian, Stephen M. Ogle, F. Jay Breidt, and Keith Paustian
- Subjects
Hydrology ,Global and Planetary Change ,Ecology ,Soil organic matter ,Soil carbon ,Atmospheric sciences ,Confidence interval ,Ecosystem model ,Soil water ,Spatial ecology ,Environmental Chemistry ,Environmental science ,Uncertainty analysis ,Stock (geology) ,General Environmental Science - Abstract
Process-based model analyses are often used to estimate changes in soil organic carbon (SOC), particularly at regional to continental scales. However, uncertainties are rarely evaluated, and so it is difficult to determine how much confidence can be placed in the results. Our objective was to quantify uncertainties across multiple scales in a process-based model analysis, and provide 95% confidence intervals for the estimates. Specifically, we used the Century ecosystem model to estimate changes in SOC stocks for US croplands during the 1990s, addressing uncertainties in model inputs, structure and scaling of results from point locations to regions and the entire country. Overall, SOC stocks increased in US croplands by 14.6 Tg C yr -1 from 1990 to 1995 and 17.5 Tg C yr -1 during 1995 to 2000, and uncertainties were ± 22% and ± 16% for the two time periods, respectively. Uncertainties were inversely related to spatial scale, with median uncertainties at the regional scale estimated at ± 118% and ± 114% during the early and latter part of 1990s, and even higher at the site scale with estimates at ± 739% and ± 674% for the time periods, respectively. This relationship appeared to be driven by the amount of the SOC stock change; changes in stocks that exceeded 200 Gg C yr -1 represented a threshold where uncertainties were always lower than ±100%. Consequently, the amount of uncertainty in estimates derived from process-based models will partly depend on the level of SOC accumulation or loss. In general, the majority of uncertainty was associated with model structure in this application, and so attaining higher levels of precision in the estimates will largely depend on improving the model algorithms and parameterization, as well as increasing the number of measurement sites used to evaluate the structural uncertainty.
- Published
- 2010
- Full Text
- View/download PDF
27. Spatial Lasso With Applications to GIS Model Selection
- Author
-
David M. Theobald, Hsin-Cheng Huang, F. Jay Breidt, and Nan-Jung Hsu
- Subjects
Statistics and Probability ,Geographic information system ,Computer science ,business.industry ,Model selection ,Linear model ,Feature selection ,Generalized least squares ,computer.software_genre ,Cross-validation ,Lasso (statistics) ,Discrete Mathematics and Combinatorics ,Data mining ,Statistics, Probability and Uncertainty ,business ,computer ,Spatial analysis - Abstract
Geographic information systems (GIS) organize spatial data in multiple two-dimensional arrays called layers. In many applications, a response of interest is observed on a set of sites in the landscape, and it is of interest to build a regression model from the GIS layers to predict the response at unsampled sites. Model selection in this context then consists not only of selecting appropriate layers, but also of choosing appropriate neighborhoods within those layers. We formalize this problem as a linear model and propose the use of Lasso to simultaneously select variables, choose neighborhoods, and estimate parameters. Spatially dependent errors are accounted for using generalized least squares and spatial smoothness in selected coefficients is incorporated through use of a priori spatial covariance structure. This leads to a modification of the Lasso procedure, called spatial Lasso. The spatial Lasso can be implemented by a fast algorithm and it performs well in numerical examples, including an applicat...
- Published
- 2010
- Full Text
- View/download PDF
28. Predicting Enhanced Vegetation Index (EVI) curves for ecosystem modeling applications
- Author
-
Amandine Dutin, F. Jay Breidt, Stephen M. Ogle, and Ram Gurung
- Subjects
Ecosystem model ,Linear regression ,Simulation modeling ,Soil Science ,Primary production ,Environmental science ,Geology ,Regression analysis ,Enhanced vegetation index ,Vegetation ,Computers in Earth Sciences ,Time series ,Remote sensing - Abstract
Vegetation indices derived from remote sensing data provide information about the variability in stature, growth and vigor of the vegetation across a region, and have been used to model plant processes. For example, the Enhanced Vegetation Index (EVI) provides a measure of greenness of the vegetation that can be used to predict net primary production. However, ecosystem models relying on remote sensing data for EVI or other vegetation indices are limited by the time series of the satellite data record. Our objective was to develop a statistical model to predict EVI in order to extend the time series for modeling applications. To explain the functional behavior of the seasonal EVI curves, a two-stage multiple regression fitting procedure within a semi-parametric mixed effect (SPME) model framework was used. First, a linear mixed effect (LME) model was fitted to the EVI with climate indexes, crop and irrigation information as predictor variables. Second, Penalized B-splines were used to explain the behavior of the smooth residuals, which result from a smooth model fit to the smooth EVI data curve, in order to describe the uncertainty of the EVI curve. Individual models were fit within individual Major Land Resources Areas (MLRAs). Predicted seasonal EVI, derived from our regression equations, showed a strong agreement with the observed EVI and was able to capture the site by site and year by year variation in the EVI curve. Out-of-sample prediction produced excellent results for a majority of the sites, except for sites without clear seasonal patterns, which may have resulted from cloud contamination and/or snow cover. Therefore, given the appropriate climate, crop, and irrigation information, the proposed approach can be used to predict seasonal EVI curves for extending the time series into the past and future.
- Published
- 2009
- Full Text
- View/download PDF
29. Estimating Distribution Functions from Survey Data Using Nonparametric Regression
- Author
-
Alicia A. Johnson, Jean D. Opsomer, and F. Jay Breidt
- Subjects
Statistics and Probability ,Extremum estimator ,Statistics ,Econometrics ,Nonparametric statistics ,Asymptotic distribution ,Local regression ,Kernel regression ,Estimator ,M-estimator ,Nonparametric regression ,Mathematics - Abstract
Auxiliary information is often used to improve the precision of estimators of the finite population cumulative distribution function through the use of superpopulation models. A variety of approaches are available to construct such estimators, including design-based, model-based and model-assisted methods. The superpopulation modeling framework can be either parametric or nonparametric, and the estimators can be constructed as either linear or nonlinear functions of the observations. In this article, we argue that model-assisted estimators based on a nonparametric model are a good overall choice for distribution function estimators, because they have good efficiency properties and are robust against model misspecification. When such estimators are constructed as linear functions of the data, they are also easily incorporated into the existing survey estimation paradigm through the use of survey weights. Theoretical properties of nonparametric distribution function estimators based on local linear regression are derived, and their practical behavior is evaluated in a simulation study.
- Published
- 2008
- Full Text
- View/download PDF
30. Sparse Functional Dynamical Models — A Big Data Approach
- Author
-
Sienkiewicz, Ela, Song, Dong, F. Jay Breidt, and Haonan Wang
- Subjects
Quantitative Biology::Neurons and Cognition - Abstract
Nonlinear dynamical systems are encountered in many areas of social science, natural science and engineering, and are of particular interest for complex biological processes like the spiking activity of neural ensembles in the brain. To describe such spiking activity, we adapt the Volterra series expansion of an analytic function to account for the point-process nature of multiple inputs and a single output (MISO) in a neural ensemble. Our model describes the transformed spiking probability for the output as the sum of kernel-weighted integrals of the inputs. The kernel functions need to be identified and estimated, and both local sparsity (kernel functions may be zero on part of their support) and global sparsity (some kernel functions may be identically zero) are of interest. The kernel functions are approximated by B-splines and a penalized likelihood-based approach is proposed for estimation. Even for moderately complex brain functionality, the identification and estimation of this sparse functional dynamical model poses major computational challenges, which we address with big data techniques that can be implemented on a single, multi-core server. The performance of the proposed method is demonstrated using neural recordings from the hippocampus of a rat during open field tasks.
- Published
- 2016
- Full Text
- View/download PDF
31. A diagnostic test for autocorrelation in increment-averaged data with application to soil sampling
- Author
-
William Coar, Nan-Jung Hsu, and F. Jay Breidt
- Subjects
Statistics and Probability ,Statistics::Theory ,Heteroscedasticity ,Autocorrelation technique ,Autocorrelation ,Autocovariance ,Statistics ,Linear regression ,Statistics::Methodology ,Statistics, Probability and Uncertainty ,Spatial analysis ,General Environmental Science ,Mathematics ,Parametric statistics ,Cholesky decomposition - Abstract
Motivated by the problem of detecting spatial autocorrelation in increment- averaged data from soil core samples, we use the Cholesky decomposition of the inverse of an autocovariance matrix to derive a parametric linear regression model for autocovariances. In the absence of autocorrelation, the off-diagonal terms in the lower triangular matrix from the Cholesky decomposition should be identically zero, and so the regression coefficients should be identically zero. The standard F-test of this hypothesis and two bootstrapped versions of the test are evaluated as autocorrelation diagnostics via simulation. Size is assessed for a variety of heteroskedastic null hypotheses. Power is evaluated against autocorrelated alternatives, including increment-averaged Ornstein-Uhlenbeck and Matern processes. The bootstrapped tests maintain approximately the correct size and have good power against moderately autocorrelated alternatives. The methods are applied to data from a study of carbon sequestration in agricultural soils.
- Published
- 2007
- Full Text
- View/download PDF
32. Semiparametric Mixed Models for Increment-Averaged Data With Application to Carbon Sequestration in Agricultural Soils
- Author
-
Stephen M. Ogle, Nan-Jung Hsu, and F. Jay Breidt
- Subjects
Statistics and Probability ,Mixed model ,Restricted maximum likelihood ,Parametric model ,Statistics ,Statistics, Probability and Uncertainty ,Additive model ,Random effects model ,Smoothing ,Nonparametric regression ,Semiparametric model ,Mathematics - Abstract
Adoption of conservation tillage practice in agriculture offers the potential to mitigate greenhouse gas emissions. Studies comparing conservation tillage methods to traditional tillage pair fields under the two management systems and obtain soil core samples from each treatment. Cores are divided into multiple increments, and matching increments from one or more cores are aggregated and analyzed for carbon stock. These data represent not the actual value at a specific depth, but rather the total or average over a depth increment. A semiparametric mixed model is developed for such increment-averaged data. The model uses parametric fixed effects to represent covariate effects, random effects to capture correlation within studies, and an integrated smooth function to describe effects of depth. The depth function is specified as an additive model, estimated with penalized splines using standard mixed model software. Smoothing parameters are automatically selected using restricted maximum likelihood. The meth...
- Published
- 2007
- Full Text
- View/download PDF
33. An empirically based approach for estimating uncertainty associated with modelling carbon sequestration in soils
- Author
-
Stephen M. Ogle, Mark Easter, Keith Paustian, F. Jay Breidt, and S. Williams
- Subjects
Propagation of uncertainty ,Ecological Modeling ,Greenhouse gas ,Simulation modeling ,Econometrics ,Environmental science ,Variance (accounting) ,Soil carbon ,Agricultural productivity ,Carbon sequestration ,Uncertainty analysis - Abstract
Simulation modelling is used to estimate C sequestration associated with agricultural management for purposes of greenhouse gas mitigation. Models are not completely accurate or precise estimators of C pools, however, due to insufficient knowledge and imperfect conceptualizations about ecosystem processes, leading to uncertainty in the results. It can be difficult to quantify the uncertainty using traditional error propagation techniques, such as Monte Carlo Analyses, because of the structural complexity of simulation models. Empirically based methods provide an alternative to the error propagation techniques, and our objective was to apply this alternative approach. Specifically, we developed a linear mixed-effect model to quantify both bias and variance in modeled soil C stocks that were estimated using the Century ecosystem simulation model. The statistical analysis was based on measurements from 47 agricultural experiments. A significant relationship was found between model results and measurements although there were biases and imprecision in the modeled estimates. Century under-estimated soil C stocks for several management practices, including organic amendments, no-till adoption, and inclusion of hay or pasture in rotation with annual crops. Century also over-estimated the impact of N fertilization on soil C stocks. For lands set-aside from agricultural production, Century under-estimated soil C stocks on low carbon soils and over-estimated the stocks on high carbon soils. Using an empirically based approach allows for simulation model results to be adjusted for biases as well as quantify the variance associated with modeled estimates, according to the measured “reality” of management impacts from a network of experimental sites.
- Published
- 2007
- Full Text
- View/download PDF
34. Deriving Comprehensive County‐Level Crop Yield and Area Data for U.S. Cropland
- Author
-
Erandathie Lokupitiya, R. S. Lokupitiya, F. Jay Breidt, Keith Paustian, and S. Williams
- Subjects
Crop ,Agronomy ,Agriculture ,business.industry ,Yield (finance) ,Crop yield ,Environmental science ,Census ,Policy design ,County level ,Missing data ,business ,Agronomy and Crop Science - Abstract
Ground-based data on crop production in the USA is provided through surveys conducted by the National Agricultural Statistics Service (NASS) and the Census of Agriculture (AgCensus). Statistics from these surveys are widely used in economic analyses, policy design, and for other purposes. However, missing data in the surveys presents limitations for research that requires comprehensive data for spatial analyses. We created comprehensive county-level databases for nine major crops of the USA for a 16-yr period, by filling the gaps in existing data reported by NASS and AgCensus. We used a combination of regression analyses with data reported by NASS and the AgCensus and linear mixed-effect models incorporating county-level environmental, management, and economic variables pertaining to different agroecozones. Predicted yield and crop area were very close to the data reported by NASS, within 10% relative error. The linear mixed-effect model approach gave the best results in filling 84% of the total gaps in yields and 83% of the gaps in crop areas of all the crops. Regression analyses with AgCensus data filled 16% of the gaps in yields and crop areas of the major crops reported by NASS.
- Published
- 2007
- Full Text
- View/download PDF
35. Controlling the American Community Survey to intercensal population estimates
- Author
-
F. Jay Breidt and David A. Swanson
- Subjects
education.field_of_study ,Geography ,Population statistics ,Statistics ,Population ,General Social Sciences ,Sampling (statistics) ,Estimator ,Variance (accounting) ,Census ,Projection (set theory) ,education ,American Community Survey - Abstract
The US Census Bureau has proposed the use of demographic projections as population controls for the American Community Survey at a fine level of geographic and demographic stratification. These projections are known to be imperfect. Bias and variance of post-stratification estimators with imperfect population controls at various levels of aggregation are considered. The bias and variance are computed with respect to the “model” (including data generation and demographic projection) or with respect to the “design” (including coverage, sampling, response and demographic projection). Bias and variance depend in a complex way on the interactions of demographic projection errors with undercoverage error and nonresponse. Numerical examples illustrate that in the presence of imperfect demographic projections, control at higher levels of aggregation may be better in terms of bias than control at a fine level of post-stratification.
- Published
- 2006
- Full Text
- View/download PDF
36. Bias and variance in model results associated with spatial scaling of measurements for parameterization in regional assessments
- Author
-
Keith Paustian, Stephen M. Ogle, and F. Jay Breidt
- Subjects
Hydrology ,Global and Planetary Change ,Ecology ,Feature scaling ,Confidence interval ,Ecosystem model ,Sample size determination ,Statistics ,Spatial ecology ,Environmental Chemistry ,Land use, land-use change and forestry ,Scaling ,Uncertainty analysis ,General Environmental Science ,Mathematics - Abstract
Models are central to global change analyses, but they are often parameterized using data that represent only a portion of heterogeneity in a region. This creates uncertainty in the results and constrains the reliability of model inferences. Our objective was to evaluate the uncertainty associated with differential scaling of parameterization data to model soil organic carbon stock changes as a function of US agricultural land use and management. Specifically, we compared analyses in which model parameters were derived from field experimental data that were scaled to the entire US vs. the same data scaled to climate regions within the country. We evaluated the effect of differential scaling on both bias and variance in model results. Model results had less variance by scaling data to the entire country because of a larger sample size for deriving individual parameter values, although there was a relatively large bias associated with this parameterization, estimated at 2.7TgCyr � 1 . Even with the large bias, resulting confidence intervals from the two parameterizations had considerable overlap for the estimated national rate of SOC change (i.e. 77% overlap in those intervals). Consequently, the results were relatively similar when focusing on the uncertainty rather than solely on the mean estimate. In contrast, large biases created less overlap in confidence intervals for the change rates within individual climate regions, compared with the national estimates. For example, the overlap in resulting intervals from the two parameterizations was only 32% for the warm temperate moist region, with a corresponding bias of 3.1TgCyr � 1 . These findings demonstrate that there is a greater risk of making erroneous inferences because of large biases if models are parameterized with broader scale information, such as an entire country, and then used to address impacts at a finer spatial scale, such as subregions within a country. In addition, the study demonstrates a trade-off between variance and bias in model results that depends on the scaling of data for model parameterization.
- Published
- 2006
- Full Text
- View/download PDF
37. Agricultural management impacts on soil organic carbon storage under moist and dry climatic conditions of temperate and tropical regions
- Author
-
Keith Paustian, Stephen M. Ogle, and F. Jay Breidt
- Subjects
Conventional tillage ,Agroforestry ,Soil organic matter ,Tropics ,Soil carbon ,Carbon sequestration ,Tillage ,Agronomy ,Soil water ,Temperate climate ,Environmental Chemistry ,Environmental science ,Earth-Surface Processes ,Water Science and Technology - Abstract
We conducted a meta-analysis to quantify the impact of changing agricultural land use and management on soil organic carbon (SOC) storage under moist and dry climatic conditions of temperate and tropical regions. We derived estimates of management impacts for a carbon accounting approach developed by the Intergovernmental Panel on Climate Change, addressing the impact of long-term cultivation, setting-aside land from crop production, changing tillage management, and modifying C input to the soil by varying cropping practices. We found 126 articles that met our criteria and analyzed the data in linear mixed-effect models. In general, management impacts were sensitive to climate in the following order from largest to smallest changes in SOC: tropical moist>tropical dry>temperate moist>temperate dry. For example, long-term cultivation caused the greatest loss of SOC in tropical moist climates, with cultivated soils having 0.58 ± 0.12, or 58% of the amount found under native vegetation, followed by tropical dry climates with 0.69 ± 0.13, temperate moist with 0.71 ± 0.04, and temperate dry with 0.82 ± 0.04. Similarly, converting from conventional tillage to no-till increased SOC storage over 20 years by a factor of 1.23 ± 0.05 in tropical moist climates, which is a 23% increase in SOC, while the corresponding change in tropical dry climates was 1.17 ± 0.05, temperate moist was 1.16 ± 0.02, and temperate dry was 1.10 ± 0.03. These results demonstrate that agricultural management impacts on SOC storage will vary depending on climatic conditions that influence the plant and soil processes driving soil organic matter dynamics.
- Published
- 2005
- Full Text
- View/download PDF
38. Simulation Estimation of Quantiles From a Distribution With Known Mean
- Author
-
F. Jay Breidt
- Subjects
Statistics and Probability ,Mean squared error ,Order statistic ,Estimator ,Markov chain Monte Carlo ,Control variates ,Asymptotic theory (statistics) ,Quantile regression ,symbols.namesake ,Statistics ,symbols ,Discrete Mathematics and Combinatorics ,Statistics, Probability and Uncertainty ,Mathematics ,Quantile - Abstract
It is common in practice to estimate the quantiles of a complicated distribution by using the order statistics of a simulated sample. If the distribution of interest has known population mean, then it is often possible to improve the mean square error of the standard quantile estimator substantially through the simple device of mean-correction: subtract off the sample mean and add on the known population mean. Asymptotic results for the meancorrected quantile estimator are derived and compared to the standard sample quantile. Simulation results for a variety of distributions and processes illustrate the asymptotic theory. Application to Markov chain Monte Carlo and to simulation-based uncertainty analysis is described.
- Published
- 2004
- Full Text
- View/download PDF
39. The potential to mitigate global warming with no-tillage management is only realized when practised in the long term
- Author
-
Richard T. Conant, Keith Paustian, Johan Six, F. Jay Breidt, Arvin R. Mosier, and Stephen M. Ogle
- Subjects
Global and Planetary Change ,Ecology ,Meteorology ,business.industry ,Global warming ,Land management ,Carbon sequestration ,Tillage ,Environmental protection ,Agriculture ,Greenhouse gas ,Temperate climate ,Environmental Chemistry ,Environmental science ,business ,Global-warming potential ,General Environmental Science - Abstract
No-tillage (NT) management has been promoted as a practice capable of offsetting greenhouse gas (GHG) emissions because of its ability to sequester carbon in soils. However, true mitigation is only possible if the overall impact of NT adoption reduces the net global warming potential (GWP) determined by fluxes of the three major biogenic GHGs (i.e. CO2, N2O, and CH4). We compiled all available data of soil-derived GHG emission comparisons between conventional tilled (CT) and NT systems for humid and dry temperate climates. Newly converted NT systems increase GWP relative to CT practices, in both humid and dry climate regimes, and longer-term adoption (>10 years) only significantly reduces GWP in humid climates. Mean cumulative GWP over a 20-year period is also reduced under continuous NT in dry areas, but with a high degree of uncertainty. Emissions of N2O drive much of the trend in net GWP, suggesting improved nitrogen management is essential to realize the full benefit from carbon storage in the soil for purposes of global warming mitigation. Our results indicate a strong time dependency in the GHG mitigation potential of NT agriculture, demonstrating that GHG mitigation by adoption of NT is much more variable and complex than previously considered, and policy plans to reduce global warming through this land management practice need further scrutiny to ensure success.
- Published
- 2004
- Full Text
- View/download PDF
40. Uncertainty in estimating land use and management impacts on soil organic carbon storage for US agricultural lands between 1982 and 1997
- Author
-
Marlen D. Eve, Stephen M. Ogle, F. Jay Breidt, and Keith Paustian
- Subjects
Global and Planetary Change ,Ecology ,Land use ,Agroforestry ,Climate change ,Soil carbon ,Carbon sequestration ,Tillage ,Agricultural land ,Environmental Chemistry ,Environmental science ,Land use, land-use change and forestry ,Water resource management ,Uncertainty analysis ,General Environmental Science - Abstract
Uncertainty was quantified for an inventory estimating change in soil organic carbon (SOC) storage resulting from modifications in land use and management across US agricultural lands between 1982 and 1997. This inventory was conducted using a modified version of a carbon (C) accounting method developed by the Intergovernmental Panel on Climate Change (IPCC). Probability density functions (PDFs) were derived for each input to the IPCC model, including reference SOC stocks, land use/management activity data, and management factors. Change in C storage was estimated using a Monte-Carlo approach with 50 000 iterations, by randomly selecting values from the PDFs after accounting for dependencies in the model inputs. Over the inventory period, mineral soils had a net gain of 10.8 Tg C yr−1, with a 95% confidence interval ranging from 6.5 to 15.3 Tg C yr−1. Most of this gain was due to setting-aside lands in the Conservation Reserve Program. In contrast, managed organic soils lost 9.4 Tg C yr−1, with a 95% confidence interval ranging from 6.4 to 13.3 Tg C yr−1. Combining these gains and losses in SOC, US agricultural soils accrued 1.3 Tg C yr−1 due to land use and management change, with a 95% confidence interval ranging from a loss of 4.4 Tg C yr−1 to a gain of 6.9 Tg C yr−1. Most of the uncertainty was attributed to management factors for tillage, land use change between cultivated and uncultivated conditions, and C loss rates from managed organic soils. Based on the uncertainty, we are not able to conclude with 95% confidence that change in US agricultural land use and management between 1982 and 1997 created a net C sink for atmospheric CO2.
- Published
- 2003
- Full Text
- View/download PDF
41. Spline Estimators of the Density Function of a Variable Measured with Error
- Author
-
Cong Chen, Wayne A. Fuller, and F. Jay Breidt
- Subjects
Statistics and Probability ,Normal distribution ,Efficient estimator ,Bias of an estimator ,Estimation theory ,Modeling and Simulation ,Kernel density estimation ,Statistics ,Statistics::Methodology ,Estimator ,Probability density function ,Minimax estimator ,Mathematics - Abstract
The estimation of the distribution function of a random variable X measured with error is studied. It is assumed that the measurement error has a normal distribution with known parameters. Let the i-th observation on X be denoted by Yi=Xi+ei , where ei is the measurement error. Let {Yi } ( i=1, 2, …, n) be a sample of independent observations. It is assumed that {Xi } and {ei } are mutually independent and each is identically distributed. The proposed estimator is a spline function that transforms X into a standard normal variable. The parameters of the spline function are obtained by maximum likelihood estimation. The number of parameters is determined by the data with a simple criterion, such as AIC. Computationally, a weighted quantile regression estimator is used as the starting value for the nonlinear optimatization procedure of the MLE. In a simulation study, both the quantile regression estimator and the maximum likelihood estimator dominate an optimal kernel estimator and a mixture estima...
- Published
- 2003
- Full Text
- View/download PDF
42. Bayesian analysis of fractionally integrated ARMA with additive noise
- Author
-
F. Jay Breidt and Nan-Jung Hsu
- Subjects
Mathematical optimization ,Strategy and Management ,Bayesian probability ,Estimator ,Sampling (statistics) ,Management Science and Operations Research ,Computer Science Applications ,Approximation error ,Frequentist inference ,Modeling and Simulation ,Autoregressive–moving-average model ,Statistics, Probability and Uncertainty ,Algorithm ,Importance sampling ,Autoregressive fractionally integrated moving average ,Mathematics - Abstract
A new sampling-based Bayesian approach for fractionally integrated autoregressive moving average (ARFIMA) processes is presented. A particular type of ARMA process is used as an approximation for the ARFIMA in a Metropolis–Hastings algorithm, and then importance sampling is used to adjust for the approximation error. This algorithm is relatively time-efficient because of fast convergence in the sampling procedures and fewer computations than competitors. Its frequentist properties are investigated through a simulation study. The performance of the posterior means is quite comparable to that of the maximum likelihood estimators for small samples, but the algorithm can be extended easily to a variety of related processes, including ARFIMA plus short-memory noise. The methodology is illustrated using the Nile River data. Copyright © 2003 John Wiley & Sons, Ltd.
- Published
- 2003
- Full Text
- View/download PDF
43. Television Advertising and Beef Demand: Bayesian Inference in a Random Effects Tobit Model
- Author
-
John R. Schroeter, F. Jay Breidt, and Jeremy T. Benson
- Subjects
Economics and Econometrics ,Global and Planetary Change ,Ecology ,Econometrics ,Economics ,Animal Science and Zoology ,Tobit model ,Television advertising ,Bayesian inference ,Random effects model ,Agronomy and Crop Science ,Humanities - Abstract
A number of recent empirical studies have generated skepticism about the effectiveness of generic advertising for beef. One of these studies, Jensen and Schroeter (1992), examines data collected from a panel of households in a carefully designed experimental test of television advertising. The present paper undertakes a reexamination of the Jensen and Schroeter data with two significant improvements in method. First, the analysis disaggregates beef purchases into three product types (ground beef steaks and roasts) and assesses advertising's impact on the demand for each type separately. Second, the present analysis uses an improved econometric method: Bayesian inference in a random effects Tobit model. Inference is based on simulations of a posterior distribution using Gibbs sampling and data augmentation. As far as advertising's effects are concerned, the results of this analysis reaffirm the Jensen and Schroeter finding: The experimental television advertising campaign was not effective in increasing household purchases of beef. Plusieurs etudes empiriques recentes ont souleve le doute quant a l'efficacite de la publicite generique sur le bœuf. Celle de Jensen et Schroeder (JS), notamment, portait sur les donnees recueillies aupres d'un groupe de menages dans le cadre d'une experience bien concue sur la publicitea la television. Les auteurs reexaminent les donnees de cette etude en apportant deux ameliorations majeures a la methode utilisee. En premier lieu, leur analyse desagrege les achats de bœuf en trois (viande hachee, biftecks et rotis) et evalue l'impact de la publicite sur la demande de chaque produit. Deuxiemement, l'analyse repose sur une meilleure methode econometrique, a savoir l'inference bayesienne dans un modele Tobit a effets aleatoires. L'inference s'appuie sur la simulation d'une distribution posterieure par echantillonnage de Gibb et enrichissement des donnees. En ce qui concerne l'incidence de la publicite, cette nouvelle analyse confirme les constatations de l'etude JS, soit que la campagne experimental de publicitea la television n'entraine pas une hausse des achats de bœuf par les menages.
- Published
- 2002
- Full Text
- View/download PDF
44. A class of nearly long-memory time series models
- Author
-
F. Jay Breidt and Nan-Jung Hsu
- Subjects
Univariate ,Markov chain Monte Carlo ,White noise ,Markov model ,Random walk ,Bayesian inference ,Moving-average model ,symbols.namesake ,Autoregressive model ,symbols ,Econometrics ,Applied mathematics ,Business and International Management ,Mathematics - Abstract
We consider an autoregressive regime-switching model for the dynamic mean structure of a univariate time series. The model allows for a variety of stationary and nonstationary alternatives, and includes the possibility of approximate long-memory behavior. The proposed model includes as special cases white noise, first-order autoregression, and random walk models as well as regime-switching models and the random level-shift model proposed by Chen and Tiao, Journal of Business and Economic Statistics, 8 (1990) p. 83. We describe properties of the model, focusing on its resemblance to long-memory under a certain asymptotic parameterization. We develop a reversible-jump Markov chain Monte Carlo method for Bayesian inference on unknown model parameters and apply the methodology to the Nile River data.
- Published
- 2002
- Full Text
- View/download PDF
45. Using Distributed Analytics to Enable Real-Time Exploration of Discrete Event Simulations
- Author
-
F. Jay Breidt, Sangmi Lee Pallickara, Walid Budgaga, Neil Harvey, Matthew Malensek, and Shrideep Pallickara
- Subjects
Range (mathematics) ,Event (computing) ,Analytics ,business.industry ,Computer science ,Distributed computing ,Real-time computing ,Complex system ,Complex event processing ,Cloud computing ,Discrete event simulation ,business ,Simulation language - Abstract
Discrete event simulations (DES) provide a powerful means for modeling complex systems and analyzing their behavior. DES capture all possible interactions between the entities they manage, which makes them highly expressive but also compute-intensive. These computational requirements often impose limitations on the breadth and/or depth of research that can be conducted with a discrete event simulation. This work describes our approach for leveraging the vast quantity of computing and storage resources available in both private organizations and public clouds to enable real-time exploration of a discrete event simulation. Rather than considering the execution speed of a single simulation run, we autonomously generate novel scenario variants to explore an entire subset of the simulation parameter space. These workloads are orchestrated in a distributed fashion across a wide range of commodity hardware. The resulting outputs are analyzed to produce models that accurately forecast simulation outcomes in real time, providing interactive feedback and bolstering research possibilities.
- Published
- 2014
- Full Text
- View/download PDF
46. LONG-RANGE DEPENDENT COMMON FACTOR MODELS: A BAYESIAN APPROACH
- Author
-
Bonnie K. Ray, Nan-Jung Hsu, and F. Jay Breidt
- Subjects
Statistics and Probability ,Multivariate statistics ,symbols.namesake ,Multivariate analysis ,Stochastic volatility ,Stochastic modelling ,Stochastic process ,Bayesian probability ,Econometrics ,symbols ,Markov chain Monte Carlo ,Likelihood function ,Mathematics - Abstract
We propose a simulation-based Bayesian approach to analyze multivariate time series with possible common long-range dependent factors. A state-space approach is used to represent the likelihood function in a tractable manner. The approach taken here allows for extension to fit a non-Gaussian multivariate stochastic volatility (MVSV) model with common long-range dependent components. The method is illustrated for a set of stock returns for companies having similar annual sales.
- Published
- 2001
- Full Text
- View/download PDF
47. A semiparametric estimator of the distribution function of a variable measured with error
- Author
-
F. Jay Breidt, Cong Chen, and Wayne A. Fuller
- Subjects
Statistics and Probability ,Minimum-variance unbiased estimator ,Efficient estimator ,Bias of an estimator ,Mean squared error ,Stein's unbiased risk estimate ,Statistics ,Estimator ,Applied mathematics ,Minimax estimator ,Invariant estimator ,Mathematics - Abstract
The estimation of the distribution functon of a random variable X measured with error is studied. Let the i-th observation on X be denoted by YiXi+ei where ei is the measuremen error. Let {Yi} (i=1,2,…,n) be a sample of independent observations. It is assumed that {Xi} and {∈i} are mutually independent and each is identically distributed. As is standard in the literature for this problem, the distribution of e is assumed known in the development of the methodology. In practice, the measurement error distribution is estimated from replicate observations. The proposed semiparametric estimator is derived by estimating the quantises of X on a set of n transformed V-values and smoothing the estimated quantiles using a spline function. The number of parameters of the spline function is determined by the data with a simple criterion, such as AIC. In a simulation study, the semiparametric estimator dominates an optimal kernel estimator and a normal mixture estimator for a wide class of densities. The proposed est...
- Published
- 2000
- Full Text
- View/download PDF
48. Statistical approaches for analyzing randomized response technique data
- Author
-
F. Jay Breidt, Michael C. Gavin, and Sara G. Lewis
- Subjects
Geography ,Randomized Response Technique ,Statistics ,Mann–Whitney U test ,Statistical analysis ,Logistic regression ,Ecology, Evolution, Behavior and Systematics ,Nature and Landscape Conservation - Published
- 2015
- Full Text
- View/download PDF
49. IMPROVED BOOTSTRAP PREDICTION INTERVALS FOR AUTOREGRESSIONS
- Author
-
F. Jay Breidt, Richard A. Davis, and William T. M. Dunsmuir
- Subjects
Statistics and Probability ,Statistics::Theory ,Mean squared error ,Calibration (statistics) ,Applied Mathematics ,Statistics ,Nonparametric statistics ,Coverage probability ,Statistics::Methodology ,Prediction interval ,Least absolute deviations ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
We consider bootstrap construction and calibration of prediction intervals for nonGaussian autoregressions. In particular, we address the question of prediction conditioned on the last p observations of the process, for which we offer an exact simulation technique and an approximate bootstrap approach. In simulations for a variety of first-order autoregressions, we compare various nonparametric prediction intervals and find that calibration gives reasonably narrow prediction intervals with the lowest coverage probability mean squared error among the methods used.
- Published
- 1995
- Full Text
- View/download PDF
50. Nonparametric Regression Using Kernel and Spline Methods
- Author
-
Jean D. Opsomer and F. Jay Breidt
- Published
- 2011
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.