15,578 results on '"stratified sampling"'
Search Results
2. A novel building sampling approach leveraging data mining and stratified sampling theory for energy optimization
- Author
-
Fang, Zhijian, Lei, Lei, and Zheng, Run
- Published
- 2025
- Full Text
- View/download PDF
3. Stable adsorption configuration searching in hetero-catalysis based on similar distribution and active learning
- Author
-
Yang, Jiaqiang, Zhang, Xiaofei, Zhang, Xiaofeng, Niu, Bingbo, Wu, Feifeng, Luo, Ning, He, Jilin, Wang, Chengduo, Shan, Bin, and Li, Qingkui
- Published
- 2025
- Full Text
- View/download PDF
4. Compromise optimum allocation in neutrosophic multi-character survey under stratified random sampling using neutrosophic fuzzy programming
- Author
-
Ullah, Atta, Shabbir, Javid, Alomair, Abdullah Mohammed, and Alarfaj, Fawaz Khaled
- Published
- 2024
- Full Text
- View/download PDF
5. Improved estimation of population variance in stratified successive sampling using calibrated weights under non-response
- Author
-
Pandey, M.K., Singh, G.N., Zaman, Tolga, Mutairi, Aned Al, and Mustafa, Manahil SidAhmed
- Published
- 2024
- Full Text
- View/download PDF
6. Designing Efficient Stratified Mean-Per-Unit Sampling Applications in Accounting and Auditing.
- Author
-
Hall, Thomas W., Hoogduin, Lucas A., Pierce, Bethane Jo, and Tsay, Jeffrey J.
- Subjects
CONFIDENCE intervals ,SAMPLE size (Statistics) ,ERROR rates ,TRUST ,SAMPLING (Process) ,AUDITING - Abstract
Despite technological advances in accounting systems and audit techniques, sampling remains a commonly used audit tool. For critical estimation applications involving low error rate populations, stratified mean-per-unit sampling (SMPU) has the unique advantage of producing trustworthy confidence intervals. However, SMPU is less efficient than other classical sampling techniques because it requires a larger sample size to achieve comparable precision. To address this weakness, we investigated how SMPU efficiency can be improved via three key design choices: (a) stratum boundary selection method, (b) number of sampling strata, and (c) minimum stratum sample size. Our tests disclosed that SMPU efficiency varies significantly with stratum boundary selection method. An iterative search-based method yielded the best efficiency, followed by the Dalenius–Hodges and Equal-Value-Per-Stratum methods. We also found that variations in Dalenius–Hodges implementation procedures yielded meaningful differences in efficiency. Regardless of boundary selection method, increasing the number of sampling strata beyond levels recommended in the professional literature yielded significant improvements in SMPU efficiency. Although a minor factor, smaller values of minimum stratum sample size were found to yield better SMPU efficiency. Based on these findings, suggestions for improving SMPU efficiency are provided. We also present the first known equations for planning the number of sampling strata given various application-specific parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Run to the Hills: A Stratified Sampling Approach to Site Clustering on High Grounds in Ancient Samnium (Molise, Italy)
- Author
-
Waagen, Jitte and Stek, Tesse D.
- Abstract
The Tappino Area Archaeological Project, initiated in 2013, investigates ancient settlement dynamics in Molise, Italy. Ancient literary sources suggest a unique, non-urban settlement system for this mountainous region. Preliminary surveys for the Hellenistic period indicate a concentration of rural settlements on hill plateaus, suggesting a societal organization based on rural clusters. However, we must rigorously assess whether this pattern reflects genuine settlement organization or a research bias, a frequent challenge in landscape archaeology. This paper applies standard survey techniques with a proportionate stratified sampling approach, incorporating less-studied landscape areas. Additionally, we examine how factors like archaeological visibility, land use, and lithology may influence our understanding of settlement patterns and demography, using a statistically adequate framework to strengthen the analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
8. Not (Officially) in My Backyard: Characterizing Informal Accessory Dwelling Units and Informing Housing Policy With Remote Sensing.
- Author
-
Jo, Nathanael, Vallebueno, Andrea, Ouyang, Derek, and Ho, Daniel E.
- Subjects
- *
HOUSING , *ACCESSORY apartments , *COMPUTER vision , *HOUSING policy , *REMOTE sensing - Abstract
Problem, research strategy, and findings: One promising policy approach to addressing housing needs is liberalizing accessory dwelling unit (ADU) development. Yet, understanding the impact of such policy efforts is fundamentally constrained by the inability to quantify and characterize unpermitted ADUs, which may expose homeowners and tenants to legal, financial, and safety risks and confound policy evaluations. We addressed this gap by leveraging computer vision and human annotations to estimate the population of detached ADU constructions in San José (CA). Our contributions are threefold: 1) We estimated the proportion of unpermitted ADU constructions from 2016 to 2020; 2) we describe the demographic, housing market, and parcel characteristics associated with these informal ADUs; and 3) we provide a data set of labeled small buildings, excluding unpermitted detections, for further research. We found that informal ADU construction was substantial—approximately three to four informal units for every formal unit—and more likely in more diverse, dense, and overcrowded neighborhoods. Though our study was limited to analyzing detached ADUs during one time period, we set the stage for further investigations of informal housing across different typologies and over time. Takeaway for practice: Our approach demonstrates the promise of computer vision and human annotations to enable more robust, comprehensive, and reliable understanding of actual—not just permitted—housing units. We urge planners and other policymakers to consider the growth patterns of unpermitted ADUs to more optimally and equitably address housing needs. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
9. A stratified modified probability proportional to size sampling technique.
- Author
-
Goyal, Anupama, Arora, Sangeeta, and Goyal, Anju
- Subjects
- *
ZIPF'S law , *SAMPLING (Process) , *PROBABILITY theory , *SAMPLE size (Statistics) - Abstract
In Probability Proportional to Size (PPS) sampling, size variable associated with the sampling unit leads to the probability selection of units proportional to its size. In the literature, modification to PPS has been proposed by dividing the population in two groups on arranging the units in decreasing order of their frequency. We introduce Stratified Modified Probability Proportional to Size with Replacement (Stratified MPPSWR) sampling and Stratified Modified Rao-Hartley-Cochran (Stratified MRHC) design in case of without replacement sampling where the data are distributed in homogeneous strata such that it follows Zipf's law within each stratum and the units are selected using MPPSWR and MRHC design respectively. In proposed schemes, each stratum is divided into two groups and first group is selected into the sample with probability 1 and units from second group are selected using PPSWR and RHC sampling respectively as in MPPSWR and MRHC sampling design. The relative efficiency of the proposed schemes is carried out and an illustration is given using the real data set. The proposed sampling schemes are seen to be more efficient than existing techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Probabilistic weighted star discrepancy bound based on the general equal measure partition.
- Author
-
Xu, Xiaoda and Xian, Jun
- Subjects
- *
POINT set theory , *PLAINS - Abstract
In this paper, we consider the estimation of the weighted star discrepancy. A better weighted probabilistic star discrepancy bound than the use of plain Monte Carlo point sets is provided in terms of convergence order, i.e. the convergence order of the weighted probabilistic bound is improved from $ O(N^{-\frac {1}{2}}) $ O (N − 1 2 ) to $ O(N^{-\frac {1}{2}-\frac {1}{2d}}\cdot \ln ^{{1}/{2}}{N}) $ O (N − 1 2 − 1 2 d ⋅ ln 1 / 2 N). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Short-Term Stratified Temporal Noise Monitoring Strategies for Estimation of Day Equivalent Levels in Metropolitan City Delhi, India.
- Author
-
Kumar, Saurabh, Garg, Naveen, and Gautam, Chitra
- Abstract
The present work analyses the accuracy of short-term noise monitoring strategies for estimating day equivalent sound levels (L
day or LAeq,6 am–10 pm ) in Delhi city, India. The ambient noise monitoring data collected from 26 real-time noise monitoring stations, installed by Delhi Pollution Control Committee at distinct locations in Delhi city was utilized for the study. It was observed that average of randomly selected 1-h equivalent noise levels monitored in three sampling intervals in a day—06 am to 12 pm; 12 pm to 4 pm; and 4 pm to 10 pm, estimated Lday with an accuracy of ± 3.5 dBA with 95% probability. A dditionally, a stratified temporal sampling strategy for monitoring environment noise levels to estimate the day equivalent level for different zones was proposed with three stratified temporal intervals—9 am to 11 am; 2 pm to 3 pm; and 5 pm to 8 pm for commercial and industrial zones, and 9 am to 11 am; 12 pm to 2 pm; and 5 pm to 8 pm for residential and silence zones. It was observed that the day equivalent noise level can be estimated with an accuracy of ± 2.0 dBA for all the four zones with 95% probability if the noise levels are monitored for 1 h in each of three stratified temporal intervals. The present study may be helpful in conducting noise monitoring and mapping for a larger part of metropolitan cities of India, as continuous long-term (16-h) noise monitoring can be a quite cumbersome, time-consuming, and expensive process. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
12. PARAMETER ESTIMATION IN CASE OF INCOMPLETE FRAMES USING DOUBLE SAMPLING.
- Author
-
Yadav, Shilpa and Nagar, Pankaj
- Subjects
COST functions ,PARAMETER estimation ,SAMPLE size (Statistics) ,SAMPLING methods - Abstract
Sampling frames play an important role in survey research as they lay the foundation on the basis of which representative sample drawn from a target population. However, in practice, sampling frames are often incomplete, meaning that they do not fully capture the complete population of interest. In such cases, as a consequence of the incompleteness in sampling frame, the sample drawn does not provide a good representation of the population and hence, the results infer to vague and misleading conclusions. The present study is an attempt to somehow uplift the efficacy of the estimator of population mean by taking into consideration, the additional information gathered from the units currently excluded in the existing sampling frame along with mean square error of the proposed estimator. Optimum sample sizes are also obtained using suitable cost function under Neyman scheme. To compare the proposed estimator with the traditional ones, a simulation study has been done. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. A stratified estimation for sensitive variable using correlated scrambling variables.
- Author
-
Lee, Gi-Sung, Hong, Ki-Hak, and Son, Chang-Kyoon
- Subjects
- *
SAMPLING methods - Abstract
In this article, when the population is composed of several strata, we deal with the problem of stratified estimation for sensitive variables by applying stratified sampling to Murtaza et al.'s model using correlated scrambling variables. When the size of each stratum is accurately known, the sensitive variable is estimated by stratification, and the proportional and optimal allocations are examined as a method of allocating samples to each stratum. Also, in the case of not knowing the size of each stratum, a sensitive variable is estimated by using two-phase sampling, and the method of allocating samples to each stratum is also examined. Also, the efficiency between the proposed stratified model of Murtaza et al.'s and the existing model of Murtaza et al.'s is compared. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
14. Efficient population mean estimation via stratified sampling with dual auxiliary information: A real estate perspective
- Author
-
G.R.V. Triveni and Faizan Danish
- Subjects
Two-fold Auxiliary Information ,Study Variable ,Mean Estimation ,Real Data ,Stratified Sampling ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Auxiliary information is an essential component in the field of survey sampling since it enables precise estimation of population parameters like mean, variance, distribution function, and so on, which in turn guarantees the best possible outcomes. In order to estimate the population mean of a study variable, this study makes use of auxiliary information in a two-fold approach. Through a stratified random sampling scheme, we introduce a novel class of estimators that utilize auxiliary information and their corresponding ranks. By conducting a thorough evaluation based on metrics such as mean square error and percentage relative efficiency, these proposed estimators have been shown to be effective in the estimation process. Empirical validation is conducted using a real dataset sourced from the domain of real estate. Exploring the relationship between Assessed Value (X) and Sale Amount (Y) during a five-year period extending from 2017 to 2021 is the primary emphasis of the empirical validation process, which is carried out with the assistance of a real dataset of real estate data. Furthermore, in order to demonstrate that our suggested estimator is superior to conventional unbiased estimators, as well as traditional regression estimators and other estimators that have been considered in the literature, a full simulation analysis is carried out. Our proposed estimator appears to be the most effective choice after being subjected to a comparison study against a variety of preexisting approaches. The findings of this study not only make a significant contribution to the development of the methodology of survey sampling but also offer vital insights for predictive modeling within the real estate sector.
- Published
- 2024
- Full Text
- View/download PDF
15. Enhancing Landsat image based aboveground biomass estimation of black locust with scale bias-corrected LiDAR AGB map and stratified sampling
- Author
-
Shuhong Qin, Hong Wang, Xiuneng Li, Jay Gao, Jiaxin Jin, Yongtao Li, Jinbo Lu, Pengyu Meng, Jing Sun, Zhenglin Song, Petar Donev, and Zhangfeng Ma
- Subjects
Aboveground Biomass (AGB) ,LiDAR ,stratified sampling ,upscaling ,model uncertainty ,Mathematical geography. Cartography ,GA1-1776 ,Geodesy ,QB275-343 - Abstract
There is a growing interest in leveraging LiDAR-generated forest Aboveground Biomass (LG-AGB) data as a reference to retrieve AGB from satellite observations. However, the biases arising from the upscaling process and the impact of the sampling strategy on model accuracy still need to be resolved. In this study, we first corrected the bias arising from upscaling the LG-AGB map to match the spatial resolution of Landsat observations. Subsequently, the stratified random sampling method was used to select training samples from the corrected LG-AGB map (cLG-AGB) for the Random Forest (RF) regression model. The RF model features were extracted from the Landsat observations and auxiliary data. The impact of strata numbers on model accuracy was explored during the sampling process. Finally, independent validation was conducted using in situ measurements. The results indicated that: (1) about 68% of the biases can be corrected in the up-scale transformation; (2) compared to no stratification, a three-strata model achieved a 6.5% improvement in AGB estimation accuracy while requiring a 37.8% reduction in sample size; (3) the black locust forest had a low saturation point at 60.52 ± 4.46 Mg/ha AGB and 72.4% AGB values were underestimated and the remaining were overestimated. In summary, our study provides a framework to harmonize near-surface LiDAR and satellite data for AGB estimation in plantation forest ecosystems with small patch sizes and fragmented distribution.
- Published
- 2024
- Full Text
- View/download PDF
16. Strata Design for Variance Reduction in Stochastic Simulation.
- Author
-
Park, Jaeshin, Byon, Eunshin, Ko, Young Myoung, and Shashaani, Sara
- Subjects
- *
MONTE Carlo method , *DECISION trees , *WIND turbines , *BUDGET , *SCALABILITY - Abstract
AbstractStratified sampling is one of the powerful variance reduction methods for analyzing system performance, such as reliability, with stochastic simulation. It divides the input space into disjoint subsets, called strata, to draw samples from each stratum. Partitioning the input space properly and allocating greater computational effort to crucial strata can help accurately estimate system performance with a limited computational budget. How to create strata, however, has yet to be thoroughly examined. Strata design faces the curse of dimensionality and data scarcity as the input dimension increases. We analytically derive the optimal stratification structure that minimizes the estimation variance for univariate problems. Further, reconciling the optimal stratification into decision trees, we devise a robust algorithm for multi-dimensional problems. Numerical experiments and a wind turbine case study demonstrate the superiority of the proposed method in terms of variance reduction, leading to computational efficiency and scalability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Efficient population mean estimation via stratified sampling with dual auxiliary information: A real estate perspective.
- Author
-
Triveni, G.R.V. and Danish, Faizan
- Subjects
DISTRIBUTION (Probability theory) ,REAL estate business ,PARAMETERS (Statistics) ,REAL property ,EMPIRICAL research ,STATISTICAL sampling - Abstract
Auxiliary information is an essential component in the field of survey sampling since it enables precise estimation of population parameters like mean, variance, distribution function, and so on, which in turn guarantees the best possible outcomes. In order to estimate the population mean of a study variable, this study makes use of auxiliary information in a two-fold approach. Through a stratified random sampling scheme, we introduce a novel class of estimators that utilize auxiliary information and their corresponding ranks. By conducting a thorough evaluation based on metrics such as mean square error and percentage relative efficiency, these proposed estimators have been shown to be effective in the estimation process. Empirical validation is conducted using a real dataset sourced from the domain of real estate. Exploring the relationship between Assessed Value (X) and Sale Amount (Y) during a five-year period extending from 2017 to 2021 is the primary emphasis of the empirical validation process, which is carried out with the assistance of a real dataset of real estate data. Furthermore, in order to demonstrate that our suggested estimator is superior to conventional unbiased estimators, as well as traditional regression estimators and other estimators that have been considered in the literature, a full simulation analysis is carried out. Our proposed estimator appears to be the most effective choice after being subjected to a comparison study against a variety of preexisting approaches. The findings of this study not only make a significant contribution to the development of the methodology of survey sampling but also offer vital insights for predictive modeling within the real estate sector. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Enhancing Landsat image based aboveground biomass estimation of black locust with scale bias-corrected LiDAR AGB map and stratified sampling.
- Author
-
Qin, Shuhong, Wang, Hong, Li, Xiuneng, Gao, Jay, Jin, Jiaxin, Li, Yongtao, Lu, Jinbo, Meng, Pengyu, Sun, Jing, Song, Zhenglin, Donev, Petar, and Ma, Zhangfeng
- Subjects
BLACK locust ,FEATURE extraction ,FOREST biomass ,LANDSAT satellites ,TREE farms - Abstract
There is a growing interest in leveraging LiDAR-generated forest Aboveground Biomass (LG-AGB) data as a reference to retrieve AGB from satellite observations. However, the biases arising from the upscaling process and the impact of the sampling strategy on model accuracy still need to be resolved. In this study, we first corrected the bias arising from upscaling the LG-AGB map to match the spatial resolution of Landsat observations. Subsequently, the stratified random sampling method was used to select training samples from the corrected LG-AGB map (cLG-AGB) for the Random Forest (RF) regression model. The RF model features were extracted from the Landsat observations and auxiliary data. The impact of strata numbers on model accuracy was explored during the sampling process. Finally, independent validation was conducted using in situ measurements. The results indicated that: (1) about 68% of the biases can be corrected in the up-scale transformation; (2) compared to no stratification, a three-strata model achieved a 6.5% improvement in AGB estimation accuracy while requiring a 37.8% reduction in sample size; (3) the black locust forest had a low saturation point at 60.52 ± 4.46 Mg/ha AGB and 72.4% AGB values were underestimated and the remaining were overestimated. In summary, our study provides a framework to harmonize near-surface LiDAR and satellite data for AGB estimation in plantation forest ecosystems with small patch sizes and fragmented distribution. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. PNNGS, a multi-convolutional parallel neural network for genomic selection.
- Author
-
Zhengchao Xie, Lin Weng, Jingjing He, Xianzhong Feng, Xiaogang Xu, Yinxing Ma, Panpan Bai, and Qihui Kong
- Subjects
ARTIFICIAL neural networks ,SELECTION (Plant breeding) ,PRINCIPAL components analysis ,RANDOM forest algorithms ,DEEP learning - Abstract
Genomic selection (GS) can accomplish breeding faster than phenotypic selection. Improving prediction accuracy is the key to promoting GS. To improve the GS prediction accuracy and stability, we introduce parallel convolution to deep learning for GS and call it a parallel neural network for genomic selection (PNNGS). In PNNGS, information passes through convolutions of different kernel sizes in parallel. The convolutions in each branch are connected with residuals. Four different Lp loss functions train PNNGS. Through experiments, the optimal number of parallel paths for rice, sunflower, wheat, and maize is found to be 4, 6, 4, and 3, respectively. Phenotype prediction is performed on 24 cases through ridge-regression best linear unbiased prediction (RRBLUP), random forests (RF), support vector regression (SVR), deep neural network genomic prediction (DNNGP), and PNNGS. Serial DNNGP and parallel PNNGS outperform the other three algorithms. On average, PNNGS prediction accuracy is 0.031 larger than DNNGP prediction accuracy, indicating that parallelism can improve the GS model. Plants are divided into clusters through principal component analysis (PCA) and K-means clustering algorithms. The sample sizes of different clusters vary greatly, indicating that this is unbalanced data. Through stratified sampling, the prediction stability and accuracy of PNNGS are improved. When the training samples are reduced in small clusters, the prediction accuracy of PNNGS decreases significantly. Increasing the sample size of small clusters is critical to improving the prediction accuracy of GS. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Assessing COVID-19 Prevalence in Austria with Infection Surveys and Case Count Data as Auxiliary Information.
- Author
-
Guerrier, Stéphane, Kuzmics, Christoph, and Victoria-Feser, Maria-Pia
- Subjects
- *
COVID-19 pandemic , *MAXIMUM likelihood statistics , *COMMUNICABLE diseases , *MEASUREMENT errors , *MOMENTS method (Statistics) - Abstract
Countries officially record the number of COVID-19 cases based on medical tests of a subset of the population. These case count data obviously suffer from participation bias, and for prevalence estimation, these data are typically discarded in favor of infection surveys, or possibly also completed with auxiliary information. One exception is the series of infection surveys recorded by the Statistics Austria Federal Institute to study the prevalence of COVID-19 in Austria in April, May, and November 2020. In these infection surveys, participants were additionally asked if they were simultaneously recorded as COVID-19 positive in the case count data. In this article, we analyze the benefits of properly combining the outcomes from the infection survey with the case count data, to analyze the prevalence of COVID-19 in Austria in 2020, from which the case ascertainment rate can be deduced. The results show that our approach leads to a significant efficiency gain. Indeed, considerably smaller infection survey samples suffice to obtain the same level of estimation accuracy. Our estimation method can also handle measurement errors due to the sensitivity and specificity of medical testing devices and to the nonrandom sample weighting scheme of the infection survey. The proposed estimators and associated confidence intervals are implemented in the companion open source R package pempi available on the Comprehensive R Archive Network (CRAN). for this article are available online including a standardized description of the materials available for reproducing the work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Estimating the prevalence of osteoporosis using ranked-based methodologies and Manitoba's population-based BMD registry.
- Author
-
Omidvar, Sedigheh, Jafari Jozani, Mohammad, Nematollahi, Nader, and Leslie, Wiliam D.
- Subjects
- *
METABOLIC bone disorders , *BONE density , *OSTEOPOROSIS in women , *EXPECTATION-maximization algorithms , *OSTEOPOROSIS - Abstract
Osteoporosis is a metabolic bone disorder that is characterized by reduced bone mineral density (BMD) and deterioration of bone microarchitecture. Osteoporosis is highly prevalent among women over 50, leading to skeletal fragility and risk of fracture. Early diagnosis and treatment of those at high risk for fracture is very important in order to avoid morbidity, mortality and economic burden from preventable fractures. The province of Manitoba established a BMD testing program in 1997. The Manitoba BMD registry is now the largest population-based BMD registry in the world, and has detailed information on fracture outcomes and other covariates for over 160,000 BMD assessments. In this paper, we develop a number of methodologies based on ranked-set type sampling designs to estimate the prevalence of osteoporosis among women of age 50 and older in the province of Manitoba. We use a parametric approach based on finite mixture models, as well as the usual approaches using simple random and stratified sampling designs. Results are obtained under perfect and imperfect ranking scenarios while the sampling and ranking costs are incorporated into the study. We observe that rank-based methodologies can be used as cost-efficient methods to monitor the prevalence of osteoporosis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Confidence bands for survival curves from outcome‐dependent stratified samples.
- Author
-
Saegusa, Takumi and Nandori, Peter
- Subjects
- *
GAUSSIAN processes , *SURVIVAL analysis (Biometry) , *NEPHROBLASTOMA , *QUANTILES , *CONFIDENCE - Abstract
We consider the construction of confidence bands for survival curves under the outcome‐dependent stratified sampling. A main challenge of this design is that data are a biased dependent sample due to stratification and sampling without replacement. Most literature on regression approximates this design by Bernoulli sampling but variance is generally overestimated. Even with this approximation, the limiting distribution of the inverse probability weighted Kaplan–Meier estimator involves a general Gaussian process, and hence quantiles of its supremum is not analytically available. In this paper, we provide a rigorous asymptotic theory for the weighted Kaplan–Meier estimator accounting for dependence in the sample. We propose the novel hybrid method to both simulate and bootstrap parts of the limiting process to compute confidence bands with asymptotically correct coverage probability. Simulation study indicates that the proposed bands are appropriate for practical use. A Wilms tumor example is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Quality Enhancement of MIROS Wave Radar Data at Ieodo Ocean Research Station Using ANN
- Author
-
Donghyun Park, Kideok Do, Miyoung Yun, and Jin-Yong Jeong
- Subjects
ieodo ocean research station ,wave and current radar ,quality control ,artificial neural network ,stratified sampling ,Ocean engineering ,TC1501-1800 - Abstract
Remote sensing wave observation data are crucial when analyzing ocean waves, the main external force of coastal disasters. Nevertheless, it has limitations in accuracy when used in low-wind environments. Therefore, this study collected the raw data from MIROS Wave and Current Radar (MWR) and wave radar at the Ieodo Ocean Research Station (IORS) and applied the optimal filter by combining filters provided by MIROS software. The data were validated by a comparison with South Jeju ocean buoy data. The results showed it maintained accuracy for significant wave height, but errors were observed in significant wave periods and extreme waves. Hence, this study used an artificial neural network (ANN) to improve these errors. The ANN was generalized by separating the data into training and test datasets through stratified sampling, and the optimal model structure was derived by adjusting the hyperparameters. The application of ANN effectively improved the accuracy in significant wave periods and high wave conditions. Consequently, this study reproduced past wave data by enhancing the reliability of the MWR, contributing to understanding wave generation and propagation in storm conditions, and improving the accuracy of wave prediction. On the other hand, errors persisted under high wave conditions because of wave shadow effects, necessitating more data collection and future research.
- Published
- 2024
- Full Text
- View/download PDF
24. Several problems in discrepancy theory : lower bounds and stratified sampling
- Author
-
Kirk, Nathan, Barnes, David, Lin, Ying-Fen, and Pausinger, Florian
- Subjects
Discrepancy ,stratified sampling ,jittered sampling ,hypercube ,geometry ,sequences ,sampling ,star discrepancy - Abstract
The aim of this PhD thesis is to study several problems in the theory of uniform distribution. Specifically in the subfield of discrepancy theory, which is often referred to in the literature by the theory of irregularities of distribution. We study several problems involving various measures of irregularity of distribution and the subsequent discrepancy values associated with these measures. While discrepancy theory can be discussed in many settings, we contain our study to the classical setting of point sets contained inside the d−dimensional unit hypercube. As a first contribution, the 1986 results and proofs of Petko Proinov on lower bounds of a particular discrepancy measure named the diaphony are contained in Chapter 1. These methods are written in a self-contained and complete manner for the first time in English. In addition, we discuss the progress since 1986 and provide updated state-of-the-art constants associated with the diaphony. Chapters 2 to 4 change our focus and are used to extend the recent study of stratified sampling by Markus Kiderlen and Florian Pausinger. On the way, we solve several open problems relating to partitions of the d−dimensional unit cube. In Chapter 2, we derive several closed formulae which give the expected discrepancy values for a specific formulation of stratified sampling in the d−dimensional unit cube called jittered sampling. In particular, we study the expected L2−discrepancy and the expected Hickernell L2−discrepancy of arbitrary jittered sampling in the hypercube. Chapter 3 is dedicated to the investigation of the expected discrepancy of the stratified point sets obtained from a more general family of partitions. These N−set partitions are constructed via placing N-1 hyperplanes along the main diagonal of the cube in such a way as to create equal volume strata. We provide a recommendation for the construction of this partition of the hypercube after the difficulty of construction in dimension greater than 2 was pointed out by Kiderlen and Pausinger and thereafter, utilise this construction to compute numerical values of the expected discrepancy of the resulting stratified point sets. Finally, Chapter 4 explores the validity and dependability of using a novel method involving the theory of majorisations to compare the expected discrepancy of stratified point sets obtained from a given pair of partitions. The primary advantage of this method is the fact that one would not be required to explicitly compute the expected discrepancy values to investigate which stratified point set has more regular distribution. We discuss the successes and failures of this method while stating several open problems for this new direction of research.
- Published
- 2023
25. Inferring Demand in Drinking Water Distribution Systems through Stratified Sampling of Billing Data for Smart Meter Installation.
- Author
-
Almeida Silva, Maria, Amado, Conceiçāo, and Loureiro, Dália
- Subjects
- *
SMART meters , *WATER distribution , *MUNICIPAL water supply , *WATER utilities , *SUSTAINABILITY - Abstract
The importance of urban water supply systems and public services is globally recognized. Nonrevenue water directly affects a water utility's economic, financial, and environmental sustainability. In Portugal, the mean of the nonrevenue water for the distribution systems corresponded to 28.8% in 2019. Smart metering technology is crucial for consumption monitoring and enhancing apparent and real loss network management (e.g., water meters' global error evaluation, detection of illegal uses, and real loss estimation through the minimum night flow analysis). However, this technology is expensive in acquisition, installation, operation, and maintenance. This study aims to support water utilities in inferring the total consumption using a representative sample of customers with smart meters instead of smart metering data from all customers. A stratified sampling was considered using only the customers' billing time series for the strata definition. A predominantly domestic zone was used, and eight strata were obtained with a clustering analysis [temporal correlation (CORT) dissimilarity and Ward method]. Stratified sampling was applied to minimize the variance of the total water consumption estimator. A representative sample of 259 dimensions (53%) was chosen to infer, with small errors, essential consumption statistics for water utilities: total consumption (with an error of 0.12%), total consumption time series, water consumption patterns, minimum night consumption, and volume distribution by the flow rate. The successful outcomes obtained were crucial in supporting the proposed methodology. This study has provided evidence that installing smart meters for all consumers in a distribution network area is not necessary to acquire accurate and meaningful consumption information crucial for effective network management and water loss control. Moreover, using only billing data to perform the sample selection of consumers is useful for water utilities, because they may face difficulties obtaining extra consumer information. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Tri-Phase Implementation of an Innovative Fuzzy Logic Approach for Decision-Making.
- Author
-
Tarray, Tanveer Ahmad, Khaki, Zahid Gulzar, Ganie, Zahoor Ahmad, Sultan, Adil, Danish, Faizan, and Albalawi, Olayan
- Subjects
- *
FUZZY logic , *RANDOMIZED response , *COST control , *DATA quality , *STATISTICAL sampling - Abstract
This paper proposes a novel approach to decision-making based on a three-phase application of a new fuzzy logic model that embraces the principles of symmetry by balancing competing objectives in data collection and analysis. Our study, which employs a three-stage stratified random sample strategy with a randomized response technique, addresses the critical challenges of cost management and volatility reduction. Using the alpha-cut method, our model creates an effective allocation strategy that finds a balance between cost constraints and variance reduction objectives. We use numerical examples from real-world scenarios to demonstrate our approach's durability and practicality. Our revolutionary technique maintains data quality and cost-effectiveness while offering a game-changing answer to sensitive information acquisition concerns. By combining randomized response techniques and fuzzy logic, this study establishes a new standard for decision-making models that prioritizes both data-gathering precision and privacy preservation, encapsulating the essential principle of symmetry in balancing competing aims. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Experimental study of population density using an optimized random forest model.
- Author
-
Li, Lingling, Liu, Jinsong, Li, Zhi, Wen, Peizhang, Li, Yancheng, and Liu, Yi
- Abstract
Random forest model is the mainstream research method used to accurately describe the distribution law and impact mechanism of regional population. We took Shijiazhuang as the research area, with comprehensive zoning based on endowments as the modeling unit, conducted stratified sampling on a hectare grid cell, and systematically carried out incremental selection experiments of population density impact factors, optimizing the population density random forest model throughout the process (zonal modeling, stratified sampling, factor selection, weighted output). The results are as follows: (1) Zonal modeling addresses the issue of confusion in population distribution laws caused by a single model. Sampling on a grid cell not only ensures the quality of training data by avoiding the modifiable areal unit problem (MAUP) but also attempts to mitigate the adverse effects of the ecological fallacy. Stratified sampling ensures the stability of population density label values (target variable) in the training sample. (2) Zonal selection experiments on population density impact factors help identify suitable combinations of factors, leading to a significant improvement in the goodness of fit (R
2 ) of the zonal models. (3) Weighted combination output of the population density prediction dataset substantially enhances the model's robustness. (4) The population density dataset exhibits multi-scale superposition characteristics. On a large scale, the population density in plains is higher than that in mountainous areas, while on a small scale, urban areas have higher density compared to rural areas. The optimization scheme for the population density random forest model that we propose offers a unified technical framework for uncovering local population distribution law and the impact mechanisms. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
28. A Comparison between Regression and Ratio Estimators using Auxiliary Information: A Case Study of Ladoke Akintola University of Technology, Nigeria.
- Author
-
Oladimeji, Lukman Abiola, Oke, Samuel Abayomi, Akinade, Oludayo Olugbenga, Ismail, Ibrahim Olalekan, and Ibidoja, Olayemi Joshua
- Subjects
STATISTICAL sampling ,PROBABILITY theory ,ARITHMETIC mean ,REGRESSION analysis ,ANALYSIS of variance - Abstract
This study investigated the use of separate and combined stratified random sampling to estimate population mean scores for two important courses in Statistics using their pre-requisites, by comparing two estimators (ratio and regression). For Probability Distribution, the separate ratio estimator emerged as the optimal choice, providing a mean score estimate of 31.66 with a variance of 6612.18. This indicates that using the pre-requisites course score as auxiliary information improved the estimation accuracy compared to the combined ratio estimator. In contrast, Statistical Inference, the combined ratio estimator proved to be more effective, yielding a mean score estimate of 33.99 with a variance of 84.54. The separate regression estimator was also evaluated, demonstrating its suitability for Probability Distribution with a mean score estimate of 32.18 and a variance of 7.22. However, for Statistical Inference, the combined regression estimator offered a more precise estimate (32.99) with a lower variance (17.59). The findings highlight the effectiveness of auxiliary information when selecting estimation methods in stratified sampling. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. THE EFFICIENT CLASSES OF ESTIMATORS FOR THE PRODUCT OF TWO POPULATION MEANS IN THE EXISTENCE OF NON-RESPONSE UNDER THE STRATIFIED POPULATION-A SIMULATION STUDY.
- Author
-
MISHRA, MANISH, KHARE, B. B., and SINGH, SACHIN
- Subjects
- *
STATISTICAL sampling , *MEAN square algorithms , *SIMULATION methods & models - Abstract
This paper focuses on estimating the product of two population means. Within this paper, we have introduced three distinct classes of estimators for product of two population means. These estimators take into account the known population mean of an auxiliary variable under the framework of stratified random sampling and the presence of non-response in the study variable. Basically, for case (I) we assume the non-response on the study variable and utilize the auxiliary information corresponding to the responding units of the study variable and in case (II), we utilize the complete dataset from the auxiliary variable while also accounting for non-response in the study variable. In case (III) we combined both the information of the auxiliary variable and assumed the non-response on the study variable. Expressions for bias and mean square error have been derived, extending up to the first-order derivative. We have also pinpointed some specific members of the proposed estimator. We have conducted a simulation study to evaluate the valuable insights into the performance of the suggested classes of estimators with the conventional estimator. [ABSTRACT FROM AUTHOR]
- Published
- 2024
30. Partitions for stratified sampling.
- Author
-
Clément, François, Kirk, Nathan, and Pausinger, Florian
- Subjects
- *
POINT set theory , *MATHEMATICAL optimization , *HYPERPLANES , *CUBES , *INTEGERS - Abstract
Classical jittered sampling partitions [ 0 , 1 ] d into m d cubes for a positive integer m and randomly places a point inside each of them, providing a point set of size N = m d with small discrepancy. The aim of this note is to provide a construction of partitions that works for arbitrary N and improves straight-forward constructions. We show how to construct equivolume partitions of the d-dimensional unit cube with hyperplanes that are orthogonal to the main diagonal of the cube. We investigate the discrepancy of such point sets and optimise the expected discrepancy numerically by relaxing the equivolume constraint using different black-box optimisation techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Randomized Quasi-Monte Carlo Methods on Triangles: Extensible Lattices and Sequences.
- Author
-
Dong, Gracia Yunruo, Hintz, Erik, Hofert, Marius, and Lemieux, Christiane
- Abstract
Two constructions were recently proposed for constructing low-discrepancy point sets on triangles. One is based on a finite lattice, the other is a triangular van der Corput sequence. We give a continuation and improvement of these methods. We first provide an extensible lattice construction for points in the triangle that can be randomized using a simple shift. We then examine the one-dimensional projections of the deterministic triangular van der Corput sequence and quantify their sub-optimality compared to the lattice construction. Rather than using scrambling to address this issue, we show how to use the triangular van der Corput sequence to construct a stratified sampling scheme. We show how stratified sampling can be used as a more efficient implementation of nested scrambling, and that nested scrambling is a way to implement an extensible stratified sampling estimator. We also provide a test suite of functions and a numerical study for comparing the different constructions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Calibration estimation of population mean in stratified sampling using standard deviation.
- Author
-
Babatunde, Oluwagbenga T., Oladugba, Abimibola V., Ude, Ifeoma O., and Adubi, Ayodeji S.
- Subjects
STATISTICAL sampling ,STANDARD deviations ,CALIBRATION - Abstract
In this paper, a new improved calibration estimator for estimating the population mean in the stratified random sampling using standard deviation of the auxiliary variable is proposed. A simulation study was conducted to assess the performance of the estimators in symmetric and skewed populations using absolute relative bias, mean square error and percentage relative efficiency. The results showed that the proposed estimator is more efficient compared to the existing estimators considered in this work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Simple Stratified Sampling for Simulating Multi-dimensional Markov Chains
- Author
-
El Haddad, Rami, Lécot, Christian, L’Ecuyer, Pierre, Hinrichs, Aicke, editor, Kritzer, Peter, editor, and Pillichshammer, Friedrich, editor
- Published
- 2024
- Full Text
- View/download PDF
34. Sequential Estimation Using Hierarchically Stratified Domains with Latin Hypercube Sampling
- Author
-
Krumscheid, Sebastian, Pettersson, Per, Hinrichs, Aicke, editor, Kritzer, Peter, editor, and Pillichshammer, Friedrich, editor
- Published
- 2024
- Full Text
- View/download PDF
35. Surrogate Model Approaches for the Evaluation of Structural Safety of Suspended Ceilings
- Author
-
Mellios, Nikolaos, Jörg, Agnetha, Spyridis, Panagiotis, Strauss, Alfred, di Prisco, Marco, Series Editor, Chen, Sheng-Hong, Series Editor, Vayas, Ioannis, Series Editor, Kumar Shukla, Sanjay, Series Editor, Sharma, Anuj, Series Editor, Kumar, Nagesh, Series Editor, Wang, Chien Ming, Series Editor, Cui, Zhen-Dong, Series Editor, Matos, José C., editor, Lourenço, Paulo B., editor, Oliveira, Daniel V., editor, Branco, Jorge, editor, Proske, Dirk, editor, Silva, Rui A., editor, and Sousa, Hélder S., editor
- Published
- 2024
- Full Text
- View/download PDF
36. Influence of Parental Involvement and Academic Motivation on Mathematical Achievement: The Role of Students’ Mathematics Interest
- Author
-
Bright Asare, Natalie B. Welcome, and Yarhands Dissou Arthur
- Subjects
academic motivation ,mathematical achievement ,mathematics interest ,parent involvement ,stratified sampling ,Education (General) ,L7-991 - Abstract
The study examines the influence of parental involvement and academic motivation on students' mathematics performance, mediated by students' interest in mathematics. The current study adopts a descriptive-correlational research design. The study population comprises all first-year and second-year senior high students in the Central Region of Ghana. A sample of 290 students was randomly selected from four senior high schools in the Central Region of Ghana. The researcher used stratified sampling techniques to categorize the students into the various courses offered in the schools and employed simple random sampling techniques to select respondents from each stratum for the study. A structured questionnaire was used as a research instrument to collect data from the target population. Analysis of Moment Structures (Amos) version 23 and IBM SPSS version 23 were used as analysis tools for data analysis. The analysis results show that parental involvement, academic motivation, and students' interest in mathematics have a significant positive effect on mathematics achievement. Furthermore, students' interest in mathematics partially mediates the link between parental involvement and mathematics achievement. Finally, students' interest in mathematics partially mediates the connection between mathematics motivation and mathematics achievement. The study recommends that parents must be fully involved in their children's education, especially in their mathematics learning, by providing students with the necessary support to improve their mathematics learning and performance.
- Published
- 2024
- Full Text
- View/download PDF
37. Correspondence: Can Menstrual Cycle Length Predict Cardiovascular Risk in Healthy Indian Females?
- Author
-
Avnesh Kumar Singh and Shikha Singh
- Subjects
non probability sampling ,probability sampling ,stratified sampling ,Medicine - Abstract
Dear Editor, We read with great interest the article entitled “Can Menstrual Cycle Length Predict Cardiovascular Risk in Healthy Indian Females? A cross-sectional Study” by Shilpi Vashishta et al., published in your esteemed journal (Journal of Clinical and Diagnostic Research) 2024;18(6):CC22-CC25. We would like to share a few of our thoughts regarding this study, mainly about the sampling and statistical techniques used. Although it was worthwhile research, the sampling technique was not properly chosen. It is impossible to conduct quality research without adequate sampling. There are two main varieties: probability sampling and non probability sampling. Convenient non random sampling was incorrectly referred to as simple random sampling in the article. The sampling technique should be stratified instead of relying on convenient sampling. Stratified sampling is used to separate the population into smaller groups that may differ significantly from one another. Ensuring that each subgroup is fairly represented in the sample allows for more precise conclusions
- Published
- 2025
- Full Text
- View/download PDF
38. Advancing Survey Sampling Efficiency under Stratified Random Sampling and Post-Stratification: Leveraging Symmetry for Enhanced Estimation Accuracy in the Prediction of Exam Scores.
- Author
-
Triveni, Gullinkala Ramya Venkata, Danish, Faizan, and Albalawi, Olayan
- Subjects
- *
STATISTICAL sampling , *DISTRIBUTION (Probability theory) , *SYMMETRY , *CUMULATIVE distribution function , *EDUCATIONAL evaluation , *SAMPLING (Process) - Abstract
This pioneering investigation introduces two innovative estimators crafted to evaluate the finite population distribution function of a study variable, employing auxiliary variables within the framework of stratified random sampling and post-stratification while emphasizing symmetry in the sampling process. The derivation of mathematical expressions for bias and the mean square error up to the first degree of approximation fortifies the credibility of the proposed estimators. Drawing from three distinct datasets, including real-world data capturing student behaviors and exam performances from 500 students, this research highlights the superior efficiency of the proposed estimators compared to existing methods across both sampling schemes. Employing the proposed estimator, we effectively forecast students' exam scores based on their study hours, backed by empirical evidence showcasing its precision in terms of mean square error and percentage relative efficiency. This study not only introduces inventive solutions to enduring challenges in survey sampling but also provides practical insights into enhancing predictive accuracy in educational assessments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Estimation of canopy height based on multi-source remote sensing data using forest structure aided sample selection.
- Author
-
Zhao, Yinpeng, Du, Shouhang, Li, Kangning, Jiang, Jinbao, Guo, Qiyu, and Xiao, Wanshan
- Subjects
- *
AIRBORNE lasers , *SPACE-based radar , *REMOTE sensing , *CARBON sequestration in forests , *FOREST canopies , *FOREST ecology , *FOREST mapping - Abstract
Forest canopy height data are crucial for estimating forest carbon storage and assessing forest ecology. By utilizing satellite imagery, canopy height data obtained from airborne or spaceborne LiDAR have been expanded from footprint and plot levels to spatially continuous elevation mapping of forests. However, current research suggests that estimating forest canopy height without forest type data presents a challenge in how to effectively integrate multi-source LiDAR data and ensure the samples adequately represent various forest types for higher estimation accuracy. Therefore, this study proposes a forest canopy height estimation method that considers forest structure and integrates multi-source LiDAR data to overcome the challenge. First, a stratified sampling method based on forest structure (SSMFS) was proposed to select training samples and enhance their representativeness. Second, we combined GEDI and ATL08 data to create a multi-source spaceborne LiDAR dataset, enhancing geographic coverage and increasing canopy height samples. Third, the spaceborne LiDAR-based canopy height estimation model incorporates previously unconsidered canopy openness features and uses SSMFS to select training samples. Finally, we improved spaceborne LiDAR canopy height accuracy by creating a residual correction model that adjusts for differences between airborne scanner (ALS) and spaceborne LiDAR estimates. This study, conducted in Zhangwu County, achieved an accuracy of R2 = 0.71, MAE = 1.20 m, and RMSE = 1.71 m. These results show a 51.06% increase in R2, a 26.38% decrease in MAE, and a 24.00% decrease in RMSE compared to recent research. In summary, this study profoundly amplifies predictive accuracy, providing a clear advantage in the delineation of regional forest canopy maps. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Exploring Factors Influencing Market Engagement and Marketing Channel Selection among Smallholder Macadamia Farmers in Embu West Sub County, Kenya.
- Author
-
Nthiga, Kelvin Murimi, Ndirangu, Samuel Njiri, and Isaboke, Hezron Nyarindo
- Subjects
MARKETING channels ,FARMERS ,MACADAMIA ,AGRICULTURE ,AGRICULTURAL extension work ,FARM size ,SMALL farms - Abstract
This study delves into market participation and marketing channel preferences among smallholder Macadamia farmers in Embu West Sub County, Kenya. Employing a stratified multistage sampling procedure and multinomial logit regression analysis, the research reveals that despite offering lower prices, brokers remain the favored marketing channel. Several factors, including age, farming experience, consideration of macadamia quality, information flow, farm size, distance to market, payment period, and education level, significantly influence farmers' channel choice. The findings underscore the importance of policy interventions emphasizing non-price incentives, such as enhanced extension services and dissemination of macadamia marketing information, to encourage participation in more lucrative markets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Estimation of heterogeneous population variance using memory-type estimators based on <italic>EWMA</italic> statistic in the presence of measurement error for time-scaled surveys.
- Author
-
Qureshi, Muhammad Nouman, Tariq, Muhammad Umair, Alamri, Osama Abdulaziz, and Hanif, Muhammad
- Abstract
AbstractIn this present article, we have suggested memory-type ratio, exponential ratio, product and exponential product estimators based on exponentially weighted moving average statistic for the estimation of heterogeneous population variance using stratified sampling design in presence of measurement error for time-scaled surveys. Mathematical expressions of approximate mean square error are derived using Taylor and exponential expansions for the proposed memory-type estimators. We have also discussed the situations in which the memory-type estimators would perform efficiently than the conventional estimators. The results of extensive simulation study revealed that the proposed memory-type estimators based exponentially weighted moving average statistic would perform better than the conventional estimators in the presence measurement error for time-scaled surveys under certain condition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Improved calibration estimation of population mean in stratified sampling using two auxiliary variables.
- Author
-
Oladugba, Abimibola V. and Babatunde, Oluwagbenga T.
- Subjects
MATHEMATICAL variables ,CALIBRATION ,STATISTICAL correlation ,STANDARD deviations ,INFORMATION retrieval - Abstract
In this paper, a new improved calibration estimator for the population mean in a stratified sampling was proposed using two auxiliary variables. A simulation study was carried out to evaluate the performance and efficiency of the proposed estimator with respect to three estimators considered in the literature for estimating the population mean in a stratified sampling using two auxiliary variables. The results showed that the new estimator proved to be more efficient than the three existing estimators considered. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Assessing Antithetic Sampling for Approximating Shapley, Banzhaf, and Owen Values
- Author
-
Jochen Staudacher and Tim Pollmann
- Subjects
cooperative game theory ,permutation sampling ,antithetic sampling ,stratified sampling ,Shapley value ,Banzhaf value ,Mathematics ,QA1-939 - Abstract
Computing Shapley values for large cooperative games is an NP-hard problem. For practical applications, stochastic approximation via permutation sampling is widely used. In the context of machine learning applications of the Shapley value, the concept of antithetic sampling has become popular. The idea is to employ the reverse permutation of a sample in order to reduce variance and accelerate convergence of the algorithm. We study this approach for the Shapley and Banzhaf values, as well as for the Owen value which is a solution concept for games with precoalitions. We combine antithetic samples with established stratified sampling algorithms. Finally, we evaluate the performance of these algorithms on four different types of cooperative games.
- Published
- 2023
- Full Text
- View/download PDF
44. Determinants of online professor reviews: an elaboration likelihood model perspective
- Author
-
Li, Yaojie, Wang, Xuan, and Van Slyke, Craig
- Published
- 2023
- Full Text
- View/download PDF
45. A new randomized response technique with application to election polling.
- Author
-
Singh, Garib Nath, Bhattacharyya, Diya, and Bandyopadhyay, Arnab
- Subjects
- *
RANDOMIZED response , *ELECTION forecasting , *RANDOMIZATION (Statistics) , *EMPIRICAL research - Abstract
Surveys involving sensitive characteristics entail applying innovative tools such as Randomized Response Techniques (RRTs) to elicit responses from the sampled units. This manuscript proposes one such new two-stage RRT, involving two simple and common randomization devices, namely, a coin and a die. Empirical study has been conducted to show its efficacy over contemporary estimator in terms of both Percentage Relative Efficiency and Privacy Protection. Application of the proposed RRT to election polling has been discussed efficiently. The appropriateness of such application has been illustrated via detailed simulation study. The results have been tabulated and recommendations made for the use of the proposed estimation strategy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. ANALYSIS OF URBANIZATION AND HOUSING DEPRIVATION IN THE CITY OF YENAGOA, BAYELSA STATE, NIGERIA.
- Author
-
GUNN, E. O., DEINNE, C. E., and BRAVE, J.
- Subjects
HOUSING ,HOUSING development ,LOW-income housing ,SQUATTER settlements ,PLANNED communities ,SANITATION - Abstract
Africa is currently the fastest urbanizing continent in the world. The contemporary urban development in Africa is faced with multi-scale and diverse challenges one of which is housing deprivation. A cross-sectional survey research design involving a stratified sampling technique was utilized in this study. The indicators of housing deprivation were measured based on the headcount approach, the percentage of persons in the population living in dwellings without basic requirements such as the structural aspect of housing, the characteristics of roofs, walls, floor, access to quality water, electricity, sanitation, and habitability. An equal weight of 1 was assigned to each dimension in a nested fashion. The result reveals that (204) 52.7% of the respondents are males, while (183) 47.3% are females. Most of the respondents (292)75.5% are aware of housing deprivation, while (281) 72.6% of respondents stated that there is a relationship between urbanization and housing deprivation in the city of Yenagoa. The result of the Pearson Product Moment Correlation reveals a coefficient of (r = 0.660), which means a strong positive relationship between urbanization and housing deprivation at a 0.01 significance level. This result implies that urbanization because of pull factors is one of the principal determinants of housing deprivation in Yenagoa. Lack of decent housing facilities contributed to the development of squatter settlements and with the ever-increasing rate of urbanization, it becomes imperative for the government to plan for the development of the housing sector and provide affordable low-cost housing estate, especially within Yenagoa city to redress the issue of poor housing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
47. Alzheimer's disease detection using residual neural network with LSTM hybrid deep learning models.
- Author
-
Vidhya, R., Banavath, Dhanalaxmi, Kayalvili, S., Naidu, Swarna Mahesh, Charles Prabu, V., Sugumar, D., Hemalatha, R., Vimal, S., and Vidhya, R.G.
- Subjects
- *
DEEP learning , *ALZHEIMER'S disease , *CLINICAL decision support systems , *FEATURE extraction , *MACHINE learning - Abstract
Early Alzheimer's disease detection is essential for facilitating prompt intervention and enhancing the quality of care provided to patients. This research presents a novel strategy for the diagnosis of Alzheimer's disease that makes use of sophisticated sampling methods in conjunction with a hybrid model of deep learning. We use stratified sampling, ADASYN (Adaptive Synthetic Sampling), and Cluster- Centroids approaches to ensure a balanced representation of Alzheimer's and non-Alzheimer's cases during model training in order to meet the issues posed by imbalanced data distributions in clinical datasets. This allows us to solve the challenges posed by imbalanced data distributions in clinical datasets. A strong hybrid architecture is constructed by combining a Residual Neural Network (ResNet) with Residual Neural Network (ResNet) units. This architecture makes the most of both the feature extraction capabilities of ResNet and the capacity of LSTM to capture temporal dependencies. The findings demonstrate that the model is superior to traditional approaches to machine learning and single-model architectures in terms of accuracy, sensitivity, and specificity. The hybrid deep learning model demonstrates exceptional capabilities in identifying early indicators of Alzheimer's disease with a high degree of accuracy, which paves the way for early diagnosis and treatment. In addition, an interpretability study is carried out in order to provide light on the decision-making process underlying the model. This helps to contribute to a better understanding of the characteristics and biomarkers that play a role in the identification of Alzheimer's disease. In general, the strategy that was provided provides a promising foundation for accurate and reliable Alzheimer's disease identification. It does this by harnessing the capabilities of hybrid deep learning models and sophisticated sampling approaches to improve clinical decision support and, as a result, eventually improve patient outcomes. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Another solution for some optimum allocation problem.
- Author
-
Wójciak, Wojciech
- Subjects
STATISTICAL sampling ,COST allocation ,RECURSIVE functions ,COST functions ,ANALYSIS of variance - Abstract
We derive optimality conditions for the optimum sample allocation problem in stratified sampling, formulated as the determination of the fixed strata sample sizes that minimize the total cost of the survey, under the assumed level of variance of the stratified p estimator of the population total (or mean) and one-sided upper bounds imposed on sample sizes in strata. In this context, we presume that the variance function is of some generic form that, in particular, covers the case of the simple random sampling without replacement design in strata. The optimality conditions mentioned above will be derived from the Karush-Kuhn-Tucker conditions. Based on the established optimality conditions, we provide a formal proof of the optimality of the existing procedure, termed here as LRNA, which solves the allocation problem considered. We formulate the LRNA in such a way that it also provides the solution to the classical optimum allocation problem (i.e. minimization of the estimator's variance under a fixed total cost) under one-sided lower bounds imposed on sample sizes in strata. In this context, the LRNA can be considered as a counterparty to the popular recursive Neyman allocation procedure that is used to solve the classical problem of an optimum sample allocation with added one-sided upper bounds. Ready-to-use R-implementation of the LRNA is available through our stratallo package, which is published on the Comprehensive R Archive Network (CRAN) package repository. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Assessing Antithetic Sampling for Approximating Shapley, Banzhaf, and Owen Values.
- Author
-
Staudacher, Jochen and Pollmann, Tim
- Subjects
COOPERATIVE game theory ,STATISTICAL sampling ,PERMUTATIONS ,ALGORITHMS ,VARIANCES - Abstract
Computing Shapley values for large cooperative games is an NP-hard problem. For practical applications, stochastic approximation via permutation sampling is widely used. In the context of machine learning applications of the Shapley value, the concept of antithetic sampling has become popular. The idea is to employ the reverse permutation of a sample in order to reduce variance and accelerate convergence of the algorithm. We study this approach for the Shapley and Banzhaf values, as well as for the Owen value which is a solution concept for games with precoalitions. We combine antithetic samples with established stratified sampling algorithms. Finally, we evaluate the performance of these algorithms on four different types of cooperative games. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Lung Nodule Classification Based on SE-ResNet152 and Stratified Sampling
- Author
-
Li, Jiancheng, Gan, Junying, Cao, Lu, Xu, Xuexia, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Yongtian, Wang, editor, and Lifang, Wu, editor
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.