621 results on '"Aggregated data"'
Search Results
2. Linear Mixed Modeling of Federated Data When Only the Mean, Covariance, and Sample Size Are Available.
- Author
-
Limpoco, Marie Analiz April, Faes, Christel, and Hens, Niel
- Subjects
- *
FEDERATED learning , *LINEAR statistical models , *PATIENTS' rights , *PEDIATRIC clinics , *DATA privacy - Abstract
ABSTRACT In medical research, individual‐level patient data provide invaluable information, but the patients' right to confidentiality remains of utmost priority. This poses a huge challenge when estimating statistical models such as a linear mixed model, which is an extension of linear regression models that can account for potential heterogeneity whenever data come from different data providers. Federated learning tackles this hurdle by estimating parameters without retrieving individual‐level data. Instead, iterative communication of parameter estimate updates between the data providers and analysts is required. In this article, we propose an alternative framework to federated learning for fitting linear mixed models. Specifically, our approach only requires the mean, covariance, and sample size of multiple covariates from different data providers once. Using the principle of statistical sufficiency within the likelihood framework as theoretical support, this proposed strategy achieves estimates identical to those derived from actual individual‐level data. We demonstrate this approach through real data on 15 068 patient records from 70 clinics at the Children's Hospital of Pennsylvania. Assuming that each clinic only shares summary statistics once, we model the COVID‐19 polymerase chain reaction test cycle threshold as a function of patient information. Simplicity, communication efficiency, generalisability, and wider scope of implementation in any statistical software distinguish our approach from existing strategies in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. COMPARATIVE INDICATOR ANALYSIS BASED ON AGGREGATED DATA IN THE CONTEXT OF NEEDS AND INTERPRETATIVE POSSIBILITIES IN A TEMPORAL APPROACH.
- Author
-
BERNAT, Piotr
- Subjects
SOCIAL impact ,COMPARATIVE method ,COMPARATIVE studies ,DATA analysis ,POSSIBILITY - Abstract
Purpose: The aim of the work was to demonstrate the interpretation possibilities offered by the presentation of prepared information, which is the result of analytical work and influences the final assessment resulting in the perception of the studied phenomena or the state of the studied object, but also indicates potential or possible interpretational distortions, i.e. potentially erroneous recommendations. Design/methodology/approach: A temporal approach to the issue of comparative indicator analysis that allows for the demonstration of interpretational distortions requires aggregated data sets that are necessary for the correct conduct of inference activities, which translates into the perception of the examined object or issue. Findings: The analysis of the existence of interdependencies or their absence is conditioned by both the time period and reliable data, hence the structured considerations conducted in the subsequent stages of the analytical comparative work will allow for the demonstration of existing similarities, differences or problems. Social implications: Indicator comparative analysis is a tool for collecting information about an object or phenomenon, taking into account the broader context, i.e. society, economy or state of infrastructure. This gives the possibility of comparing the studied object based on the background, enabling conclusions and recommendations. Originality/value: Limiting the distortions in interpretation of the phenomena studied allows us to predict directions of development based on the background, i.e. references to the environment and identified trends, and by making future states more probable, propose final assessments that translate into recommendations or procedures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. 6G assisted federated learning for continuous monitoring in wireless sensor network using game theory.
- Author
-
Phani Praveen, S., Ali, Mohammed Hasan, Jarwar, Muhammad Aslam, Prakash, Chander, Reddy, Chavva Ravi Kishore, Malliga, L., and Chandru Vignesh, C.
- Subjects
- *
FEDERATED learning , *WIRELESS sensor nodes , *ENERGY management , *GAME theory , *POWER resources , *WIRELESS sensor networks , *DATA transmission systems - Abstract
In-Game theory Applications, the 6G-assisted federated learning in continuous monitoring applications with wireless sensor networks (WSN) is a significant concern. With increased applications comes the increased demand for advanced resource allocation and energy management systems. WSN can be determined as a self-configured, infrastructure-less wireless network monitoring physical or other surrounding conditions. In this study, the proposed system is concentrated on applying game theory to 6G-assisted federated learning for continuous monitoring in wireless sensor networks. The techniques imposed by the dual sink, such as Static and dynamic moving nodes approaches, are applied to the tentative node selection based on aggregated data transmission techniques. Based on the Static nodes and trusted nodes, the Aggregated data transmission is achieved high-level data transmission by combining individual-level data, i.e., the aggregate of the output data. This technique is performed with the wireless sensor network (WSN) platform with a future 6G network coordinating with the tool of NS4-Programmable Data Plane simulation. The proposed system simplifies the development of a behavioral model and bridges the gap between simulation and deployment. Finally, the combination of game theory with 6G-assisted federated learning for continuous monitoring applications in WSN solves problems and identifies several future directions. The outcome analysis of the proposed system is to design the wireless sensor network to yield a high network lifetime of more than 20 h and low power (less than 0.2 kJ energy) consumption for efficient communication in the future 6G cellular network. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Assessing the impact of socio‐demographics and farming activity on ward‐level mortality patterns using farm and population decennial censuses.
- Author
-
Trearty, Kelly, Bunting, Brendan, and Mallett, John
- Subjects
- *
AGRICULTURAL statistics , *PSYCHOLOGY of agricultural laborers , *MORTALITY , *RISK assessment , *STATISTICAL models , *CLUSTER analysis (Statistics) , *ACADEMIC medical centers , *SOCIOECONOMIC factors , *PROBABILITY theory , *CENSUS , *DESCRIPTIVE statistics , *PATH analysis (Statistics) , *CHI-squared test , *WORK-related injuries , *SOCIAL context , *MATHEMATICAL models , *SOCIODEMOGRAPHIC factors , *AGRICULTURAL laborers , *COMPARATIVE studies , *DATA analysis software , *CONFIDENCE intervals , *THEORY , *INDUSTRIAL hygiene , *NEIGHBORHOOD characteristics ,MORTALITY risk factors - Abstract
Introduction and Objective: Farmers experience a specific set of unique dangers, which increases their risk of mortality compared with any other occupation. This study hypothesised that Northern Ireland's (NIs) agriculturally saturated Wards have a higher risk of mortality compared against non‐agriculturally based Wards. Design: The Population Census and Farm Census information were downloaded from the Northern Ireland Neighbourhood Service (NINIS) online depository to compile three mortality‐based data sets (2001, 2011, pooled data sets). Assessing the impact of socio‐demographics and farming activity on Ward‐level mortality patterns using farm and population decennial censuses. This study analysed all 582 Ward areas of NI, which enclosed the entire populace of the country in 2001 and 2011. Findings: Path analysis was utilised to examine direct and indirect paths linked with mortality within two census years (2001; 2011), alongside testing pathways for invariance between census years (pooled data set). Ward‐level results provided evidence for exogenous variables to mortality operating through three/four endogenous variables via: (i) direct effects (age), (ii) summed indirect effects (age; males; living alone; farming profit; and deprivation) and (iii) total effects (age; males; living alone; and deprivation). Multi‐group results cross‐validated these cause‐and‐effect relationships relating to mortality. Discussion and Conclusion: This study demonstrated that farming intensity scores, farming profits and socio‐demographics' influence on mortality risk in a Ward were dependent on the specific social‐environmental characteristics within that area. In line with earlier area level research, results support the aggregated interpretation that higher levels of farming activity within a Ward increase the risk of mortality within those Wards of NI. This was an essential study to enable future tailoring of new strategies and upgrading of current policies to bring about significant mortality risk change at local level. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. FinTechs and the Market for Financial Analysis.
- Author
-
Grennan, Jillian and Michaely, Roni
- Subjects
FINANCIAL technology ,CORPORATE finance ,ARTIFICIAL intelligence ,INTERNET users ,AGGREGATED data ,INFORMATION resources - Abstract
Hundreds of equity market intelligence financial technology firms (FinTechs) have formed in the last decade. We assemble novel data to describe their capabilities, users, and consequences. Our data suggest that these FinTechs i) aggregate many data sources, including nontraditional ones (e.g., Twitter, blogs), and synthesize such data using artificial intelligence to make investment recommendations, and ii) change Internet users’ information discovery by serving as substitutes for traditional information providers. We evaluate some nontraditional data and find evidence suggesting that such data contain valuable information or “crowd wisdom” that links to informational efficiency. Overall, our findings are consistent with this innovation benefiting investors and markets. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
7. DADOS ABERTOS DO IBGE: RECUPERAÇÃO NA API DE DADOS AGREGADOS.
- Author
-
Nascimento Silva, Patrícia and Pereira da Silva, Gabriel Vieira
- Subjects
- *
TRANSPARENCY in government , *GOVERNMENT policy , *ACQUISITION of data , *NEW product development , *BUSINESS models , *HABIT - Abstract
Objective: The publication of data opened by the government has as its main objective or its reuse by society. As the increase in the volume of available data, machine processing was one of two essential requirements to promote interoperability, thus allowing the effective reuse of open government data. This data article aims to reveal the data collection process in the IBGE aggregate data API, which makes available aggregate data from surveys and censuses carried out in Brazil. Method: This research is characterized as descriptive and exploratory, with a qualitative, and applied approach because it involves a practical problem that collects two data in the IBGE aggregate data API. For this reason, a documentary search was carried out to identify the broken points (endpoints) available and the parameters existing in the API, which will allow the implementation of "filters" to recover data and relevant information in a universe with many sets of aggregated data. Reuse potential: The availability of two sets of IBGE research data, in an automated manner, allows them to be monitored, recovered and used in the direction and maintenance of public policies, contributing to the identification of patterns, behaviors and habits of the population, by various segments of civil society, including companies (public and private), promoting new digital business models that use data and information from different contexts, creating new products, services and occupations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Projected incidence trends of need for long-term care in German men and women from 2011 to 2021.
- Author
-
Voß, Sabrina, Knippschild, Stephanie, Haß, Luisa, Tönnies, Thaddäus, and Brinks, Ralph
- Subjects
- *
LONG-term health care , *EPIDEMIOLOGY , *AGGREGATED data , *PARTIAL differential equations - Abstract
Background: The German Federal Statistical Office routinely collects and reports aggregated numbers of people in need of long-term care (NLTC) stratified by age and sex. Age- and sex-specific prevalence of NLTC from 2011 to 2021 is reported as well. One estimation of the incidence rate of NLTC based on the age- and sex-specific prevalence exists that did not explore possible trends in incidence [based on MRR (mortality rate ratio)], which is important for an adequate projection of the future number of people with NLTC. Objective: We aim to explore possible trends in age-specific incidence of NLTC in German men and women from 2011 to 2021 based on different scenarios about excess mortality (in terms of MRR). Methods: The incidence of NLTC was calculated based on an illness-death model and a related partial differential equation based on data from the Federal Statistical Office. Estimation of annual percent change (APC) of the incidence rate was conducted in eight scenarios. Results: There are consistent indications for trends in incidence for men and women aged 50-79 years with APC in incidence rate of more than +9% per year (up to nearly 19%). For ages 80+ the APC is between +0.4% and +12.5%. In all scenarios, women had higher age-specific APCs than men. Conclusion: We performed the first analysis of APC in the age- and sex-specific incidence rate of NLTC in Germany and revealed an increasing trend in the incidences. With these findings, a future prevalence of NLTC can be estimated which may exceed current prognoses. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. COMPARATIVE INDICATOR ANALYSIS BASED ON AGGREGATED DATA IN THE CONTEXT OF NEEDS AND INTERPRETATIVE POSSIBILITIES IN PROCEDURAL TERMS.
- Author
-
BERNAT, Piotr
- Subjects
COMPARATIVE studies ,SOCIAL impact ,COMPARATIVE method ,POSSIBILITY - Abstract
Purpose: The aim of the work was to demonstrate the interpretation possibilities caused by the presentation of prepared information being the result of analytical work, and influencing the final assessment resulting in the perception of the examined phenomena or the state of the examined object, but also indicating directions and finally ways of proceeding, i.e. potential or recommended actions. Design/methodology/approach: A procedural approach to the issue of comparative indicator analysis allowing to demonstrate the needs and possibilities of interpretation requires aggregated data sets, necessary for the proper conduct of inference activities, which translates into the perception of the examined object or issue. Findings: The analysis of the existence of interdependencies or their absence is determined by both the procedure and reliable data, hence the procedurally structured considerations, carried out in subsequent stages of analytical comparative work, make it possible to demonstrate the existing similarities, connections or problems. Social implications: Indicative comparative analysis is a tool for collecting information about an object or phenomenon, taking into account a broader context, i.e. society, economy or the state of infrastructure. This gives the opportunity to compare the tested object on the basis of the background, enabling conclusions and recommendations. Originality/value: Limiting the interpretative differences of the studied phenomena allows, on the basis of the background, i.e. references to the environment and identified trends, to predict development directions, and by making future states more probable, to propose final assessments that translate into recommendations or procedures. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Predicting childhood lead exposure at an aggregated level using machine learning.
- Author
-
Lobo, GP, Kalyan, B, and Gadgil, AJ
- Subjects
Aggregated data ,Environmental exposure ,Lead ,Lead poisoning ,Machine learning ,Child ,Child ,Preschool ,Environmental Exposure ,Housing ,Humans ,Lead Poisoning ,Machine Learning ,New York City ,Pediatric ,Public Health and Health Services ,Epidemiology ,Toxicology - Abstract
Childhood lead exposure affects over 500,000 children under 6 years old in the US; however, only 14 states recommend regular universal blood screening. Several studies have reported on the use of predictive models to estimate lead exposure of individual children, albeit with limited success: lead exposure can vary greatly among individuals, individual data is not easily accessible, and models trained in one location do not always perform well in another. We report on a novel approach that uses machine learning to accurately predict elevated Blood Lead Levels (BLLs) in large groups of children, using aggregated data. To that end, we used publicly available zip code and city/town BLL data from the states of New York (n = 1642, excluding New York City) and Massachusetts (n = 352), respectively. Five machine learning models were used to predict childhood lead exposure by using socioeconomic, housing, and water quality predictive features. The best-performing model was a Random Forest, with a 10-fold cross validation ROC AUC score of 0.91 and 0.85 for the Massachusetts and New York datasets, respectively. The model was then tested with New York City data and the results compared to measured BLLs at a borough level. The model yielded predictions in excellent agreement with measured data: at a city level it predicted elevated BLL rates of 1.72% for the children in New York City, which is close to the measured value of 1.73%. Predictive models, such as the one presented here, have the potential to help identify geographical hotspots with significantly large occurrence of elevated lead blood levels in children so that limited resources may be deployed to those who are most at risk.
- Published
- 2021
11. A global gridded municipal water withdrawal estimation method using aggregated data and artificial neural network
- Author
-
Jiabao Yan and Shaofeng Jia
- Subjects
aggregated data ,artificial neural network-based indirect model ,fine-resolution ,global ,gridded ,municipal water withdrawal ,Environmental technology. Sanitary engineering ,TD1-1066 - Abstract
Municipal water withdrawal (MWW) information is of great significance for water supply planning, including water supply pipeline networks planning, optimization and management. Currently most MWW data are reported as spatially aggregated over large-area survey regions or even lack of data, which is unable to meet the growing demand for spatially detailed data in many applications. In this paper, six different models are constructed and evaluated in estimating global MWW using aggregated MWW data and gridded raster covariates. Among the models, the artificial neural network-based indirect model (NNM) shows the best accuracy with higher R2 and lower NMAE and NRMSE in different spatial scales. The estimates achieved from the NNM model are consistent with census and survey data, and outperforms the existing global gridded MWW dataset. At last, the NNM model is applied to mapping global gridded MWW for the year 2015 at 0.1 × 0.1° resolution. The proposed method can be applied to a wider aggregated output learning problem and the high-resolution global gridded MWW data can be used in hydrological models and water resources management. HIGHLIGHTS Different models are constructed and evaluated in estimating gridded municipal water withdrawal.; Global fine-resolution municipal water withdrawal data are generated using aggregated data and an artificial neural network model.; Gridded indirect artificial neural network model through per capita municipal water withdrawal achieved better performance than other models.; Uncertainty analysis indicates the robustness of a gridded indirect artificial neural network model at regional scale.; The artificial neural network-based method can be applied to a broader aggregated output learning problem. ;
- Published
- 2023
- Full Text
- View/download PDF
12. Inter-regional correlation estimators for functional magnetic resonance imaging
- Author
-
Sophie Achard, Jean-François Coeurjolly, Pierre Lafaye de Micheaux, Hanâ Lbath, and Jonas Richiardi
- Subjects
Functional connectivity ,Correlation ,Aggregated data ,Familial correlations ,Serial correlations ,Neurosciences. Biological psychiatry. Neuropsychiatry ,RC321-571 - Abstract
Functional magnetic resonance imaging (fMRI) functional connectivity between brain regions is often computed using parcellations defined by functional or structural atlases. Typically, some kind of voxel averaging is performed to obtain a single temporal correlation estimate per region pair. However, several estimators can be defined for this task, with various assumptions and degrees of robustness to local noise, global noise, and region size.In this paper, we systematically present and study the properties of 9 different functional connectivity estimators taking into account the spatial structure of fMRI data, based on a simple fMRI data spatial model. These include 3 existing estimators and 6 novel estimators. We demonstrate the empirical properties of the estimators using synthetic, animal, and human data, in terms of graph structure, repeatability and reproducibility, discriminability, dependence on region size, as well as local and global noise robustness.We prove analytically the link between regional intra-correlation and inter-region correlation, and show that the choice of estimator has a strong influence on inter-correlation values. Some estimators, including the commonly used correlation of averages (ca), are positively biased, and have more dependence to region size and intra-correlation than robust alternatives, resulting in spatially-dependent bias. We define the new local correlation of averages estimator with better theoretical guarantees, lower bias, significantly lower dependence on region size (Spearman correlation 0.40 vs 0.55, paired t-test T=27.2, p=1.1e−47), at negligible cost to discriminative power, compared to the ca estimator.The difference in connectivity pattern between the estimators is not distributed uniformly throughout the brain, but rather shows a clear ventral-dorsal gradient, suggesting that region size and intra-correlation plays an important role in shaping functional networks defined using the ca estimator, and leading to non-trivial differences in their connectivity structure. We provide an open source R package and equivalent Python implementation to facilitate the use of the new estimators, together with preprocessed rat time-series.
- Published
- 2023
- Full Text
- View/download PDF
13. Information Aggregation Bias and Samuelson's Dictum.
- Author
-
CHOI, YONGOK, RONDINA, GIACOMO, and WALKER, TODD B.
- Subjects
AGGREGATED data ,PREJUDICES ,ECONOMICS ,STOCK prices ,DIVIDENDS ,MARKET volatility ,ASSETS (Accounting) ,PRICING - Abstract
Under the assumption of incomplete information, idiosyncratic shocks may not dissipate in the aggregate. An econometrician who incorrectly imposes complete information and applies the law of large numbers may be susceptible to information aggregation bias. Tests of aggregate economic theory will be misspecified even though tests of the same theory at the microlevel deliver the correct inference. A testable implication of information aggregation bias is "Samuelson's Dictum" or the idea that stock prices can simultaneously display "microefficiency" and "macroinefficiency;" an idea accredited to Paul Samuelson. Using firm‐level data from the Center for Research in Security Prices, we present empirical evidence consistent with Samuelson's dictum. Specifically, we conduct two standard tests of the linear present value model of stock prices: a regression of future dividend changes on the dividend‐price ratio and a test for excess volatility. We show that the dividend price ratio forecasts the future growth in dividends much more accurately at the firm level as predicted by the present value model, and that excess volatility can be rejected for most firms. When the same firms are aggregated into equal‐weighted or cap‐weighted portfolios, the estimated coefficients typically deviate from the present value model and "excess" volatility is observed; this is especially true for aggregates (e.g., S&P 500) that are used in most asset pricing studies. To investigate the source of our empirical findings, we propose a theory of aggregation bias based on incomplete information and segmented markets. Traders specializing in individual stocks conflate idiosyncratic and aggregate shocks to dividends. To an econometrician using aggregate data, these assumptions generate a rejection of the present value model even though individual traders are efficiently using their available information. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Changing presidential approval: Detecting and understanding change points in interval censored polling data.
- Author
-
Jiahao Tian and Porter, Michael D.
- Subjects
- *
PUBLIC opinion , *EXPECTATION-maximization algorithms , *CENSORSHIP , *POLITICAL candidates , *GOVERNMENT policy - Abstract
Understanding how a society views certain policies, politicians, and events can help shape public policy, legislation, and even a political candidate's campaign. This paper focuses on using aggregated, or interval censored, polling data to estimate the times when the public opinion shifts on the US president's job approval. The approval rate is modelled as a Poisson segmented (joinpoint) regression with the EM algorithm used to estimate the model parameters. Inference on the change points is carried out using BIC based model averaging. This approach can capture the uncertainty in both the number and location of change points. The model is applied to president Trump's job approval rating during 2020. Three primary change points are discovered and related to significant events and statements. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. A parameter estimation method for multivariate binned Hawkes processes.
- Author
-
Shlomovich, Leigh, Cohen, Edward A. K., and Adams, Niall
- Abstract
It is often assumed that events cannot occur simultaneously when modelling data with point processes. This raises a problem as real-world data often contains synchronous observations due to aggregation or rounding, resulting from limitations on recording capabilities and the expense of storing high volumes of precise data. In order to gain a better understanding of the relationships between processes, we consider modelling the aggregated event data using multivariate Hawkes processes, which offer a description of mutually-exciting behaviour and have found wide applications in areas including seismology and finance. Here we generalise existing methodology on parameter estimation of univariate aggregated Hawkes processes to the multivariate case using a Monte Carlo expectation–maximization (MC-EM) algorithm and through a simulation study illustrate that alternative approaches to this problem can be severely biased, with the multivariate MC-EM method outperforming them in terms of MSE in all considered cases. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Data report on three datasets: Mortality patterns between agricultural and non-agricultural ward areas
- Author
-
Kelly Trearty, Brendan Bunting, and John Mallett
- Subjects
death trends ,aggregated data ,area-level data ,farm census ,population census ,Northern Ireland ,Genetics ,QH426-470 - Abstract
The health of the farming community in Northern Ireland (NI) requires further research as previous mortality studies have reported contradictory results regarding farmers’ health outcomes compared against other occupations and the general population. This study collated the NINIS area-level farm census with the population census information across 582 non-overlapping wards of NI to compile three mortality datasets (2001, 2011, and pooled dataset) (NISRA 2019). These datasets allow future researchers to investigate the influence of demographic, farming, and economic predictors on all-cause mortality at the ward level. The 2001 and 2011 mortality datasets were compiled for cross-sectional analyses and subsequently pooled for longitudinal analyses. Findings from these datasets will provide evidence of the influence of Farming Intensity scores influence on death risk within the wards for future researchers to utilise. This data report will aid in the understanding of socio-ecological variables’ additive contribution to the risk of death at the ward level within NI. This data report is of interest to the One Health research community as it standardises the environment−human−animal data to pave the way towards a new One Health research paradigm. For example, future researchers can use this nationally representative data to investigate whether agriculturally saturated wards have a higher mortality risk than non-agriculturally based wards of NI.
- Published
- 2023
- Full Text
- View/download PDF
17. Identifying gender disparities in research performance: the importance of comparing apples with apples.
- Author
-
Nygaard, Lynn P., Aksnes, Dag W., and Piro, Fredrik Niclas
- Subjects
- *
GENDER inequality , *BIBLIOMETRICS , *PUBLICATIONS , *AUTHORSHIP , *AGGREGATED data , *HIGHER education , *ADULTS - Abstract
Many studies on research productivity and performance suggest that men consistently outperform women. However, women and men are spread unevenly throughout the academy both horizontally (e.g., by scientific field) and vertically (e.g., by academic position), suggesting that aggregate numbers (comparing all men with all women) may reflect the different publication practices in different corners of the academy rather than gender per se. We use Norwegian bibliometric data to examine how the "what" (which publication practices are measured) and the "who" (how the population sample is disaggregated) matter in assessing apparent gender differences among academics in Norway. We investigate four clusters of indicators related to publication volume, publication type, authorship, and impact or quality (12 indicators in total) and explore how disaggregating the population by scientific field, institutional affiliation, academic position, and age changes the gender gaps that appear at the aggregate level. For most (but not all) indicators, we find that gender differences disappear or are strongly reduced after disaggregation. This suggests a composition effect, whereby apparent gender differences in productivity can to a considerable degree be ascribed to the composition of the group examined and the different publication practices common to specific groups. We argue that aggregate figures can exaggerate some gender disparities while obscuring others. Our study illustrates the situated nature of research productivity and the importance of comparing men and women within similar academic positions or scientific fields—of comparing apples with apples—when using bibliometric indicators to identify gender disparities in research productivity. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. A comparison of methods for enriching network meta‐analyses in the absence of individual patient data.
- Author
-
Proctor, Tanja, Zimmermann, Samuel, Seide, Svenja, and Kieser, Meinhard
- Subjects
- *
RANDOMIZED controlled trials , *NETWORK performance , *DRUG development - Abstract
During drug development, a biomarker is sometimes identified as separating a patient population into those with more and those with less benefit from evaluated treatments. Consequently, later studies might be targeted, while earlier ones are performed in mixed patient populations. This poses a challenge in evidence synthesis, especially if only aggregated data are available. Starting from this scenario, we investigate three commonly used network meta‐analytic estimation methods, the naive estimation approach, the stand‐alone analysis, and the network meta‐regression. Additionally, we adapt and modify two methods, which are used in evidence synthesis to combine randomized controlled trials with observational studies, the enrichment‐through‐weighting approach, and the informative prior estimation. We evaluate all five methods in a simulation study with 32 scenarios using bias, root‐mean‐squared‐error, coverage, precision, and power. Additionally, we revisit a clinical data set to exemplify and discuss the application. In the simulation study, none of the methods was observed to be clearly favorable over all investigated scenarios. However, the stand‐alone analysis and the naive estimation performed comparably or worse than the other methods in all evaluated performance measures and simulation scenarios and are therefore not recommended. While substantial between‐trial heterogeneity is challenging for all estimation approaches, the performance of the network meta‐regression, the enriching‐through weighting approach and the informative prior approach was dependent on the simulation scenario and the performance measure of interest. Furthermore, as these estimation methods are drawing slightly different assumptions, some of which require the presence of additional information for estimation, we recommend sensitivity‐analyses wherever possible. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Parameter Estimation of Binned Hawkes Processes.
- Author
-
Shlomovich, Leigh, Cohen, Edward A. K., Adams, Niall, and Patel, Lekha
- Subjects
- *
PARAMETER estimation , *DATA binning , *POINT processes , *EXPECTATION-maximization algorithms , *COMPUTER networks , *STOCK exchanges - Abstract
A key difficulty that arises from real event data is imprecision in the recording of event time-stamps. In many cases, retaining event times with a high precision is expensive due to the sheer volume of activity. Combined with practical limits on the accuracy of measurements, binned data is common. In order to use point processes to model such event data, tools for handling parameter estimation are essential. Here we consider parameter estimation of the Hawkes process, a type of self-exciting point process that has found application in the modeling of financial stock markets, earthquakes and social media cascades. We develop a novel optimization approach to parameter estimation of binned Hawkes processes using a modified Expectation-Maximization algorithm, referred to as Binned Hawkes Expectation Maximization (BH-EM). Through a detailed simulation study, we demonstrate that existing methods are capable of producing severely biased and highly variable parameter estimates and that our novel BH-EM method significantly outperforms them in all studied circumstances. We further illustrate the performance on network flow (NetFlow) data between devices in a real large-scale computer network, to characterize triggering behavior. These results highlight the importance of correct handling of binned data. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Interval Valued Bipolar Fuzzy Prioritized Weighted Dombi Averaging Operator Based On Multi-Criteria Decision Making Problems.
- Author
-
ÖZLÜ, Şerif
- Subjects
MULTIPLE criteria decision making ,FUZZY sets ,AGGREGATED data ,MULTIPLICATION ,COMPARATIVE studies - Abstract
In this paper, we develop to a new operator which Interval Valued Bipolar Fuzzy Prioritized Weighted Dombi Averaging operator (IVBFPWDA) by using together IVBFS and Dombi operators. This construction is important because of presenting priotrized flexible approach. The soft concept provides to rank into its own structure. Thus, the obtained rankings act to find to ideal alternative and non-ideal alternative for many k values. Then, the some properties of this concept are introduced like addition, scalar multiplication, scalar power, multiplication. Moreover, we present to score function over (IVBFSs) [8] for ranking of aggregated values. Also, the offered operator is applied over an investment example under the Multi Criteria Decision Making (MCDM) method with five steps. Firstly, the components of Decision Making Matrix are turned into aggregated values thanks to IVBFPWDA and score values are calculated. The obtained values are ranked and determined the most desired alternative. The results show that our operator is reality, objective and agreement in its own. Also, a comparative analysis is proposed by using Hamacher operators over Interval Valued Bipolar fuzzy sets (IVBFSs) [8]. As can be seen from comparison analysis, the results are agreement. But our defined operator has several advantages than operators over IVBFS for example; our operator is flexible and variable. Thus; prefer, need and requirement of decision makers can be satisfied. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. UTM: A trajectory privacy evaluating model for online health monitoring
- Author
-
Zhigang Yang, Ruyan Wang, Dapeng Wu, and Daizhong Luo
- Subjects
Online health monitoring ,Trajectory privacy ,User trajectory model ,Aggregated data ,Uniqueness ,Information technology ,T58.5-58.64 - Abstract
A huge amount of sensitive personal data is being collected by various online health monitoring applications. Although the data is anonymous, the personal trajectories (e.g., the chronological access records of small cells) could become the anchor of linkage attacks to re-identify the users. Focusing on trajectory privacy in online health monitoring, we propose the User Trajectory Model (UTM), a generic trajectory re-identification risk predicting model to reveal the underlying relationship between trajectory uniqueness and aggregated data (e.g., number of individuals covered by each small cell), and using the parameter combination of aggregated data to further mathematically derive the statistical characteristics of uniqueness (i.e., the expectation and the variance). Eventually, exhaustive simulations validate the effectiveness of the UTM in privacy risk evaluation, confirm our theoretical deductions and present counter-intuitive insights.
- Published
- 2021
- Full Text
- View/download PDF
22. An Analysis of Trends and Connections in Google, Twitter, and Wikipedia
- Author
-
Conti, Gianluca, Sansonetti, Giuseppe, Micarelli, Alessandro, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Stephanidis, Constantine, editor, and Antona, Margherita, editor
- Published
- 2020
- Full Text
- View/download PDF
23. EU Net-Zero Policy Achievement Assessment in Selected Members through Automated Forecasting Algorithms.
- Author
-
Tudor, Cristiana and Sova, Robert
- Subjects
- *
EMISSIONS (Air pollution) , *CARBON offsetting , *GEOLOGICAL statistics , *AIR pollutants , *FORECASTING , *SUSTAINABLE development , *AIR quality , *GREENHOUSE gas mitigation - Abstract
The European Union (EU) has positioned itself as a frontrunner in the worldwide battle against climate change and has set increasingly ambitious pollution mitigation targets for its members. The burden is heavier for the more vulnerable economies in Central and Eastern Europe (CEE), who must juggle meeting strict greenhouse gas emission (GHG) reduction goals, significant fossil-fuel reliance, and pressure to respond to current pandemic concerns that require an increasing share of limited public resources, while facing severe repercussions for non-compliance. Thus, the main goals of this research are: (i) to generate reliable aggregate GHG projections for CEE countries; (ii) to assess whether these economies are on track to meet their binding pollution reduction targets; (iii) to pin-point countries where more in-depth analysis using spatial inventories of GHGs at a finer resolution is further needed to uncover specific areas that should be targeted by additional measures; and (iv) to perform geo-spatial analysis for the most at-risk country, Poland. Seven statistical and machine-learning models are fitted through automated forecasting algorithms to predict the aggregate GHGs in nine CEE countries for the 2019–2050 horizon. Estimations show that CEE countries (except Romania and Bulgaria) will not meet the set pollution reduction targets for 2030 and will unanimously miss the 2050 carbon neutrality target without resorting to carbon credits or offsets. Austria and Slovenia are the least likely to meet the 2030 emissions reduction targets, whereas Poland (in absolute terms) and Slovenia (in relative terms) are the farthest from meeting the EU's 2050 net-zero policy targets. The findings thus stress the need for additional measures that go beyond the status quo, particularly in Poland, Austria, and Slovenia. Geospatial analysis for Poland uncovers that Krakow is the city where pollution is the most concentrated with several air pollutants surpassing EU standards. Short-term projections of PM2.5 levels indicate that the air quality in Krakow will remain below EU and WHO standards, highlighting the urgency of policy interventions. Further geospatial data analysis can provide valuable insights into other geo-locations that require the most additional efforts, thereby, assisting in the achievement of EU climate goals with targeted measures and minimum socio-economic costs. The study concludes that statistical and geo-spatial data, and consequently research based on these data, complement and enhance each other. An integrated framework would consequently support sustainable development through bettering policy and decision-making processes. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. MultiCenter Interrupted Time Series Analysis: Incorporating Within and Between-Center Heterogeneity
- Author
-
Ewusie JE, Thabane L, Beyene J, Straus SE, and Hamid JS
- Subjects
aggregated data ,weighted segmented regression ,pooled analysis ,interrupted time series ,multisite studies ,Infectious and parasitic diseases ,RC109-216 - Abstract
Joycelyne E Ewusie,1,2 Lehana Thabane,1 Joseph Beyene,1 Sharon E Straus,3 Jemila S Hamid1,2,4 1Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; 2School of Epidemiology and Public Health, University of Ottawa, Ottawa, Canada; 3Li Ka Shing Knowledge Institute of St Michael’s Hospital, Toronto, Ontario, Canada; 4Children’s Hospital of Eastern Ontario, Ottawa, Ontario, CanadaCorrespondence: Jemila S HamidChildren’s Hospital of Eastern Ontario, Ottawa, Ontario, CanadaTel +1 613 737-7600 x 4194Email jhamid@uottawa.caBackground: Segmented regression (SR) is the most common statistical method used in the analysis of interrupted time series (ITS) data. However, this modeling strategy is indicated to produce spurious results when applied to aggregated data. For multicenter ITS studies, data at a given time point are often aggregated across different participants and settings; thus, conventional segmented regression analysis may not be an optimal approach. Our objective is to provide a robust method for analysis of ITS data, while accounting for two sources of heterogeneity, between participants and across sites.Methods: We present a methodological framework within the segmented regression modeling strategy, where we introduced weights to account for between-participant variation and the differences across multiple sites. We empirically compared the proposed weighted segmented regression (wSR) with the conventional SR as well as with a previously published pooled analysis method using data from the Mobility of Vulnerable Elders in Ontario (MOVE-ON) project, a multisite ITS study.Results: Overall, the wSR produced the most precise estimates, where they had the narrowest 95% CI, while the conventional SR method resulted in the least precise estimates. Our method also resulted in increased power. The pooled analysis method and the wSR had comparable results when there were ≤ 4 sites included in the overall analysis and when there was moderate to high between-site heterogeneity as measured by the I2 statistic.Conclusion: Incorporating participant-level and site-level variability led to estimates that were more precise and accurate in determining the magnitude of the effect of an intervention and led to increased statistical power. This underscores the importance of accounting for the inherent variability in aggregated data. Extensive simulations are required to further assess the methods in a wide range of scenarios and outcome types.Keywords: aggregated data, weighted segmented regression, pooled analysis, interrupted time series, multisite studies
- Published
- 2020
25. Projected number of people in need for long-term care in Germany until 2050.
- Author
-
Haß L, Knippschild S, Tönnies T, Hoyer A, Palm R, Voß S, and Brinks R
- Subjects
- Humans, Germany epidemiology, Aged, Health Services Needs and Demand trends, Health Services Needs and Demand statistics & numerical data, Middle Aged, Female, Male, Aged, 80 and over, Adult, Long-Term Care statistics & numerical data, Long-Term Care trends, Forecasting
- Abstract
Introduction: Current demographic trends predict continuously growing numbers of individuals reliant on care, which has to be accounted for in future planning of long-term care-resources. The projection of developments becomes especially necessary in order to enable healthcare systems to cope with this future burden and to implement suitable strategies to deal with the demand of long-term care. This study aimed to project the prevalence of long-term care and the number of care-dependent people in Germany until 2050., Methods: We used the illness-death model to project the future prevalence of long-term care in Germany until 2050 considering eight different scenarios. Therefore, transition rates (incidence rate and mortality rates) describing the illness-death model are needed, which have been studied recently. Absolute numbers of people in need for long-term care were calculated based to the 15th population projection of the Federal Statistical Office., Results: Numbers of people in need for long-term care will increase by at least 12%, namely 5.6 million people, in the period of 2021 until 2050. Assuming an annual incidence-increase of 2% from 2021 to 2050 the number of care-dependent individuals could potentially rise up to 14 million (+180%)., Conclusion: Our projections indicated a substantial rise in the number of care-dependent individuals. This is expected to lead to raising economic challenges as well as a stronger demand for healthcare and nursing personnel., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The author(s) RB, SK, and SV declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision., (Copyright © 2024 Haß, Knippschild, Tönnies, Hoyer, Palm, Voß and Brinks.)
- Published
- 2024
- Full Text
- View/download PDF
26. The Global Shortage of Essential Drugs during the COVID-19 Pandemic: Evidence Based on Aggregated Media and Social Media Reports.
- Author
-
Salahuddin, Mohammed, Manzar, Dilshad, Unissa, Aleem, Pandi-Perumal, Seithikurippu R., and BaHammam, Ahmed S.
- Subjects
- *
COVID-19 pandemic , *ESSENTIAL drugs , *AGGREGATED data , *CRITICAL care medicine , *DRUG supply & demand - Abstract
Background: Implications from accumulated bodies of commentaries and media/social-media reports highlight the drug shortage during the COVID-19 pandemic. In this special report, the relation between drug shortage and response measures is discussed in the light of a preliminary data construct. Materials and Methods: Media reports and social media posts on public and national drug regulatory bodies' websites were searched between March 1, 2020, and August 11, 2020. The search's key terms were shortage, nonavailability, essential medicine, essential drug, imported medicine, imported drug, COVID-19, current pandemic, and corona. A qualitative and quantitative summary of drug-shortage response pages and trends of drug-shortage reports are presented. Results: In the developed countries, the drug regulatory bodies released drug-shortage response pages; such pages were not made available in the developing countries. There were reports of drug shortages from both the developing and developed countries. There were reports of drug shortage from as early as March 2020 when the lockdown was first implemented and continued until July 2020. The reported drug shortages varied from that of simple essential medicines to those needed in critical care. Conclusions: The study findings highlighted the spread (across the developing and developed countries), time trend of drug-shortage reports (started from the 1st week of the first round of lockdown and continued throughout the study duration), and nontermination of drug-shortage reports even after availability of drug-shortage response page. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Export, Logistics Performance, and Regional Economic Integration: Sectoral and Sub-Sectoral Evidence from Vietnam.
- Author
-
Duc Nha Le
- Subjects
LOGISTICS ,INTERNATIONAL economic integration ,SUSTAINABILITY ,GENERALIZED method of moments ,AGGREGATED data - Abstract
As a coastal emerging country, export-led marine economy has been the development model of Vietnam over the past decades since The Renovation 1986. Given the rise of globalization, regional economic integration and logistics enhancement have been identified as key engines for economic sustainability by Vietnamese government. Nevertheless, little sectoral and sub-sectoral evidence has been given for the platform shaped by policies relevant to export, logistics performance and regional economic integration. The paper employs the trade gravity model to study the relationship between seafood export, logistics performance and regional economic integration in the case of Vietnam. Sectoral and sub-sectoral trade gravity models are employed. Logistics performance from the exporter-side and importer-side is included in the estimations. Membership to effective regional trade agreements of Vietnam are proxies for regional economic integration. Zero trade issue is resolved by the Pooled Ordinary Least Squares (POLS), Poisson Pseudo-Maximum Likelihood (PPML) and Heckman Sample Selection estimations, while endogeneity is tackled by the difference and system Generalized Method of Moments (GMM) models. Findings vary by estimation methods, data levels, product groups, and whether which side is considered. In addition, theoretical contributions and some seafood export-driving policy recommendations relevant to regional economic integration and logistics performance development are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. Examining Prevalence and Diversity of Tick-Borne Pathogens in Questing Ixodes pacificus Ticks in California.
- Author
-
Salkeld, Daniel J., Lagana, Danielle M., Wachara, Julie, Porter, W. Tanner, and Nieto, Nathan C.
- Subjects
- *
IXODES , *LYME disease , *ANAPLASMA phagocytophilum , *BORRELIA burgdorferi , *PATHOGENIC microorganisms , *TICK-borne diseases , *TICKS - Abstract
Tick-borne diseases in California include Lyme disease (caused by Borrelia burgdorferi), infections with Borrelia miyamotoi, and human granulocytic anaplasmosis (caused by Anaplasma phagocytophilum). We surveyed multiple sites and habitats (woodland, grassland, and coastal chaparral) in California to describe spatial patterns of tick-borne pathogen prevalence in western black-legged ticks (Ixodes pacificus). We found that several species of Borrelia--B. burgdorferi, Borrelia americana, and Borrelia bissettiae--were observed in habitats, such as coastal chaparral, that do not harbor obvious reservoir host candidates. Describing tick-borne pathogen prevalence is strongly influenced by the scale of surveillance: aggregating data from individual sites to match jurisdictional boundaries (e.g., county or state) can lower the reported infection prevalence. Considering multiple pathogen species in the same habitat allows a more cohesive interpretation of local pathogen occurrence. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
29. American Economic Journal: Macroeconomics.
- Author
-
GILCHRIST, SIMON
- Subjects
MACROECONOMICS ,GROWTH rate ,GOVERNMENT policy ,AGGREGATED data - Abstract
The article discuses about the mandate of the American Economic Journal: Macroeconomics. Topics of discussion includes the mandate is to publish the studies of aggregate fluctuations and growth and the role of policy. The periodical invites papers in fields that contribute towards macroeconomics. The editors interpret this mandate with the view of broad range of research styles and methods that research progress in macroeconomics proceeds.
- Published
- 2021
- Full Text
- View/download PDF
30. A study of m-N-extremally Disconnected Spaces With Respect toτ, Maximum mX-N-open Sets.
- Author
-
Salih, Ahmed A. and Ali, Haider J.
- Subjects
- *
ABSTRACT algebra , *SET theory , *AGGREGATED data , *ANALYTIC sets , *ANALYTIC spaces - Abstract
The aim of this research is to prove the idea of maximum mX-N-open set, m-N-extremally disconnected with respect to τ and provide some definitions by utilizing the idea of mX-N-open sets. Some properties of these sets are studied. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. 新冠肺炎患者死亡时间的分析.
- Author
-
张军侠, 薛慧敏, 龚雅欣, 秦 琦, 宁昶华, 曹 蕾, and 曹永孝
- Abstract
Objective To investigate the death time of patients with coronavirus disease 2019 (COVID-19). Methods The death time was calculated and analyzed using individual data and aggregated data through the daily notification of the epidemic situation and the death cases published on the website of the Heath Commission of China and provinces. Results In the 153 patients who died of COVID-19. the shortest time from onset to death was 4 days and the longest time was 50 days with the mean± standard deviation of (16.7±9.2) days. The median was 14 days and the 95 % confidence interval was 4.6-42.9. The shortest time from admission to death was 1 day and the longest time was 50 days with the mean ± standard deviation of 02.1±7.81 days. The median was 11 days and the 95% confidence interval was 2-32.8. The time curve from diagnosis to death was skewed. The death time from diagnosis to death was 0 to 48 days with the mean ± standard deviation of (11.1±8.9) days. The median was 9 days. the interquartile interval was 10.5 days, and the 95% confidence interval was 0-35.4. It took 3 days from onset to admission and 1 day from admission to diagnosis. Aggregated data showed tha t the time from diagnosis to death of COVID-19 patients in China. China (except Hubei Province), Hubei Province and Wuhan City was 8.9, 6 and 6 days. respectively. Conclusion The time from diagnosis to death of COVID-19 patients varied significantly. with the median time of 6-9 days in different regions. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
32. MATHEMATICAL ANALYSIS OF CYCLE LENGTH-AGE STRUCTURED CELL POPULATION WITH AGGREGATE TRANSITION RULE: ASYNCHRONOUS GROWTH PROPERTY.
- Author
-
BOULANOUAR, MOHAMED
- Subjects
MATHEMATICAL analysis ,CELL populations ,AGGREGATED data ,SEMIGROUP algebras ,MATHEMATICAL models - Abstract
This work is a continuation of [1] in which we have analyzed a mathematical model of a structured cell population. Each cell is distinguished by its cycle length and by its age. The daughter cells are correlated to the whole cell population thanks to the Aggregate Transition Rule. We investigate then the asymptotic behavior of the generated semigroup which allows us to get the Asynchronous Growth Property of the whole cell population. [ABSTRACT FROM AUTHOR]
- Published
- 2021
33. MATHEMATICAL ANALYSIS OF CYCLE LENGTH-AGE STRUCTURED CELL POPULATION WITH AGGREGATE TRANSITION RULE: WELL-POSEDNESS.
- Author
-
BOULANOUAR, MOHAMED
- Subjects
MATHEMATICAL analysis ,CELL populations ,AGGREGATED data ,SEMIGROUP algebras ,MATHEMATICAL models - Abstract
In this work, we analyze a mathematical model of a structured cell population. Each cell is distinguished by its cycle length and by its age. The daughter cells are correlated to the whole cell population thanks to the Aggregate Transition Rule. We prove then that this mathematical model is governed by a C0-semigroup and we investigate some of its properties. [ABSTRACT FROM AUTHOR]
- Published
- 2021
34. Atropos Evidence Network tops 300M patients with addition of Forian and Syndesis to its searchable network.
- Author
-
Beavins, Emma
- Subjects
GENERATIVE artificial intelligence ,ARTIFICIAL intelligence - Abstract
With 300 million patients represented, Atropos Health says its Evidence Network is now the largest federated healthcare data network in the industry. [ABSTRACT FROM AUTHOR]
- Published
- 2024
35. RECOMMENDATIONS FOR INSTITUTIONAL AND GOVERNMENTAL MANAGEMENT OF GENDER INFORMATION.
- Author
-
ASHLEY, FLORENCE
- Subjects
GENDER identity ,INFORMATION resources management ,PUBLIC records ,AGGREGATED data ,RESEARCH - Abstract
Gender information management is becoming an area of increased concern and tension in recent years due to the parallel rise of trans visibility and the increase of government surveillance. With this Article, I aim to provide a structured and principled analytical framework for managing gender information in a manner that is responsive to different institutional contexts. Part I sketches the ethical considerations and principles which guide my recommendations. Whereas ethical considerations are the values which underlie my recommendations--the why--the proposed principles provide us with conceptual tools to bridge the why, when, and how of gender information management. Part II explores four different contexts in which gender information should be gathered and recorded and makes recommendations specific to each of those contexts. These four contexts are: administrative records, special programs, aggregate assessment, and research. Part III sketches how and what--when justified under the recommendations--gender information should be requested, recorded, and recounted. [ABSTRACT FROM AUTHOR]
- Published
- 2020
36. Nuevas perspectivas sobre el Problema de la Unidad Espacial Modificable (PUEM) en relación con la representación cartográfica de enfermedades raras.
- Author
-
Sánchez-Díaz, Germán, Alonso-Ferreira, Verónica, Posada de la Paz, Manuel, and Escobar, Francisco
- Subjects
- *
PHYSIOGRAPHIC provinces , *MEDICAL geography , *RARE diseases , *NEIGHBORHOODS , *VISUALIZATION - Abstract
This study analyses the effect of the modifiable areal unit problem (MAUP) comparing mortality data of a specific rare disease (Huntington) in three levels of spatial aggregation in Spain. The objective is to compare mortality indicators and cartographic visualisations in order to soundly advise on the optimal aggregation level according to population, covered area, and number of cases. We designed an adjacency ratio to observe the effect of neighbourhood relationships among three geographic units: province; district (comarca); municipality. For each level of aggregation, we performed epidemiological indicators of mortality as well as local indicators of spatial association. Maps were plotted with user-defined intervals to compare visual and statistical differences. MAUP-related effects are particularly noticeable in relatively infrequent events such as rare diseases. We found that district displayed the highest indicator for stability in the adjacency ratio and showed optimal characteristics for spatial resolution and the amount of information revealed through plotting. This helps in the choice of the working scale that can be used with other diseases or the levels of aggregation as a first step for more advanced epidemiological analyses. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
37. Analysing domestic tourism flows at the provincial level in Spain by using spatial gravity models.
- Author
-
Alvarez‐Diaz, Marcos, D'Hombres, Beatrice, Ghisetti, Claudia, and Pontarollo, Nicola
- Subjects
DOMESTIC tourism ,AGGREGATED data ,ECONOMETRIC models ,RECREATION - Abstract
Domestic tourism represents a large share of the total tourism volume in Spain, but it is still an under‐researched topic. This study focuses on the determinants of domestic flows in Spain at provincial level. The prior assumption is that domestic tourism demand may be affected by specific local conditions that previous studies, mostly based on more aggregate data, would hardly capture. A gravity model and various spatial econometric models are estimated assuming alternative spatial weighting matrices. Results suggest that income and relative prices affect tourism demand in Spanish provinces as well as weather, natural amenities, infrastructures, and recreational activities. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
38. Optimising lot sizing and order scheduling with non-linear production rates.
- Author
-
Neidigh, Robert O. and Harrison, Terry P.
- Subjects
PRODUCTION scheduling ,PRODUCTIVITY accounting ,MATHEMATICAL programming ,MANUFACTURING processes ,AGGREGATED data ,LINEAR statistical models - Abstract
This research focuses on developing an optimum production schedule in a process with a non-linear production rate. Non-linear production processes may exhibit an increasing production rate as the lot size increases, which results in increasing efficiency in per-unit production. The degree to which this learning is carried forward into the next lot varies by process. Sometimes the learning effect experiences a 100% carryover into the next lot, but other times some learning is forgotten and there is less than a 100% carryover. We consider processes in which the learning effect is completely forgotten from lot to lot. In practice non-linear processes are often treated as linear. That is, the production data are collected and aggregated over time and an average production rate is calculated which leads to inaccuracies in the production schedule. Here we use a discretised linear model to approximate the non-linear process. Production occurs in discrete time periods within which the amount produced is known. This enables a production schedule to be determined that minimises production and holding costs. A dynamic programming model that starts with the latest demand and progresses towards the earliest demand is used to solve the single-product single-machine problem. The model is tested using the production function from the PR#2 grinding process at CTS Reeves, a manufacturing firm in Carlisle, Pa. Solution times are determined for 50, 100, 200, 500, 1500, and 3000 periods. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
39. On influence of clustering population on accuracy of population total estimation.
- Author
-
Wywiał, Janusz L. and Sitek, Grzegorz A.
- Subjects
- *
CLUSTER sampling , *PROBABILITY theory - Abstract
The paper compares the accuracy of using cluster samples and using stratified samples to estimate a population total. Several clustering algorithms are used to partition a finite population into strata or clusters. Several variants of stratified sampling designs and one-stage cluster sampling designs, including those dependent on various inclusion probabilities, are taken into account. The accuracies of the estimators are compared using simulation experiments. The results of this paper let us conclude that partitioning a population into clusters could significantly improve the accuracy of estimating the total using sampling dependent on inclusion probabilities proportional to the aggregated auxiliary variable. Moreover, the considered estimators based on cluster sampling designs could be easily used in practice. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
40. One-stage dose-response meta-analysis for aggregated data.
- Author
-
Crippa, Alessio, Discacciati, Andrea, Bottai, Matteo, Spiegelman, Donna, and Orsini, Nicola
- Subjects
- *
DOSE-response relationship in biochemistry , *AGGREGATED data , *META-analysis , *RANDOM effects model , *SPECIFICATION (Civil law) , *COFFEE , *COMPUTER simulation , *CAUSES of death , *EXPERIMENTAL design , *REGRESSION analysis - Abstract
The standard two-stage approach for estimating non-linear dose-response curves based on aggregated data typically excludes those studies with less than three exposure groups. We develop the one-stage method as a linear mixed model and present the main aspects of the methodology, including model specification, estimation, testing, prediction, goodness-of-fit, model comparison, and quantification of between-studies heterogeneity. Using both fictitious and real data from a published meta-analysis, we illustrated the main features of the proposed methodology and compared it to a traditional two-stage analysis. In a one-stage approach, the pooled curve and estimates of the between-studies heterogeneity are based on the whole set of studies without any exclusion. Thus, even complex curves (splines, spike at zero exposure) defined by several parameters can be estimated. We showed how the one-stage method may facilitate several applications, in particular quantification of heterogeneity over the exposure range, prediction of marginal and conditional curves, and comparison of alternative models. The one-stage method for meta-analysis of non-linear curves is implemented in the dosresmeta R package. It is particularly suited for dose-response meta-analyses of aggregated where the complexity of the research question is better addressed by including all the studies. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
41. Estimating Term Structure Phenomena From Data Aggregated over Time.
- Author
-
Cargill, Thomas F. and Meyer, Robert A.
- Subjects
YIELD curve (Finance) ,INTEREST rates ,AGGREGATED data ,TIME series analysis ,ECONOMETRIC models ,ESTIMATION theory - Abstract
The article seeks to examine the results of the empirical research reported in the two frequency-domain studies and to keep the discussion open by explaining further about some of the econometric aspects of estimation from the data that has been aggregated over time, since aggregation may bring about a certain amount of bias on its own. In order to determine the effect of time aggregation, the current research paper will examine the distributed lag relationship between long-term and short-term interest rates for various digitizing intervals over the same amount of time.
- Published
- 1974
- Full Text
- View/download PDF
42. Modeling Field-Level Conservation Tillage Adoption with Aggregate Choice Data
- Author
-
Tara Wade, Lyubov Kurkalova, and Silvia Secchi
- Subjects
aggregated data ,conservation tillage ,estimated subsidies ,logic model ,Agriculture - Abstract
This empirical study of conservation tillage adoption relies on the logit model applied to field-level information on agents' attributes and county-aggregated measures of agents' choices. The methodology treats the aggregated data as an expected valueÑthe area-weighted group average of individual probabilities of choosing conservation tillageÑsubject to a measurement error. Using 2002 and 2004 data for Iowa, we estimate field-level costs of the adoption of conservation tillage. The results indicate that adoption is significantly affected by soil characteristics and crop rotation and highlight the heterogeneity in adoption costs when controlling for these characteristics.
- Published
- 2016
- Full Text
- View/download PDF
43. Rule Learning in Healthcare and Health Services Research
- Author
-
Wojtusiak, Janusz, Kacprzyk, Janusz, Series editor, Jain, Lakhmi C, Series editor, Dua, Sumeet, editor, Acharya, U. Rajendra, editor, and Dua, Prerna, editor
- Published
- 2014
- Full Text
- View/download PDF
44. Uncertainties and implications of applying aggregated data for spatial modelling of atmospheric ammonia emissions.
- Author
-
Hellsten, S., Dragosits, U., Place, C.J., Dore, A.J., Tang, Y.S., and Sutton, M.A.
- Subjects
AMMONIA ,EMISSIONS (Air pollution) ,EUTROPHICATION ,GEOGRAPHIC spatial analysis ,AGRICULTURE - Abstract
Ammonia emissions vary greatly at a local scale, and effects (eutrophication, acidification) occur primarily close to sources. Therefore it is important that spatially distributed emission estimates are located as accurately as possible. The main source of ammonia emissions is agriculture, and therefore agricultural survey statistics are the most important input data to an ammonia emission inventory alongside per activity estimates of emission potential. In the UK, agricultural statistics are collected at farm level, but are aggregated to parish level, NUTS-3 level or regular grid resolution for distribution to users. In this study, the Modifiable Areal Unit Problem (MAUP), associated with such amalgamation, is investigated in the context of assessing the spatial distribution of ammonia sources for emission inventories. England was used as a test area to study the effects of the MAUP. Agricultural survey data at farm level (point data) were obtained under license and amalgamated to different areal units or zones: regular 1-km, 5-km, 10-km grids and parish level, before they were imported into the emission model. The results of using the survey data at different levels of amalgamation were assessed to estimate the effects of the MAUP on the spatial inventory. The analysis showed that the size and shape of aggregation zones applied to the farm-level agricultural statistics strongly affect the location of the emissions estimated by the model. If the zones are too small, this may result in false emission “hot spots”, i.e., artificially high emission values that are in reality not confined to the zone to which they are allocated. Conversely, if the zones are too large, detail may be lost and emissions smoothed out, which may give a false impression of the spatial patterns and magnitude of emissions in those zones. The results of the study indicate that the MAUP has a significant effect on the location and local magnitude of emissions in spatial inventories where amalgamated, zonal data are used. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
45. THE IMPACT OF REGIONAL AGE DISTRIBUTION ON ENTREPRENEURSHIP IN CANADA.
- Author
-
Weiqiu Yu and Liang Ma
- Subjects
AGE distribution ,ENTREPRENEURSHIP ,BIRTH rate ,AGGREGATED data ,BIG data - Abstract
A sustained decline in mortality and fertility rates during the twentieth century has resulted in a shift towards older populations worldwide. Ageing is directly impacts business and the overall economy of a nation. Some studies have shown that the decision to become an entrepreneur is a regional phenomenon. A study explored the effect of demographic factors and aging on business startups in Germany using an aggregate data set and found an inverse Ushaped relationship between the regional age distribution and start-up activity in a region. This paper examines the relationship between age distribution and regional business activities in Canada using a longitudinal data set from 1988 to 2014. Results confirm that differences in the age distribution contribute to the variation in the business activities in Canada. Moreover, our findings suggest that the age-specific likelihood of becoming an entrepreneur changes over time, indicating the existence of age-specific peer effects in Canada. [ABSTRACT FROM AUTHOR]
- Published
- 2018
46. The Aggregate Association Index applied to stratified 2 × 2 tables: Application to the 1893 election data in New Zealand.
- Author
-
Tran, Duy, Beh, Eric J., and Hudson, Irene L.
- Subjects
- *
AGGREGATED data , *DATA management , *INFERENCE (Logic) , *DATA protection , *DATA analysis - Abstract
Data aggregation often occurs due to data collection methods or confidentiality laws imposed by government and institutional organisations. This kind of practice is carried out to ensure that an individual’s privacy is protected but it results in selective information being distributed. In this case, the availability of only aggregate data makes it difficult to draw conclusions about the association between categorical variables. This issue lies at the heart of Ecological Inference (EI) and is of growing concern for data analysts, especially for those dealing with the aggregate analysis of a single, or multiple, 2 × 2 contingency tables. Currently, there are a number of EI approaches that are available and provide the analyst with tools to analyse aggregated data but their success has been mixed due to the variety of assumptions that are made about the individual level data, or the models that are developed to analyse them. As an alternative to ecological inference, one may consider the Aggregate Association Index (AAI). This index gives the analyst an indication of the likely association structure between two categorical variables of a single 2 × 2 contingency table when the individual level, or joint frequency/proportion, data is unknown. To date, the AAI has been developed for the analysis of a single 2 × 2 table. Hence, the purpose of this paper is to extend the application of the AAI to the case where aggregated data from multiple 2 × 2 tables (i.e. stratified 2 × 2 tables) require analysis. To illustrate this new extension of the AAI, New Zealand voting data in 1893 is studied with the focus on gender. This data comprises of fifty-five electorates where the data available consists of the marginal information of a 2 × 2 table. The importance of this New Zealand voting data is that it was in this 1893 election where gender equality in voting at a national level was recognised for the first time in the world. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
47. Geostatistical disaggregation of polygon maps of average crop yields by area-to-point kriging.
- Author
-
Brus, D.J., Boogaard, H., Ceccarelli, T., Orton, T.G., Traore, S., and Zhang, M.
- Subjects
- *
CROP yields , *GEOLOGICAL statistics , *POLYGONS , *KRIGING , *AGRICULTURAL surveys - Abstract
Crop yield data are often available as statistics of areas, such as administrative units, generated by national agricultural surveys and censuses. This paper shows that such areal data can be used in area-to-point kriging (ATP kriging) to estimate the crop yield at the nodes of a fine grid that discretizes the study area, so that a more detailed map of the crop yield is obtained. The theory behind ATP kriging is explained, and illustrated with a one-dimensional simulation study and two real-world case studies. Vegetation, precipitation, temperature and soil data were used as potential covariates in the spatial trend part of the geostatistical model. ATP kriging requires the covariogram at point support, which can be recovered from the areal data by restricted maximum likelihood. The standard errors of the estimated variogram parameters can then be obtained by the Fisher information matrix. The average yields of only 17 administrative units in Shandong province (China) were not enough to obtain reliable estimates of the covariogram at point support. Also the ranges of the regional averages of the covariates were very narrow, so that the model must be extrapolated in the largest part of the study area. We were more confident about the covariogram parameters estimated from 45 provinces in Burkina Faso. We conclude that ATP kriging is an interesting method for disaggregation of spatially averaged crop yields. Contrary to other downscaling methods ATP kriging is founded on statistical theory, and consequently provides estimates of the precision of the disaggregated yields. Shortcomings are related to the uncertainty in the estimated covariogram parameters, as well as to the extrapolation of the model outside the range of the regional means of the covariates. Opportunities for future advancements are the use of modelled yields as covariates and the introduction of expert knowledge at different levels. For the latter a Bayesian approach to ATP kriging can be advantageous, introducing prior knowledge about the model parameters, as well as accounting for uncertainty about the model parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
48. ArCo: An R package to Estimate Artificial Counterfactuals.
- Author
-
Fonseca, Yuri R., Masini, Ricardo P., Medeiros, Marcelo C., and Vasconcelos, Gabriel F. R.
- Subjects
- *
COUNTERFACTUALS (Logic) , *AGGREGATED data , *DATA mining - Abstract
In this paper we introduce the ArCo package for R which consists of a set of functions to implement the the Artificial Counterfactual (ArCo) methodology to estimate causal effects of an intervention (treatment) on aggregated data and when a control group is not necessarily available. The ArCo method is a two-step procedure, where in the first stage a counterfactual is estimated from a large panel of time series from a pool of untreated peers. In the second-stage, the average treatment effect over the post-intervention sample is computed. Standard inferential procedures are available. The package is illustrated with both simulated and real datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
49. A concept for measuring network proximity of regions in R&D networks.
- Author
-
Wanzenböck, Iris
- Subjects
RESEARCH & development ,LOCATION marketing ,EXTERNALITIES ,AGGREGATED data ,PRODUCTIVITY accounting - Abstract
This paper proposes a new measure for assessing the network proximity between aggregated units, based on disaggregated information on the network distance of actors. Specific focus is on R&D network structures between regions. We introduce a weighted version of the proximity measure, related to the idea that direct and indirect linkages carry different types of knowledge. First-order proximity arising from direct cross-regional linkages is distinguished from higher-order network proximity, resulting from indirect linkages in the R&D network. We use an macroeconomic application in which we analyse the productivity effects of R&D network spillovers across regions to illustrate the usefulness of a proximity measure for aggregated units. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
50. Estimating Risk Preferences in the Field.
- Author
-
Barseghyan, Levon, Molinari, Francesca, O'Donoghue, Ted, and Teitelbaum, Joshua C.
- Subjects
RISK assessment ,SURVEYS ,EXPECTED utility ,UTILITY theory ,AGGREGATED data - Abstract
We survey the literature on estimating risk preferences using field data. We concentrate our attention on studies in which risk preferences are the focal object and estimating their structure is the core enterprise. We review a number of models of risk preferences-including both expected utility (EU) theory and non-EU models-that have been estimated using field data, and we highlight issues related to identification and estimation of such models using field data. We then survey the literature, giving separate treatment to research that uses individual-level data (e.g., property-insurance data) and research that uses aggregate data (e.g., betting-market data). We conclude by discussing directions for future research. ( JEL C51, D11, D81, D82, D83, G22, I13) [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.