21 results
Search Results
2. Using Tensor Completion Method to Achieving Better Coverage of Traffic State Estimation from Sparse Floating Car Data.
- Author
-
Ran, Bin, Song, Li, Zhang, Jian, Cheng, Yang, and Tan, Huachun
- Subjects
TRAFFIC engineering ,ESTIMATION theory ,PROBLEM solving ,STATISTICAL correlation ,MISSING data (Statistics) - Abstract
Traffic state estimation from the floating car system is a challenging problem. The low penetration rate and random distribution make available floating car samples usually cover part space and time points of the road networks. To obtain a wide range of traffic state from the floating car system, many methods have been proposed to estimate the traffic state for the uncovered links. However, these methods cannot provide traffic state of the entire road networks. In this paper, the traffic state estimation is transformed to solve a missing data imputation problem, and the tensor completion framework is proposed to estimate missing traffic state. A tensor is constructed to model traffic state in which observed entries are directly derived from floating car system and unobserved traffic states are modeled as missing entries of constructed tensor. The constructed traffic state tensor can represent spatial and temporal correlations of traffic data and encode the multi-way properties of traffic state. The advantage of the proposed approach is that it can fully mine and utilize the multi-dimensional inherent correlations of traffic state. We tested the proposed approach on a well calibrated simulation network. Experimental results demonstrated that the proposed approach yield reliable traffic state estimation from very sparse floating car data, particularly when dealing with the floating car penetration rate is below 1%. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
3. Large-scale probabilistic identification of boreal peatlands using Google Earth Engine, open-access satellite data, and machine learning.
- Author
-
DeLancey, Evan Ross, Kariyeva, Jahan, Bried, Jason T., and Hird, Jennifer N.
- Subjects
MACHINE learning ,TAIGA ecology ,TAIGAS ,AQUATIC sciences ,PHYSICAL sciences ,EARTH sciences ,ENVIRONMENTAL sciences - Abstract
Freely-available satellite data streams and the ability to process these data on cloud-computing platforms such as Google Earth Engine have made frequent, large-scale landcover mapping at high resolution a real possibility. In this paper we apply these technologies, along with machine learning, to the mapping of peatlands–a landcover class that is critical for preserving biodiversity, helping to address climate change impacts, and providing ecosystem services, e.g., carbon storage–in the Boreal Forest Natural Region of Alberta, Canada. We outline a data-driven, scientific framework that: compiles large amounts of Earth observation data sets (radar, optical, and LiDAR); examines the extracted variables for suitability in peatland modelling; optimizes model parameterization; and finally, predicts peatland occurrence across a large boreal area (397, 958 km
2 ) of Alberta at 10 m spatial resolution (equalling 3.9 billion pixels across Alberta). The resulting peatland occurrence model shows an accuracy of 87% and a kappa statistic of 0.57 when compared to our validation data set. Differentiating peatlands from mineral wetlands achieved an accuracy of 69% and kappa statistic of 0.37. This data-driven approach is applicable at large geopolitical scales (e.g., provincial, national) for wetland and landcover inventories that support long-term, responsible resource management. [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
4. Best Match: New relevance search for PubMed.
- Author
-
Fiorini, Nicolas, Canese, Kathi, Starchenko, Grisha, Kireev, Evgeny, Kim, Won, Miller, Vadim, Osipov, Maxim, Kholodov, Michael, Ismagilov, Rafis, Mohan, Sunil, Ostell, James, and Lu, Zhiyong
- Subjects
SEARCH engines ,SEARCH algorithms ,INTERNET searching ,DATA mining ,MEDICAL literature - Abstract
PubMed is a free search engine for biomedical literature accessed by millions of users from around the world each day. With the rapid growth of biomedical literature—about two articles are added every minute on average—finding and retrieving the most relevant papers for a given query is increasingly challenging. We present Best Match, a new relevance search algorithm for PubMed that leverages the intelligence of our users and cutting-edge machine-learning technology as an alternative to the traditional date sort order. The Best Match algorithm is trained with past user searches with dozens of relevance-ranking signals (factors), the most important being the past usage of an article, publication date, relevance score, and type of article. This new algorithm demonstrates state-of-the-art retrieval performance in benchmarking experiments as well as an improved user experience in real-world testing (over 20% increase in user click-through rate). Since its deployment in June 2017, we have observed a significant increase (60%) in PubMed searches with relevance sort order: it now assists millions of PubMed searches each week. In this work, we hope to increase the awareness and transparency of this new relevance sort option for PubMed users, enabling them to retrieve information more effectively. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
5. Accurate and fast path computation on large urban road networks: A general approach.
- Author
-
Song, Qing, Li, Meng, and Li, Xiaolei
- Subjects
TRANSPORTATION ,TRAFFIC engineering ,ROADS ,NAVIGATION ,ALGORITHMS - Abstract
Accurate and fast path computation is essential for applications such as onboard navigation systems and traffic network routing. While a number of heuristic algorithms have been developed in the past few years for faster path queries, the accuracy of them are always far below satisfying. In this paper, we first develop an agglomerative graph partitioning method for generating high balanced traverse distance partitions, and we constitute a three-level graph model based on the graph partition scheme for structuring the urban road network. Then, we propose a new hierarchical path computation algorithm, which benefits from the hierarchical graph model and utilizes a region pruning strategy to significantly reduce the search space without compromising the accuracy. Finally, we present a detailed experimental evaluation on the real urban road network of New York City, and the experimental results demonstrate the effectiveness of the proposed approach to generate optimal fast paths and to facilitate real-time routing applications. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
6. An efficient General Transit Feed Specification (GTFS) enabled algorithm for dynamic transit accessibility analysis.
- Author
-
Fayyaz S., S. Kiavash, Liu, Xiaoyue Cathy, and Zhang, Guohui
- Subjects
METROPOLITAN areas ,PUBLIC transit ,TRAVEL time (Traffic engineering) ,POPULATION density ,POPULATION biology - Abstract
The social functions of urbanized areas are highly dependent on and supported by the convenient access to public transportation systems, particularly for the less privileged populations who have restrained auto ownership. To accurately evaluate the public transit accessibility, it is critical to capture the spatiotemporal variation of transit services. This can be achieved by measuring the shortest paths or minimum travel time between origin-destination (OD) pairs at each time-of-day (e.g. every minute). In recent years, General Transit Feed Specification (GTFS) data has been gaining popularity for between-station travel time estimation due to its interoperability in spatiotemporal analytics. Many software packages, such as ArcGIS, have developed toolbox to enable the travel time estimation with GTFS. They perform reasonably well in calculating travel time between OD pairs for a specific time-of-day (e.g. 8:00 AM), yet can become computational inefficient and unpractical with the increase of data dimensions (e.g. all times-of-day and large network). In this paper, we introduce a new algorithm that is computationally elegant and mathematically efficient to address this issue. An open-source toolbox written in C++ is developed to implement the algorithm. We implemented the algorithm on City of St. George’s transit network to showcase the accessibility analysis enabled by the toolbox. The experimental evidence shows significant reduction on computational time. The proposed algorithm and toolbox presented is easily transferable to other transit networks to allow transit agencies and researchers perform high resolution transit performance analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
7. Relay discovery and selection for large-scale P2P streaming.
- Author
-
Zhang, Chengwei, Wang, Angela Yunxian, and Hei, Xiaojun
- Subjects
PEER-to-peer architecture (Computer networks) ,ERROR analysis in mathematics ,ESTIMATION theory ,HASHING ,NUMERICAL analysis - Abstract
In peer-to-peer networks, application relays have been commonly used to provide various networking services. The service performance often improves significantly if a relay is selected appropriately based on its network location. In this paper, we studied the location-aware relay discovery and selection problem for large-scale P2P streaming networks. In these large-scale and dynamic overlays, it incurs significant communication and computation cost to discover a sufficiently large relay candidate set and further to select one relay with good performance. The network location can be measured directly or indirectly with the tradeoffs between timeliness, overhead and accuracy. Based on a measurement study and the associated error analysis, we demonstrate that indirect measurements, such as King and Internet Coordinate Systems (ICS), can only achieve a coarse estimation of peers’ network location and those methods based on pure indirect measurements cannot lead to a good relay selection. We also demonstrate that there exists significant error amplification of the commonly used “best-out-of-K” selection methodology using three RTT data sets publicly available. We propose a two-phase approach to achieve efficient relay discovery and accurate relay selection. Indirect measurements are used to narrow down a small number of high-quality relay candidates and the final relay selection is refined based on direct probing. This two-phase approach enjoys an efficient implementation using the Distributed-Hash-Table (DHT). When the DHT is constructed, the node keys carry the location information and they are generated scalably using indirect measurements, such as the ICS coordinates. The relay discovery is achieved efficiently utilizing the DHT-based search. We evaluated various aspects of this DHT-based approach, including the DHT indexing procedure, key generation under peer churn and message costs. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
8. Can co-authorship networks be used to predict author research impact? A machine-learning based analysis within the field of degenerative cervical myelopathy research
- Author
-
Noah Grodzinski, Benjamin Davies, Ben Grodzinski, Grodzinski, Ben [0000-0001-8839-4718], and Apollo - University of Cambridge Repository
- Subjects
FOS: Computer and information sciences ,Biomedical Research ,Social connectedness ,Computer science ,International Cooperation ,Field (computer science) ,Machine Learning ,Mathematical and Statistical Techniques ,Knowledge extraction ,Japan ,Medical Laboratory Personnel ,Musculoskeletal System ,Multidisciplinary ,Computer and information sciences ,Artificial neural network ,Applied Mathematics ,Simulation and Modeling ,Statistics ,Opinion leadership ,Neurodegenerative Diseases ,Research Assessment ,Research Personnel ,Physical sciences ,Medicine ,Anatomy ,Network Analysis ,Algorithms ,Network analysis ,Research Article ,Neural Networks ,Science ,FOS: Physical sciences ,Bibliometrics ,Spinal Cord Diseases ,Machine Learning Algorithms ,Artificial Intelligence ,Humans ,Statistical Methods ,Set (psychology) ,Skeleton ,Artificial Neural Networks ,Retrospective Studies ,Computational Neuroscience ,Medicine and health sciences ,Biology and life sciences ,Computational Biology ,Data science ,Spine ,Authorship ,Research and analysis methods ,North America ,Neural Networks, Computer ,Mathematics ,Neck ,Neuroscience ,Forecasting - Abstract
Introduction Degenerative Cervical Myelopathy (DCM) is a common and disabling condition, with a relatively modest research capacity. In order to accelerate knowledge discovery, the AO Spine RECODE-DCM project has recently established the top priorities for DCM research. Uptake of these priorities within the research community will require their effective dissemination, which can be supported by identifying key opinion leaders (KOLs). In this paper, we aim to identify KOLs using artificial intelligence. We produce and explore a DCM co-authorship network, to characterise researchers’ impact within the research field. Methods Through a bibliometric analysis of 1674 scientific papers in the DCM field, a co-authorship network was created. For each author, statistics about their connections to the co-authorship network (and so the nature of their collaboration) were generated. Using these connectedness statistics, a neural network was used to predict H-Index for each author (as a proxy for research impact). The neural network was retrospectively validated on an unseen author set. Results DCM research is regionally clustered, with strong collaboration across some international borders (e.g., North America) but not others (e.g., Western Europe). In retrospective validation, the neural network achieves a correlation coefficient of 0.86 (p Discussion Analysis of the neural network shows that the nature of collaboration strongly impacts an author’s research visibility, and therefore suitability as a KOL. This also suggests greater collaboration within the DCM field could help to improve both individual research visibility and global synergy.
- Published
- 2021
9. Mapping technological innovation dynamics in artificial intelligence domains: Evidence from a global patent analysis
- Author
-
Na Liu, Philip Shapira, Xiaoxu Yue, and Jiancheng Guan
- Subjects
Computer and Information Sciences ,China ,Technology ,Asia ,Science ,Social Sciences ,Technology/methods ,Research and Analysis Methods ,Machine Learning ,Geographical Locations ,Patents as Topic ,Machine Learning Algorithms ,Automation ,Japan ,Inventions ,Artificial Intelligence ,Support Vector Machines ,Humans ,Patents ,Language Acquisition ,Multidisciplinary ,Models, Statistical ,Applied Mathematics ,Simulation and Modeling ,Linguistics ,Automation/methods ,United States ,Intellectual Property ,Models, Organizational ,Physical Sciences ,People and Places ,North America ,Medicine ,Law and Legal Sciences ,Commercial Law ,Diffusion of Innovation ,Mathematics ,Algorithms ,Research Article - Abstract
Artificial intelligence (AI) is emerging as a technology at the center of many political, economic, and societal debates. This paper formulates a new AI patent search strategy and applies this to provide a landscape analysis of AI innovation dynamics and technology evolution. The paper uses patent analyses, network analyses, and source path link count algorithms to examine AI spatial and temporal trends, cooperation features, cross-organization knowledge flow and technological routes. Results indicate a growing yet concentrated, non-collaborative and multi-path development and protection profile for AI patenting, with cross-organization knowledge flows based mainly on interorganizational knowledge citation links.
- Published
- 2021
10. A personalized channel recommendation and scheduling system considering both section video clips and full video clips
- Author
-
SeungGwan Lee and Daeho Lee
- Subjects
Computer science ,Section (typography) ,Video Recording ,Social Sciences ,lcsh:Medicine ,02 engineering and technology ,computer.software_genre ,Geographical locations ,Machine Learning ,Database and Informatics Methods ,Learning and Memory ,0202 electrical engineering, electronic engineering, information engineering ,Data Mining ,Psychology ,Computer Networks ,CLIPS ,lcsh:Science ,Statistical Data ,computer.programming_language ,Multidisciplinary ,Multimedia ,Applied Mathematics ,Simulation and Modeling ,IPTV ,Scheduling system ,Physical Sciences ,Information Retrieval ,020201 artificial intelligence & image processing ,Information Technology ,Algorithms ,Statistics (Mathematics) ,Research Article ,Communication channel ,Computer and Information Sciences ,Schedule ,Minnesota ,Broadcasting ,Research and Analysis Methods ,Computer Communication Networks ,Artificial Intelligence ,Learning ,Humans ,Internet ,business.industry ,Communications Media ,lcsh:R ,Cognitive Psychology ,Biology and Life Sciences ,020207 software engineering ,Models, Theoretical ,United States ,North America ,Cognitive Science ,lcsh:Q ,People and places ,business ,computer ,Mathematics ,Neuroscience - Abstract
With the convergence of various broadcasting systems, the amount of content available in mobile terminals including IPTV has significantly increased. In this paper, we propose a system that enables users to schedule programs considering both section video clips and full video clips based on the user detection method with similar preference. And, since the system constituting the contents can be classified according to the program, the proposed method can store a program desired by the user, and thus create and schedule a kind of individual channel. Experimental results show that the proposed method has a higher prediction accuracy; this is accomplished by comparing existing channel recommendation methods with the program recommendation methods proposed in this paper.
- Published
- 2018
11. Assessing the role of transmission chains in the spread of HIV-1 among men who have sex with men in Quebec, Canada.
- Author
-
Villandré, Luc, Labbe, Aurélie, Brenner, Bluma, Ibanescu, Ruxandra-Ilinca, Roger, Michel, and Stephens, David A.
- Subjects
HIV infection transmission ,MEN who have sex with men ,PHYLOGENY ,MAXIMUM likelihood statistics - Abstract
Background: Phylogenetics has been used to investigate HIV transmission among men who have sex with men. This study compares several methodologies to elucidate the role of transmission chains in the dynamics of HIV spread in Quebec, Canada. Methods: The Quebec Human Immunodeficiency Virus (HIV) genotyping program database now includes viral sequences from close to 4,000 HIV-positive individuals classified as Men who have Sex with Men (MSMs), collected between 1996 and early 2016. Assessment of chain expansion may depend on the partitioning scheme used, and so, we produce estimates from several methods: the conventional Bayesian and maximum likelihood-bootstrap methods, in combination with a variety of schemes for applying a maximum distance criterion, and two other algorithms, DM-PhyClus, a Bayesian algorithm that produces a measure of uncertainty for proposed partitions, and the Gap Procedure, a fast non-phylogenetic approach. Sequences obtained from individuals in the Primary HIV Infection (PHI) stage serve to identify incident cases. We focus on the period ranging from January 1st 2012 to February 1st 2016. Results and conclusion: The analyses reveal considerable overlap between chain estimates obtained from conventional methods, thus leading to similar estimates of recent temporal expansion. The Gap Procedure and DM-PhyClus suggest however moderately different chains. Nevertheless, all estimates stress that longer older chains are responsible for a sizeable proportion of the sampled incident cases among MSMs. Curbing the HIV epidemic will require strategies aimed specifically at preventing such growth. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
12. Evaluating the role of land cover and climate uncertainties in computing gross primary production in Hawaiian Island ecosystems.
- Author
-
Kimball, Heather L., Selmants, Paul C., Moreno, Alvaro, Running, Steve W., and Giardina, Christian P.
- Subjects
LAND cover ,PRIMARY productivity (Biology) ,ISLAND ecology ,FLUX (Energy) ,MODIS (Spectroradiometer) - Abstract
Gross primary production (GPP) is the Earth’s largest carbon flux into the terrestrial biosphere and plays a critical role in regulating atmospheric chemistry and global climate. The Moderate Resolution Imaging Spectrometer (MODIS)-MOD17 data product is a widely used remote sensing-based model that provides global estimates of spatiotemporal trends in GPP. When the MOD17 algorithm is applied to regional scale heterogeneous landscapes, input data from coarse resolution land cover and climate products may increase uncertainty in GPP estimates, especially in high productivity tropical ecosystems. We examined the influence of using locally specific land cover and high-resolution local climate input data on MOD17 estimates of GPP for the State of Hawaii, a heterogeneous and discontinuous tropical landscape. Replacing the global land cover data input product (MOD12Q1) with Hawaii-specific land cover data reduced statewide GPP estimates by ~8%, primarily because the Hawaii-specific land cover map had less vegetated land area compared to the global land cover product. Replacing coarse resolution GMAO climate data with Hawaii-specific high-resolution climate data also reduced statewide GPP estimates by ~8% because of the higher spatial variability of photosynthetically active radiation (PAR) in the Hawaii-specific climate data. The combined use of both Hawaii-specific land cover and high-resolution Hawaii climate data inputs reduced statewide GPP by ~16%, suggesting equal and independent influence on MOD17 GPP estimates. Our sensitivity analyses within a heterogeneous tropical landscape suggest that refined global land cover and climate data sets may contribute to an enhanced MOD17 product at a variety of spatial scales. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
13. Deploying a quantum annealing processor to detect tree cover in aerial imagery of California.
- Author
-
Boyda, Edward, Basu, Saikat, Ganguly, Sangram, Michaelis, Andrew, Mukhopadhyay, Supratik, and Nemani, Ramakrishna R.
- Subjects
QUANTUM annealing ,COMPUTER vision ,AERIAL photography ,REMOTE sensing ,GROUND cover plants - Abstract
Quantum annealing is an experimental and potentially breakthrough computational technology for handling hard optimization problems, including problems of computer vision. We present a case study in training a production-scale classifier of tree cover in remote sensing imagery, using early-generation quantum annealing hardware built by D-wave Systems, Inc. Beginning within a known boosting framework, we train decision stumps on texture features and vegetation indices extracted from four-band, one-meter-resolution aerial imagery from the state of California. We then impose a regulated quadratic training objective to select an optimal voting subset from among these stumps. The votes of the subset define the classifier. For optimization, the logical variables in the objective function map to quantum bits in the hardware device, while quadratic couplings encode as the strength of physical interactions between the quantum bits. Hardware design limits the number of couplings between these basic physical entities to five or six. To account for this limitation in mapping large problems to the hardware architecture, we propose a truncation and rescaling of the training objective through a trainable metaparameter. The boosting process on our basic 108- and 508-variable problems, thus constituted, returns classifiers that incorporate a diverse range of color- and texture-based metrics and discriminate tree cover with accuracies as high as 92% in validation and 90% on a test scene encompassing the open space preserves and dense suburban build of Mill Valley, CA. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
14. Fuel shortages during hurricanes: Epidemiological modeling and optimal control
- Author
-
Sirish Namilae, Dahai Liu, Sabique Islam, and Richard J. Prazenica
- Subjects
0301 basic medicine ,0209 industrial biotechnology ,Operations research ,Economics ,Social Sciences ,Economic shortage ,02 engineering and technology ,Shortages ,Systems Science ,Geographical locations ,020901 industrial engineering & automation ,Sociology ,Per capita ,Medicine and Health Sciences ,Resource Management ,Public and Occupational Health ,Materials ,Multidisciplinary ,Emergency management ,Covariance ,Cyclonic Storms ,Applied Mathematics ,Simulation and Modeling ,Resource constraints ,Social Communication ,Vaccination and Immunization ,Southeastern United States ,Dynamical Systems ,Social Networks ,Physical Sciences ,Florida ,Medicine ,Engineering and Technology ,Kalman Filter ,Gasoline ,Algorithms ,Network Analysis ,Research Article ,Computer and Information Sciences ,Science ,Materials Science ,Immunology ,Disaster Planning ,Fuels ,Research and Analysis Methods ,03 medical and health sciences ,Humans ,Landfall ,Estimation ,business.industry ,Biology and Life Sciences ,Random Variables ,Optimal control ,Probability Theory ,United States ,Communications ,Energy and Power ,030104 developmental biology ,North America ,Environmental science ,Preventive Medicine ,People and places ,business ,Epidemic model ,Social Media ,Mathematics - Abstract
Hurricanes are powerful agents of destruction with significant socioeconomic impacts. A persistent problem due to the large-scale evacuations during hurricanes in the southeastern United States is the fuel shortages during the evacuation. Computational models can aid in emergency preparedness and help mitigate the impacts of hurricanes. In this paper, we model the hurricane fuel shortages using the SIR epidemic model. We utilize the crowd-sourced data corresponding to Hurricane Irma and Florence to parametrize the model. An estimation technique based on Unscented Kalman filter (UKF) is employed to evaluate the SIR dynamic parameters. Finally, an optimal control approach for refueling based on a vaccination analogue is presented to effectively reduce the fuel shortages under a resource constraint. We find the basic reproduction number corresponding to fuel shortages in Miami during Hurricane Irma to be 3.98. Using the control model we estimated the level of intervention needed to mitigate the fuel-shortage epidemic. For example, our results indicate that for Naples- Fort Myers affected by Hurricane Irma, a per capita refueling rate of 0.1 for 2.2 days would have reduced the peak fuel shortage from 55% to 48% and a refueling rate of 0.75 for half a day before landfall would have reduced to 37%.
- Published
- 2019
15. Navigating optimal treaty-shopping routes using a multiplex network model
- Author
-
Sung Jae Park, Kyu-Min Lee, and Jae-Suk Yang
- Subjects
Computer and Information Sciences ,Economics ,Political Science ,Science ,International Cooperation ,Social Sciences ,Public Policy ,Smoking Prevention ,ComputingMilieux_LEGALASPECTSOFCOMPUTING ,Foreign direct investment ,Multiplex Networks ,Research and Analysis Methods ,Geographical locations ,Tax revenue ,Income tax ,Centrality ,Humans ,Treaty ,Industrial organization ,Multidisciplinary ,Applied Mathematics ,Simulation and Modeling ,Income Tax ,Smoking ,Commerce ,Tobacco Products ,Taxes ,Tax avoidance ,United States ,Taxation ,ComputingMilieux_GENERAL ,Tax treaty ,Multinational corporation ,North America ,Physical Sciences ,Medicine ,Business ,People and places ,Network Analysis ,Finance ,Mathematics ,Algorithms ,Research Article - Abstract
The international tax treaty system is a highly integrated and complex network. In this system, many multinational enterprises (MNEs) explore ways of reducing taxes by choosing optimal detour routes. Treaty abuse by these MNEs causes significant loss of tax revenues for many countries, but there is no systematic way of regulating their actions. However, it may be helpful to find a way of detecting the optimal routes by which MNEs avoid taxes and observe the effects of this behavior. In this paper, we investigate the international tax treaty network system of foreign investment channels based on real data and introduce a novel measure of tax-routing centrality and other centralities via network analysis. Our analysis of tax routing in a multiplex network reveals not only various tax-minimizing routes and their rates, but also new paths which cannot be found by navigating a single network layer. In addition, we identify strongly connected components of the multiplex tax treaty system with minimal tax shopping routes; more than 80 countries are included in this system. This means that there are far more pathways to be observed than can be detected on any given individual single layer. We provide a unified framework for analyzing the international tax treaty system and observing the effects of tax avoidance by MNEs.
- Published
- 2021
16. Economic development and wage inequality: A complex system analysis
- Author
-
Emanuele Pugliese, Luciano Pietronero, and Angelica Sbardella
- Subjects
Genetics and Molecular Biology (all) ,Labour economics ,Index (economics) ,General Economics (econ.GN) ,Article economic aspect economic development empiricism employment geography gross national product industrialization private sector United States ,Economics ,lcsh:Medicine ,Social Sciences ,economic developmenth ,Economic Geography ,umansalary and fringe benefit ,Biochemistry ,Systems Science ,Geographical locations ,Per capita ,statistics and numerical data ,Salaries ,050207 economics ,lcsh:Science ,050205 econometrics ,Economics - General Economics ,Multidisciplinary ,Geography ,05 social sciences ,1. No poverty ,Complex Systems ,North American Industry Classification System ,Scale (social sciences) ,8. Economic growth ,Physical Sciences ,Economic Development ,Algorithms ,Research Article ,Employment ,Computer and Information Sciences ,Complex system ,FOS: Economics and business ,socioeconomics ,Development Economics ,0502 economics and business ,Humans ,Social inequality ,Salaries and Fringe Benefits ,Socioeconomic Factors ,United States ,Biochemistry, Genetics and Molecular Biology (all) ,Agricultural and Biological Sciences (all) ,lcsh:R ,wage inequality algorithm ,Industrialisation ,Income inequality metrics ,Labor Economics ,North America ,Earth Sciences ,lcsh:Q ,People and places ,Mathematics - Abstract
By borrowing methods from complex system analysis, in this paper we analyze the features of the complex relationship that links the development and the industrialization of a country to economic inequality. In order to do this, we identify industrialization as a combination of a monetary index, the GDP per capita, and a recently introduced measure of the complexity of an economy, the Fitness. At first we explore these relations on a global scale over the time period 1990--2008 focusing on two different dimensions of inequality: the capital share of income and a Theil measure of wage inequality. In both cases, the movement of inequality follows a pattern similar to the one theorized by Kuznets in the fifties. We then narrow down the object of study ad we concentrate on wage inequality within the United States. By employing data on wages and employment on the approximately 3100 US counties for the time interval 1990--2014, we generalize the Fitness-Complexity algorithm for counties and NAICS sectors, and we investigate wage inequality between industrial sectors within counties. At this scale, in the early nineties we recover a behavior similar to the global one. While, in more recent years, we uncover a trend reversal: wage inequality monotonically increases as industrialization levels grow. Hence at a county level, at net of the social and institutional factors that differ among countries, we not only observe an upturn in inequality but also a change in the structure of the relation between wage inequality and development.
- Published
- 2017
17. Large-scale probabilistic identification of boreal peatlands using Google Earth Engine, open-access satellite data, and machine learning
- Author
-
Jahan Kariyeva, Jennifer N. Hird, Evan R. DeLancey, and Jason T. Bried
- Subjects
Topography ,Earth observation ,Peat ,010504 meteorology & atmospheric sciences ,Earth, Planet ,0211 other engineering and technologies ,Marine and Aquatic Sciences ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Geographical locations ,Alberta ,Ecosystem services ,Machine Learning ,Remote Sensing ,Bogs ,Taiga ,Satellite imagery ,Lidar ,Multidisciplinary ,Applied Mathematics ,Simulation and Modeling ,Biodiversity ,Physical Sciences ,Medicine ,Engineering and Technology ,Algorithms ,Research Article ,Freshwater Environments ,Conservation of Natural Resources ,Canada ,Computer and Information Sciences ,Science ,Climate Change ,Climate change ,Research and Analysis Methods ,Machine learning ,Machine Learning Algorithms ,Surface Water ,Artificial Intelligence ,Resource management ,Ecosystem ,Fens ,021101 geological & geomatics engineering ,0105 earth and related environmental sciences ,Landforms ,Radar ,business.industry ,Ecology and Environmental Sciences ,Aquatic Environments ,Geomorphology ,Carbon ,Boreal ,Wetlands ,North America ,Earth Sciences ,Environmental science ,Artificial intelligence ,Hydrology ,People and places ,Scale (map) ,business ,computer ,Mathematics - Abstract
Freely-available satellite data streams and the ability to process these data on cloud-computing platforms such as Google Earth Engine have made frequent, large-scale landcover mapping at high resolution a real possibility. In this paper we apply these technologies, along with machine learning, to the mapping of peatlands-a landcover class that is critical for preserving biodiversity, helping to address climate change impacts, and providing ecosystem services, e.g., carbon storage-in the Boreal Forest Natural Region of Alberta, Canada. We outline a data-driven, scientific framework that: compiles large amounts of Earth observation data sets (radar, optical, and LiDAR); examines the extracted variables for suitability in peatland modelling; optimizes model parameterization; and finally, predicts peatland occurrence across a large boreal area (397, 958 km2) of Alberta at 10 m spatial resolution (equalling 3.9 billion pixels across Alberta). The resulting peatland occurrence model shows an accuracy of 87% and a kappa statistic of 0.57 when compared to our validation data set. Differentiating peatlands from mineral wetlands achieved an accuracy of 69% and kappa statistic of 0.37. This data-driven approach is applicable at large geopolitical scales (e.g., provincial, national) for wetland and landcover inventories that support long-term, responsible resource management.
- Published
- 2019
18. Using Tensor Completion Method to Achieving Better Coverage of Traffic State Estimation from Sparse Floating Car Data
- Author
-
Yang Cheng, Jian Zhang, Huachun Tan, Li Song, and Bin Ran
- Subjects
Computer science ,Aviation ,Intelligence ,lcsh:Medicine ,Social Sciences ,Transportation ,02 engineering and technology ,Geographical locations ,Mathematical and Statistical Techniques ,0202 electrical engineering, electronic engineering, information engineering ,Range (statistics) ,Computer Science::Networking and Internet Architecture ,Psychology ,lcsh:Science ,Intelligent transportation system ,Principal Component Analysis ,Multidisciplinary ,geography.geographical_feature_category ,Applied Mathematics ,Simulation and Modeling ,05 social sciences ,Floating car data ,Transportation Infrastructure ,Physical Sciences ,Engineering and Technology ,020201 artificial intelligence & image processing ,Algorithm ,Algorithms ,Statistics (Mathematics) ,Network analysis ,Research Article ,Optimization ,Computer and Information Sciences ,Research and Analysis Methods ,Civil Engineering ,Wisconsin ,0502 economics and business ,Computer Simulation ,Tensor ,Statistical Methods ,Traffic generation model ,050210 logistics & transportation ,geography ,business.industry ,lcsh:R ,Cognitive Psychology ,Biology and Life Sciences ,Computing Methods ,United States ,Roads ,ComputerSystemsOrganization_MISCELLANEOUS ,Multivariate Analysis ,North America ,Cognitive Science ,lcsh:Q ,State (computer science) ,People and places ,business ,Automobiles ,Mathematics ,Water well ,Neuroscience - Abstract
Traffic state estimation from the floating car system is a challenging problem. The low penetration rate and random distribution make available floating car samples usually cover part space and time points of the road networks. To obtain a wide range of traffic state from the floating car system, many methods have been proposed to estimate the traffic state for the uncovered links. However, these methods cannot provide traffic state of the entire road networks. In this paper, the traffic state estimation is transformed to solve a missing data imputation problem, and the tensor completion framework is proposed to estimate missing traffic state. A tensor is constructed to model traffic state in which observed entries are directly derived from floating car system and unobserved traffic states are modeled as missing entries of constructed tensor. The constructed traffic state tensor can represent spatial and temporal correlations of traffic data and encode the multi-way properties of traffic state. The advantage of the proposed approach is that it can fully mine and utilize the multi-dimensional inherent correlations of traffic state. We tested the proposed approach on a well calibrated simulation network. Experimental results demonstrated that the proposed approach yield reliable traffic state estimation from very sparse floating car data, particularly when dealing with the floating car penetration rate is below 1%.
- Published
- 2016
19. Avian Influenza Risk Surveillance in North America with Online Media
- Author
-
Lauren Yee and Colin Robertson
- Subjects
Epidemiology ,lcsh:Medicine ,Social Sciences ,Disease Outbreaks ,Animal Diseases ,0302 clinical medicine ,Sociology ,Zoonoses ,Medicine and Health Sciences ,030212 general & internal medicine ,lcsh:Science ,Statistical Data ,Multidisciplinary ,Warning system ,Animal Behavior ,Social Communication ,Geography ,Infectious Diseases ,Data extraction ,Social Networks ,Veterinary Diseases ,Vertebrates ,Physical Sciences ,The Internet ,Algorithms ,Network Analysis ,Statistics (Mathematics) ,Research Article ,Avian Influenza ,Computer and Information Sciences ,Infectious Disease Control ,030231 tropical medicine ,Twitter ,Context (language use) ,Disease Surveillance ,Digital media ,Birds ,03 medical and health sciences ,Animal Influenza ,Animals ,Social media ,Internet ,Behavior ,Operationalization ,business.industry ,lcsh:R ,Organisms ,Biology and Life Sciences ,Data science ,Communications ,Risk perception ,Influenza in Birds ,Infectious Disease Surveillance ,North America ,Amniotes ,lcsh:Q ,Animal Migration ,Veterinary Science ,business ,Social Media ,Zoology ,Mathematics - Abstract
The use of Internet-based sources of information for health surveillance applications has increased in recent years, as a greater share of social and media activity happens through online channels. The potential surveillance value in online sources of information about emergent health events include early warning, situational awareness, risk perception and evaluation of health messaging among others. The challenge in harnessing these sources of data is the vast number of potential sources to monitor and developing the tools to translate dynamic unstructured content into actionable information. In this paper we investigated the use of one social media outlet, Twitter, for surveillance of avian influenza risk in North America. We collected AI-related messages over a five-month period and compared these to official surveillance records of AI outbreaks. A fully automated data extraction and analysis pipeline was developed to acquire, structure, and analyze social media messages in an online context. Two methods of outbreak detection; a static threshold and a cumulative-sum dynamic threshold; based on a time series model of normal activity were evaluated for their ability to discern important time periods of AI-related messaging and media activity. Our findings show that peaks in activity were related to real-world events, with outbreaks in Nigeria, France and the USA receiving the most attention while those in China were less evident in the social media data. Topic models found themes related to specific AI events for the dynamic threshold method, while many for the static method were ambiguous. Further analyses of these data might focus on quantifying the bias in coverage and relation between outbreak characteristics and detectability in social media data. Finally, while the analyses here focused on broad themes and trends, there is likely additional value in developing methods for identifying low-frequency messages, operationalizing this methodology into a comprehensive system for visualizing patterns extracted from the Internet, and integrating these data with other sources of information such as wildlife, environment, and agricultural data.
- Published
- 2016
20. Accurate and fast path computation on large urban road networks: A general approach
- Author
-
Meng Li, Xiaolei Li, and Qing Song
- Subjects
Optimization ,Computer and Information Sciences ,Traverse ,Urban Population ,Computer science ,Heuristic (computer science) ,Computation ,New York ,0211 other engineering and technologies ,lcsh:Medicine ,Social Sciences ,Transportation ,02 engineering and technology ,Fast path ,Research and Analysis Methods ,Civil Engineering ,Geographical locations ,Sociology ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Pruning (decision trees) ,lcsh:Science ,021103 operations research ,Multidisciplinary ,Heuristic ,Applied Mathematics ,Simulation and Modeling ,lcsh:R ,Graph partition ,Transportation Infrastructure ,United States ,Navigation ,Roads ,Signaling Networks ,Hierarchical clustering ,Social Networks ,Computer engineering ,Physical Sciences ,North America ,Path (graph theory) ,Engineering and Technology ,lcsh:Q ,New York City ,People and places ,Routing (electronic design automation) ,Algorithms ,Mathematics ,Network Analysis ,Research Article - Abstract
Accurate and fast path computation is essential for applications such as onboard navigation systems and traffic network routing. While a number of heuristic algorithms have been developed in the past few years for faster path queries, the accuracy of them are always far below satisfying. In this paper, we first develop an agglomerative graph partitioning method for generating high balanced traverse distance partitions, and we constitute a three-level graph model based on the graph partition scheme for structuring the urban road network. Then, we propose a new hierarchical path computation algorithm, which benefits from the hierarchical graph model and utilizes a region pruning strategy to significantly reduce the search space without compromising the accuracy. Finally, we present a detailed experimental evaluation on the real urban road network of New York City, and the experimental results demonstrate the effectiveness of the proposed approach to generate optimal fast paths and to facilitate real-time routing applications.
- Published
- 2018
21. Relay discovery and selection for large-scale P2P streaming
- Author
-
Angela Yunxian Wang, Chengwei Zhang, and Xiaojun Hei
- Subjects
Computer and Information Sciences ,Computer science ,Vector Spaces ,lcsh:Medicine ,02 engineering and technology ,Research and Analysis Methods ,Geographical locations ,law.invention ,Computer Communication Networks ,Relay ,law ,0202 electrical engineering, electronic engineering, information engineering ,Overhead (computing) ,Computer Networks ,lcsh:Science ,Selection (genetic algorithm) ,Ohio ,Internet ,Key generation ,Multidisciplinary ,business.industry ,Applied Mathematics ,Simulation and Modeling ,Node (networking) ,lcsh:R ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,020206 networking & telecommunications ,Relays ,United States ,Algebra ,Linear Algebra ,Physical Sciences ,North America ,Cryptography ,Engineering and Technology ,Bandwidth (Computing) ,lcsh:Q ,020201 artificial intelligence & image processing ,Electronics ,People and places ,business ,Mathematics ,Algorithms ,Research Article ,Computer network - Abstract
In peer-to-peer networks, application relays have been commonly used to provide various networking services. The service performance often improves significantly if a relay is selected appropriately based on its network location. In this paper, we studied the location-aware relay discovery and selection problem for large-scale P2P streaming networks. In these large-scale and dynamic overlays, it incurs significant communication and computation cost to discover a sufficiently large relay candidate set and further to select one relay with good performance. The network location can be measured directly or indirectly with the tradeoffs between timeliness, overhead and accuracy. Based on a measurement study and the associated error analysis, we demonstrate that indirect measurements, such as King and Internet Coordinate Systems (ICS), can only achieve a coarse estimation of peers' network location and those methods based on pure indirect measurements cannot lead to a good relay selection. We also demonstrate that there exists significant error amplification of the commonly used "best-out-of-K" selection methodology using three RTT data sets publicly available. We propose a two-phase approach to achieve efficient relay discovery and accurate relay selection. Indirect measurements are used to narrow down a small number of high-quality relay candidates and the final relay selection is refined based on direct probing. This two-phase approach enjoys an efficient implementation using the Distributed-Hash-Table (DHT). When the DHT is constructed, the node keys carry the location information and they are generated scalably using indirect measurements, such as the ICS coordinates. The relay discovery is achieved efficiently utilizing the DHT-based search. We evaluated various aspects of this DHT-based approach, including the DHT indexing procedure, key generation under peer churn and message costs.
- Published
- 2017
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.