Author: "Charles C. Taylor" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Charles C. Taylor"' showing total 157 results

Start Over Author "Charles C. Taylor"

157 results on '"Charles C. Taylor"'

1. Bayesian CART models for aggregate claim modeling.

Author: Yaojun Zhang, Lanpeng Ji, Georgios Aivaliotis, and Charles C. Taylor
Published: 2024
Full Text: View/download PDF

2. Interval forecasts based on regression trees for streaming data.

Author: Xin Zhao, Stuart Barber, Charles C. Taylor, and Zoka Milan
Published: 2021
Full Text: View/download PDF

3. A New Approach to Measuring Distances in Dense Graphs.

Author: Fatimah A. Almulhim, Peter A. Thwaites, and Charles C. Taylor
Published: 2018
Full Text: View/download PDF

4. Sparse modelling of cancer patients' survival based on genomic copy number alterations.

Author: Khaled Alqahtani, Charles C. Taylor, Henry M. Wood, and Arief Gusnanto
Published: 2022
Full Text: View/download PDF

5. Classification tree methods for panel data using wavelet-transformed time series.

Author: Xin Zhao, Stuart Barber, Charles C. Taylor, and Zoka Milan
Published: 2018
Full Text: View/download PDF

6. Cross-validation is safe to use.

Author: Ross D. King, Oghenejokpeme I. Orhobor, and Charles C. Taylor
Published: 2021
Full Text: View/download PDF

7. Kernel regression for errors-in-variables problems in the circular domain

Author: Marco Di Marzio, Stefania Fensore, and Charles C. Taylor
Subjects: Statistics and Probability, Statistics, Probability and Uncertainty
Abstract: We study the problem of estimating a regression function when the predictor and/or the response are circular random variables in the presence of measurement errors. We propose estimators whose weight functions are deconvolution kernels defined according to the nature of the involved variables. We derive the asymptotic properties of the proposed estimators and consider possible generalizations and extensions. We provide some simulation results and a real data case study to illustrate and compare the proposed methods.
Published: 2023
Full Text: View/download PDF

8. Estimating optimal window size for analysis of low-coverage next-generation sequence data.

Author: Arief Gusnanto, Charles C. Taylor, Ibrahim Nafisah, Henry M. Wood, Pamela Rabbitts, and Stefano Berri
Published: 2014
Full Text: View/download PDF

9. The package: nonparametric regression using local rotation matrices in

Author: Giovanni Lafratta, Charles C. Taylor, Marco Di Marzio, and Stefania Fensore
Subjects: Statistics and Probability, 021103 operations research, Applied Mathematics, 0211 other engineering and technologies, Nonparametric statistics, 02 engineering and technology, Rotation matrix, 01 natural sciences, Regression, Bias reduction, Nonparametric regression, 010104 statistics & probability, Modeling and Simulation, Statistics, Singular value decomposition, 0101 mathematics, Statistics, Probability and Uncertainty, MIT License, Mathematics
Abstract: The package implements nonparametric (smooth) regression for spherical data in , and is freely available from the Comprehensive Archive Network (CRAN), licensed under the MIT License. It can be use...
Published: 2021
Full Text: View/download PDF

10. Spatio-temporal forecasting using wavelet transform-based decision trees with application to air quality and covid-19 forecasting

Author: Xin Zhao, Stuart Barber, Charles C Taylor, Xiaokai Nie, and Wenqian Shen
Subjects: Statistics and Probability, Articles, Statistics, Probability and Uncertainty
Abstract: We develop a new method that combines a decision tree with a wavelet transform to forecast time series data with spatial spillover effects. The method can not only improve prediction but also give good interpretability of the time series mechanism. As a feature exploration method, the wavelet transform represents information at different resolution levels, which may improve the performance of decision trees. The method is applied to simulated data, air pollution and COVID time series data sets. In the simulation, Haar, LA8, D4 and D6 wavelets are compared, with the Haar wavelet having the best performance. In the air pollution application, by using wavelet transform-based decision trees, the temporal effect of air quality index including autoregressive and seasonal effects can be described as well as the spatial correlation effect. To describe the spillover spatial effect in contiguous regions, a spatial weight is constructed to improve the modeling performance. The results show that air quality index has autoregressive, seasonal and spatial spillover effects. The wavelet transformed variables have a better forecasting performance and enhanced interpretability than the original variables. For the COVID time series of cumulative cases, spatial weighted variables are not selected which shows the lock-down policies are truly effective.
Published: 2022

11. A comparison of block and semi-parametric bootstrap methods for variance estimation in spatial statistics.

Author: N. Iranpanah, Mohsen Mohammadzadeh, and Charles C. Taylor
Published: 2011
Full Text: View/download PDF

12. Properties and approximate p-value calculation of the Cramer test

Author: Arief Gusnanto, Charles C. Taylor, Alison Telford, and Henry M. Wood
Subjects: Statistics and Probability, Anderson–Darling test, Applied Mathematics, Cumulative distribution function, Variance (accounting), Test (assessment), Distribution (mathematics), Modeling and Simulation, Cramér–von Mises criterion, Statistics, p-value, Statistics, Probability and Uncertainty, Null hypothesis, Mathematics
Abstract: Two-sample tests are probably the most commonly used tests in statistics. These tests generally address one aspect of the samples' distribution, such as mean or variance. When the null hypothesis is that two distributions are equal, the Anderson–Darling (AD) test, which is developed from the Cramer–von Mises (CvM) test, is generally employed. Unfortunately, we find that the AD test often fails to identify true differences when the differences are complex: they are not only in terms of mean, variance and/or skewness but also in terms of multi-modality. In such cases, we find that Cramer test, a modification of the CvM test, performs well. However, the adaptation of the Cramer test in routine analysis is hindered by the fact that the mean, variance and skewness of the test statistic are not available, which resulted in the problem of calculating the associated p-value. For this purpose, we propose a new method for obtaining a p-value by approximating the distribution of the test statistic by a generalized Pareto distribution. By approximating the distribution in this way, the calculation of the p-value is much faster than e.g. bootstrap method, especially for large n. We have observed that this approximation enables the Cramer test to have proper control of type-I error. A simulation study indicates that the Cramer test is as powerful as other tests in simple cases and more powerful in more complicated cases.
Published: 2020
Full Text: View/download PDF

13. Density estimation for circular data observed with errors

Author: Charles C. Taylor, Stefania Fensore, Marco Di Marzio, and Agnese Panzera
Subjects: Statistics and Probability, General Immunology and Microbiology, Applied Mathematics, Estimator, General Medicine, Density estimation, General Biochemistry, Genetics and Molecular Biology, Bias, Simple (abstract algebra), Kernel (statistics), Computer Simulation, Deconvolution, General Agricultural and Biological Sciences, Equivalence (measure theory), Fourier series, Algorithm, Smoothing, Mathematics
Abstract: Until now the problem of estimating circular densities when data are observed with errors has been mainly treated by Fourier series methods. We propose kernel-based estimators exhibiting simple construction and easy implementation. Specifically, we consider three different approaches: the first one is based on the equivalence between kernel estimators using data corrupted with different levels of error. This proposal appears to be totally unexplored, despite its potential for application also in the Euclidean setting. The second approach relies on estimators whose weight functions are circular deconvolution kernels. Due to the periodicity of the involved densities, it requires ad hoc mathematical tools. Finally, the third one is based on the idea of correcting extra bias of kernel estimators which use contaminated data and is essentially an adaptation of the standard theory to the circular case. For all the proposed estimators, we derive asymptotic properties, provide some simulation results, and also discuss some possible generalizations and extensions. Real data case studies are also included.
Published: 2022

14. Evaluating Usefulness for Dynamic Classification.

Author: Gholamreza Nakhaeizadeh, Charles C. Taylor, and Carsten Lanquillon
Published: 1998

15. Automatic bandwidth selection for circular density estimation.

Author: Charles C. Taylor
Published: 2008
Full Text: View/download PDF

16. Learning in Dynamically Changing Domains: Theory Revision and Context Dependence Issues.

Author: Charles C. Taylor and Gholamreza Nakhaeizadeh
Published: 1997
Full Text: View/download PDF

17. Statistical Aspects of Classification in Drifting Populations.

Author: Charles C. Taylor, Gholamreza Nakhaeizadeh, and G. Kunisch
Published: 1997

18. The Poisson Index: a new probabilistic model for protein-ligand binding site similarity.

Author: J. R. Davies, Richard M. Jackson, Kanti V. Mardia, and Charles C. Taylor
Published: 2007
Full Text: View/download PDF

19. Classification of type I-censored bivariate data.

Author: Matthew J. Langdon, Charles C. Taylor, and Robert M. West
Published: 2007
Full Text: View/download PDF

20. Hierarchical Bayesian modelling of spatial age-dependent mortality.

Author: N. Miklós Arató, Ian L. Dryden, and Charles C. Taylor
Published: 2006
Full Text: View/download PDF

21. An Understanding of Muscle Fibre Images.

Author: Charles C. Taylor, Mohammed Reza Faghihi, and Ian L. Dryden
Published: 1995
Full Text: View/download PDF

22. On boosting kernel density methods for multivariate data: density estimation and classification.

Author: Marco Di Marzio and Charles C. Taylor
Published: 2005
Full Text: View/download PDF

23. Kernel density classification and boosting: an L2 analysis.

Author: Marco Di Marzio and Charles C. Taylor
Published: 2005
Full Text: View/download PDF

24. Chain plot: a tool for exploiting bivariate temporal structures.

Author: Charles C. Taylor and András Zempléni
Published: 2004
Full Text: View/download PDF

25. Statistical Methods in Learning.

Author: A. Sutherland, Bob Henery, Rafael Molina 0001, Charles C. Taylor, and Ross D. King
Published: 1992
Full Text: View/download PDF

26. Procrustes shape analysis of triangulations of a two coloured point pattern.

Author: Mohammed Reza Faghihi, Charles C. Taylor, and Ian L. Dryden
Published: 1999
Full Text: View/download PDF

27. Interval forecasts based on regression trees for streaming data

Author: Stuart Barber, Charles C. Taylor, Zoka Milan, and Xin Zhao
Subjects: Statistics and Probability, Computer science, Test data generation, Applied Mathematics, Autoregressive conditional heteroskedasticity, CPU time, Inference, 02 engineering and technology, Interval (mathematics), 01 natural sciences, Regression, Computer Science Applications, 010104 statistics & probability, Tree (data structure), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Autoregressive integrated moving average, 0101 mathematics, Algorithm
Abstract: In forecasting, we often require interval forecasts instead of just a specific point forecast. To track streaming data effectively, this interval forecast should reliably cover the observed data and yet be as narrow as possible. To achieve this, we propose two methods based on regression trees: one ensemble method and one method based on a single tree. For the ensemble method, we use weighted results from the most recent models, and for the single-tree method, we retain one model until it becomes necessary to train a new model. We propose a novel method to update the interval forecast adaptively using root mean square prediction errors calculated from the latest data batch. We use wavelet-transformed data to capture long time variable information and conditional inference trees for the underlying regression tree model. Results show that both methods perform well, having good coverage without the intervals being excessively wide. When the underlying data generation mechanism changes, their performance is initially affected but can recover relatively quickly as time proceeds. The method based on a single tree performs the best in computational (CPU) time compared to the ensemble method. When compared to ARIMA and GARCH modelling, our methods achieve better or similar coverage and width but require considerably less CPU time.
Published: 2019
Full Text: View/download PDF

28. Fluid shear stress stimulates breast cancer cells to display invasive and chemoresistant phenotypes while upregulating PLAU in a 3D bioreactor

Author: Caymen Novak, Catherine Z. Liu, Eric N. Horst, Charles C. Taylor, and Geeta Mehta
Subjects: 0106 biological sciences, 0301 basic medicine, Breast Neoplasms, Bioengineering, 01 natural sciences, Applied Microbiology and Biotechnology, Article, Metastasis, Extracellular matrix, 03 medical and health sciences, Bioreactors, Breast cancer, Downregulation and upregulation, 010608 biotechnology, medicine, Shear stress, Humans, Neoplasm Invasiveness, Mechanotransduction, Tumor microenvironment, Chemistry, Membrane Proteins, medicine.disease, Neoplasm Proteins, Up-Regulation, Gene Expression Regulation, Neoplastic, 030104 developmental biology, Drug Resistance, Neoplasm, Cancer cell, MCF-7 Cells, Cancer research, Female, Stress, Mechanical, Shear Strength, Biotechnology
Abstract: Breast cancer cells experience a range of shear stresses in the tumor microenvironment (TME). However most current in vitro three-dimensional (3D) models fail to systematically probe the effects of this biophysical stimuli on cancer cell metastasis, proliferation and chemoresistance. To investigate the roles of shear stress within the mammary and lung pleural effusion TME, a bioreactor capable of applying shear stress to cells within a 3D extracellular matrix was designed and characterized. Breast cancer cells were encapsulated within an interpenetrating network (IPN) hydrogel and subjected to shear stress of 5.4 dynes cm(−2) for 72 hours. Finite element modeling assessed shear stress profiles within the bioreactor. Cells exposed to shear stress had significantly higher cellular area and significantly lower circularity, indicating a motile phenotype. Stimulated cells were more proliferative than static controls and showed higher rates of chemoresistance to the anti-neoplastic drug paclitaxel. Fluid shear stress induced significant upregulation of the PLAU gene and elevated urokinase activity was confirmed through zymography and activity assay. Overall, these results indicate that pulsatile shear stress promotes breast cancer cell proliferation, invasive potential, chemoresistance, and PLAU signaling.
Published: 2019
Full Text: View/download PDF

29. Kernel Circular Deconvolution Density Estimation

Author: Marco Di Marzio, Stefania Fensore, Charles C. Taylor, and Agnese Panzera
Subjects: Observational error, Kernel (statistics), Euclidean geometry, Estimator, Applied mathematics, Deconvolution, Density estimation, Data application, Mathematics
Abstract: We consider the problem of nonparametrically estimating a circular density from data contaminated by angular measurement errors. Specifically, we obtain a kernel-type estimator with weight functions that are reminiscent of deconvolution kernels. Here, differently from the Euclidean setting, discrete Fourier coefficients are involved rather than characteristic functions. We provide some simulation results along with a real data application.
Published: 2020
Full Text: View/download PDF

30. A New Approach to Measuring Distances in Dense Graphs

Author: Charles C. Taylor, Peter A. Thwaites, and Fatimah A. Almulhim
Subjects: Discrete mathematics, Computer science, k-means clustering, Graph theory, 01 natural sciences, Graph, 010305 fluids & plasmas, Hierarchical clustering, Vertex (geometry), Search algorithm, 0103 physical sciences, Adjacency matrix, 010306 general physics, Cluster analysis, MathematicsofComputing_DISCRETEMATHEMATICS
Abstract: The problem of computing distances and shortest paths between vertices in graphs is one of the fundamental issues in graph theory. It is of great importance in many different applications, for example, transportation, and social network analysis. However, efficient shortest distance algorithms are still desired in many disciplines. Basically, the majority of dense graphs have ties between the shortest distances. Therefore, we consider a different approach and introduce a new measure to solve all-pairs shortest paths for undirected and unweighted graphs. This measures the shortest distance between any two vertices by considering the length and the number of all possible paths between them. The main aim of this new approach is to break the ties between equal shortest paths SP, which can be obtained by the Breadth-first search algorithm (BFS), and distinguish meaningfully between these equal distances. Moreover, using the new measure in clustering produces higher quality results compared with SP. In our study, we apply two different clustering techniques: hierarchical clustering and K-means clustering, with four different graph models, and for a various number of clusters. We compare the results using a modularity function to check the quality of our clustering results.
Published: 2019
Full Text: View/download PDF

31. Kernel density classification for spherical data

Author: Agnese Panzera, Charles C. Taylor, Marco Di Marzio, and Stefania Fensore
Subjects: Statistics and Probability, 010104 statistics & probability, Field (physics), Global climate, 010102 general mathematics, Kernel density estimation, Nonparametric statistics, Applied mathematics, Decision rule, 0101 mathematics, Statistics, Probability and Uncertainty, 01 natural sciences, Mathematics
Abstract: Classifying observations coming from two different spherical populations by using a nonparametric method appears to be an unexplored field, although clearly worth to pursue. We propose some decision rules based on spherical kernel density estimation and we provide asymptotic L 2 properties. A real-data application using global climate data is finally discussed.
Published: 2019

32. Geometry-based distance for clustering amino acids

Author: Arief Gusnanto, Charles C. Taylor, and Samira F. Abushilah
Subjects: Statistics and Probability, chemistry.chemical_classification, Quantitative Biology::Biomolecules, business.industry, Squared euclidean distance, Pattern recognition, Articles, Quantitative Biology::Genomics, Amino acid, Hierarchical clustering, chemistry, Artificial intelligence, Statistics, Probability and Uncertainty, Cluster analysis, business, Mathematics
Abstract: Clustering amino acids is one of the most challenging problems in functional and structural prediction of protein. Previous studies have proposed clusters based on measurements of physical and biochemical characteristics of the amino acids such as volume, area, hydrophilicity, polarity, hydrogen bonding, shape, and charge. These characteristics, although important, are less directly related to the protein structure compared to geometrical characteristics such as dihedral angles between amino acids. We propose using the p-value from a test of equality of dihedral-angle distributions as the basis of a distance measure for the clustering. In this novel approach, an energy test is modified to deal with bivariate angular data and the p-value is obtained via a permutation method. The results indicate that the clusters of amino acids have sensible interpretation where Glycine, Proline, and Asparagine each forms a distinct cluster. A simulation study suggests that this approach has good working characteristics to cluster amino acids.
Published: 2019
Full Text: View/download PDF

33. Local binary regression with spherical predictors

Author: Agnese Panzera, Marco Di Marzio, Charles C. Taylor, and Stefania Fensore
Subjects: Statistics and Probability, Polynomial regression, Statistics::Theory, 010102 general mathematics, Kernel density estimation, Local regression, Binary number, Estimator, 01 natural sciences, 010104 statistics & probability, Applied mathematics, Statistics::Methodology, Binary regression, 0101 mathematics, Statistics, Probability and Uncertainty, Mathematics
Abstract: We discuss local regression estimators when the predictor lies on the d -dimensional sphere and the response is binary. Despite Di Marzio et al. (2018b), who introduce spherical kernel density classification, we build on the theory of local polynomial regression and local likelihood. Simulations and a real-data application illustrate the effectiveness of the proposals.
Published: 2019

34. Cross-validation is safe to use

Author: Oghenejokpeme I. Orhobor, Ross D. King, and Charles C. Taylor
Subjects: Human-Computer Interaction, Artificial Intelligence, Computer Networks and Communications, business.industry, Medicine, Computer Vision and Pattern Recognition, business, Software, Cross-validation, Reliability engineering
Published: 2021
Full Text: View/download PDF

35. Classification of form under heterogeneity and non-isotropic errors

Author: Arief Gusnanto, Farag Shuweihdi, and Charles C. Taylor
Subjects: Statistics and Probability, business.industry, Computation, Diagonal, Estimator, Pattern recognition, Euclidean distance matrix, computer.software_genre, Form classification, Weighting, Data mining, Artificial intelligence, Statistics, Probability and Uncertainty, business, computer, Classifier (UML), Shape analysis (digital geometry), Mathematics
Abstract: A number of areas related to learning under supervision have not been fully investigated, particularly the possibility of incorporating the method of classification into shape analysis. In this regard, practical ideas conducive to the improvement of form classification are the focus of interest. Our proposal is to employ a hybrid classifier built on Euclidean Distance Matrix Analysis (EDMA) and Procrustes distance, rather than generalised Procrustes analysis (GPA). In empirical terms, it has been demonstrated that there is notable difference between the estimated form and the true form when EDMA is used as the basis for computation. However, this does not seem to be the case when GPA is employed. With the assumption that no association exists between landmarks, EDMA and GPA are used to calculate the mean form and diagonal weighting matrix to build superimposing classifiers. As our findings indicate, with the use of EDMA estimators, the superimposing classifiers we propose work extremely well, as opposed to the use of GPA, as far as both simulated and real datasets are concerned.
Published: 2016
Full Text: View/download PDF

36. Practical performance of local likelihood for circular density estimation

Author: Agnese Panzera, Stefania Fensore, M. Di Marzio, and Charles C. Taylor
Subjects: Statistics and Probability, Normalization (statistics), education.field_of_study, Mathematical optimization, Estimation theory, Applied Mathematics, 05 social sciences, Population, Probability and statistics, Density estimation, 01 natural sciences, Likelihood principle, 010104 statistics & probability, Sample size determination, Modeling and Simulation, 0502 economics and business, 0101 mathematics, Statistics, Probability and Uncertainty, education, Likelihood function, Algorithm, 050205 econometrics, Mathematics
Abstract: Local likelihood has been mainly developed from an asymptotic point of view, with little attention to finite sample size issues. The present paper provides simulation evidence of how likelihood density estimation practically performs from two points of view. First, we explore the impact of the normalization step of the final estimate, second we show the effectiveness of higher order fits in identifying modes present in the population when small sample sizes are available. We refer to circular data, nevertheless it is easily seen that our findings straightforwardly extend to the Euclidean setting, where they appear to be somehow new.
Published: 2016
Full Text: View/download PDF

37. Nonparametric circular quantile regression

Author: Charles C. Taylor, Marco Di Marzio, and Agnese Panzera
Subjects: Statistics and Probability, Circular distribution, Applied Mathematics, 05 social sciences, Nonparametric statistics, Estimator, Inversion (meteorology), Conditional probability distribution, 01 natural sciences, Quantile regression, 010104 statistics & probability, Circular conditional distribution function, circular conditional quantiles, circular kernels, optimal smoothing degree, wind directions, 0502 economics and business, Statistics, Applied mathematics, Minification, 0101 mathematics, Statistics, Probability and Uncertainty, 050205 econometrics, Mathematics, Quantile
Abstract: We discuss nonparametric estimation of conditional quantiles of a circular distribution when the conditioning variable is either linear or circular. Two different approaches are pursued: inversion of a conditional distribution function estimator, and minimization of a smoothed check function. Local constant and local linear versions of both estimators are discussed. Simulation experiments and a real data case study are used to illustrate the usefulness of the methods.
Published: 2016
Full Text: View/download PDF

38. A note on nonparametric estimation of circular conditional densities

Author: M. Di Marzio, Charles C. Taylor, Agnese Panzera, and Stefania Fensore
Subjects: Statistics and Probability, Polynomial, Applied Mathematics, 05 social sciences, Nonparametric statistics, Estimator, Conditional probability distribution, Conditional expectation, 01 natural sciences, Quantile regression, 010104 statistics & probability, Modeling and Simulation, 0502 economics and business, Statistics, Applied mathematics, 0101 mathematics, Statistics, Probability and Uncertainty, Conditional variance, 050205 econometrics, Quantile, Mathematics
Abstract: The conditional density offers the most informative summary of the relationship between explanatory and response variables. We need to estimate it in place of the simple conditional mean when its shape is not well-behaved. A motivation for estimating conditional densities, specific to the circular setting, lies in the fact that a natural alternative of it, like quantile regression, could be considered problematic because circular quantiles are not rotationally equivariant. We treat conditional density estimation as a local polynomial fitting problem as proposed by Fan et al. [Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems. Biometrika. 1996;83:189–206] in the Euclidean setting, and discuss a class of estimators in the cases when the conditioning variable is either circular or linear. Asymptotic properties for some members of the proposed class are derived. The effectiveness of the methods for finite sample sizes is illustrated by simulation experiments a...
Published: 2016
Full Text: View/download PDF

39. Classification tree methods for panel data using wavelet-transformed time series

Author: Charles C. Taylor, Xin Zhao, Zoka Milan, and Stuart Barber
Subjects: Statistics and Probability, Interpretation (logic), Series (mathematics), business.industry, Computer science, Applied Mathematics, Decision tree learning, Pattern recognition, 02 engineering and technology, 01 natural sciences, Data type, 010104 statistics & probability, Computational Mathematics, Wavelet, Computational Theory and Mathematics, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, 0101 mathematics, Representation (mathematics), Scale (map), business, Panel data
Abstract: Wavelet-transformed variables can have better classification performance for panel data than using variables on their original scale. Examples are provided showing the types of data where using a wavelet-based representation is likely to improve classification accuracy. Results show that in most cases wavelet-transformed data have better or similar classification accuracy to the original data, and only select genuinely useful explanatory variables. Use of wavelet-transformed data provides localized mean and difference variables which can be more effective than the original variables, provide a means of separating “signal” from “noise”, and bring the opportunity for improved interpretation via the consideration of which resolution scales are the most informative. Panel data with multiple observations on each individual require some form of aggregation to classify at the individual level. Three different aggregation schemes are presented and compared using simulated data and real data gathered during liver transplantation. Methods based on aggregating individual level data before classification outperform methods which rely solely on the combining of time-point classifications.
Published: 2018

40. Statistical Estimate of Radon Concentration from Passive and Active Detectors in Doha

Author: Rifaat Hassona, Adil Yousef, Kassim Mwitondi, Ibrahim Al Sadig, and Charles C. Taylor
Subjects: Radon detection, spatio-temporal analyses, Information Systems and Management, Meteorology, 0211 other engineering and technologies, chemistry.chemical_element, Radon, unsupervised modelling, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Clustering, Local regression, radon detection, Data modeling, Spatio-temporal analyses, Visualisation, visualisation, Cluster analysis, 0105 earth and related environmental sciences, Potential impact, 021103 operations research, Data collection, estimation, Detector, lcsh:Z, lcsh:Bibliography. Library science. Information resources, Computer Science Applications, local regression, chemistry, Work (electrical), Environmental science, Estimation methods, Estimation, Unsupervised modelling, clustering, Information Systems
Abstract: Harnessing knowledge on the physical and natural conditions that affect our health, general livelihood and sustainability has long been at the core of scientific research. Health risks of ionising radiation from exposure to radon and radon decay products in homes, work and other public places entail developing novel approaches to modelling occurrence of the gas and its decaying products, in order to cope with the physical and natural dynamics in human habitats. Various data modelling approaches and techniques have been developed and applied to identify potential relationships among individual local meteorological parameters with a potential impact on radon concentrations&mdash, i.e., temperature, barometric pressure and relative humidity. In this first research work on radon concentrations in the State of Qatar, we present a combination of exploratory, visualisation and algorithmic estimation methods to try and understand the radon variations in and around the city of Doha. Data were obtained from the Central Radiation Laboratories (CRL) in Doha, gathered from 36 passive radon detectors deployed in various schools, residential and work places in and around Doha as well as from one active radon detector located at the CRL. Our key findings show high variations mainly attributable to technical variations in data gathering, as the equipment and devices appear to heavily influence the levels of radon detected. A parameter maximisation method applied to simulate data with similar behaviour to the data from the passive detectors in four of the neighbourhoods appears appropriate for estimating parameters in cases of data limitation. Data from the active detector exhibit interesting seasonal variations&mdash, with data clustering exhibiting two clearly separable groups, with passive and active detectors exhibiting a huge disagreement in readings. These patterns highlight challenges related to detection methods&mdash, in particular ensuring that deployed detectors and calculations of radon concentrations are adapted to local conditions. The study doesn&rsquo, t dwell much on building materials and makes rather fundamental assumptions, including an equal exhalation rate of radon from the soil across neighbourhoods, based on Doha&rsquo, s homogeneous underlying geological formation. The study also highlights potential extensions into the broader category of pollutants such as hydrocarbon, air particulate carbon monoxide and nitrogen dioxide at specific time periods of the year and particularly how they may tie in with global health institutions&rsquo, requirements.
Published: 2018

41. Statistical analysis of particulate matter data in Doha, Qatar

Author: Charles C. Taylor, Kassim Mwitondi, and Adil Yousif
Subjects: Pollution, Data collection, Meteorology, media_common.quotation_subject, Outlier, Analyser, Environmental science, Sampling (statistics), Sample (statistics), Missing data, Wind speed, media_common
Abstract: Pollution in Doha is measured using passive, active and automatic sampling. In this paper we consider data automatically sampled in which various pollutants were continually collected and analysed every hour. At each station the sample is analysed on-line and in real time and the data is stored within the analyser, or a separate logger so it can be downloaded remotely by a modem. The accuracy produced enables pollution episodes to be analysed in detail and related to traffic flows, meteorology and other variables. Data has been collected hourly over more than 6 years at 3 different locations, with measurements available for various pollutants – for example, ozone, nitrogen oxides, sulphur dioxide, carbon monoxide, THC, methane and particulate matter (PM1.0, PM2.5 and PM10), as well as meteorological data such as humidity, temperature, and wind speed and direction. Despite much care in the data collection process, the resultant data has long stretches of missing values, when the equipment has malfunctioned – often as a result of more extreme conditions. Our analysis is twofold. Firstly, we consider ways to “clean” the data, by imputing missing values, including identified outliers. The second aspect specifically considers prediction of each particulate (PM1.0, PM2.5 and PM10) 24 hours ahead, using current (and previous) pollution and meteorological data. In this case, we use vector autoregressive models, compare with decision trees and propose variable selection criteria which explicitly adapt to missing data. Our results show that the regression tree models, with no variable transformations, perform the best, and that attempts to impute missing values are hampered by non-random missingness.
Published: 2018

42. Circular local likelihood

Author: Charles C. Taylor, Agnese Panzera, Marco Di Marzio, and Stefania Fensore
Subjects: Statistics and Probability, Polynomial, Bessel functions. Circular data. Density estimation. Log-likelihood. von Mises density, Logarithm, Basis (linear algebra), 05 social sciences, Kernel density estimation, Estimator, Density estimation, Function (mathematics), 01 natural sciences, 0506 political science, 010104 statistics & probability, 050602 political science & public administration, Applied mathematics, 0101 mathematics, Statistics, Probability and Uncertainty, Special case, Mathematics
Abstract: We introduce a class of local likelihood circular density estimators, which includes the kernel density estimator as a special case. The idea lies in optimizing a spatially weighted version of the log-likelihood function, where the logarithm of the density is locally approximated by a periodic polynomial. The use of von Mises density functions as weights reduces the computational burden. Also, we propose closed-form estimators which could form the basis of counterparts in the multidimensional Euclidean setting. Simulation results and a real data case study are used to evaluate the performance and illustrate the results.
Published: 2018

43. Nonparametric Rotations for Sphere-Sphere Regression

Author: Marco Di Marzio, Charles C. Taylor, and Agnese Panzera
Subjects: Statistics and Probability, Wahba's problem, 05 social sciences, Nonparametric statistics, Hypersphere, 01 natural sciences, Regression, Bias Reduction, Fisher’s Method of Scoring, Local Smoothing, Non-Rigid Rotation Estimation, Singular Value Decomposition, Skew-symmetric Matrices, Spherical Kernels, Wahba’s Problem, 010104 statistics & probability, Simple (abstract algebra), 0502 economics and business, Singular value decomposition, Applied mathematics, 0101 mathematics, Statistics, Probability and Uncertainty, Rotation (mathematics), 050205 econometrics, Parametric statistics, Mathematics
Abstract: Regression of data represented as points on a hypersphere has traditionally been treated using parametric families of transformations that include the simple rigid rotation as an important, special case. On the other hand, nonparametric methods have generally focused on modeling a scalar response through a spherical predictor by representing the regression function as a polynomial, leading to component-wise estimation of a spherical response. We propose a very flexible, simple regression model where for each location of the manifold a specific rotation matrix is to be estimated. To make this approach tractable, we assume continuity of the regression function that, in turn, allows for approximations of rotation matrices based on a series expansion. It is seen that the nonrigidity of our technique motivates an iterative estimation within a Newton–Raphson learning scheme, which exhibits bias reduction properties. Extensions to general shape matching are also outlined. Both simulations and real data are used to illustrate the results. Supplementary materials for this article are available online.
Published: 2018
Full Text: View/download PDF

44. Nonparametric estimating equations for circular probability density functions and their derivatives

Author: Agnese Panzera, Charles C. Taylor, Stefania Fensore, and Marco Di Marzio
Subjects: Statistics and Probability, Mathematical optimization, Population, Fourier coefficients, Probability density function, Estimating equations, trigonometric moments, 01 natural sciences, 010104 statistics & probability, Circular kernels, Density estimation, Jackknife, Sin-polynomials, Trigonometric moments, Von mises density, density estimation, 0502 economics and business, Applied mathematics, 0101 mathematics, education, von Mises density, 050205 econometrics, Mathematics, education.field_of_study, 05 social sciences, Nonparametric statistics, Estimator, Probability and statistics, jackknife, Delta method, sin-polynomials, Statistics, Probability and Uncertainty
Abstract: We propose estimating equations whose unknown parameters are the values taken by a circular density and its derivatives at a point. Specifically, we solve equations which relate local versions of population trigonometric moments with their sample counterparts. Major advantages of our approach are: higher order bias without asymptotic variance inflation, closed form for the estimators, and absence of numerical tasks. We also investigate situations where the observed data are dependent. Theoretical results along with simulation experiments are provided.
Published: 2017
Full Text: View/download PDF

45. Estimating optimal window size for analysis of low-coverage next-generation sequence data

Author: Ibrahim Nafisah, Charles C. Taylor, Henry M. Wood, Stefano Berri, Arief Gusnanto, and Pamela Rabbitts
Subjects: Statistics and Probability, Lung Neoplasms, Computer science, Context (language use), computer.software_genre, Biochemistry, Humans, Molecular Biology, Likelihood Functions, Sequence, Genome, Human, High-Throughput Nucleotide Sequencing, Window (computing), Contrast (statistics), Genomics, Sequence Analysis, DNA, Function (mathematics), Computer Science Applications, Computational Mathematics, Computational Theory and Mathematics, Step function, Data mining, Akaike information criterion, computer, Algorithm, Next generation sequence
Abstract: Motivation: Current high-throughput sequencing has greatly transformed genome sequence analysis. In the context of very low-coverage sequencing ( Results: We assume the reads density to be a step function. Given this model, we propose a data-based estimation of optimal window size based on Akaike’s information criterion (AIC) and cross-validation (CV) log-likelihood. By plotting the AIC and CV log-likelihood curve as a function of window size, we are able to estimate the optimal window size that minimizes AIC or maximizes CV log-likelihood. The proposed methods are of general purpose and we illustrate their application using low-coverage next-generation sequence datasets from real tumour samples and simulated datasets. Availability and implementation: An R package to estimate optimal window size is available at http://www1.maths.leeds.ac.uk/∼arief/R/win/ . Contact: a.gusnanto@leeds.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
Published: 2014
Full Text: View/download PDF

46. Smooth estimation of circular cumulative distribution functions and quantiles

Author: Marco Di Marzio, Charles C. Taylor, and Agnese Panzera
Subjects: Statistics and Probability, Kernel method, Location parameter, Cumulative distribution function, Statistics, Nonparametric statistics, Estimator, Applied mathematics, Statistics, Probability and Uncertainty, Covariance, Empirical distribution function, Quantile, Mathematics
Abstract: Smooth nonparametric estimators based on a kernel method are proposed for cumulative distribution functions (CDFs) and quantiles of circular data. A sound motivation for this is that although for euclidean data similar estimators have been widely studied, for circular data nothing similar seems to exist; albeit, remarkably, in the circular-setting local methods are implemented more easily because of the absence of boundaries on the circle. The only alternative to our method seems to be the empirical CDF, that does not take into account circularity of data when the estimate is near the cut-point, as our local method naturally does. The definition of circular CDF is different from its euclidean counterpart in many respects, and this will give rise to estimators exhibiting some ‘unusual’ features such as, for example, global efficiency measures containing a location parameter and a covariance term. Simulations along with real data case studies illustrate the findings.
Published: 2012
Full Text: View/download PDF

47. Mixtures of concentrated multivariate sine distributions with applications to bioinformatics

Author: Zhengzheng Zhang, Thomas Hamelryck, John T. Kent, Kanti V. Mardia, and Charles C. Taylor
Subjects: Statistics and Probability, Wishart distribution, Univariate distribution, Inverse-Wishart distribution, Matrix t-distribution, Statistics::Methodology, Matrix normal distribution, Multivariate t-distribution, Statistics, Probability and Uncertainty, Bioinformatics, Elliptical distribution, Normal-Wishart distribution, Mathematics
Abstract: Motivated by examples in protein bioinformatics, we study a mixture model of multivariate angular distributions. The distribution treated here (multivariate sine distribution) is a multivariate extension of the well-known von Mises distribution on the circle. The density of the sine distribution has an intractable normalizing constant and here we propose to replace it in the concentrated case by a simple approximation. We study the EM algorithm for this distribution and apply it to a practical example from protein bioinformatics.
Published: 2012
Full Text: View/download PDF

48. Validating protein structure using kernel density estimates

Author: Charles C. Taylor, Agnese Panzera, Marco Di Marzio, and Kanti V. Mardia
Subjects: Statistics and Probability, Quantitative Biology::Biomolecules, Mathematical optimization, Kernel density estimation, Conditional probability distribution, Density estimation, Multivariate kernel density estimation, Kernel embedding of distributions, Variable kernel density estimation, Test set, Kernel (statistics), Statistics, Probability and Uncertainty, Algorithm, Mathematics
Abstract: Measuring the quality of determined protein structures is a very important problem in bioinformatics. Kernel density estimation is a well-known nonparametric method which is often used for exploratory data analysis. Recent advances, which have extended previous linear methods to multi-dimensional circular data, give a sound basis for the analysis of conformational angles of protein backbones, which lie on the torus. By using an energy test, which is based on interpoint distances, we initially investigate the dependence of the angles on the amino acid type. Then, by computing tail probabilities which are based on amino-acid conditional density estimates, a method is proposed which permits inference on a test set of data. This can be used, for example, to validate protein structures, choose between possible protein predictions and highlight unusual residue angles.
Published: 2012
Full Text: View/download PDF

49. Non-parametric smoothing and prediction for nonlinear circular time series

Author: Agnese Panzera, Charles C. Taylor, and Macro Di Marzio
Subjects: Statistics and Probability, Mathematical optimization, Field (physics), Series (mathematics), Applied Mathematics, Nonparametric statistics, Nonlinear system, Applied mathematics, Time domain, Statistics, Probability and Uncertainty, Constant (mathematics), Cross-spectrum, Smoothing, Mathematics
Abstract: Not much research has been done in the field of circular time-series analysis. We propose a non-parametric theory for smoothing and prediction in the time domain for circular time-series data. Our model is based on local constant and local linear fitting estimates of a minimizer of an angular risk function. Both asymptotic arguments and empirical examples are used to describe the accuracy of our methods.
Published: 2012
Full Text: View/download PDF

50. Kernel density estimation on the torus

Author: Marco Di Marzio, Agnese Panzera, and Charles C. Taylor
Subjects: Statistics and Probability, Applied Mathematics, Kernel density estimation, Torus, Density estimation, Multivariate kernel density estimation, Kernel method, Variable kernel density estimation, Calculus, Partial derivative, Applied mathematics, Statistics, Probability and Uncertainty, Smoothing, Mathematics
Abstract: Kernel density estimation for multivariate, circular data has been formulated only when the sample space is the sphere, but theory for the torus would also be useful. For data lying on a d-dimensional torus (d >= 1), we discuss kernel estimation of a density, its mixed partial derivatives, and their squared functionals. We introduce a specific class of product kernels whose order is suitably defined in such a way to obtain L-2-risk formulas whose structure can be compared to their Euclidean counterparts. Our kernels are based on circular densities; however, we also discuss smaller bias estimation involving negative kernels which are functions of circular densities. Practical rules for selecting the smoothing degree, based on cross-validation, bootstrap and plug-in ideas are derived. Moreover, we provide specific results on the use of kernels based on the von Mises density. Finally, real-data examples and simulation studies illustrate the findings.
Published: 2011
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

157 results on '"Charles C. Taylor"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources