28 results on '"Paul R. Trundle"'
Search Results
2. Classification of Heterogeneous Data Based on Data Type Impact on Similarity.
- Author
-
Najat Ali, Daniel Neagu, and Paul R. Trundle
- Published
- 2018
- Full Text
- View/download PDF
3. Bipartite Network Model for Inferring Hidden Ties in Crime Data.
- Author
-
Haruna Isah, Daniel Neagu, and Paul R. Trundle
- Published
- 2015
- Full Text
- View/download PDF
4. Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology.
- Author
-
Pritesh Mistry, Daniel Neagu, Paul R. Trundle, and Jonathan D. Vessey
- Published
- 2016
- Full Text
- View/download PDF
5. Social media analysis for product safety using text mining and sentiment analysis.
- Author
-
Haruna Isah, Paul R. Trundle, and Daniel Neagu
- Published
- 2014
- Full Text
- View/download PDF
6. Using computational methods for the prediction of drug vehicles.
- Author
-
Pritesh Mistry, Anna Palczewska, Daniel Neagu, and Paul R. Trundle
- Published
- 2014
- Full Text
- View/download PDF
7. Human Memory Assistance through Semantic-Based Text Processing.
- Author
-
Paul R. Trundle and Jianmin Jiang
- Published
- 2009
- Full Text
- View/download PDF
8. A Memory Management System towards Cognitive Assistance of Elderly People.
- Author
-
Fouad Khelifi, Jianmin Jiang, and Paul R. Trundle
- Published
- 2009
- Full Text
- View/download PDF
9. Development of Multi-output Neural Networks for Data Integration - A Case Study.
- Author
-
Paul R. Trundle, Daniel Neagu, Marian Viorel Craciun, and Qasim Chaudhry
- Published
- 2008
- Full Text
- View/download PDF
10. Multi-source Data Modelling: Integrating Related Data to Improve Model Performance.
- Author
-
Paul R. Trundle, Daniel Neagu, and Qasim Chaudhry
- Published
- 2007
- Full Text
- View/download PDF
11. Towards model governance in predictive toxicology.
- Author
-
Anna Palczewska, Xin Fu, Paul R. Trundle, Longzhi Yang, Daniel Neagu, Mick J. Ridley, and Kim Travis
- Published
- 2013
- Full Text
- View/download PDF
12. Medical image analysis with artificial neural networks.
- Author
-
Jianmin Jiang, Paul R. Trundle, and Jinchang Ren
- Published
- 2010
- Full Text
- View/download PDF
13. Social Media Analysis for Product Safety using Text Mining and Sentiment Analysis.
- Author
-
Haruna Isah, Daniel Neagu, and Paul R. Trundle
- Published
- 2015
14. Prediction of the effect of formulation on the toxicity of chemicals
- Author
-
John Paul Gosling, Daniel Neagu, Pritesh Mistry, Jonathan D. Vessey, Paul R. Trundle, and Antonio Sánchez-Ruiz
- Subjects
0301 basic medicine ,business.industry ,Computer science ,Health, Toxicology and Mutagenesis ,Decision tree ,Toxicology ,computer.software_genre ,Random forest ,03 medical and health sciences ,030104 developmental biology ,Text mining ,Toxicity ,Partial least squares regression ,Data mining ,business ,Cluster analysis ,computer ,Statistical evidence - Abstract
Two approaches for the prediction of which of two vehicles will result in lower toxicity for anticancer agents are presented. Machine-learning models are developed using decision tree, random forest and partial least squares methodologies and statistical evidence is presented to demonstrate that they represent valid models. Separately, a clustering method is presented that allows the ordering of vehicles by the toxicity they show for chemically-related compounds.
- Published
- 2017
- Full Text
- View/download PDF
15. Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets
- Author
-
Paul R. Trundle, Najat Ali, and Daniel Neagu
- Subjects
Computer science ,business.industry ,General Chemical Engineering ,05 social sciences ,Data classification ,General Engineering ,050301 education ,General Physics and Astronomy ,Pattern recognition ,02 engineering and technology ,Euclidean distance ,Binary data ,0202 electrical engineering, electronic engineering, information engineering ,General Earth and Planetary Sciences ,020201 artificial intelligence & image processing ,General Materials Science ,Artificial intelligence ,business ,K nearest neighbour ,0503 education ,Categorical variable ,Test sample ,Classifier (UML) ,General Environmental Science - Abstract
Distance-based algorithms are widely used for data classification problems. The k-nearest neighbour classification (k-NN) is one of the most popular distance-based algorithms. This classification is based on measuring the distances between the test sample and the training samples to determine the final classification output. The traditional k-NN classifier works naturally with numerical data. The main objective of this paper is to investigate the performance of k-NN on heterogeneous datasets, where data can be described as a mixture of numerical and categorical features. For the sake of simplicity, this work considers only one type of categorical data, which is binary data. In this paper, several similarity measures have been defined based on a combination between well-known distances for both numerical and binary data, and to investigate k-NN performances for classifying such heterogeneous data sets. The experiments used six heterogeneous datasets from different domains and two categories of measures. Experimental results showed that the proposed measures performed better for heterogeneous data than Euclidean distance, and that the challenges raised by the nature of heterogeneous data need personalised similarity measures adapted to the data characteristics.
- Published
- 2019
- Full Text
- View/download PDF
16. Vehicle Warranty Claim Prediction from Diagnostic Data Using Classification
- Author
-
Daniel Neagu, Andrew Sherratt, Paul R. Trundle, Felician Campean, and Denis Torgunov
- Subjects
Computer science ,business.industry ,020208 electrical & electronic engineering ,Warranty ,Decision tree ,Automotive industry ,02 engineering and technology ,Machine learning ,computer.software_genre ,Field (computer science) ,Random forest ,Support vector machine ,On-board diagnostics ,Binary classification ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
This paper presents an approach to predict warranty repair claims on automotive units based on joint on-board diagnostic and historic warranty repair data. The problem is framed as binary classification, facilitating the applicability of a variety of machine learning techniques. The approach allows automotive manufacturers to make better use of the operational and failure data collected from the field, allowing for better spend forecast and more targeted vehicle health management interventions and campaigns. The research evaluates the performance of Support Vector Machines, Random Forests and Decision Trees on the data set thus obtained is evaluated and the results are presented, highlighting the importance of hyper-parameter tuning for the problem considered. It is shown that the modelling methods employed demonstrate comparable performance, however the Decision Tree approach seems to perform the most consistently across the various target failure codes considered at this time.
- Published
- 2019
- Full Text
- View/download PDF
17. Classification of Heterogeneous Data Based on Data Type Impact on Similarity
- Author
-
Paul R. Trundle, Daniel Neagu, and Najat Ali
- Subjects
Data records ,Computer science ,Data classification ,Decision tree ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Data type ,010104 statistics & probability ,Statistical classification ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,0101 mathematics ,Categorical variable ,Data objects ,computer ,Classifier (UML) - Abstract
Real-world datasets are increasingly heterogeneous, showing a mixture of numerical, categorical and other feature types. The main challenge for mining heterogeneous datasets is how to deal with heterogeneity present in the dataset records. Although some existing classifiers (such as decision trees) can handle heterogeneous data in specific circumstances, the performance of such models may be still improved, because heterogeneity involves specific adjustments to similarity measurements and calculations. Moreover, heterogeneous data is still treated inconsistently and in ad-hoc manner. In this paper, we study the problem of heterogeneous data classification: our purpose is to use heterogeneity as a positive feature of the data classification effort by using consistently the similarity between data objects. We address the heterogeneity issue by studying the impact of mixing data types in the calculation of data objects’ similarity. To reach our goal, we propose an algorithm to divide the initial data records based on pairwise similarity for classification subtasks with the aim to increase the quality of the data subsets and apply specialized classifier models on them. The performance of the proposed approach is evaluated on 10 publicly available heterogeneous data sets. The results show that the models achieve better performance for heterogeneous datasets when using the proposed similarity process.
- Published
- 2018
- Full Text
- View/download PDF
18. Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology
- Author
-
Daniel Neagu, Paul R. Trundle, Pritesh Mistry, and Jonathan D. Vessey
- Subjects
0301 basic medicine ,Computer science ,Process (engineering) ,business.industry ,Decision tree ,Computational intelligence ,Computational toxicology ,computer.software_genre ,Machine learning ,Theoretical Computer Science ,Random forest ,03 medical and health sciences ,030104 developmental biology ,Toxicity ,Geometry and Topology ,Data mining ,Developmental Therapeutics Program ,Artificial intelligence ,business ,computer ,Software ,Selection (genetic algorithm) - Abstract
Drug vehicles are chemical carriers that provide beneficial aid to the drugs they bear. Taking advantage of their favourable properties can potentially allow the safer use of drugs that are considered highly toxic. A means for vehicle selection without experimental trial would therefore be of benefit in saving time and money for the industry. Although machine learning is increasingly used in predictive toxicology, to our knowledge there is no reported work in using machine learning techniques to model drug-vehicle relationships for vehicle selection to minimise toxicity. In this paper we demonstrate the use of data mining and machine learning techniques to process, extract and build models based on classifiers (decision trees and random forests) that allow us to predict which vehicle would be most suited to reduce a drug's toxicity. Using data acquired from the National Institute of Health's (NIH) Developmental Therapeutics Program (DTP) we propose a methodology using an area under a curve (AUC) approach that allows us to distinguish which vehicle provides the best toxicity profile for a drug and build classification models based on this knowledge. Our results show that we can achieve prediction accuracies of 80 % using random forest models whilst the decision tree models produce accuracies in the 70 % region. We consider our methodology widely applicable within the scientific domain and beyond for comprehensively building classification models for the comparison of functional relationships between two variables.
- Published
- 2015
- Full Text
- View/download PDF
19. Towards model governance in predictive toxicology
- Author
-
Paul R. Trundle, Anna Palczewska, Longzhi Yang, Kim Z. Travis, Daniel Neagu, Mick Ridley, and Xin Fu
- Subjects
Engineering ,Markup language ,Knowledge management ,Computer Networks and Communications ,business.industry ,Corporate governance ,Library and Information Sciences ,Reuse ,Data governance ,Risk analysis (engineering) ,Schema (psychology) ,New product development ,Web application ,Information governance ,business ,Information Systems - Abstract
Efficient management of toxicity information as an enterprise asset is increasingly important for the chemical, pharmaceutical, cosmetics and food industries. Many organisations focus on better information organisation and reuse, in an attempt to reduce the costs of testing and manufacturing in the product development phase. Toxicity information is extracted not only from toxicity data but also from predictive models. Accurate and appropriately shared models can bring a number of benefits if we are able to make effective use of existing expertise. Although usage of existing models may provide high-impact insights into the relationships between chemical attributes and specific toxicological effects, they can also be a source of risk for incorrect decisions. Thus, there is a need to provide a framework for efficient model management. To address this gap, this paper introduces a concept of model governance, that is based upon data governance principles. We extend the data governance processes by adding procedures that allow the evaluation of model use and governance for enterprise purposes. The core aspect of model governance is model representation. We propose six rules that form the basis of a model representation schema, called Minimum Information About a QSAR Model Representation (MIAQMR). As a proof-of-concept of our model governance framework we develop a web application called Model and Data Farm (MADFARM), in which models are described by the MIAQMR-ML markup language.
- Published
- 2013
- Full Text
- View/download PDF
20. HERMES: a FP7 funded project towards the development of a computer‐aided memory management system via intelligent computations
- Author
-
Arjan Geven, Jianmin Jiang, Fouad Khelifi, and Paul R. Trundle
- Subjects
Software ,Memory management ,Human–computer interaction ,Computer science ,business.industry ,Computation ,Rehabilitation ,Pattern recognition (psychology) ,Computer-aided ,Cognition ,Speech processing ,Semantics ,business - Abstract
In this article, we introduce a new concept in HERMES, the FP7 funded project in Europe, in developing technology innovations towards computer aided memory management via intelligent computation, and helping elderly people to overcome their decline in cognitive capabilities.In this project, an integrated computer aided memory management system is being developed from a strong interdisciplinary perspective, which brings together knowledge from gerontology to software and hardware integration. State‐of‐the‐art techniques and algorithms for image, video and speech processing, pattern recognition, semantic summarisation are illustrated, and the objectives and strategy for HERMES are described. Also, more details on the software that has been implemented are provided with future development direction.
- Published
- 2009
- Full Text
- View/download PDF
21. Bipartite Network Model for Inferring Hidden Ties in Crime Data
- Author
-
Daniel Neagu, Paul R. Trundle, and Haruna Isah
- Subjects
Social and Information Networks (cs.SI) ,FOS: Computer and information sciences ,Structure (mathematical logic) ,Physics - Physics and Society ,Point (typography) ,Computer science ,Law enforcement ,FOS: Physical sciences ,ComputingMilieux_LEGALASPECTSOFCOMPUTING ,Computer Science - Social and Information Networks ,Physics and Society (physics.soc-ph) ,Data science ,Identification (information) ,Order (exchange) ,Bipartite graph ,Crime data ,Network analysis - Abstract
Certain crimes are hardly committed by individuals but carefully organised by group of associates and affiliates loosely connected to each other with a single or small group of individuals coordinating the overall actions. A common starting point in understanding the structural organisation of criminal groups is to identify the criminals and their associates. Situations arise in many criminal datasets where there is no direct connection among the criminals. In this paper, we investigate ties and community structure in crime data in order to understand the operations of both traditional and cyber criminals, as well as to predict the existence of organised criminal networks. Our contributions are twofold: we propose a bipartite network model for inferring hidden ties between actors who initiated an illegal interaction and objects affected by the interaction, we then validate the method in two case studies on pharmaceutical crime and underground forum data using standard network algorithms for structural and community analysis. The vertex level metrics and community analysis results obtained indicate the significance of our work in understanding the operations and structure of organised criminal networks which were not immediately obvious in the data. Identifying these groups and mapping their relationship to one another is essential in making more effective disruption strategies in the future., 8 pages
- Published
- 2015
22. Using computational methods for the prediction of drug vehicles
- Author
-
Anna Palczewska, Pritesh Mistry, Paul R. Trundle, and Daniel Neagu
- Subjects
Drug ,Intrinsic activity ,Computer science ,business.industry ,media_common.quotation_subject ,Toxicity reduction ,Drug vehicle ,Toxicity ,Developmental Therapeutics Program ,Biochemical engineering ,Drug carrier ,business ,media_common ,Pharmaceutical industry - Abstract
Drug vehicles are chemical carriers that aid a drug's passage through an organism. Whilst they possess no intrinsic efficacy they are designed to achieve desirable characteristics which can include improving a drug's permeability and or solubility, targeting a drug to a specific site or reducing a drug's toxicity. All of which are ideally achieved without compromising the efficacy of the drug. Whilst the majority of drug vehicle research is focused on the solubility and permeability issues of a drug, significant progress has been made on using vehicles for toxicity reduction. Achieving this can enable safer and more effective use of a potent drug against diseases such as cancer. From a molecular perspective, drugs activate or deactivate biochemical pathways through interactions with cellular macromolecules resulting in toxicity. For newly developed drugs such pathways are not always clearly understood but toxicity endpoints are still required as part of a drug's registration. An understanding of which vehicles could be used to ameliorate the unwanted toxicities of newly developed drugs would be highly desirable to the pharmaceutical industry. In this paper we demonstrate the use of different classifiers as a means to select vehicles best suited to avert a drug's toxic effects when no other information about a drug's characteristics is known. Through analysis of data acquired from the Developmental Therapeutics Program (DTP) we are able to establish a link between a drug's toxicity and vehicle used. We demonstrate that classification and selection of the appropriate vehicle can be made based on the similarity of drug choice.
- Published
- 2014
- Full Text
- View/download PDF
23. Human Memory Assistance through Semantic-Based Text Processing
- Author
-
Jianmin Jiang and Paul R. Trundle
- Subjects
Search terms ,Memory management ,Text processing ,Multimedia ,Computer science ,Semantic interpretation ,Human memory ,Elderly people ,Cognition ,Context (language use) ,computer.software_genre ,computer ,Data science - Abstract
The proportion of elderly people across the world is predicted to increase significantly in the next 50 years. Tools to assist the elderly with remaining independent must be developed now to reduce the impact this will have on future generations. Technological solutions have the potential to alleviate some of the problems associated with old age, particularly those associated with the deterioration of memory. This paper proposes an algorithm for semantic-based text processing within the context of a cognitive care platform for older people, and an implementation of the algorithm used within the EU FP7 project HERMES is introduced. The algorithm facilitates computerised human-like memory management through semantic interpretation of everyday events and textual search terms, and the utilisation of human language lexical resources.
- Published
- 2009
- Full Text
- View/download PDF
24. A Memory Management System towards Cognitive Assistance of Elderly People
- Author
-
Jianmin Jiang, Paul R. Trundle, and Fouad Khelifi
- Subjects
Software ,Memory management ,Multimedia ,Computer science ,business.industry ,Perspective (graphical) ,Computer-aided ,Elderly people ,Cognition ,business ,computer.software_genre ,computer - Abstract
This paper describes technology innovations towards computer aided memory management via intelligent data processing, and helping elderly people to overcome their decline in terms of cognitive. The system which integrates the functionalities to be delivered by HERMES, the FP7 funded project in Europe, aims at assisting the user who suffers from memory decline due to aging with effective memory refreshment based on the correlation of textual, spoken, or visual data. In this project, the system is being developed from a strong interdisciplinary perspective, which brings together knowledge from gerontology to software and hardware implementation.
- Published
- 2009
- Full Text
- View/download PDF
25. A comparative study of machine learning algorithms applied to predictive toxicology data mining
- Author
-
Gongde Guo, Paul R. Trundle, Daniel Neagu, and Mark T. D. Cronin
- Subjects
Databases, Factual ,Computer science ,Trout ,Feature selection ,Toxicology ,Machine learning ,computer.software_genre ,Quail ,General Biochemistry, Genetics and Molecular Biology ,Set (abstract data type) ,Phenols ,Artificial Intelligence ,Predictive Value of Tests ,Toxicity Tests ,Feature (machine learning) ,Animals ,Selection (genetic algorithm) ,business.industry ,Online machine learning ,Reproducibility of Results ,General Medicine ,Bees ,Class (biology) ,Data set ,Medical Laboratory Technology ,Range (mathematics) ,Daphnia ,Data Interpretation, Statistical ,Artificial intelligence ,Data mining ,business ,computer ,Algorithm ,Algorithms - Abstract
This paper reports results of a comparative study of widely used machine learning algorithms applied to predictive toxicology data mining. The machine learning algorithms involved were chosen in terms of their representability and diversity, and were extensively evaluated with seven toxicity data sets which were taken from real-world applications. Some results based on visual analysis of the correlations of different descriptors to the class values of chemical compounds, and on the relationships of the range of chosen descriptors to the performance of machine learning algorithms, are emphasised from our experiments. Some interesting findings relating to the data and the quality of the models are presented — for example, that no specific algorithm appears best for all seven toxicity data sets, and that up to five descriptors are sufficient for creating classification models for each toxicity data set with good accuracy. We suggest that, for a specific data set, model accuracy is affected by the feature selection method and model development technique. Models built with too many or too few descriptors are undesirable, and finding the optimal feature subset appears at least as important as selecting appropriate algorithms with which to build a final model.
- Published
- 2007
26. Algorithms for (Q)SAR model building
- Author
-
Paul R. Trundle, Daniel Neagu, Marco Pintore, Frank Lemke, Nadège Piclin, Qasim Chaudhry, Jacques R. Chrétien, Marian Craciun, Gongde Guo, and Johann-Adolf Müller
- Subjects
Soft computing ,Complex data type ,Quantitative structure–activity relationship ,Artificial neural network ,business.industry ,Linear model ,Machine learning ,computer.software_genre ,Range (mathematics) ,Probabilistic method ,Artificial intelligence ,Data mining ,business ,computer ,Algorithm ,Model building ,Mathematics - Abstract
The concept of mathematically relating biological activity with physicochemical properties of related chemical compounds emerged in the 1960s. Early quantitative structure–activity relationships (QSARs) were based on simple principles, such as substituent parameters, and linear mathematics. It was gradually realized that QSAR models based on such simplistic properties and statistical algorithms only worked well in certain well-defined situations. QSAR models for relatively simple sets of molecular data are still based on linear algorithms, but this approach has only a limited usefulness in finding multidimensional relational patterns in complex data sets. Linear models are also often hard to generalize across chemical classes and/or test species. This has led to the use of nonlinear algorithms and soft computing techniques, such as fuzzy systems, probabilistic methods, and artificial neural networks to decipher relational patterns in large, imprecise, and complex data sets. This shift in QSAR paradigm has made it possible to predict biological properties of a wide range of chemicals, which otherwise would be difficult, or impossible to determine experimentally.
- Published
- 2007
- Full Text
- View/download PDF
27. Development of Multi-output Neural Networks for Data Integration — A Case Study
- Author
-
Qasim Chaudhry, Paul R. Trundle, Marian Craciun, and Daniel Neagu
- Subjects
Neuro-fuzzy ,Artificial neural network ,business.industry ,Computer science ,Time delay neural network ,Deep learning ,computer.software_genre ,Machine learning ,Data type ,Artificial intelligence ,Data mining ,Types of artificial neural networks ,business ,computer ,Nervous system network models ,Data integration - Abstract
Despite the wide variety of algorithms that exist to build predictive models, it can still be difficult to make accurate predictions for unknown values for certain types of data. New and innovative techniques are needed to overcome the problems underlying these difficulties for poor quality data, or data with a lack of available training cases. In this paper the authors propose a technique for integrating data from related datasets with the aim of improving the accuracy of predictions using Artificial Neural Networks. An overall improvement in the prediction power of models was shown when using the integration algorithm, when compared to models constructed using non-integrated data.
- Published
- 2007
- Full Text
- View/download PDF
28. Hybrid systems
- Author
-
Frank Lemke, Marco Pintore, Johann-Adolf Müller, Severin Bumbaru, Jacques R. Chrétien, Daniel Neagu, Viorel Minzu, Gongde Guo, Silviu Augustin Stroia, Marian Craciun, Paul R. Trundle, Antonio Chana, Nicolas Amaury, Giuseppina Gini, and Emilio Benfenati
- Subjects
Chemical descriptors ,Quantitative structure–activity relationship ,business.industry ,Nonlinear model ,Hybrid system ,Artificial intelligence ,business ,Hybrid model ,Mathematics - Abstract
Quantitative structure–activity relationship (QSAR) problems do not have, in general, linear solutions, and the problem is how to model those situations. Another consideration is that the nonlinear model should not be assumed but should emerge from data analysis. This chapter integrates the best models individually developed for each endpoint into a hybrid system for that endpoint. This has to be flexible to accept further inputs or modules, if available. Whereas inputs to the basic models are the chemical descriptors, input to the hybrid model are the n values predicted for each molecule by the n integrated models; the output is always the toxicity for that molecule. The basic theory behind the combinations, as well as the models obtained is illustrated.
- Published
- 2007
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.