Author: "Paul R. Trundle" - Searchworks@Jio Institute Digital Library Search Results

Author: Anna Palczewska, Xin Fu, Paul R. Trundle, Longzhi Yang, Daniel Neagu, Mick J. Ridley, and Kim Travis
Published: 2013
Full Text: View/download PDF

12. Medical image analysis with artificial neural networks.

Author: Jianmin Jiang, Paul R. Trundle, and Jinchang Ren
Published: 2010
Full Text: View/download PDF

13. Social Media Analysis for Product Safety using Text Mining and Sentiment Analysis.

Author: Haruna Isah, Daniel Neagu, and Paul R. Trundle
Published: 2015

14. Prediction of the effect of formulation on the toxicity of chemicals

Author: John Paul Gosling, Daniel Neagu, Pritesh Mistry, Jonathan D. Vessey, Paul R. Trundle, and Antonio Sánchez-Ruiz
Subjects: 0301 basic medicine, business.industry, Computer science, Health, Toxicology and Mutagenesis, Decision tree, Toxicology, computer.software_genre, Random forest, 03 medical and health sciences, 030104 developmental biology, Text mining, Toxicity, Partial least squares regression, Data mining, business, Cluster analysis, computer, Statistical evidence
Abstract: Two approaches for the prediction of which of two vehicles will result in lower toxicity for anticancer agents are presented. Machine-learning models are developed using decision tree, random forest and partial least squares methodologies and statistical evidence is presented to demonstrate that they represent valid models. Separately, a clustering method is presented that allows the ordering of vehicles by the toxicity they show for chemically-related compounds.
Published: 2017
Full Text: View/download PDF

15. Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Author: Paul R. Trundle, Najat Ali, and Daniel Neagu
Subjects: Computer science, business.industry, General Chemical Engineering, 05 social sciences, Data classification, General Engineering, 050301 education, General Physics and Astronomy, Pattern recognition, 02 engineering and technology, Euclidean distance, Binary data, 0202 electrical engineering, electronic engineering, information engineering, General Earth and Planetary Sciences, 020201 artificial intelligence & image processing, General Materials Science, Artificial intelligence, business, K nearest neighbour, 0503 education, Categorical variable, Test sample, Classifier (UML), General Environmental Science
Abstract: Distance-based algorithms are widely used for data classification problems. The k-nearest neighbour classification (k-NN) is one of the most popular distance-based algorithms. This classification is based on measuring the distances between the test sample and the training samples to determine the final classification output. The traditional k-NN classifier works naturally with numerical data. The main objective of this paper is to investigate the performance of k-NN on heterogeneous datasets, where data can be described as a mixture of numerical and categorical features. For the sake of simplicity, this work considers only one type of categorical data, which is binary data. In this paper, several similarity measures have been defined based on a combination between well-known distances for both numerical and binary data, and to investigate k-NN performances for classifying such heterogeneous data sets. The experiments used six heterogeneous datasets from different domains and two categories of measures. Experimental results showed that the proposed measures performed better for heterogeneous data than Euclidean distance, and that the challenges raised by the nature of heterogeneous data need personalised similarity measures adapted to the data characteristics.
Published: 2019
Full Text: View/download PDF

16. Vehicle Warranty Claim Prediction from Diagnostic Data Using Classification

Author: Daniel Neagu, Andrew Sherratt, Paul R. Trundle, Felician Campean, and Denis Torgunov
Subjects: Computer science, business.industry, 020208 electrical & electronic engineering, Warranty, Decision tree, Automotive industry, 02 engineering and technology, Machine learning, computer.software_genre, Field (computer science), Random forest, Support vector machine, On-board diagnostics, Binary classification, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer
Abstract: This paper presents an approach to predict warranty repair claims on automotive units based on joint on-board diagnostic and historic warranty repair data. The problem is framed as binary classification, facilitating the applicability of a variety of machine learning techniques. The approach allows automotive manufacturers to make better use of the operational and failure data collected from the field, allowing for better spend forecast and more targeted vehicle health management interventions and campaigns. The research evaluates the performance of Support Vector Machines, Random Forests and Decision Trees on the data set thus obtained is evaluated and the results are presented, highlighting the importance of hyper-parameter tuning for the problem considered. It is shown that the modelling methods employed demonstrate comparable performance, however the Decision Tree approach seems to perform the most consistently across the various target failure codes considered at this time.
Published: 2019
Full Text: View/download PDF

17. Classification of Heterogeneous Data Based on Data Type Impact on Similarity

Author: Paul R. Trundle, Daniel Neagu, and Najat Ali
Subjects: Data records, Computer science, Data classification, Decision tree, 02 engineering and technology, computer.software_genre, 01 natural sciences, Data type, 010104 statistics & probability, Statistical classification, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Data mining, 0101 mathematics, Categorical variable, Data objects, computer, Classifier (UML)
Abstract: Real-world datasets are increasingly heterogeneous, showing a mixture of numerical, categorical and other feature types. The main challenge for mining heterogeneous datasets is how to deal with heterogeneity present in the dataset records. Although some existing classifiers (such as decision trees) can handle heterogeneous data in specific circumstances, the performance of such models may be still improved, because heterogeneity involves specific adjustments to similarity measurements and calculations. Moreover, heterogeneous data is still treated inconsistently and in ad-hoc manner. In this paper, we study the problem of heterogeneous data classification: our purpose is to use heterogeneity as a positive feature of the data classification effort by using consistently the similarity between data objects. We address the heterogeneity issue by studying the impact of mixing data types in the calculation of data objects’ similarity. To reach our goal, we propose an algorithm to divide the initial data records based on pairwise similarity for classification subtasks with the aim to increase the quality of the data subsets and apply specialized classifier models on them. The performance of the proposed approach is evaluated on 10 publicly available heterogeneous data sets. The results show that the models achieve better performance for heterogeneous datasets when using the proposed similarity process.
Published: 2018
Full Text: View/download PDF

18. Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology

Author: Daniel Neagu, Paul R. Trundle, Pritesh Mistry, and Jonathan D. Vessey
Subjects: 0301 basic medicine, Computer science, Process (engineering), business.industry, Decision tree, Computational intelligence, Computational toxicology, computer.software_genre, Machine learning, Theoretical Computer Science, Random forest, 03 medical and health sciences, 030104 developmental biology, Toxicity, Geometry and Topology, Data mining, Developmental Therapeutics Program, Artificial intelligence, business, computer, Software, Selection (genetic algorithm)
Abstract: Drug vehicles are chemical carriers that provide beneficial aid to the drugs they bear. Taking advantage of their favourable properties can potentially allow the safer use of drugs that are considered highly toxic. A means for vehicle selection without experimental trial would therefore be of benefit in saving time and money for the industry. Although machine learning is increasingly used in predictive toxicology, to our knowledge there is no reported work in using machine learning techniques to model drug-vehicle relationships for vehicle selection to minimise toxicity. In this paper we demonstrate the use of data mining and machine learning techniques to process, extract and build models based on classifiers (decision trees and random forests) that allow us to predict which vehicle would be most suited to reduce a drug's toxicity. Using data acquired from the National Institute of Health's (NIH) Developmental Therapeutics Program (DTP) we propose a methodology using an area under a curve (AUC) approach that allows us to distinguish which vehicle provides the best toxicity profile for a drug and build classification models based on this knowledge. Our results show that we can achieve prediction accuracies of 80 % using random forest models whilst the decision tree models produce accuracies in the 70 % region. We consider our methodology widely applicable within the scientific domain and beyond for comprehensively building classification models for the comparison of functional relationships between two variables.
Published: 2015
Full Text: View/download PDF

19. Towards model governance in predictive toxicology

Author: Paul R. Trundle, Anna Palczewska, Longzhi Yang, Kim Z. Travis, Daniel Neagu, Mick Ridley, and Xin Fu
Subjects: Engineering, Markup language, Knowledge management, Computer Networks and Communications, business.industry, Corporate governance, Library and Information Sciences, Reuse, Data governance, Risk analysis (engineering), Schema (psychology), New product development, Web application, Information governance, business, Information Systems
Abstract: Efficient management of toxicity information as an enterprise asset is increasingly important for the chemical, pharmaceutical, cosmetics and food industries. Many organisations focus on better information organisation and reuse, in an attempt to reduce the costs of testing and manufacturing in the product development phase. Toxicity information is extracted not only from toxicity data but also from predictive models. Accurate and appropriately shared models can bring a number of benefits if we are able to make effective use of existing expertise. Although usage of existing models may provide high-impact insights into the relationships between chemical attributes and specific toxicological effects, they can also be a source of risk for incorrect decisions. Thus, there is a need to provide a framework for efficient model management. To address this gap, this paper introduces a concept of model governance, that is based upon data governance principles. We extend the data governance processes by adding procedures that allow the evaluation of model use and governance for enterprise purposes. The core aspect of model governance is model representation. We propose six rules that form the basis of a model representation schema, called Minimum Information About a QSAR Model Representation (MIAQMR). As a proof-of-concept of our model governance framework we develop a web application called Model and Data Farm (MADFARM), in which models are described by the MIAQMR-ML markup language.
Published: 2013
Full Text: View/download PDF

20. HERMES: a FP7 funded project towards the development of a computer‐aided memory management system via intelligent computations

Author: Arjan Geven, Jianmin Jiang, Fouad Khelifi, and Paul R. Trundle
Subjects: Software, Memory management, Human–computer interaction, Computer science, business.industry, Computation, Rehabilitation, Pattern recognition (psychology), Computer-aided, Cognition, Speech processing, Semantics, business
Abstract: In this article, we introduce a new concept in HERMES, the FP7 funded project in Europe, in developing technology innovations towards computer aided memory management via intelligent computation, and helping elderly people to overcome their decline in cognitive capabilities.In this project, an integrated computer aided memory management system is being developed from a strong interdisciplinary perspective, which brings together knowledge from gerontology to software and hardware integration. State‐of‐the‐art techniques and algorithms for image, video and speech processing, pattern recognition, semantic summarisation are illustrated, and the objectives and strategy for HERMES are described. Also, more details on the software that has been implemented are provided with future development direction.
Published: 2009
Full Text: View/download PDF

21. Bipartite Network Model for Inferring Hidden Ties in Crime Data

Author: Daniel Neagu, Paul R. Trundle, and Haruna Isah
Subjects: Social and Information Networks (cs.SI), FOS: Computer and information sciences, Structure (mathematical logic), Physics - Physics and Society, Point (typography), Computer science, Law enforcement, FOS: Physical sciences, ComputingMilieux_LEGALASPECTSOFCOMPUTING, Computer Science - Social and Information Networks, Physics and Society (physics.soc-ph), Data science, Identification (information), Order (exchange), Bipartite graph, Crime data, Network analysis
Abstract: Certain crimes are hardly committed by individuals but carefully organised by group of associates and affiliates loosely connected to each other with a single or small group of individuals coordinating the overall actions. A common starting point in understanding the structural organisation of criminal groups is to identify the criminals and their associates. Situations arise in many criminal datasets where there is no direct connection among the criminals. In this paper, we investigate ties and community structure in crime data in order to understand the operations of both traditional and cyber criminals, as well as to predict the existence of organised criminal networks. Our contributions are twofold: we propose a bipartite network model for inferring hidden ties between actors who initiated an illegal interaction and objects affected by the interaction, we then validate the method in two case studies on pharmaceutical crime and underground forum data using standard network algorithms for structural and community analysis. The vertex level metrics and community analysis results obtained indicate the significance of our work in understanding the operations and structure of organised criminal networks which were not immediately obvious in the data. Identifying these groups and mapping their relationship to one another is essential in making more effective disruption strategies in the future., 8 pages
Published: 2015

22. Using computational methods for the prediction of drug vehicles

Author: Anna Palczewska, Pritesh Mistry, Paul R. Trundle, and Daniel Neagu
Subjects: Drug, Intrinsic activity, Computer science, business.industry, media_common.quotation_subject, Toxicity reduction, Drug vehicle, Toxicity, Developmental Therapeutics Program, Biochemical engineering, Drug carrier, business, media_common, Pharmaceutical industry
Abstract: Drug vehicles are chemical carriers that aid a drug's passage through an organism. Whilst they possess no intrinsic efficacy they are designed to achieve desirable characteristics which can include improving a drug's permeability and or solubility, targeting a drug to a specific site or reducing a drug's toxicity. All of which are ideally achieved without compromising the efficacy of the drug. Whilst the majority of drug vehicle research is focused on the solubility and permeability issues of a drug, significant progress has been made on using vehicles for toxicity reduction. Achieving this can enable safer and more effective use of a potent drug against diseases such as cancer. From a molecular perspective, drugs activate or deactivate biochemical pathways through interactions with cellular macromolecules resulting in toxicity. For newly developed drugs such pathways are not always clearly understood but toxicity endpoints are still required as part of a drug's registration. An understanding of which vehicles could be used to ameliorate the unwanted toxicities of newly developed drugs would be highly desirable to the pharmaceutical industry. In this paper we demonstrate the use of different classifiers as a means to select vehicles best suited to avert a drug's toxic effects when no other information about a drug's characteristics is known. Through analysis of data acquired from the Developmental Therapeutics Program (DTP) we are able to establish a link between a drug's toxicity and vehicle used. We demonstrate that classification and selection of the appropriate vehicle can be made based on the similarity of drug choice.
Published: 2014
Full Text: View/download PDF

23. Human Memory Assistance through Semantic-Based Text Processing

Author: Jianmin Jiang and Paul R. Trundle
Subjects: Search terms, Memory management, Text processing, Multimedia, Computer science, Semantic interpretation, Human memory, Elderly people, Cognition, Context (language use), computer.software_genre, computer, Data science
Abstract: The proportion of elderly people across the world is predicted to increase significantly in the next 50 years. Tools to assist the elderly with remaining independent must be developed now to reduce the impact this will have on future generations. Technological solutions have the potential to alleviate some of the problems associated with old age, particularly those associated with the deterioration of memory. This paper proposes an algorithm for semantic-based text processing within the context of a cognitive care platform for older people, and an implementation of the algorithm used within the EU FP7 project HERMES is introduced. The algorithm facilitates computerised human-like memory management through semantic interpretation of everyday events and textual search terms, and the utilisation of human language lexical resources.
Published: 2009
Full Text: View/download PDF

24. A Memory Management System towards Cognitive Assistance of Elderly People

Author: Jianmin Jiang, Paul R. Trundle, and Fouad Khelifi
Subjects: Software, Memory management, Multimedia, Computer science, business.industry, Perspective (graphical), Computer-aided, Elderly people, Cognition, business, computer.software_genre, computer
Abstract: This paper describes technology innovations towards computer aided memory management via intelligent data processing, and helping elderly people to overcome their decline in terms of cognitive. The system which integrates the functionalities to be delivered by HERMES, the FP7 funded project in Europe, aims at assisting the user who suffers from memory decline due to aging with effective memory refreshment based on the correlation of textual, spoken, or visual data. In this project, the system is being developed from a strong interdisciplinary perspective, which brings together knowledge from gerontology to software and hardware implementation.
Published: 2009
Full Text: View/download PDF

25. A comparative study of machine learning algorithms applied to predictive toxicology data mining

Author: Gongde Guo, Paul R. Trundle, Daniel Neagu, and Mark T. D. Cronin
Subjects: Databases, Factual, Computer science, Trout, Feature selection, Toxicology, Machine learning, computer.software_genre, Quail, General Biochemistry, Genetics and Molecular Biology, Set (abstract data type), Phenols, Artificial Intelligence, Predictive Value of Tests, Toxicity Tests, Feature (machine learning), Animals, Selection (genetic algorithm), business.industry, Online machine learning, Reproducibility of Results, General Medicine, Bees, Class (biology), Data set, Medical Laboratory Technology, Range (mathematics), Daphnia, Data Interpretation, Statistical, Artificial intelligence, Data mining, business, computer, Algorithm, Algorithms
Abstract: This paper reports results of a comparative study of widely used machine learning algorithms applied to predictive toxicology data mining. The machine learning algorithms involved were chosen in terms of their representability and diversity, and were extensively evaluated with seven toxicity data sets which were taken from real-world applications. Some results based on visual analysis of the correlations of different descriptors to the class values of chemical compounds, and on the relationships of the range of chosen descriptors to the performance of machine learning algorithms, are emphasised from our experiments. Some interesting findings relating to the data and the quality of the models are presented — for example, that no specific algorithm appears best for all seven toxicity data sets, and that up to five descriptors are sufficient for creating classification models for each toxicity data set with good accuracy. We suggest that, for a specific data set, model accuracy is affected by the feature selection method and model development technique. Models built with too many or too few descriptors are undesirable, and finding the optimal feature subset appears at least as important as selecting appropriate algorithms with which to build a final model.
Published: 2007

26. Algorithms for (Q)SAR model building

Author: Paul R. Trundle, Daniel Neagu, Marco Pintore, Frank Lemke, Nadège Piclin, Qasim Chaudhry, Jacques R. Chrétien, Marian Craciun, Gongde Guo, and Johann-Adolf Müller
Subjects: Soft computing, Complex data type, Quantitative structure–activity relationship, Artificial neural network, business.industry, Linear model, Machine learning, computer.software_genre, Range (mathematics), Probabilistic method, Artificial intelligence, Data mining, business, computer, Algorithm, Model building, Mathematics
Abstract: The concept of mathematically relating biological activity with physicochemical properties of related chemical compounds emerged in the 1960s. Early quantitative structure–activity relationships (QSARs) were based on simple principles, such as substituent parameters, and linear mathematics. It was gradually realized that QSAR models based on such simplistic properties and statistical algorithms only worked well in certain well-defined situations. QSAR models for relatively simple sets of molecular data are still based on linear algorithms, but this approach has only a limited usefulness in finding multidimensional relational patterns in complex data sets. Linear models are also often hard to generalize across chemical classes and/or test species. This has led to the use of nonlinear algorithms and soft computing techniques, such as fuzzy systems, probabilistic methods, and artificial neural networks to decipher relational patterns in large, imprecise, and complex data sets. This shift in QSAR paradigm has made it possible to predict biological properties of a wide range of chemicals, which otherwise would be difficult, or impossible to determine experimentally.
Published: 2007
Full Text: View/download PDF

27. Development of Multi-output Neural Networks for Data Integration — A Case Study

Author: Qasim Chaudhry, Paul R. Trundle, Marian Craciun, and Daniel Neagu
Subjects: Neuro-fuzzy, Artificial neural network, business.industry, Computer science, Time delay neural network, Deep learning, computer.software_genre, Machine learning, Data type, Artificial intelligence, Data mining, Types of artificial neural networks, business, computer, Nervous system network models, Data integration
Abstract: Despite the wide variety of algorithms that exist to build predictive models, it can still be difficult to make accurate predictions for unknown values for certain types of data. New and innovative techniques are needed to overcome the problems underlying these difficulties for poor quality data, or data with a lack of available training cases. In this paper the authors propose a technique for integrating data from related datasets with the aim of improving the accuracy of predictions using Artificial Neural Networks. An overall improvement in the prediction power of models was shown when using the integration algorithm, when compared to models constructed using non-integrated data.
Published: 2007
Full Text: View/download PDF

28. Hybrid systems

Author: Frank Lemke, Marco Pintore, Johann-Adolf Müller, Severin Bumbaru, Jacques R. Chrétien, Daniel Neagu, Viorel Minzu, Gongde Guo, Silviu Augustin Stroia, Marian Craciun, Paul R. Trundle, Antonio Chana, Nicolas Amaury, Giuseppina Gini, and Emilio Benfenati
Subjects: Chemical descriptors, Quantitative structure–activity relationship, business.industry, Nonlinear model, Hybrid system, Artificial intelligence, business, Hybrid model, Mathematics
Abstract: Quantitative structure–activity relationship (QSAR) problems do not have, in general, linear solutions, and the problem is how to model those situations. Another consideration is that the nonlinear model should not be assumed but should emerge from data analysis. This chapter integrates the best models individually developed for each endpoint into a hybrid system for that endpoint. This has to be flexible to accept further inputs or modules, if available. Whereas inputs to the basic models are the chemical descriptors, input to the hybrid model are the n values predicted for each molecule by the n integrated models; the output is always the toxicity for that molecule. The basic theory behind the combinations, as well as the models obtained is illustrated.
Published: 2007
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

28 results on '"Paul R. Trundle"'

1. Vehicle Warranty Claim Prediction from Diagnostic Data Using Classification.

2. Classification of Heterogeneous Data Based on Data Type Impact on Similarity.

3. Bipartite Network Model for Inferring Hidden Ties in Crime Data.

4. Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology.

5. Social media analysis for product safety using text mining and sentiment analysis.

6. Using computational methods for the prediction of drug vehicles.

7. Human Memory Assistance through Semantic-Based Text Processing.

8. A Memory Management System towards Cognitive Assistance of Elderly People.

9. Development of Multi-output Neural Networks for Data Integration - A Case Study.

10. Multi-source Data Modelling: Integrating Related Data to Improve Model Performance.

11. Towards model governance in predictive toxicology.

12. Medical image analysis with artificial neural networks.

13. Social Media Analysis for Product Safety using Text Mining and Sentiment Analysis.

14. Prediction of the effect of formulation on the toxicity of chemicals

15. Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

16. Vehicle Warranty Claim Prediction from Diagnostic Data Using Classification

17. Classification of Heterogeneous Data Based on Data Type Impact on Similarity

18. Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology

19. Towards model governance in predictive toxicology

20. HERMES: a FP7 funded project towards the development of a computer‐aided memory management system via intelligent computations

21. Bipartite Network Model for Inferring Hidden Ties in Crime Data

22. Using computational methods for the prediction of drug vehicles

23. Human Memory Assistance through Semantic-Based Text Processing

24. A Memory Management System towards Cognitive Assistance of Elderly People

25. A comparative study of machine learning algorithms applied to predictive toxicology data mining

26. Algorithms for (Q)SAR model building

27. Development of Multi-output Neural Networks for Data Integration — A Case Study

28. Hybrid systems

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

28 results on '"Paul R. Trundle"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources