Author: "Iain Buchan" / Publisher: ieee - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Iain Buchan"' showing total 8 results

Start Over Author "Iain Buchan" Publisher ieee

8 results on '"Iain Buchan"'

1. A fast and scalable high-throughput sequencing data error correction via oligomers

Author: Iain Buchan, Mattia Prosperi, Franco Milicchio, Franco Milicchio, ain E. Buchan, MattiaProsperi, Milicchio, Franco, Buchan, Iain E., and Prosperi, Mattia C. F.
Subjects: error correction, 0301 basic medicine, Computer science, 0206 medical engineering, Hash function, Inference, Genomics, 02 engineering and technology, computer.software_genre, De Bruijn graph, 03 medical and health sciences, symbols.namesake, Genetic, Artificial Intelligence, next generation sequencing, Health Informatic, Sanger sequencing, Agricultural and Biological Sciences (miscellaneous), Range (mathematics), 030104 developmental biology, Computational Mathematic, Scalability, symbols, Data mining, Error detection and correction, computer, 020602 bioinformatics, de Bruijn graph, Biotechnology
Abstract: Next-generation sequencing (NGS) technologies have superseded traditional Sanger sequencing approach in many experimental settings, given their tremendous yield and affordable cost. Nowadays it is possible to sequence any microbial organism or meta-genomic sample within hours, and to obtain a whole human genome in weeks. Nonetheless, NGS technologies are error-prone. Correcting errors is a challenge due to multiple factors, including the data sizes, the machine-specific and non-at-random characteristics of errors, and the error distributions. Errors in NGS experiments can hamper the subsequent data analysis and inference. This work proposes an error correction method based on the de Bruijn graph that permits its execution on Gigabyte-sized data sets using normal desktop/laptop computers, ideal for genome sizes in the Megabase range, e.g. bacteria. The implementation makes extensive use of hashing techniques, and implements an A∗ algorithm for optimal error correction, minimizing the distance between an erroneous read and its possible replacement with the Needleman-Wunsch score. Our approach outperforms other popular methods both in terms of random access memory usage and computing times.
Published: 2016

2. Why Linked Data is Not Enough for Scientists

Author: John Ainsworth, Philip Couch, David Newman, David De Roure, Danius T. Michaelides, Mark Delderfield, Don Cruickshank, Shoaib Sufi, Ian Dunlop, Carole Goble, Matthew Gamble, Jiten Bhagat, Iain Buchan, Sean Bechhofer, Stuart Owen, and Paolo Missier
Subjects: Computer science, Computer Networks and Communications, media_common.quotation_subject, Context (language use), Cloud computing, Data publishing, 02 engineering and technology, Reuse, First class, Data modeling, World Wide Web, 03 medical and health sciences, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Computer Science (miscellaneous), Quality (business), Research object, 030304 developmental biology, media_common, Publishing, 0303 health sciences, business.industry, Linked data, Sharing, Data science, Reproducibility, Data sharing, Hardware and Architecture, e-Science, Electronic publishing, Research Object, business, MyGrid, Software
Abstract: Scientific data represents a significant portion of the linked open data cloud and scientists stand to benefit from the data fusion capability this will afford. Publishing linked data into the cloud, however, does not ensure the required reusability. Publishing has requirements of provenance, quality, credit, attribution and methods to provide the reproducibility that enables validation of results. In this paper we make the case for a scientific data publication model on top of linked data and introduce the notion of Research Objects as first class citizens for sharing and publishing. Highlights? We identify and characterise different aspects of reuse and reproducibility. ? We examine requirements for such reuse. ? We propose a scientific data publication model that layers on top of linked data publishing.
Published: 2016

3. Sharable simulations of public health for evidence based policy making

Author: Philip Couch, John Ainsworth, and Iain Buchan
Subjects: Functional programming, Context model, Knowledge management, Computer science, business.industry, Health care, Graph (abstract data type), Graphical model, business, Data science, Semantic Web, Data modeling, Evidence-based policy
Abstract: Local health policies are not as evidence based as they could be if the public health impacts of policies were easier to simulate. Here we address the inaccessibility of high quality models of public health and policy — presenting the concepts of a new simulation framework, IMPACT, built on Semantic Web principles. Model and simulation data are persisted with rich semantics and context to support sharing and interpretation. For this purpose, graph storage systems are explored alongside a new framework for mapping clinical data objects to graphical models. The computation employs functional programming for the parallelised simulation of locally representative populations/cohorts changing over time. The input data, model information and simulation results are mapped to social networks of policy making using the Work/Research Object and e-Lab paradigm that is emerging in E-Science.
Published: 2011

4. Shared genomics: A platform for emerging interpretation of genetic epidemiology

Author: Gareth Smith, Lee Kitching, Crowther P, David C. Hoyle, Mark Delderfield, and Iain Buchan
Subjects: Annotation, Workflow, Genetic epidemiology, Process (engineering), Statistical genetics, Computer science, Interpretation (philosophy), Genomics, Web service, computer.software_genre, Data science, computer
Abstract: The study of the genetics of diseases has been revolutionised by the advent of genome-wide genotyping technologies. Increasingly, genome-wide association studies are being used to identify positions within the human genome that have a link with a disease condition. These new data sets require the use of distributed resources, both for the statistical analysis and for the interpretation of the analysis results. Aiding the latter will be be crucial for the statistical analysis process to be successful. In this paper we report our experiences in developing a user-friendly High Performance Computing (HPC) statistical genetics analysis platform for use by clinical researchers. Specifically, we report work on supporting the interpretation process through the automatic annotation of the statistical analysis results with relevant biological information. Retrieval of the biological annotation is performed by high-volume invocation of multiple web-services orchestrated via pre-existing scientific workflows. We also report work on developing tools to aid the capture and replay of the processes performed by a user when exploring analysis results.
Published: 2009

5. Federating health information systems to enable population level research

Author: Iain Buchan, John Ainsworth, and Crowther P
Subjects: Information privacy, education.field_of_study, System of record, business.industry, Population, Data science, Health informatics, World Wide Web, Data extraction, Proof of concept, Health care, Medicine, business, education, Record linkage
Abstract: Epidemiology requires large-scale, high-resolution, representative population data sets; data extracted from electronic health record systems meets these criteria. However, within the UK, there is no single electronic health record, and the record of a patient's healthcare is fragmented over multiple systems and multiple organizations. In the SHORE project we have developed a proof of concept system that addresses these problems by retaining control of patient data at a local level, where it can be effectively interpreted and governed, and by overlaying on these data sources privacy-preserving record linkage providing a unified view of the health and care of the population.
Published: 2009

6. Shared Genomics: Accessible High Performance Computing for Genomic Medical Research

Author: Iain Buchan, David C. Hoyle, Mark Delderfield, Lee Kitching, and Gareth Smith
Subjects: Computer science, Genome-wide association study, Human genome, Genomics, Single-nucleotide polymorphism, Locus (genetics), Computational biology, Bioinformatics, Genotyping, Genome, DNA sequencing
Abstract: The study of the genetic causes of disease is entering a new era. Variations in DNA sequence between individuals at a single position (locus) within the human genome are termed single nucleotide polymorphisms (SNPs), and may lead to a frank disease state or a variation in normal physiology. By comparing and contrasting the genomes of people who have a disease with the genomes of people who don't, we can begin to identify those genetic locii which potentially play a role in the disease. Modern biotechnology allows for the genotyping of individuals at hundreds of thousands of genetic locii. Whilst metrics to quantify the statistical importance of a single locus are essentially of low complexity, for example calculation of a x2 statistic, within a genome-wide association study this process is repeated at every locus. In addition, the entire computational process is often repeated with a number of randomised data sets, necessary for estimation of the statistical significance. The large number of locii, number of randomized data sets, and rapid combinatorial increase when analysing multiple SNPs, naturally dictates that a high performance computing (HPC) solution be developed. On a single core machine analysis of significant numbers of SNP pairs would take many years. Once statistical analysis of the data has been performed results must be annotated with relevant information to aid biological interpretation and hypothesis generation - this is a standard, but not in substantial bioinformatic task.
Published: 2008

7. Experience in e-Science Requirements Engineering

Author: Rob Procter, Alistair Sutcliffe, O. de Bruijn, Sarah Thew, Colin C. Venters, John McNaught, and Iain Buchan
Subjects: Vision, Geographic information system, Knowledge management, Requirements engineering, Computer science, business.industry, Cognition, computer.software_genre, Systems analysis, Tacit knowledge, Middleware (distributed applications), e-Science, business, computer
Abstract: We describe the experience of using a combination of requirements engineering techniques (scenarios, storyboards, observation and workshops) in an e-science application to develop a geographical analysis tool for epidemiologists. Problems encountered were: eliciting tacit knowledge; and creating new visions and working practices for our users. The combination of techniques worked well, although observation of working practice was not so effective in this scientific domain, where activity is mainly cognitive.
Published: 2008

8. PsyGrid: Applying e-Science to Epidemiology

Author: I. Juma, Robert Harper, John Ainsworth, and Iain Buchan
Subjects: education.field_of_study, Data grid, Process (engineering), Computer science, Population, computer.software_genre, Data science, Semantic grid, Grid computing, Utility computing, Cohort, e-Science, education, computer
Abstract: The process of hypothesis-driven epidemiological research has three phases - the establishment and characterisation of a large, representative cohort from a geographically distributed population; the integration of the cohort data with other data sources to provide additional characterisation; the formulation of a hypothesis and generation of the corresponding predictions. Grid-computing technologies make possible secure, distributed collaboration, and the ability to share data sources, computational resources and storage resources across administrative boundaries. PsyGrid is an e-Science project established to apply grid-computing technologies to each of the three phases, with the aim of eliminating the obstacles that hinder epidemiological research. We describe a system for distributed cohort characterisation, and the first application to the study of First Episode Psychosis.
Published: 2006

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

8 results on '"Iain Buchan"'

1. A fast and scalable high-throughput sequencing data error correction via oligomers

2. Why Linked Data is Not Enough for Scientists

3. Sharable simulations of public health for evidence based policy making

4. Shared genomics: A platform for emerging interpretation of genetic epidemiology

5. Federating health information systems to enable population level research

6. Shared Genomics: Accessible High Performance Computing for Genomic Medical Research

7. Experience in e-Science Requirements Engineering

8. PsyGrid: Applying e-Science to Epidemiology

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

8 results on '"Iain Buchan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources