44 results on '"Tim Todman"'
Search Results
2. Non-deterministic event brokered computing
- Author
-
Andrew Brown, Tim Todman, Wayne Luk, David Thomas, Mark Vousden, Graeme Bragg, Jonny Beaumont, Simon Moore, Alex Yakovlev, and Ashur Rafiev
- Published
- 2022
- Full Text
- View/download PDF
3. Custom enhancements to networked processor templates
- Author
-
Tim Todman and Wayne Luk
- Subjects
Very-large-scale integration ,Task (computing) ,Template ,Software ,Parallel processing (DSP implementation) ,Computer science ,business.industry ,Embedded system ,Hardware acceleration ,Scalable parallelism ,Field-programmable gate array ,business - Abstract
Processor templates are a well-established way to design for FPGA technology, easing the task of implementation by reducing it to choosing a template and writing software for it – while avoiding the need for hardware design experience and circumventing the installation and execution of FPGA design tools. Networked processor templates allow designers to achieve scalable parallelism by covering a network of processors, while retaining the simplicity of using processor templates. This paper proposes techniques to improve the performance of networked processor templates, by adding custom instructions to processors in the network. An approach has been developed to systematically choose parts of a design to implement as custom instructions. Performance and area models have also been devised to allow the prediction of performance and area usage of a design targeting a particular FPGA, enabling design trade-offs to be explored prior to implementation. The proposed approach has been evaluated on various applications including N-body simulation and dissipative particle dynamics, demonstrating its potential of delivering hardware acceleration based on custom instructions targeting state-of-the-art FPGAs.
- Published
- 2021
- Full Text
- View/download PDF
4. Exploring performance enhancement of event-driven processor networks
- Author
-
David B. Thomas, Tim Todman, and Wayne Luk
- Subjects
Pipeline transport ,Flow (mathematics) ,Event (computing) ,Computer science ,Design flow ,Real-time computing ,Electronic design automation ,Performance enhancement - Abstract
Event-driven processor networks have been proposed as an effective way of exploiting recent advances in field-programmable technology. This paper explores an approach to enhancing the performance of event-driven processor networks for specific applications: attaching to the processor network accelerators with custom-designed logic. We present a design flow of this approach, and apply the flow to a heatplate application.
- Published
- 2020
- Full Text
- View/download PDF
5. Artisan: a meta-programming approach for codifying optimisation strategies
- Author
-
Eriko Nurvitadhi, Tim Todman, Jessica Vandebon, Jose G. F. Coutinho, and Wayne Luk
- Subjects
Technology ,Speedup ,Science & Technology ,Computer science ,business.industry ,Maintainability ,Multiple applications ,020207 software engineering ,02 engineering and technology ,Python (programming language) ,Metaprogramming ,020202 computer hardware & architecture ,Computer Science ,0202 electrical engineering, electronic engineering, information engineering ,Software engineering ,business ,Field-programmable gate array ,Computer Science, Hardware & Architecture ,computer ,computer.programming_language - Abstract
This paper provides a novel compilation approach that addresses the complexity of mapping high-level descriptions to heterogeneous platforms, improving design productivity and maintainability. Our approach is based on a co-design methodology decoupling functional concerns from optimisation concerns, allowing two separate descriptions to be independently maintained by two types of programmers: application experts focus on algorithmic behaviour, while platform experts focus on the mapping process. Our approach supports two key requirements: (1) Customisable optimisations to rapidly capture a wide range of mapping strategies, and (2) Reusable strategies to allow optimisations to be described once and applied to multiple applications. To evaluate our approach, we develop Artisan, a meta-programming tool for codifying optimisation strategies using a high-level general-purpose programming language (Python 3), offering full design-flow orchestration of key components (source-code, third-party tools, and platforms). We evaluate Artisan using three case study applications and three reusable optimisation strategies, achieving at least 24 times speedup for each application on CPU and FPGA targets with little application developer effort.
- Published
- 2020
6. Lossy Multiport Memory
- Author
-
Wenguang Xu, Bowen P. Y. Kwan, Gary C.T. Chow, Tim Todman, and Wayne Luk
- Subjects
High memory ,Memory bank ,Address space ,Computer science ,Scalability ,Memory architecture ,0202 electrical engineering, electronic engineering, information engineering ,02 engineering and technology ,Parallel computing ,Lossy compression ,Field-programmable gate array ,020202 computer hardware & architecture ,Data compression - Abstract
Supporting a high level of parallelism for statistical algorithms on FPGAs is often hindered by the uncertainty of random memory access and the associated difficulty of scheduling at runtime. By exploiting the statistical properties to tolerate memory request reordering and droppage, speed enhancement and resource reduction of the design can be optimized, leading to a more efficient and parallelizable design. This paper introduces a novel lossy multiport memory capable of high memory bandwidth, providing concurrent accesses to a single address space through multiple ports. The proposed architecture contains parallel memory banks connected by lossy switch networks to multiple input ports and local ring buffers. For 4 parallel read/write ports, our design reduces BRAM usage by 68% while having the operating frequency increased by 50% as compared to state-of-the-art memory designs. The drop rate of the design is 2% under full port utilization, and is reducible without altering the architecture at runtime. With a simple and scalable structure, this memory architecture can be scaled up to 64 parallel read/write ports and beyond, which outperforms most of the existing designs. Experiments show that this lossy memory can reduce slice usage by 90.8 times for random forest training and data compression, with a reduction in accuracy of only 2%.
- Published
- 2018
- Full Text
- View/download PDF
7. Using Statistical Assertions to Guide Self-Adaptive Systems
- Author
-
Stephan C. Stilkerich, Wayne Luk, and Tim Todman
- Subjects
Class (computer programming) ,lcsh:Computer engineering. Computer hardware ,Theoretical computer science ,Article Subject ,Cover (telecommunications) ,Computer science ,Response time ,lcsh:TK7885-7895 ,Self adaptive ,02 engineering and technology ,Software implementation ,020202 computer hardware & architecture ,Resource (project management) ,Computer engineering ,Hardware and Architecture ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Relevance (information retrieval) ,Pairwise comparison - Abstract
Self-adaptive systems need to monitor themselves, to check their internal behaviour and design assumptions about runtime inputs and conditions. This kind of monitoring for self-adaptive systems can include collecting statistics about such systems themselves which can be computationally intensive (for detailed statistics) and hence time consuming, with possible negative impact on self-adaptive response time. To mitigate this limitation, we extend the technique of in-circuit runtime assertions to cover statistical assertions in hardware. The presented designs implement several statistical operators that can be exploited by self-adaptive systems; a novel optimization is developed for reducing the number of pairwise operators fromONtoOlogN. To illustrate the practicability and industrial relevance of our proposed approach, we evaluate our designs, chosen from a class of possible application scenarios, for their resource usage and the tradeoffs between hardware and software implementations.
- Published
- 2014
- Full Text
- View/download PDF
8. In-Circuit Assertions and Exceptions for Reconfigurable Hardware Design
- Author
-
Tim Todman and Wayne Luk
- Subjects
010302 applied physics ,Speedup ,Low overhead ,Computer science ,Assertion ,02 engineering and technology ,01 natural sciences ,Reconfigurable computing ,020202 computer hardware & architecture ,Computer engineering ,0103 physical sciences ,Lookup table ,0202 electrical engineering, electronic engineering, information engineering ,Field-programmable gate array ,Formal verification ,Block (data storage) - Abstract
We present an approach to enable run-time, in-circuit assertions and exceptions in reconfigurable hardware designs. Static, compile-time checking, including formal verification, can catch many errors before a reconfigurable design is implemented. However, many other errors cannot be caught by static approaches, including those due to run-time data. Our approach allows users to add run-time assertions and exceptions to a design, giving multiple ways to handle run-time errors. We also allow imprecise assertions and exceptions, so that the origin of a failed assertion or raised exception is blurred. Users can take advantage of exception imprecision to trade performance for accurate location of errors. Our work includes a high-level approach to adding assertions and exceptions to a design, a concrete implementation for Maxeler streaming designs, and an evaluation. Results show low overhead for supporting assertions and exceptions in hardware design targeting FPGAs. For example, the cost of including assertions lies between 5% in lookup tables and 15% in Block RAMs in addition to the area used by the original design, due to logic used to implement assertion conditions, and buffers used to store assertion results. Furthermore, imprecision gives immediate benefits and up to 48% speedup over precise exceptions.
- Published
- 2017
- Full Text
- View/download PDF
9. Transparent In-Circuit Assertions for FPGAs
- Author
-
Tim Todman, Eddie Hung, Wayne Luk, Engineering & Physical Science Research Council (EPSRC), Commission of the European Communities, and Engineering & Physical Science Research Council (E
- Subjects
Technology ,Computer Hardware & Architecture ,Computer science ,Exception handling ,Logic simulation ,0102 computer and information sciences ,02 engineering and technology ,Circuits and systems ,01 natural sciences ,Programmable logic array ,electronic design automation and methodology ,Engineering ,VHDL ,design automation ,0202 electrical engineering, electronic engineering, information engineering ,reconfigurable logic ,logic design ,Electrical and Electronic Engineering ,Field-programmable gate array ,Computer Science, Hardware & Architecture ,Register-transfer level ,computer.programming_language ,field programmable gate arrays ,1006 Computer Hardware ,Science & Technology ,business.industry ,Hardware description language ,0906 Electrical And Electronic Engineering ,Engineering, Electrical & Electronic ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,010201 computation theory & mathematics ,Embedded system ,Computer Science ,Software design ,Computer Science, Interdisciplinary Applications ,Place and route ,business ,computer ,integrated circuits ,Software - Abstract
Commonly used in software design, assertions are statements placed into a design to ensure that its behavior matches that expected by a designer. Although assertions apply equally to hardware design, they are typically supported only for logic simulation, and discarded prior to physical implementation. We propose a new hardware design language-agnostic language for describing latency-insensitive assertions and novel methods to add such assertions transparently to an already placed-and-routed circuit without affecting the existing design. We also describe how this language and associated methods can be used to implement semi-transparent exception handling. The key to this paper is that by treating hardware assertions and exceptions as being oblivious or less sensitive to latency, assertion logic need only use spare FPGA resources. We use network-flow techniques to route necessary signals to assertions via spare flip-flops, eliminating any performance degradation, even on large designs (92% of slices in one test). Experimental evaluation shows zero impact on critical-path delay, even on large benchmarks operating above 200 MHz, at the cost of a small power penalty.
- Published
- 2016
10. Optimizing Hardware Design by Composing Utility-Directed Transformations
- Author
-
Qiang Liu, George A. Constantinides, Tim Todman, and Wayne Luk
- Subjects
business.industry ,Computer science ,Transformation (music) ,Theoretical Computer Science ,Logic synthesis ,Computational Theory and Mathematics ,Parallel processing (DSP implementation) ,Computer architecture ,Hardware and Architecture ,Gate array ,business ,Field-programmable gate array ,Software ,Computer hardware ,Compile time - Abstract
Utility-directed transformations involve changing a design to optimize for given constraints while preserving behavior. These changes are often achieved by techniques such as linear programming or geometric programming. We present a systematic approach composing multiple utility-directed transformations for optimizing and mapping a sequential design onto a customizable parallel computing platform such as a Field-Programmable Gate Array (FPGA). Our aim is to enable automatic design optimization at compile time. Design goals specified by users drive the design transformations. Each utility-directed transformation achieves part of the overall goal, and multiple utility-directed transformations, connected by pattern-directed transformations, are composed to fulfill the overall design requirements. The utility-directed transformations in this work produce performance-optimized designs by exploiting data reuse, MapReduce, and pipelining for the target parallel computing platform. Moreover, it is shown that performing transformations in different orders allows users to trade speed for resources, and design performance for compile time. Several applications are used to evaluate this approach on FPGAs. The system performance of a 64-bit matrix multiplication is shown to improve up to 98 times compared to the original design, in the target hardware platform.
- Published
- 2012
- Full Text
- View/download PDF
11. Field‐programmable gate arrays and quantum Monte Carlo: Power efficient coprocessing for scalable high‐performance computing
- Author
-
Wayne Luk, Hugh G. A. Burton, Jonathan R. R. Kimmitt, Alex J. W. Thom, Tim Todman, Shurui Li, and Salvatore Cardamone
- Subjects
Physics ,010304 chemical physics ,Quantum Monte Carlo ,Power efficient ,Condensed Matter Physics ,Supercomputer ,01 natural sciences ,Atomic and Molecular Physics, and Optics ,Computational science ,0103 physical sciences ,Scalability ,Variational Monte Carlo ,Physical and Theoretical Chemistry ,010306 general physics ,Field-programmable gate array - Published
- 2019
- Full Text
- View/download PDF
12. Automated Mapping of the MapReduce Pattern onto Parallel Computing Platforms
- Author
-
Tim Todman, George A. Constantinides, Wayne Luk, and Qiang Liu
- Subjects
Computer science ,Pipeline (computing) ,Parallel computing ,Theoretical Computer Science ,Tree structure ,Hardware and Architecture ,Control and Systems Engineering ,Gate array ,Modeling and Simulation ,Signal Processing ,Pattern recognition (psychology) ,Parallelism (grammar) ,Data-intensive computing ,Geometric programming ,Field-programmable gate array ,Information Systems - Abstract
The MapReduce pattern can be found in many important applications, and can be exploited to significantly improve system parallelism. Unlike previous work, in which designers explicitly specify how to exploit the pattern, we develop a compilation approach for mapping applications with the MapReduce pattern automatically onto Field-Programmable Gate Array (FPGA) based parallel computing platforms. We formulate the problem of mapping the MapReduce pattern to hardware as a geometric programming model; this model exploits loop-level parallelism and pipelining to give an optimal implementation on given hardware resources. The approach is capable of handling single and multiple nested MapReduce patterns. Furthermore, we explore important variations of MapReduce, such as using a linear structure rather than a tree structure for merging intermediate results generated in parallel. Results for six benchmarks show that our approach can find performance-optimal designs in the design space, improving system performance by up to 170 times compared to the initial designs on the target platform.
- Published
- 2010
- Full Text
- View/download PDF
13. Self-aware Hardware Acceleration of Financial Applications on a Heterogeneous Cluster
- Author
-
Mark Salmon, Stewart Denholm, Ce Guo, Tobias Becker, Andreea-Ingrid Funie, Tim Todman, Maciej Kurek, and Wayne Luk
- Subjects
Finance ,Fitness function ,Speedup ,Computer science ,business.industry ,Software design pattern ,Hardware acceleration ,Genetic programming ,Field-programmable gate array ,business ,Hill climbing ,Reconfigurable computing - Abstract
This chapter describes self-awareness in four financial applications. We apply some of the design patterns of Chapter 5 and techniques of Chapter 7. We describe three applications briefly, highlighting the links to self-awareness and self-expression. The applications are (i) a hybrid genetic programming and particle swarm optimisation approach for high-frequency trading, with fitness function evaluation accelerated by FPGA; (ii) an adaptive point process model for currency trading, accelerated by FPGA hardware; (iii) an adaptive line arbitrator synthesising high-reliability and low-latency feeds from redundant data feeds (A/B feeds) using FPGA hardware. Finally, we describe in more detail a generic optimisation approach for reconfigurable designs automating design optimisation, using reconfigurable hardware to speed up the optimisation process, applied to applications including a quadrature-based financial application. In each application, the hardware-accelerated self-aware approaches give significant benefits: up to 55× speedup for hardware-accelerated design optimisation compared to software hill climbing.
- Published
- 2016
- Full Text
- View/download PDF
14. Self-adaptive Hardware Acceleration on a Heterogeneous Cluster
- Author
-
Wayne Luk, Tim Todman, and Xinyu Niu
- Subjects
Software ,Computer science ,business.industry ,Distributed computing ,Message Passing Interface ,Cluster (physics) ,Hardware acceleration ,Symmetric multiprocessor system ,Heterogeneous cluster ,business ,Throughput (business) ,Telecommunications network - Abstract
Building a cluster of computers is a common technique to significantly improve the throughput of computationally intensive applications. Communication networks connect hundreds to thousands of compute nodes to form a cluster system, where a parallelisable application workload is distributed into the compute nodes. Theoretically, heterogeneous clusters with various types of processing units are more efficient than homogeneous clusters, since some types of processing units perform better than others on certain applications. A heterogeneous cluster can achieve better cluster performance by adapting cluster configurations to assign applications to processing elements that fit well with the applications. In this chapter we describe how to build a heterogeneous cluster that can adapt to application requirements. Section 9.1 provides an overview of heterogeneous computing. Section 9.2 presents the commonly used hardware and software architectures of heterogeneous clusters. Section 9.3 discusses the use of self-awareness and self-adaptivity in two runtime scenarios of a heterogeneous cluster, and Section 9.4 presents the experimental results. Finally, Section 9.5 discusses approaches to formally verify the developed applications.
- Published
- 2016
- Full Text
- View/download PDF
15. EXTRA : towards the exploitation of eXascale technology for reconfigurable architectures
- Author
-
Ana Lucia Varbanescu, Amit Kulkarni, Dirk Stroobandt, Andreas Brokalakis, Tobias Becker, Wayne Luk, Xinyu Niu, Michael Huebner, Antonis Nikitakis, Elias Vansteenkiste, Muhammed Al Kadi, Donatella Sciuto, Tim Todman, Dionisios Pnevmatikatos, Alex J. W. Thom, George Charitopoulos, Marco D. Santambrogio, Georgi Gaydadjiev, Catalin Bogdan Ciobanu, and Commission of the European Communities
- Subjects
Open platform ,Technology and Engineering ,Computer science ,Computer Networks and Communications ,Overhead (engineering) ,02 engineering and technology ,reconfigurable platform ,3301 Architecture ,Hardware and Architecture ,Competitive advantage ,Field (computer science) ,4009 Electronics, Sensors and Digital Hardware ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,QUANTUM MONTE-CARLO ,33 Built Environment and Design ,Field-programmable gate array ,FPGA ,40 Engineering ,business.industry ,EXTRA ,Control reconfiguration ,Chip ,Supercomputer ,020202 computer hardware & architecture ,Computer architecture ,Embedded system ,HPC ,exascale ,3303 Design ,business ,Run-time reconfiguration - Abstract
© 2016 IEEE. To handle the stringent performance requirements of future exascale-class applications, High Performance Computing (HPC) systems need ultra-efficient heterogeneous compute nodes. To reduce power and increase performance, such compute nodes will require hardware accelerators with a high degree of specialization. Ideally, dynamic reconfiguration will be an intrinsic feature, so that specific HPC application features can be optimally accelerated, even if they regularly change over time. In the EXTRA project, we create a new and flexible exploration platform for developing reconfigurable architectures, design tools and HPC applications with run-time reconfiguration built-in as a core fundamental feature instead of an add-on. EXTRA covers the entire stack from architecture up to the application, focusing on the fundamental building blocks for run-time reconfigurable exascale HPC systems: new chip architectures with very low reconfiguration overhead, new tools that truly take reconfiguration as a central design concept, and applications that are tuned to maximally benefit from the proposed run-time reconfiguration techniques. Ultimately, this open platform will improve Europe's competitive advantage and leadership in the field.
- Published
- 2016
16. In-circuit temporal monitors for runtime verification of reconfigurable designs
- Author
-
Stephan C. Stilkerich, Wayne Luk, and Tim Todman
- Subjects
High-level verification ,business.industry ,Computer science ,Runtime verification ,Reconfigurable computing ,Intelligent verification ,Software ,Computer architecture ,Hardware register ,Embedded system ,Hardware compatibility list ,business ,Field-programmable gate array ,Shift register ,Register-transfer level ,Electronic circuit - Abstract
We present designs for in-circuit monitoring of custom hardware designs implemented in reconfigurable hardware. The monitors check hardware designs against temporal logic specifications. Compared to previous work, which uses custom hardware to monitor software, our designs can run at higher speeds and make better use of hardware resources, such as shift registers and embedded memory blocks. We evaluate our monitor circuits on example hardware designs targeting FPGA implementation, showing that they have low overhead in terms of circuit area, and can run at the same speed as the circuits they monitor.
- Published
- 2015
- Full Text
- View/download PDF
17. Customisable Hardware Compilation
- Author
-
Wayne Luk, Jose G. F. Coutinho, and Tim Todman
- Subjects
business.industry ,Computer science ,Pipeline (computing) ,Metalanguage ,computer.software_genre ,Theoretical Computer Science ,High-level design ,Resource (project management) ,Imperative programming ,Computer architecture ,Hardware and Architecture ,High-level programming language ,Compiler ,business ,computer ,Software ,Computer hardware ,Information Systems ,Abstraction (linguistics) - Abstract
Hardware compilers for high-level languages are increasingly recognised to be the key to reducing the productivity gap for advanced circuit development in general, and for reconfigurable designs in particular. This paper explains how customisable frameworks for hardware compilation can enable rapid design exploration, and reusable and extensible hardware optimisation. It describes such a framework, based on a parallel imperative language, which supports multiple levels of design abstraction, transformational development, optimisation by compiler passes, and metalanguage facilities. Our approach has been used in producing designs for applications such as signal and image processing, with different trade-offs in performance and resource usage.
- Published
- 2005
- Full Text
- View/download PDF
18. FASTER: Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration
- Author
-
Dionisios Pnevmatikatos, Tobias Becker, Christian Pilato, Dirk Stroobandt, Elias Vansteenkiste, Andreas Brokalakis, Donatella Sciuto, Ioannis Papaefstathiou, Danilo Pau, Peter Böhm, Georgi Gaydadjiev, Karel Bruneel, Catalin Bogdan Ciobanu, Wayne Luk, Marco D. Santambrogio, Tim Todman, Xinyu Niu, Karel Heyse, Tom Davidson, Kyprianos Papadimitriou, and Oliver Pell
- Subjects
Partial reconfiguration ,Computer science ,business.industry ,Computer Networks and Communications ,Reconfigurable computing ,Micro-reconfiguration ,Verification ,Control reconfiguration ,Runtime system ,Acceleration ,Software ,Computer architecture ,Dynamic reconfiguration ,Adaptive computing,Configurable computing systems,Reconfigurable computing systems,adaptive computing systems,adaptive computing,configurable computing systems,reconfigurable computing systems ,Hardware and Architecture ,Artificial Intelligence ,Embedded system ,business ,Field-programmable gate array - Abstract
Summarization: The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) EU FP7 project, aims to ease the design and implementation of dynamically changing hardware systems. Our motivation stems from the promise reconfigurable systems hold for achieving high performance and extending product functionality and lifetime via the addition of new features that operate at hardware speed. However, designing a changing hardware system is both challenging and time-consuming. FASTER facilitates the use of reconfigurable technology by providing a complete methodology enabling designers to easily specify, analyze, implement and verify applications on platforms with general-purpose processors and acceleration modules implemented in the latest reconfigurable technology. Our tool-chain supports both coarse- and fine-grain FPGA reconfiguration, while during execution a flexible run-time system manages the reconfigurable resources. We target three applications from different domains. We explore the way each application benefits from reconfiguration, and then we asses them and the FASTER tools, in terms of performance, area consumption and accuracy of analysis. Presented on: Microprocessors and Microsystems
- Published
- 2015
19. Transparent insertion of latency-oblivious logic onto FPGAs
- Author
-
Tim Todman, Eddie Hung, and Wayne Luk
- Subjects
Computer science ,business.industry ,Embedded system ,Spare part ,Clock rate ,Latency (engineering) ,business ,Field-programmable gate array ,Electronic circuit - Abstract
We present an approach for inserting latency-oblivious functionality into pre-existing FPGA circuits transparently. To ensure transparency — that such modifications do not affect the design's maximum clock frequency — we insert any additional logic post place-and-route, using only the spare resources that were not consumed by the pre-existing circuit. The typical challenge with adding new functionality into existing circuits incrementally is that spare FPGA resources to host this functionality must be located close to the input signals that it requires, in order to minimise the impact of routing delays. In congested designs, however, such co-location is often not possible. We overcome this challenge by using flow techniques to pipeline and route signals from where they originate, potentially in a region of high resource congestion, into a region of low congestion capable of hosting new circuitry, at the expense of latency. We demonstrate and evaluate our approach by augmenting realistic designs with self-monitoring circuitry, which is not sensitive to latency. We report results on circuits operating over 200MHz and show that our insertions have no impact on timing, are 2–4 times faster than compile-time insertion, and incur only a small power overhead.
- Published
- 2014
- Full Text
- View/download PDF
20. Runtime assertions and exceptions for streaming systems
- Author
-
Wayne Luk and Tim Todman
- Subjects
Low overhead ,business.industry ,Computer science ,Programming language ,Exception handling ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Reconfigurable computing ,020202 computer hardware & architecture ,Work (electrical) ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,business ,Formal verification ,computer - Abstract
We present an approach to enable run-time, in-circuit assertions and exceptions in reconfigurable hardware designs. Static, compile-time checking, including formal verification, can catch many errors before a reconfigurable design is implemented. However, many other errors cannot be caught by static approaches, including those due to run-time data. Our approach allows users to add run-time assertions and exceptions to a design, giving multiple ways to handle run-time errors. Our work includes an abstract approach to adding assertions and exceptions to a design, a concrete implementation for Maxeler streaming designs, and an evaluation. Results show low overhead for adding exceptions to a design.
- Published
- 2013
- Full Text
- View/download PDF
21. Verification of streaming hardware and software codesigns
- Author
-
Wayne Luk, Tim Todman, and Peter Boehm
- Subjects
Hardware architecture ,business.industry ,Computer science ,Software construction ,Software development ,Hardware compatibility list ,Hardware acceleration ,Software verification and validation ,business ,Formal methods ,Computer hardware ,Software verification - Abstract
We present an approach to verifying the codesign of software and hardware. Our approach verifies that a reference design, perhaps a straightforward software implementation, is equivalent to a design combining software and reconfigurable hardware, possibly using runtime reconfiguration. Our approach combines symbolic simulation with equivalence checking to compare symbolic output expressions. Whilst our implementation uses C-style software and streaming hardware based on Maxeler designs, our approach is modular and could generalize to other software or hardware inputs. We evaluate our approach by applying it to several kernels, including one used for geoengineering.
- Published
- 2012
- Full Text
- View/download PDF
22. Verification of streaming designs by combining symbolic simulation and equivalence checking
- Author
-
Wayne Luk and Tim Todman
- Subjects
Model checking ,Theoretical computer science ,Logic synthesis ,Computer science ,Symbolic trajectory evaluation ,Formal equivalence checking ,Symbolic simulation ,Equivalence (formal languages) ,Field-programmable gate array ,Bottleneck - Abstract
As design complexity grows, verification becomes a bottleneck in design development and implementation. This paper describes a novel approach for verifying reconfigurable streaming designs, based on symbolic simulation and equivalence checking. Compared with numerical simulation, symbolic simulation provides a more informative way of showing a design behaved as expected; equivalence checking enables automatic checking of equivalence of symbolic expressions. Our approach has been implemented for designs targeting Maxeler technologies, using an easy-to-use symbolic simulator and the Yices equivalence checker, together with other facilities such as an output combiner to support an automated verification flow. Several benchmarks including, including one-dimensional convolution and finite difference computation, are used to evaluate the proposed approach.
- Published
- 2012
- Full Text
- View/download PDF
23. Reconfigurable Design Automation by High-Level Exploration
- Author
-
Tim Todman and Wayne Luk
- Subjects
Logic synthesis ,Computer architecture ,Design space exploration ,Computer science ,High-level synthesis ,Electronic design automation ,Field-programmable gate array ,Reconfigurable computing ,Selection (genetic algorithm) - Abstract
This paper describes a novel approach for designautomation of general-purpose reconfigurable computingapplications, which combines design space exploration withtransformation-based high-level feedback of performance resultsobtained from a detailed implementation. This approach enhanceseffectiveness of high-level exploration by using performanceestimates to guide the selection of applicable transformations.The impact of the transformations on metrics such asspeed and area is pre-characterised, so that appropriate transformationslikely to contribute to meeting design requirements areselected. A prototype system supporting this approach has beendeveloped, and promising results have been obtained.
- Published
- 2012
- Full Text
- View/download PDF
24. Smart technologies for effective reconfiguration: The FASTER approach
- Author
-
Dionisios Pnevmatikatos, Kyprianos Papadimitriou, Dirk Stroobandt, Donatella Sciuto, Christian Pilato, Andrea Cazzaniga, Georgi Gaydadjiev, A. Bonetto, Wayne Luk, Tobias Becker, Marco D. Santambrogio, Tim Todman, Tom Davidson, Gianluca Durelli, Indrusiak, LS, Gogniat, G, and Voros, N
- Subjects
010302 applied physics ,Technology and Engineering ,dynamic reconfiguration ,business.industry ,Computer science ,Control reconfiguration ,System requirements specification ,02 engineering and technology ,Electrical Engineering, Electronic Engineering, Information Engineering ,User requirements document ,01 natural sciences ,Toolchain ,020202 computer hardware & architecture ,Software ,Logic synthesis ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Informatics,computer science,informatics ,Adaptation (computer science) ,business ,Field-programmable gate array ,FPGA - Abstract
Summarization: Current and future computing systems increasingly require that their functionality stays flexible after the system is operational, in order to cope with changing user requirements and improvements in system features, i.e. changing protocols and data-coding standards, evolving demands for support of different user applications, and newly emerging applications in communication, computing and consumer electronics. Therefore, extending the functionality and the lifetime of products requires the addition of new functionality to track and satisfy the customers needs and market and technology trends. Many contemporary products along with the software part incorporate hardware accelerators for reasons of performance and power efficiency. While adaptivity of software is straightforward, adaptation of the hardware to changing requirements constitutes a challenging problem requiring delicate solutions. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a complete methodology to allow designers to easily implement a system specification on a platform which includes a general purpose processor combined with multiple accelerators running on an FPGA, taking as input a high-level description and fully exploiting, both at design time and at run time, the capabilities of partial dynamic reconfiguration. The goal is that for selected application domains, the FASTER toolchain will be able to reduce the design and verification time of complex reconfigurable systems providing additional novel verification features that are not available in existing tool flows. Presented on
- Published
- 2012
- Full Text
- View/download PDF
25. The hArtes Tool Chain
- Author
-
Tim Todman, Vlad Mihai Sima, Anna Antola, Kamana Sigdel, Christian Pilato, Maria Teresa Chiaradia, Antonio Cerruto, Alberto Morea, Wayne Luk, A. Michelotti, Koen Bertels, Raffaele Nutricato, Ariano Lattanzi, Donatella Sciuto, Marco Lattuada, Fabrizio Ferrandi, Jose Gabriel de Figueiredo Coutinho, Roel J. Meeuws, Yuet Ming Lam, Yana Yankova, Emanuele Ciavattini, and Ferruccio Bettarelli
- Subjects
Intermediate language ,Digital signal processor ,Task Mapping Tabu ,Digital Signal Processor ,List Task Graph ,Intermediate Representation ,Chain (algebraic topology) ,Computer architecture ,Computer science ,Task mapping ,Legacy code ,Digital Signal Processor, Task Mapping Tabu, List Task Graph, Intermediate Representation ,Tabu list - Abstract
This chapter describes the different design steps needed to go from legacy code to a transformed application that can be efficiently mapped on the hArtes platform.
- Published
- 2012
26. Novel design methods and a tool flow for unleashing dynamic reconfiguration
- Author
-
Christian Pilato, Dionisios Pnevmatikatos, Marco D. Santambrogio, Tobias Becker, Catalin Bogdan Ciobanu, Dirk Stroobandt, Tim Todman, Kyprianos Papadimitriou, Wayne Luk, Xinyu Niu, Timothy N. Davidson, and Georgi Gaydadjiev
- Subjects
Flexibility (engineering) ,Technology and Engineering ,dynamic reconfiguration ,Computer science ,Distributed computing ,Design flow ,Control reconfiguration ,Reconfigurable computing ,Runtime system ,Computer architecture ,Formal specification ,High-level synthesis ,Informatics,computer science,informatics ,Adaptation (computer science) ,Field-programmable gate array ,FPGA - Abstract
Summarization: During the last few years, there is an increasing interest in mixing software and hardware to serve efficiently different applications. This is due to the heterogeneity characterizing the tasks of an application which require the presence of resources from both worlds, software and hardware. Controlling effectively these resources through an integrated tool flow is a challenging problem and towards this direction only a few efforts exist. In fact, a framework that seamlessly exploits both resources of a platform for executing efficiently an application has not yet come into existence. Moreover, reconfigurable computing often incorporated in such platforms due to its high flexibility and customization, has not yet taken off due to the lack of exploiting its full capabilities. Thus, the capability of reconfigurable devices such as Field Programmable Gate Arrays (FPGAs) to be dynamically reconfigured, i.e. reprogramming part of the chip while other parts of the same chip remain functional, has not yet taken off even in small-scale basis. The inherent difficulty in using the tools to control this technology has kept it back from being adopted by academia and industry alike. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a design methodology and a tool flow that will enable designers to implement effectively and easily a system specification on a platform combining software and reconfigurable resources. The FASTER framework accepts as input a high-level description of the application and the architectural details of the target platform, and through certain steps it can enable the full use of the capabilities of the platform, while at the same time it should be flexible enough so as to balance efficiently performance, power and area. One of the main novelties is the incorporation of partial reconfiguration as an explicit design concept at an early stage of the design flow. We target different applications from the embedded, desktop and high-performance computing domains. In all cases we will demonstrate the effectiveness of the proposed framework in exploiting the inherent parallelism of applications and enabling the runtime adaptation of the platforms to the changing needs of the applications. Presented on
- Published
- 2012
27. Customizable Composition and Parameterization of Hardware Design Transformations
- Author
-
Qiang Liu, Tim Todman, Wayne Luk, and George A. Constantinides
- Subjects
Linear programming ,business.industry ,Computer science ,Programming language ,computer.software_genre ,Matrix multiplication ,Transformation (music) ,High-level programming language ,High-level synthesis ,Electronic design automation ,Compiler ,business ,Geometric programming ,computer ,Computer hardware - Abstract
A promising approach to high-level design is to start initially with an obvious but possibly inefficient design, and apply multiple transformations to meet design goals. Many hardware compilation tools support a fixed recipe of applying design transformations, but designers have few options to adapt the recipe without re-writing the tools themselves. In addition, complex transformations based on linear programming and geometric programming are often not included. This paper proposes anew approach that enables designers to customize the composition and parameterization of different types of design transformations in a unified framework, using a high-level language to control a transformation engine to automate the application of design transformations. Our approach is implemented by a tool based on the Python language and the ROSE compiler framework, which supports both syntax-directed transformations such as loop coalescing, and goal-directed transformations such as geometric programming. We illustrate how customizing the composition and parameterization of design transformations can lead to designs with different trade-offs in performance, resource usage, and energy efficiency. We evaluate our approach on benchmarks including matrix multiplication, Monte Carlo simulation of Asian options, edge detection, FIR filtering, and motion estimation.
- Published
- 2010
- Full Text
- View/download PDF
28. Combining optimizations in automated low power design
- Author
-
null Qiang Liu, Tim Todman, and Wayne Luk
- Published
- 2010
- Full Text
- View/download PDF
29. A Scripting Engine for Combining Design Transformations
- Author
-
Wayne Luk, Tim Todman, Qiang Liu, and George A. Constantinides
- Subjects
LOOP (programming language) ,Scripting language ,Programming language ,Computer science ,High-level synthesis ,Electronic design automation ,Compiler ,Computational linguistics ,Geometric programming ,computer.software_genre ,Field-programmable gate array ,computer - Abstract
This paper describes a scripting engine based on the Python language and the ROSE compiler framework. Our scripting engine supports hardware design involving both syntax-directed transformations such as loop coalescing, and goal-directed transformations such as geometric programming. We show how customizing the composition and parametrization of design transformations can lead to designs with different trade-offs in performance and resource usage.
- Published
- 2010
- Full Text
- View/download PDF
30. Automatic optimisation of MapReduce designs by geometric programming
- Author
-
Tim Todman, Qiang Liu, George A. Constantinides, and Wayne Luk
- Subjects
Source data ,Operator (computer programming) ,Tree structure ,Computer science ,Computation ,Bandwidth (signal processing) ,Linear complex structure ,Parallel computing ,Geometric programming ,Associative property - Abstract
Many important applications can be expressed using the MapReduce pattern, where a computation is decomposed into a Map phase on which each element of source data is independently operated, followed by a Reduce phase in which the mapped elements are combined with an associative operator. We develop an approach for compiling applications with the MapReduce pattern into parallel hardware. Using optimisation techniques based on geometric programming, we map the computation onto a resource-constrained architecture. Furthermore, we explore important variations of MapReduce, such as making the Reduce a linear structure rather than a tree structure. Results for four benchmarks show that our approach can improve system performance by up to 170 times compared to the initial designs.
- Published
- 2009
- Full Text
- View/download PDF
31. A high-level compilation toolchain for heterogeneous systems
- Author
-
Y.M. Lam, Tim Todman, W.S. Wong, Qiang Liu, Kong Woei Susanto, Jose G. F. Coutinho, W.G. Osborne, and Wayne Luk
- Subjects
Digital signal processor ,Task (computing) ,Computer architecture ,business.industry ,Computer science ,Embedded system ,Process (computing) ,Multiprocessing ,business ,External Data Representation ,Field-programmable gate array ,Digital signal processing ,Toolchain - Abstract
This paper describes Harmonic, a toolchain that targets multiprocessor heterogeneous systems comprising different types of processing elements such as general-purposed processors (GPPs), digital signal processors (DSP), and field-programmable gate arrays (FPGAs) from a high-level C program. The main goal of Harmonic is to improve an application by partitioning and optimising each part of the program, and selecting the most appropriate processing element in the system to execute each part. The core tools include a task transformation engine, a mapping selector, a data representation optimiser, and a hardware synthesiser. We also use the C language with source-annotations as intermediate representation for the toolchain, making it easier for users to understand and to control the compilation process.
- Published
- 2009
- Full Text
- View/download PDF
32. Optimising designs by combining model-based and pattern-based transformations
- Author
-
Tim Todman, Jose G. F. Coutinho, Qiang Liu, George A. Constantinides, and Wayne Luk
- Subjects
Object-oriented programming ,Random access memory ,Theoretical computer science ,Transformation (function) ,Computer science ,Data reuse ,System on a chip ,Pattern matching - Abstract
We present a methodology for optimising designs written in high-level descriptions, combining mathematical model-based transformations with syntax-driven pattern-matching transformations, showing how the two kinds of transformation can benefit each other. We evaluate thismethodology by implementing an instance, combining a model-based transformation for data reuse with pattern-based transformations to improve its output. Results for three benchmarks show the implemented framework can improve system performance by up to 57 times.
- Published
- 2009
- Full Text
- View/download PDF
33. Cube: A 512-FPGA cluster
- Author
-
Philip H. W. Leong, Oskar Mencer, Tim Todman, Wayne Luk, Stephen Craimer, Ming Yee Wong, and Kuen Hung Tsoi
- Subjects
Xeon ,business.industry ,Computer science ,Key space ,Supercomputer ,Application software ,computer.software_genre ,Search engine ,Scalability ,Cube ,business ,Field-programmable gate array ,computer ,Computer hardware - Abstract
Cube, a massively-parallel FPGA-based platform is presented. The machine is made from boards each containing 64 FPGA devices and eight boards can be connected in a cube structure for a total of 512 FPGA devices. With high bandwidth systolic inter-FPGA communication and a flexible programming scheme, the result is a low power, high density and scalable supercomputing machine suitable for various large scale parallel applications. A RC4 key search engine was built as an demonstration application. In a fully implemented Cube, the engine can perform a full search on the 40-bit key space within 3 minutes, this being 359 times faster than a multi-threaded software implementation running on a 2.5GHz Intel Quad-Core Xeon processor.
- Published
- 2009
- Full Text
- View/download PDF
34. Smart Enumeration: A Systematic Approach to Exhaustive Search
- Author
-
Brittle Tsoi, Wayne Luk, Tim Todman, Oskar Mencer, and Haohuan Fu
- Subjects
Computer Science::Hardware Architecture ,Mathematical optimization ,Speedup ,Computer engineering ,Computer science ,Key (cryptography) ,Enumeration ,Brute-force search ,Beam search ,Heuristics ,Massively parallel ,Reconfigurable computing - Abstract
This paper explores the potential of smart enumeration: enumeration of a design space giving the effect of exhaustive search, while using heuristics to order and reduce the search space. We characterise smart enumeration as having several key properties, including carefully chosen problem domains and techniques to speed up the search, such as those that exploit symmetry. We also generate reconfigurable hardware to accelerate part of the search. Our approach has been applied to technology mapping for field-programmable gate arrays, optimising area and power consumption.
- Published
- 2009
- Full Text
- View/download PDF
35. Design Validation by Symbolic Simulation and Equivalence Checking: A Case Study in Memory Optimization for Image Manipulation
- Author
-
Jose G. F. Coutinho, Kong Woei Susanto, Wayne Luk, and Tim Todman
- Subjects
Validation rule ,Theoretical computer science ,Computer science ,Encoding (memory) ,Formal equivalence checking ,Process (computing) ,Symbolic simulation ,Data mining ,Symbolic execution ,computer.software_genre ,Formal verification ,Equivalence (measure theory) ,computer - Abstract
Design optimization exploration is a key element in finding an optimal resource utilization. The exploration process applies optimizations iteratively; after applying each optimization, the result has to be validated. The research challenge for formal verification is to develop an efficient design validation flow and increase the quality of the validation. In this paper, we propose an automated validation flow to check the functional equivalence of the source design and its optimized version. This approach is based on a symbolic simulation technique to obtain the design properties and automatically check them using an equivalence checker. The novelty of this approach includes the use of model simplification techniques, such as if-conversion and loop-conversion, and state encoding to ease validation analysis.
- Published
- 2009
- Full Text
- View/download PDF
36. Improving Bounds for FPGA Logic Minimization
- Author
-
Oskar Mencer, Tim Todman, Haofan Fu, and Wayne Luk
- Subjects
Programmable logic device ,Digital electronics ,Logic synthesis ,Sequential logic ,business.industry ,Computer science ,Lookup table ,Logic family ,business ,Algorithm ,Programmable logic array ,Logic optimization - Abstract
We present a methodology for improving the bounds of combinational designs implemented on networks of lookup tables, moving them closer to the theoretical minimum. Our work effectively extends optimality to span logic minimization and technology mapping. We obtain a proof of optimality by restricting ourselves to 4-input look-up tables (LUTs) and generating all possible circuits up to a certain area or latency depending on the optimization mode. Since simple-minded generation would take a long time, we develop levels of abstraction (steps) and techniques to restrict and order the search space, and produce results in practical time. We use logic decomposition to break up large designs, using the resulting trees to guide our search and prune the search space. The price of this optimality is that we are limited to small blocks; however, such blocks can be used to build larger designs.
- Published
- 2007
- Full Text
- View/download PDF
37. Reconfigurable Designs for Radiosity
- Author
-
H. Styles, Tim Todman, Wayne Luk, and P. Baker
- Subjects
Computer graphics ,Logic synthesis ,Software ,Data parallelism ,business.industry ,Computer science ,Image quality ,Computation ,Retargeting ,Parallel computing ,Field-programmable gate array ,business - Abstract
We develop reconfigurable designs to support radiosity, a computer graphics algorithm for producing highly realistic images of artificial scenes, but which is computationally expensive. We implement radiosity using stochastic raytracing, which affords both instruction-level and data parallelism. Our designs are parameterisable by bitwidth, allowing trade-offs between image quality and computation speed. We measure the speed of our designs for a Xilinx XC2V6000 device in the Celoxica RC2000 platform: at 53 MHz it can run up to five times faster than a software implementation on an Athlon MP 2600+ processor at 2.1 GHz. We estimate that retargeting our design for a Virtex-4 XCVSX55 device can result in over 160 times software speed, while a Spartan-3 XC3S5000 device can run more than 40 times faster than the software implementation.
- Published
- 2005
- Full Text
- View/download PDF
38. Memory optimisations for high-resolution imaging
- Author
-
Tim Todman and Wayne Luk
- Subjects
Memory bank ,Speedup ,Computer science ,Template matching ,Memory architecture ,Process (computing) ,Control reconfiguration ,Parallel computing ,Auxiliary memory ,Reconfigurable computing - Abstract
The rapid advance in imaging technology has led to the increasing availability of high-resolution digital images. For instance, the latest film scanners can produce more than 6 million pixels with 12-bit colour. We explore the opportunities of working with these high-resolution images on reconfigurable hardware, focusing on memory optimisations to support their effective processing. We consider template matching and related algorithms, and propose a runtime reconfiguration scheme which allows large template-matching algorithms to be implemented on small FPGAs. We present a partitioning scheme which allows the FPGA to process images larger than its local, external memory banks; alternatively, the scheme facilitates exploitation of concurrency offered by multiple memory banks in FPGA systems. We have developed parametric models to analyse these designs and to explore their potential. It is shown that, for instance, our column caching scheme can support linear speedup with respect to the number of columns, even when the reconfiguration time is large.
- Published
- 2005
- Full Text
- View/download PDF
39. Methods and Tools for High-Resolution Imaging
- Author
-
Wayne Luk and Tim Todman
- Subjects
Pixel ,Channel (digital image) ,Computer science ,business.industry ,Template matching ,Frame (networking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Memory bank ,Digital image processing ,Computer vision ,Artificial intelligence ,business ,Image resolution - Abstract
Film and video sequences are increasingly being digitised, allowing image processing operations to be applied to them in the digital domain. For film in particular, images are digitised at the limit of available scanners: each frame may contain 3000 by 2000 pixels, with 16 bits per colour channel. We investigate the consequences of working with these high-resolution images on FPGAs. We consider template matching and related algorithms, and derive a performance model to establish bounds on performance and to predict which optimisations may be fruitful. An architecture generator has been developed which can generate optimised implementations given image resolution, the FPGA platform architecture, and a description of the image processing algorithm.
- Published
- 2004
- Full Text
- View/download PDF
40. Real-time extensions to a C-like hardware description language
- Author
-
Wayne Luk and Tim Todman
- Subjects
Worst case timing analysis ,Computer science ,Programming language ,Hardware description language ,occam ,Parallel computing ,computer.software_genre ,Application software ,Parallel processing (DSP implementation) ,Simple (abstract algebra) ,Field-programmable gate array ,Timeout ,computer ,computer.programming_language - Abstract
Handel-C is a language for compilation into hardware. It is based on C with support for parallel execution and user-defined data size. We present extensions to a Handel-C like language for real-time applications. Our extensions are based on those of Occam 2, a language related to Handel-C. We show that our extensions, though simple, can implement the basic real-time idioms of timed wait and timeout.
- Published
- 2003
- Full Text
- View/download PDF
41. Combining imperative and declarative hardware descriptions
- Author
-
Tim Todman and Wayne Luk
- Subjects
Imperative programming ,Cobble ,Computer science ,business.industry ,Programming language ,Hardware description language ,business ,computer.software_genre ,Implementation ,computer ,Computer hardware ,Declarative programming ,computer.programming_language - Abstract
This paper describes an approach for hardware development that involves both imperative and declarative descriptions. The imperative descriptions are mainly used for algorithm and application development; they are based on Cobble, a sequential imperative language extended with facilities for parallel computation and arbitrary-sized variables, similar to the Handel-C language. Operators in Cobble can be produced using the declarative language Pebble, which supports efficient bit-level design. We introduce the use of meta-information, such as information about latency and throughput, for Pebble descriptions, to enable Cobble programs to adapt to different implementations of operators in Pebble. The optimisation of designs by transforming the Cobble and Pebble descriptions is presented.
- Published
- 2003
- Full Text
- View/download PDF
42. Reconfigurable computing: architectures and design methods
- Author
-
Tim Todman, Steven J. E. Wilton, Wayne Luk, George A. Constantinides, Oskar Mencer, and Peter Y. K. Cheung
- Subjects
Virtex ,Engineering ,Speedup ,business.industry ,Integrated circuit ,Reconfigurable computing ,Theoretical Computer Science ,law.invention ,Microprocessor ,Computational Theory and Mathematics ,Computer architecture ,Hardware and Architecture ,law ,Embedded system ,Stratix ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Design methods ,Field-programmable gate array ,business - Abstract
Reconfigurable computing is becoming increasingly attractive for many applications. This survey covers two aspects of reconfigurable computing: architectures and design methods. The paper includes recent advances in reconfigurable architectures, such as the Alters Stratix II and Xilinx Virtex 4 FPGA devices. The authors identify major trends in general-purpose and special-purpose design methods. It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.
- Published
- 2005
- Full Text
- View/download PDF
43. Knowledge Transfer in Automatic Optimisation of Reconfigurable Designs
- Author
-
Tim Todman, Maciej Kurek, Wayne Luk, and Marc Peter Deisenroth
- Subjects
Speedup ,Computer science ,Computational finance ,ComputingMethodologies_MISCELLANEOUS ,Control engineering ,02 engineering and technology ,computer.software_genre ,Support vector machine ,Reduction (complexity) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,020201 artificial intelligence & image processing ,Algorithm design ,Data mining ,Knowledge transfer ,computer - Abstract
This paper presents a novel approach for automatic optimisation of reconfigurable design parameters based on knowledge transfer. The key idea is to make use of insights derived from optimising related designs to benefit future optimisations. We show how to use designs targeting one device to speed up optimisation of another device. The proposed approach is evaluated based on various applications including computational finance and seismic imaging. It is capable of achieving up to 35% reduction in optimisation time in producing designs with similar performance, compared to alternative optimisation methods.
- Full Text
- View/download PDF
44. FASTER: Facilitating analysis and synthesis technologies for effective reconfiguration
- Author
-
Marco D. Santambrogio, Ioannis Papaefstathiou, Oliver Pell, Dirk Stroobandt, Andreas Brokalakis, Karel Bruneel, Donatella Sciuto, Wayne Luk, Dionisios Pnevmatikatos, Christian Pilato, Tobias Becker, M. Robart, Kyprianos Papadimitriou, Tim Todman, and Georgi Gaydadjiev
- Subjects
run-time reconfiguration ,tools for reconfiguration ,Computer and Information Science ,business.industry ,computer.internet_protocol ,Computer science ,Control reconfiguration ,02 engineering and technology ,Supercomputer ,relocation ,Reconfigurable computing ,020202 computer hardware & architecture ,run-time system ,Software ,Parallel processing (DSP implementation) ,partial reconfiguration ,020204 information systems ,Embedded system ,Component-based software engineering ,reconfigurable computing ,0202 electrical engineering, electronic engineering, information engineering ,business ,Field-programmable gate array ,computer ,XML - Abstract
Summarization: The FASTER project aims to ease the definition, implementation and use of dynamically changing hardware systems. Our motivation stems from the promise reconfigurable systems hold for achieving better performance and extending product functionality and lifetime via the addition of new features that work at hardware speed. This is a clear advantage over the more straightforward software component adaptivity. However, designing a changing hardware system is both challenging and time consuming. The FASTER project will facilitate the use of reconfigurable technology by providing a complete methodology that enables designers to easily specify, analyse, implement and verify applications on platforms with general-purpose processors and acceleration modules implemented in the latest reconfigurable technology. To better adapt to different application requirements, the tool-chain will support both region-based and micro-reconfiguration and provide a flexible run-time system that will efficiently manage the reconfigurable resources. We will use applications from the embedded, high performance computing, and desktop domains to demonstrate the potential benefits of the FASTER tools on metrics such as performance, power consumption and total ownership cost. Παρουσιάστηκε στο: 15th Euromicro Conference on Digital System Design (DSD)
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.