36 results on '"Reconfigurable hardware"'
Search Results
2. An Overview of Reconfigurable Hardware in Embedded Systems
- Author
-
Garcia, Philip, Compton, Katherine, Schulte, Michael, Blem, Emily, and Fu, Wenyin
- Published
- 2006
- Full Text
- View/download PDF
3. A Dynamic Reconfigurable Hardware/Software Architecture for Object Tracking in Video Streams
- Author
-
Mühlbauer, Felix and Bobda, Christophe
- Published
- 2006
- Full Text
- View/download PDF
4. A Dynamic Reconfigurable Hardware/Software Architecture for Object Tracking in Video Streams
- Author
-
Christophe Bobda and Felix Muhlbauer
- Subjects
Hardware architecture ,Focus (computing) ,General Computer Science ,Computer science ,business.industry ,lcsh:Electronics ,lcsh:TK7800-8360 ,Reconfigurable computing ,Data flow diagram ,Software ,Feature (computer vision) ,Control and Systems Engineering ,Video tracking ,Embedded system ,Software architecture ,business ,Computer Science(all) - Abstract
This paper presents the design and implementation of a feature tracker on an embedded reconfigurable hardware system. Contrary to other works, the focus here is on the efficient hardware/software partitioning of the feature tracker algorithm, a viable data flow management, as well as an efficient use of memory and processor features. The implementation is done on a Xilinx Spartan 3 evaluation board and the results provided show the superiority of our implementation compared to the other works.
- Published
- 2006
5. Implementation of a reconfigurable ASIP for high throughput low power DFT/DCT/FIR engine
- Author
-
Hassan, Hanan M, Mohammed, Karim, and Shalash, Ahmed F
- Published
- 2012
- Full Text
- View/download PDF
6. Run-Time HW/SW Scheduling of Data Flow Applications on Reconfigurable Architectures.
- Author
-
Ghaffari, Fakhreddine, Miramond, Benoit, and Verdier, François
- Subjects
COMPUTER architecture ,COMPUTER input-output equipment ,DATA flow computing ,ELECTRONIC data processing ,COMPUTER graphics ,IMAGE processing - Abstract
This paper presents an efficient dynamic and run-time Hardware/Software scheduling approach. This scheduling heuristic consists in mapping online the different tasks of a highly dynamic application in such a way that the total execution time is minimized.We consider soft real-time data flow graph oriented applications for which the execution time is function of the input data nature. The target architecture is composed of two processors connected to a dynamically reconfigurable hardware accelerator. Our approach takes advantage of the reconfiguration property of the considered architecture to adapt the treatment to the system dynamics. We compare our heuristic with another similar approach.We present the results of our scheduling method on several image processing applications. Our experiments include simulation and synthesis results on a Virtex V-based platform. These results show a better performance against existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
7. An FPGA Implementation of a Parallelized MT19937 Uniform Random Number Generator.
- Author
-
Sriram, Vinay and Kearney, David
- Subjects
COMPUTERS ,RANDOM number generators ,FIELD programmable gate arrays ,SYSTEMS engineering ,EMBEDDED computer systems - Abstract
Recent times have witnessed an increase in use of high-performance reconfigurable computing for accelerating large-scale simulations. A characteristic of such simulations, like infrared (IR) scene simulation, is the use of large quantities of uncorrelated random numbers. It is therefore of interest to have a fast uniform random number generator implemented in reconfigurable hardware. While there have been previous attempts to accelerate the MT19937 pseudouniform random number generator using FPGAs we believe that we can substantially improve the previous implementations to develop a higher throughput and more areatime efficient design. Due to the potential for parallel implementation of random numbers generators, designs that have both a small area footprint and high throughput are to be preferred to ones that have the high throughput but with significant extra area requirements. In this paper, we first present a single port design and then present an enhanced 624 port hardware implementation of the MT19937 algorithm. The 624 port hardware implementation when implemented on a Xilinx XC2VP70-6 FPGA chip has a throughput of 119.6 x 10
9 32 bit random numbers per second which is more than 17x that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA-based pseudouniform random number generators. [ABSTRACT FROM AUTHOR]- Published
- 2009
- Full Text
- View/download PDF
8. Communication-Oriented Design Space Exploration for Reconfigurable Architectures.
- Author
-
Bossuet, Lilian, Gogniat, Guy, and Philippe, Jean-Luc
- Subjects
COMPUTER architecture ,COMPUTER engineering ,FIELD programmable gate arrays ,ALGORITHMS ,INTEGER programming - Abstract
Many academic works in computer engineering focus on reconfigurable architectures and associated tools. Fine-grain architectures, field programmable gate arrays (FPGAs), are the most well-known structures of reconfigurable hardware. Dedicated tools (generic or specific) allow for the exploration of their design space to choose the best architecture characteristics and/or to explore the application characteristics. The aim is to increase the synergy between the application and the architecture in order to get the best performance. However, there is no generic tool to perform such an exploration for coarse-grain or heterogeneous-grain architectures, just a small number of very specific tools are able to explore a limited set of architectures. To address this major lack, in this paper we propose a new design space exploration approach adapted to fine- and coarse-grain granularities. Our approach combines algorithmic and architecture explorations. It relies on an automatic estimation tool which computes the communication hierarchical distribution and the architectural processing resources use rate for the architecture under exploration. Such an approach forwards the rapid definition of efficient reconfigurable architectures dedicated to one or several applications. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
9. ARM-FPGA-based platform for reconfigurable wireless communication systems using partial reconfiguration.
- Author
-
Rihani, Mohamad-Al-Fadl, Mroue, Mohamad, Prévotet, Jean-Christophe, Nouvel, Fabienne, and Mohanna, Yasser
- Subjects
FIELD programmable gate arrays ,WIRELESS communications ,SYSTEMS on a chip - Abstract
Today, wireless devices generally feature multiple radio access technologies (LTE, WIFI, WIMAX,...) to handle a rich variety of standards or technologies.These devices should be intelligent and autonomous enough in order to either reach a given level of performance or automatically select the best available wireless technology according to standards availability. On the hardware side, system on chip (SoC) devices integrate processors and field-programmable gate array (FPGA) logic fabrics on the same chip with fast inter-connection. This allows designing software/hardware systems and implementing new techniques and methodologies that greatly improve the performance of communication systems. In these devices, Dynamic partial reconfiguration (DPR) constitutes a well-known technique for reconfiguring only a specific area within the FPGA while other parts continue to operate independently. To evaluate when it is advantageous to perform DPR, adaptive techniques have been proposed. They consist in reconfiguring parts of the system automatically according to specific parameters. In this paper, an intelligent wireless communication system aiming at implementing an adaptive OFDM-based transmitter and performing a vertical handover in heterogeneous networks is presented. An unified physical layer for WIFI-WIMAX networks is also proposed. The system was implemented and tested on a ZedBoard which features a Xilinx Zynq-7000-SoC. The performance of the system is described, and simulation results are presented in order to validate the proposed architecture. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
10. FPGA Supercomputing Platforms, Architectures, and Techniques for Accelerating Computationally Complex Algorithms.
- Author
-
Sriram, Vinay and Leeser, Miriam
- Subjects
RAPID prototyping ,FIELD programmable gate arrays - Abstract
The article discusses various reports published within the issue which include one on rapid prototyping platform and design flow for design of onchip motion controllers, one on use of a high performance reconfigurable supercomputer built from both general-purpose processors and field programmable gate arrays (FPGA) and one on implementation of FPGA-based face detector using a neural network.
- Published
- 2009
- Full Text
- View/download PDF
11. A hybrid fixed-function and microprocessor solution for high-throughput broad-phase collision detection.
- Author
-
Woulfe, Muiris and Manzke, Michael
- Subjects
MICROPROCESSORS ,COLLISION detection (Computer animation) ,HYBRID systems - Abstract
We present a hybrid system spanning a fixed-function microarchitecture and a general-purpose microprocessor, designed to amplify the throughput and decrease the power dissipation of collision detection relative to what can be achieved using CPUs or GPUs alone. The primary component is one of the two novel microarchitectures designed to perform the principal elements of broad-phase collision detection. Both microarchitectures consist of pipelines comprising a plurality of memories, which rearrange the input into a format that maximises parallelism and bandwidth. The two microarchitectures are combined with the remainder of the system through an original method for sharing data between a ray tracer and the collision-detection microarchitectures to minimise data structure construction costs. We effectively demonstrate our system using several benchmarks of varying object counts. These benchmarks reveal that, for over one million objects, our design achieves an acceleration of 812 × relative to a CPU and an acceleration of 161 × relative to a GPU. We also achieve energy efficiencies that enable the mitigation of silicon power-density challenges, while making the design amenable to both mobile and wearable computing devices. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
12. A Programmable Video Platform and Its Application Mapping Framework Using the Target Application's SystemC Models.
- Author
-
Daewoong Kim, Kilhyung Cha, Do-Sun Hong, Soonwoo Choi, and Soo-Ik Chae
- Subjects
HIGH definition video recording ,DECODERS (Electronics) ,BIT rate ,DISTRIBUTED operating systems (Computers) ,ASYNCHRONOUS transfer mode ,COMPUTER operating systems - Abstract
HD video applications can be represented with multiple tasks consisting of tightly coupled multiple threads. Each task requires massive computation, and their communication can be categorized as asynchronous distributed small data and large streaming data transfers. In this paper, we propose a high performance programmable video platform that consists of four processing element (PE) clusters. Each PE cluster runs a task in the video application with RISC cores, a hardware operating system kernel (HOSK), and task-specific accelerators. PE clusters are connected with two separate point-to-point networks: one for asynchronous distributed controls and the other for heavy streaming data transfers among the tasks. Furthermore, we developed an application mapping framework, with which parallel executable codes can be obtained from a manually developed SystemC model of the target application without knowing the detailed architecture of the video platform. To show the effectivity of the platform and its mapping framework, we also present mapping results for an H.264/AVC 720p decoder/encoder and a VC-1 720p decoder with 30 fps, assuming that the platform operates at 200MHz. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
13. Word Length SelectionMethod for Controller Implementation on FPGAs Using the VHDL-2008 Fixed-Point and Floating-Point Packages.
- Author
-
Urriza, I., Barragán, L. A., Navarro, D., Artigas, J. I., and Lucia, O.
- Subjects
DIGITAL control systems ,FIELD programmable gate arrays ,FLOATING-point arithmetic ,SIMULATION methods & models ,SWITCHING power supplies ,DIGITAL electronics - Abstract
This paper presents a word length selection method for the implementation of digital controllers in both fixed-point and floating-point hardware on FPGAs. This method uses the new types defined in the VHDL-2008 fixed-point and floating-point packages. These packages allow customizing the word length of fixed and floating point representations and shorten the design cycle simplifying the design of arithmetic operations. The method performs bit-true simulations in order to determine the word length to represent the constant coefficients and the internal signals of the digital controller while maintaining the control system specifications. A mixed-signal simulation tool is used to simulate the closed loop system as a whole in order to analyze the impact of the quantization effects and loop delays on the control system performance. The method is applied to implement a digital controller for a switching power converter. The digital circuit is implemented on an FPGA, and the simulations are experimentally verified. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
14. Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing.
- Author
-
Cordes, Ben and Leeser, Miriam
- Subjects
FIELD programmable gate arrays ,ALGORITHMS ,ELECTRONIC systems ,SYNTHETIC aperture radar ,IMAGING systems - Abstract
High-performance reconfigurable computing (HPRC) is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR) processing system.We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
15. OLLAF: A Fine Grained Dynamically Reconfigurable Architecture for OS Support.
- Author
-
Garcia, Samuel and Granado, Bertrand
- Subjects
SYSTEMS engineering ,SYSTEMS design ,EMBEDDED computer systems ,INTEGRATED circuits ,COMPUTER operating systems - Abstract
Fine Grained Dynamically Reconfigurable Architecture (FGDRA) offers a flexibility for embedded systems with a great power processing efficiency by exploiting optimizations opportunities at architectural level thanks to their fine configuration granularity. But this increase design complexity that should be abstracted by tools and operating system. In order to have a usable solution, a good inter-overlapping between tools, OS, and platformmust exist. In this paper we present OLLAF, an FGDRA specially designed to efficiently support an OS. The studies presented here show the contribution of this architecture in terms of hardware context management and preemption support. Studies presented here show the gain that can be obtained, by using OLLAF instead of a classical FPGA, in terms of context management and preemption overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
16. Multicore Software-Defined Radio Architecture for GNSS Receiver Signal Processing.
- Author
-
Hurskainen, Heikki, Raasakka, Jussi, Ahonen, Tapani, and Nurmi, Jari
- Subjects
SOFTWARE radio ,EMBEDDED computer systems ,GLOBAL Positioning System ,DIGITAL signal processing ,WIRELESS communications - Abstract
We describe a multicore Software-Defined Radio (SDR) architecture for Global Navigation Satellite System (GNSS) receiver implementation. A GNSS receiver picks up very low power signals from multiple satellites and then uses dedicated processing to demodulate and measure the exact timing of these signals from which the user's position, velocity, and time (PVT) can be estimated. Three GNSS SDR architectures are discussed. (1) A hardware-based SDR that is feasible for embedded devices but relatively expensive, (2) a pure SDR approach that has high level of flexibility and low bill of material, but is not yet suited for handheld applications, and (3) a novel architecture that uses a programmable array of multiple processing cores that exhibits both flexibility and potential for mobile devices. We present the CRISP project where the multicore architecture will be realized along with numerical analysis of application requirements of the platform's processing cores and network payload. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
17. An Open Framework for Rapid Prototyping of Signal Processing Applications.
- Author
-
Pelcat, Maxime, Piat, Jonathan, Wipliez, Matthieu, Aridhi, Slaheddine, and Nezan, Jean-François
- Subjects
EMBEDDED computer systems ,PROTOTYPES ,SIGNAL processing ,ALGORITHMS ,RAPID prototyping - Abstract
Embedded real-time applications in communication systems have significant timing constraints, thus requiring multiple computation units. Manually exploring the potential parallelism of an application deployed on multicore architectures is greatly time-consuming. This paper presents an open-source Eclipse-based framework which aims to facilitate the exploration and development processes in this context. The framework includes a generic graph editor (Graphiti), a graph transformation library (SDF4J) and an automatic mapper/scheduler tool with simulation and code generation capabilities (PREESM). The input of the framework is composed of a scenario description and two graphs, one graph describes an algorithm and the second graph describes an architecture. The rapid prototyping results of a 3GPP Long-Term Evolution (LTE) algorithm on a multicore digital signal processor illustrate both the features and the capabilities of this framework. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
18. Efficient Processing of a Rainfall Simulation Watershed on an FPGA-Based Architecture with Fast Access to Neighbourhood Pixels.
- Author
-
Lee Seng Yeong, Christopher Wing Hong Ngau, Li-Minn Ang, and Kah Phooi Seng
- Subjects
FIELD programmable gate arrays ,WATERSHEDS ,ARCHITECTURE ,RAINFALL - Abstract
This paper describes a hardware architecture to implement the watershed algorithm using rainfall simulation. The speed of the architecture is increased by utilizing a multiple memory bank approach to allow parallel access to the neighbourhood pixel values. In a single read cycle, the architecture is able to obtain all five values of the centre and four neighbours for a 4-connectivity watershed transform. The storage requirement of the multiple bank implementation is the same as a single bank implementation by using a graph-based memory bank addressing scheme. The proposed rainfall watershed architecture consists of two parts. The first part performs the arrowing operation and the second part assigns each pixel to its associated catchment basin. The paper describes the architecture datapath and control logic in detail and concludes with an implementation on a Xilinx Spartan-3 FPGA. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
19. A Prototyping Virtual Socket System-On-Platform Architecture with a Novel ACQPPS Motion Estimator for H.264 Video Encoding Applications.
- Author
-
Yifeng Qiu and Badawy, Wael
- Subjects
EMBEDDED computer systems ,STREAMING technology ,BASIC Stamp computers ,ALGORITHMS ,ENCODING - Abstract
H.264 delivers the streaming video in high quality for various applications. The coding tools involved in H.264, however, make its video codec implementation very complicated, raising the need for algorithm optimization, and hardware acceleration. In this paper, a novel adaptive crossed quarter polar pattern search (ACQPPS) algorithm is proposed to realize an enhanced inter prediction for H.264. Moreover, an efficient prototyping system-on-platform architecture is also presented, which can be utilized for a realization of H.264 baseline profile encoder with the support of integrated ACQPPS motion estimator and related video IP accelerators. The implementation results show that ACQPPS motion estimator can achieve very high estimated image quality comparable to that from the full search method, in terms of peak signal-to-noise ratio (PSNR), while keeping the complexity at an extremely low level. With the integrated IP accelerators and optimized techniques, the proposed system-on-platform architecture sufficiently supports the H.264 real-time encoding with the low cost. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
20. A Platform for the Development and the Validation of HWIP Components Starting from Reference Software Specifications.
- Author
-
Lucarz, Christophe, Mattavelli, Marco, and Dubois, Julien
- Subjects
DIGITAL signal processing ,EMBEDDED computer systems ,COMPUTER architecture ,COMPUTER systems ,ELECTRONIC systems ,COMPUTER software - Abstract
Signal processing algorithms become more and more efficient as a result of the developments of new standards. It is particularly true in the field video compression. However, at each improvement in efficiency and functionality, the complexity of the algorithms is also increasing. Textual specifications, that in the past were the original form of specifications, have been substituted by reference software which became the starting point of any design flow leading to implementation. Therefore, designing an embedded application has become equivalent to port a generic software on a, possibly heterogeneous, embedded platform. Such operation is getting more and more difficult because of the increased algorithm complexity and the wide range of architectural solutions. This paper describes a new platform aiming at supporting a step-by-step mapping of reference software (i.e., generic and nonoptimized software) into software and hardware implementations. The platform provides a seamless interface between the software and hardware environments with profiling capabilities for the analysis of data transfers between hardware and software. Such profiling capabilities help the designer to achieve different implementations aiming at specific objectives such as the optimization of hardware processing resources, of the memory architectures, or the minimization of data transfers to reach low-power designs. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
21. Smart Camera Based on Embedded HW/SW Coprocessor.
- Author
-
Mosqueron, Romuald, Dubois, Julien, Mattavelli, Marco, and Mauvilet, David
- Subjects
DIGITAL image processing ,DIGITAL cameras ,COMPUTER architecture ,COMPUTER systems ,ALGORITHMS - Abstract
This paper describes an image acquisition and a processing system based on a new coprocessor architecture designed for CMOS sensor imaging. The system exploits the full potential CMOS selective access imaging technology because the coprocessor unit is integrated into the image acquisition loop. The acquisition and coprocessing architecture are compatible with the majority of CMOS sensors. It enables the dynamic selection of a wide variety of acquisition modes as well as the reconfiguration and implementation of high-performance image preprocessing algorithms (calibration, filtering, denoising, binarization, pattern recognition). Furthermore, the processing and data transfer, from the CMOS sensor to the processor, can be operated simultaneously to increase achievable performances. The coprocessor architecture has been designed so as to obtain a unit that can be configured on the fly, in terms of type and number of chained processing stages (up to 8 successive predefined preprocessing stages), during the image acquisition process that can be defined by the user according to each specific application requirement. Examples of acquisition and processing performances are reported and compared to classical image acquisition systems based on standard modular PC platforms. The experimental results show a considerable increase of the achievable performances. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
22. Bridging MoCs in SystemC Specifications of Heterogeneous Systems.
- Author
-
Damm, Markus, Haase, Jan, Grimm, Christoph, Herrera, Fernando, and Villar, Eugenio
- Subjects
SIMULATION methods & models ,METHODOLOGY ,DIGITAL libraries ,HETEROGENEOUS computing ,PARALLEL processing - Abstract
In order to get an efficient specification and simulation of a heterogeneous system, the choice of an appropriate model of computation (MoC) for each system part is essential. The choice depends on the design domain (e.g., analogue or digital), and the suitable abstraction level used to specify and analyse the aspects considered to be important in each system part. In practice, MoC choice is implicitly made by selecting a suitable language and a simulation tool for each system part. This approach requires the connection of different languages and simulation tools when the specification and simulation of the system are considered as a whole. SystemC is able to support a more unified specification methodology and simulation environment for heterogeneous system, since it is extensible by libraries that support additional MoCs. A major requisite of these libraries is to provide means to connect system parts which are specified using different MoCs. However, these connection means usually do not provide enough flexibility to select and tune the right conversion semantic in a mixed-level specification, simulation, and refinement process. In this article, converter channels, a flexible approach for MoC connection within a SystemC environment consisting of three extensions, namely, SystemC-AMS, HetSC, and OSSS+R, are presented. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
23. System-on-Chip Environment: A SpecC-Based Framework for Heterogeneous MPSoC Design.
- Author
-
Dömer, Rainer, Gerstlauer, Andreas, Junyu Peng, Dongwan Shin, Lukai Cai, Haobo Yu, Abdi, Samar, and Gajski, Daniel D.
- Subjects
EMBEDDED computer systems ,AUTOMATION ,INDUSTRIAL productivity ,SYSTEM analysis ,COMPUTER integrated manufacturing systems - Abstract
The constantly growing complexity of embedded systems is a challenge that drives the development of novel design automation techniques. C-based system-level design addresses the complexity challenge by raising the level of abstraction and integrating the design processes for the heterogeneous system components. In this article, we present a comprehensive design framework, the system-on-chip environment (SCE) which is based on the influential SpecC language and methodology. SCE implements a top-down system design flow based on a specify-explore-refine paradigm with support for heterogeneous target platforms consisting of custom hardware components, embedded software processors, dedicated IP blocks, and complex communication bus architectures. Starting from an abstract specification of the desired system, models at various levels of abstraction are automatically generated through successive step-wise refinement, resulting in a pin-and cycle-accurate system implementation. The seamless integration of automatic model generation, estimation, and verification tools enables rapid design space exploration and efficient MPSoC implementation. Using a large set of industrial-strength examples with a wide range of target architectures, our experimental results demonstrate the effectiveness of our framework and show significant productivity gains in design time. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
24. Reconfiguration Management in the Context of RTOS-Based HW/SW Embedded Systems.
- Author
-
Eustache, Yvan and Diguet, Jean-Philippe
- Subjects
CONFIGURATION management ,INTEGRATED circuits ,EMBEDDED computer systems ,COMPUTER networks ,INFORMATION technology - Abstract
This paper presents a safe and efficient solution to manage asynchronous configurations of dynamically reconfigurable systems-on-chip. We first define our unified RTOS-based framework for HW/SW task communication and configuration management. Then three issues are discussed and solutions are given: the formalization of configuration space modeling including its different dimensions, the synchronization of configuration that mainly addresses the question of task configuration ordering, and the configuration coherency that solves the way a task accepts a new configuration. Finally, we present the global method and give some implementation figures from a smart camera case study. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
25. DART: A Functional-Level Reconfigurable Architecture for High Energy Efficiency.
- Author
-
Pillement, Sébastien, Sentieys, Olivier, and David, Raphäel
- Subjects
COMPUTER architecture ,MOBILE communication systems ,TELECOMMUNICATION systems ,MULTIMEDIA systems ,EMBEDDED computer systems ,COMPUTER systems - Abstract
Flexibility becomes a major concern for the development of multimedia and mobile communication systems, as well as classical high-performance and low-energy consumption constraints. The use of general-purpose processors solves flexibility problems but fails to cope with the increasing demand for energy efficiency. This paper presents the DART architecture based on the functional-level reconfiguration paradigm which allows a significant improvement in energy efficiency. DART is built around a hierarchical interconnection network allowing high flexibility while keeping the power overhead low. To enable specific optimizations, DART supports two-modes of reconfiguration. The compilation framework is built using compilation and high-level synthesis techniques. A 3G mobile communication application has been implemented as a proof of concept. The energy distribution within the architecture and the physical implementation are also discussed. Finally, the VLSI design of a 0.13 ×2009μm CMOS SoC implementing a specialized DART cluster is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
26. A Flexible System Level Design Methodology Targeting Run-Time Reconfigurable FPGAs.
- Author
-
Berthelot, Florent, Nouvel, Fabienne, and Houzet, Dominique
- Subjects
ADAPTIVE computing systems ,COMPUTER systems ,ELECTRONIC systems ,COMPUTER input-output equipment ,FIELD programmable gate arrays ,COMPUTER networks - Abstract
Reconfigurable computing is certainly one of the most important emerging research topics on digital processing architectures over the last few years. The introduction of run-time reconfiguration (RTR) on FPGAs requires appropriate design flows and methodologies to fully exploit this new functionality. For that purpose, we present an automatic design generation methodology for heterogeneous architectures based on DSPs and FPGAs that ease and speed RTR implementation. We focus on how to take into account specificities of partially reconfigurable components from a high-level specification during the design generation steps. This method automatically generates designs for both fixed and partially reconfigurable parts of an FPGA with automatic management of the reconfiguration process. Furthermore, this automatic design generation enables a reconfiguration prefetching technique to minimize reconfiguration latency and buffer-merging techniques to minimize memory requirements of the generated design. This concept has been applied to different wireless access schemes, based on a combination of OFDM and CDMA techniques. This implementation example illustrates the benefits of the proposed design methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
27. Design Flow Instantiation for Run-Time Reconfigurable Systems: A Case Study.
- Author
-
Yang Qu, Tiensyrjä, Kari, Soininen, Juha-Pekka, and Nurmi, Jari
- Subjects
ADAPTIVE computing systems ,COMPUTER systems ,ELECTRONIC systems ,COMPUTER input-output equipment ,COMPUTER software - Abstract
Reconfigurable system is a promising alternative to deliver both flexibility and performance at the same time. New reconfigurable technologies and technology-dependent tools have been developed, but a complete overview of the whole design flow for run-time reconfigurable systems is missing. In this work, we present a design flow instantiation for such systems using a real-life application. The design flow is roughly divided into two parts: system level and implementation. At system level, our supports for hardware resource estimation and performance evaluation are applied. At implementation level, technology-dependent tools are used to realize the run-time reconfiguration. The design case is part of a WCDMA decoder on a commercially available reconfigurable platform. The results show that using run-time reconfiguration can save over 40% area when compared to a functionally equivalent fixed system and achieve 30 times speedup in processing time when compared to a functionally equivalent pure software design. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
28. System-Platforms-Based SystemC TLM Design of Image Processing Chains for Embedded Applications.
- Author
-
Cheema, Muhammad Omer, Lacassagne, Lionel, and Hammami, Omar
- Subjects
EMBEDDED computer systems ,SYSTEMS design ,SYSTEMS development ,IMAGE processing ,COMPUTER software ,COMPUTER input-output equipment ,AUTOMATION - Abstract
Intelligent vehicle design is a complex task which requires multidomains modeling and abstraction. Transaction-level modeling (TLM) and component-based software development approaches accelerate the process of an embedded system design and simulation and hence improve the overall productivity. On the other hand, system-level design languages facilitate the fast hardware synthesis at behavioral level of abstraction. In this paper, we introduce an approach for hardware/software codesign of image processing applications targeted towards intelligent vehicle that uses platform-based SystemC TLM and component-based software design approaches along with HW synthesis using SystemC to accelerate system design and verification process. Our experiments show the effectiveness of our methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
29. Reconfigurable On-Board Vision Processing for Small Autonomous Vehicles.
- Author
-
Fife, Wade S. and Archibald, James K.
- Subjects
FIELD programmable gate arrays ,EMBEDDED computer systems ,OPTICAL quality control ,SIGNAL processing ,ALGORITHMS - Abstract
This paper addresses the challenge of supporting real-time vision processing on-board small autonomous vehicles. Local vision gives increased autonomous capability, but it requires substantial computing power that is difficult to provide given the severe constraints of small size and battery-powered operation. We describe a custom FPGA-based circuit board designed to support research in the development of algorithms for image-directed navigation and control. We show that the FPGA approach supports real-time vision algorithms by describing the implementation of an algorithm to construct a three-dimensional (3D) map of the environment surrounding a small mobile robot. We show that FPGAs are well suited for systems that must be flexible and deliver high levels of performance, especially in embedded settings where space and power are significant concerns. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
30. A Predictive NoC Architecture for Vision Systems Dedicated to Image Analysis.
- Author
-
Fresse, Virginie, Aubert, Alain, and Bochard, Nathalie
- Subjects
COMPUTER architecture ,COMPUTER vision ,IMAGE analysis ,ALGORITHMS ,VELOCIMETRY - Abstract
The aim of this paper is to describe an adaptive and predictive FPGA embedded architecture for vision systems dedicated to image analysis. A large panel of image analysis algorithms with some common characteristics must be mapped onto this architecture. Major characteristics of such algorithms are extracted to define the architecture. This architecturemust easily adapt its structure to algorithm modifications. According to required modifications, few parts must be either changed or adapted. An NoC approach is used to break the hardware resources down as stand-alone blocks and to improve predictability and reuse aspects. Moreover, this architecture is designed using a globally asynchronous locally synchronous approach so that each local part can be optimized separately to run at its best frequency. Timing and resource prediction models are presented. With these models, the designer defines and evaluates the appropriate structure before the implementation process. The implementation of a particle image velocimetry algorithm illustrates this adaptation. Experimental results and predicted results are close enough to validate our prediction models for PIV algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
31. Examining the Viability of FPGA Supercomputing.
- Author
-
Craven, Stephen and Athanas, Peter
- Subjects
FIELD programmable gate arrays ,PROGRAMMABLE logic devices ,HIGH performance computing ,ELECTRONIC data processing ,COMPUTER systems ,COMPARATIVE studies - Abstract
For certain applications, custom computational hardware created using field programmable gate arrays (FPGAs) can produce significant performance improvements over processors, leading some in academia and industry to call for the inclusion of FPGAs in supercomputing clusters. This paper presents a comparative analysis of FPGAs and traditional processors, focusing on floating point performance and procurement costs, revealing economic hurdles in the adoption of FPGAs for general high-performance computing (HPC). [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
32. The Chameleon Architecture for Streaming DSP Applications.
- Author
-
Smit, Gerard J. M., Kokkeler, André B. J., Wolkotte, Pascal T., Hölzenspies, Philip K. F., van de Burgwal, Marcel D., and Heysters, Paul M.
- Subjects
DIGITAL signal processing ,IMAGE processing ,ADAPTIVE computing systems ,COMPUTER interfaces ,DIGITAL electronics ,COMPUTER systems - Abstract
We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8mm
2 in a 130nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool. [ABSTRACT FROM AUTHOR]- Published
- 2007
- Full Text
- View/download PDF
33. An FPGA Implementation of a Parallelized MT19937 Uniform Random Number Generator
- Author
-
David Kearney, Vinay Sriram, Sriram, Vinay Bajee, and Kearney, David Andrew
- Subjects
Pseudorandom number generator ,General Computer Science ,Random number generation ,Computer science ,lcsh:Electronics ,lcsh:TK7800-8360 ,Parallel computing ,32-bit ,Reconfigurable computing ,Lavarand ,Control and Systems Engineering ,Hardware random number generator ,Field-programmable gate array ,Throughput (business) ,Computer Science(all) - Abstract
Recent times have witnessed an increase in use of high-performance reconfigurable computing for accelerating large-scale simulations. A characteristic of such simulations, like infrared (IR) scene simulation, is the use of large quantities of uncorrelated random numbers. It is therefore of interest to have a fast uniform random number generator implemented in reconfigurable hardware. While there have been previous attempts to accelerate the MT19937 pseudouniform random number generator using FPGAs we believe that we can substantially improve the previous implementations to develop a higher throughput and more area-time efficient design. Due to the potential for parallel implementation of random numbers generators, designs that have both a small area footprint and high throughput are to be preferred to ones that have the high throughput but with significant extra area requirements. In this paper, we first present a single port design and then present an enhanced 624 port hardware implementation of the MT19937 algorithm. The 624 port hardware implementation when implemented on a Xilinx XC2VP70-6 FPGA chip has a throughput of 32-bit random numbers per second which is more than 17x that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA-based pseudouniform random number generators. 1 1 9 . 6 × 1 0 9 32 bit random numbers per second which is more than 17x that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA-based pseudouniform random number generators.
- Published
- 2009
34. Run-Time HW/SW Scheduling of Data Flow Applications on Reconfigurable Architectures
- Author
-
Benoit Miramond, Fakhreddine Ghaffari, François Verdier, Equipes Traitement de l'Information et Systèmes (ETIS - UMR 8051), and Ecole Nationale Supérieure de l'Electronique et de ses Applications (ENSEA)-Centre National de la Recherche Scientifique (CNRS)-CY Cergy Paris Université (CY)
- Subjects
[INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR] ,Virtex ,General Computer Science ,Computer science ,business.industry ,Cycles per instruction ,lcsh:Electronics ,Control reconfiguration ,lcsh:TK7800-8360 ,02 engineering and technology ,Reconfigurable computing ,020202 computer hardware & architecture ,Scheduling (computing) ,Data flow diagram ,Software ,Computer architecture ,Control and Systems Engineering ,020204 information systems ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,business ,ComputingMilieux_MISCELLANEOUS ,Data-flow analysis ,Computer Science(all) - Abstract
This paper presents an efficient dynamic and run-time Hardware/Software scheduling approach. This scheduling heuristic consists in mapping online the different tasks of a highly dynamic application in such a way that the total execution time is minimized. We consider soft real-time data flow graph oriented applications for which the execution time is function of the input data nature. The target architecture is composed of two processors connected to a dynamically reconfigurable hardware accelerator. Our approach takes advantage of the reconfiguration property of the considered architecture to adapt the treatment to the system dynamics. We compare our heuristic with another similar approach. We present the results of our scheduling method on several image processing applications. Our experiments include simulation and synthesis results on a Virtex V-based platform. These results show a better performance against existing methods.
- Published
- 2009
- Full Text
- View/download PDF
35. Reconfigurable Computing and Hardware/Software Codesign.
- Author
-
Plaks, Toomas P., Santambrogio, Marco D., and Sciuto, Donatella
- Subjects
EMBEDDED computer systems ,COMPUTER systems - Abstract
The article discusses various reports published within the issue, including one by Y. Qu and colleagues on a design flow instantiation for run-time reconfigurable system and G. B. Knerr and colleagues on approaches for system partitioning into a consistent design framework for wireless embedded systems.
- Published
- 2008
- Full Text
- View/download PDF
36. Communication-Oriented Design Space Exploration for Reconfigurable Architectures
- Author
-
Lilian Bossuet, Jean-Luc Philippe, Guy Gogniat, Laboratoire de l'intégration, du matériau au système (IMS), Centre National de la Recherche Scientifique (CNRS)-Institut Polytechnique de Bordeaux-Université Sciences et Technologies - Bordeaux 1, Laboratoire d'Electronique des Systèmes TEmps Réel (LESTER), Centre National de la Recherche Scientifique (CNRS)-Université de Bretagne Sud (UBS), Université Sciences et Technologies - Bordeaux 1-Institut Polytechnique de Bordeaux-Centre National de la Recherche Scientifique (CNRS), and Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR] ,Focus (computing) ,General Computer Science ,Design space exploration ,Computer science ,lcsh:Electronics ,lcsh:TK7800-8360 ,020207 software engineering ,02 engineering and technology ,Reconfigurable computing ,Database-centric architecture ,020202 computer hardware & architecture ,Set (abstract data type) ,Computer architecture ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,Architecture ,Field-programmable gate array ,Design space ,Computer Science(all) - Abstract
Many academic works in computer engineering focus on reconfigurable architectures and associated tools. Fine-grain architectures, field programmable gate arrays (FPGAs), are the most well-known structures of reconfigurable hardware. Dedicated tools (generic or specific) allow for the exploration of their design space to choose the best architecture characteristics and/or to explore the application characteristics. The aim is to increase the synergy between the application and the architecture in order to get the best performance. However, there is no generic tool to perform such an exploration for coarse-grain or heterogeneous-grain architectures, just a small number of very specific tools are able to explore a limited set of architectures. To address this major lack, in this paper we propose a new design space exploration approach adapted to fine- and coarse-grain granularities. Our approach combines algorithmic and architecture explorations. It relies on an automatic estimation tool which computes the communication hierarchical distribution and the architectural processing resources use rate for the architecture under exploration. Such an approach forwards the rapid definition of efficient reconfigurable architectures dedicated to one or several applications.
- Published
- 2007
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.