420 results on '"Dual modular redundancy"'
Search Results
2. An On-chip Router Architecture for Dependable Multicore Processor
- Author
-
Kise, Kenji and Asai, Shojiro, editor
- Published
- 2019
- Full Text
- View/download PDF
3. An FPGA-based network system with service-uninterrupted remote functional update.
- Author
-
Tan, Tze Hon, Chia Yee Ooi, and Marsono, Muhammad Nadzir
- Subjects
SENSOR placement ,GATE array circuits ,TELECOMMUNICATION systems ,5G networks ,WIRELESS sensor networks ,ACQUISITION of data - Abstract
The recent emergence of 5G network enables mass wireless sensors deployment for internet-of-things (IoT) applications. In many cases, IoT sensors in monitoring and data collection applications are required to operate continuously and active at all time (24/7) to ensure all data are sampled without loss. Field-programmable gate array (FPGA)-based systems exhibit a balanced processing throughput and datapath flexibility. Specifically, datapath flexibility is acquired from the FPGA-based system architecture that supports dynamic partial reconfiguration feature. However, device functional update can cause interruption to the application servicing, especially in an FPGA-based system. This paper presents a standalone FPGA-based system architecture that allows remote functional update without causing service interruption by adopting a redundancy mechanism in the application datapath. By utilizing dynamic partial reconfiguration, only the updating datapath is temporarily inactive while the rest of the circuitry, including the redundant datapath, remain active. Hence, there is no service interruption and downtime when a remote functional update takes place due to the existence of redundant application datapath, which is critical for network and communication systems. The proposed architecture has a significant impact for application in FPGA-based systems that have little or no tolerance in service interruption. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
4. Radiation Hardened Digital Direct Synthesizer With CORDIC for Spaceborne Applications
- Author
-
Luis Alberto Aranda, Francisco Garcia-Herrero, Luis Esteban, Alfonso Sanchez-Macian, and Juan Antonio Maestro
- Subjects
CORDIC ,digital signal processing ,dual modular redundancy ,fault tolerance ,radiation ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The Coordinate Rotation Digital Computer algorithm (CORDIC) is a simple mechanism to compute a set of elementary functions, such as trigonometric functions, using fixed-point devices. It is widely adopted, also in applications running in harsh environments such as space, where radiation is a cause of errors in nanoelectronic devices. A single event upset in a configuration bit of a Field Programmable Gate Array (FPGA) can completely change the behavior of the implemented circuit, so it is important to detect and reconfigure the FPGA when this happens. Dual modular redundancy is the typical method to detect errors in electronic circuits, but it has an important overhead in area and power consumption and it does not provide any additional functionality apart from the activation of the FPGA reconfiguration trigger in presence of error. This paper presents two ad-hoc techniques to protect the Digital Direct Synthesizer with CORDIC when it is implemented into an FPGA, with limited overhead in terms of area and power consumption when compared with the traditional solution. The first solution slightly increases the percentage of undetected errors, about 11%, reducing to almost half the area overhead of the circuit. The second solution introduces a trade-off between the percentage of error detection and the precision of the trigonometric output of the CORDIC by means of a polymorphic structure with lower area resources than the existing solutions. This last proposal allows the system to increase the precision of the digital synthesis signal under absence of errors or to activate the error protection in scenarios with external disturbances such as radiation.
- Published
- 2020
- Full Text
- View/download PDF
5. Simple and Low-Cost Heartbeat-Based Dual Modular Redundant Systems for Wireless Sensor Networks
- Author
-
Na, Jongwhoa, Lee, Dongwoo, Zorigbold, Munkh, Lee, Dongmin, Moon, Sungyup, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Ruediger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Park, James J., editor, Loia, Vincenzo, editor, Yi, Gangman, editor, and Sung, Yunsick, editor
- Published
- 2018
- Full Text
- View/download PDF
6. Cross-Layer Dual Modular Redundancy Hardened Scheme of Flip-Flop Design Based on Sense-Amplifier.
- Author
-
Huang, Zhengfeng, Su, Zian, Ni, Tianming, Xu, Qi, Qi, Haochen, Lu, Yingchun, Eric, Manzi, and Xu, Hui
- Subjects
- *
LOGIC circuits , *MODULAR design , *REDUNDANCY in engineering - Abstract
As the demand for low-power and high-speed logic circuits increases, the design of differential flip-flops based on sense-amplifier (SAFF), which have excellent power and speed characteristics, has become more and more popular. Conventional SAFF (Con SAFF) and improved SAFF designs focus more on the improvement of speed and power consumption, but ignore their Single-Event-Upset (SEU) sensitivity. In fact, SAFF is more susceptible to particle impacts due to the small voltage swing required for differential input in the master stage. Based on the SEU vulnerability of SAFF, this paper proposes a novel scheme, namely cross-layer Dual Modular Redundancy (DMR), to improve the robustness of SAFF. That is, unit-level DMR technology is performed in the master stage, while transistor-level stacking technology is used in the slave stage. This scheme can be applied to some current typical SAFF designs, such as Con SAFF, Strollo SAFF, Ahmadi SAFF, Jeong SAFF, etc. Detailed HSPICE simulation results demonstrate that hardened SAFF designs can not only fully tolerate the Single Node Upset of sensitive nodes, but also partially tolerate the Double Node Upset caused by charge sharing. Besides, compared with the conventional DMR hardened scheme, the proposed cross-layer DMR hardened scheme not only has the same fault-tolerant characteristics, but also greatly reduces the delay, area and power consumption. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
7. 双模冗余汉明码的设计与验证.
- Author
-
乔冰涛, 吴旭凡, 刘海静, 王正, and 董业民
- Subjects
INTEGRATED circuits ,HAMMING codes ,DELAY lines ,SOFT errors ,MEMORY ,FLAGS - Abstract
Copyright of Journal of Harbin Institute of Technology. Social Sciences Edition / Haerbin Gongye Daxue Xuebao. Shehui Kexue Ban is the property of Harbin Institute of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2020
- Full Text
- View/download PDF
8. A Highly Robust and Low-Power Real-Time Double Node Upset Self-Healing Latch for Radiation-Prone Applications
- Author
-
Sandeep Kumar and Atin Mukherjee
- Subjects
Computer science ,Hardware_PERFORMANCEANDRELIABILITY ,Fault injection ,Upset ,Power (physics) ,Hardware and Architecture ,Robustness (computer science) ,Node (circuits) ,Sensitivity (control systems) ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Electrical and Electronic Engineering ,Dual modular redundancy ,Software ,Simulation ,Hardware_LOGICDESIGN ,Voltage - Abstract
This work presents a single event double node upset (SEDNU) self-healing (DNUSH) latch to meet the high-robustness requirement of the applications used in a harsh radiation environment. The DNUSH latch is based on dual modular redundancy and mainly employs C-elements and inverters, forming multi-feedback interlocked loops to retain the correct data even after the radiation event. The self-healing capability of the proposed latch is successfully shown by the fault injection simulation using Synopsys HSPICE. Simulation results show that the proposed latch can self-heal from all SEDNUs, consumes low power even for high-speed operations, and has the least power-delay-area product (PDAP) compared to the existing SEDNU resilient latches. The proposed latch offers on average 51.25% improvement in speed, 22.67% saving in power consumption, and 59.74% lower PDAP compared to the existing SEDNU resilient latches. In addition, the sensitivity assessment of the proposed latch against the process, voltage, and temperature (PVT) variations are found to be either low or equivalent to the reference latches.
- Published
- 2021
- Full Text
- View/download PDF
9. A Comparison of Dual Modular Redundancy and Concurrent Error Detection in Finite Impulse Response Filters Implemented in SRAM-Based FPGAs Through Fault Injection.
- Author
-
Aranda, Luis Alberto, Reviriego, Pedro, and Maestro, Juan Antonio
- Abstract
Compared with application specific integrated circuits (ASICs), static random access memory (SRAM)-based field programmable gate arrays (FPGAs) respond differently to radiation due to the configuration memory vulnerability. In this brief, the differences between the permanent error model for SRAM-based FPGAs due to configuration memory single event upsets (SEUs), and the ASIC SEU error model are put into perspective for error detection schemes. In particular, a concurrent error detection (CED) technique for finite impulse response filters in ASICs is implemented and evaluated in an SRAM-based FPGA through fault injection emulation. This method is compared with a dual modular redundancy (DMR) scheme in order to obtain a common behavior. The analysis of experimental data indicates that the CED technique has less undetected errors than DMR. However, our exhaustive fault injection tests reveal that false positive detections are more likely to occur in CED, since the error detection branch uses more FPGA resources than the DMR comparator. This phenomenon, which is negligible in ASICs, implies a partial or complete unnecessary reconfiguration, so it should be considered in SRAM-based FPGAs. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
10. Timing diversity as a protective mechanism
- Author
-
Mischa Mostl, Rolf Ernst, Anika Christmann, and Robin Hapka
- Subjects
Ethernet ,Software ,Exploit ,business.industry ,Computer science ,Reliability (computer networking) ,Overhead (engineering) ,Redundancy (engineering) ,Avionics ,Dual modular redundancy ,business ,Reliability engineering - Abstract
Dual modular redundancy (DMR) is not only an established solution for systems with high reliability demands, it is even required in aviation certification standards such as DO-254 [5, Clause 2.3.1]. A safety critical avionic application such as the flight control system is designed with up to 6-fold redundancy and the Avionics Full-Duplex Ethernet (AFDX) communication network is also based on the DMR. Even in the automotive domain, DMR is a well known solution. ISO26262 [3, Part 6, Clause 7.4.13] also suggests heterogeneous or diverse redundancy for safety-critical applications including software which must be redundantly executed on independent hardware components to avoid failure due to hardware errors. We exploit this mandatory software redundancy to master timing errors of critical software with minimum additional overhead.
- Published
- 2021
- Full Text
- View/download PDF
11. An FPGA-based network system with service-uninterrupted remote functional update
- Author
-
Chia Yee Ooi, Tze Hon Tan, and Muhammad Nadzir Marsono
- Subjects
Dual modular redundancy ,Dynamic partial reconfiguration ,General Computer Science ,business.industry ,Computer science ,Control reconfiguration ,Communications system ,NetFPGA ,Gate array ,Embedded system ,Datapath ,Redundancy (engineering) ,Systems architecture ,Electrical and Electronic Engineering ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,business ,Field-programmable gate array ,Service-uninterrupted remote functional update - Abstract
The recent emergence of 5G network enables mass wireless sensors deployment for internet-of-things (IoT) applications. In many cases, IoT sensors in monitoring and data collection applications are required to operate continuously and active at all time (24/7) to ensure all data are sampled without loss. Field-programmable gate array (FPGA)-based systems exhibit a balanced processing throughput and datapath flexibility. Specifically, datapath flexibility is acquired from the FPGA-based system architecture that supports dynamic partial reconfiguration feature. However, device functional update can cause interruption to the application servicing, especially in an FPGA-based system. This paper presents a standalone FPGA-based system architecture that allows remote functional update without causing service interruption by adopting a redundancy mechanism in the application datapath. By utilizing dynamic partial reconfiguration, only the updating datapath is temporarily inactive while the rest of the circuitry, including the redundant datapath, remain active. Hence, there is no service interruption and downtime when a remote functional update takes place due to the existence of redundant application datapath, which is critical for network and communication systems. The proposed architecture has a significant impact for application in FPGA-based systems that have little or no tolerance in service interruption.
- Published
- 2021
12. Multi-objective redundancy hardening with optimal task mapping for independent tasks on multi-cores
- Author
-
Xin Yao, Bin Li, Zhigang Zeng, Huanhuan Chen, and Bo Yuan
- Subjects
Triple modular redundancy ,0209 industrial biotechnology ,Job shop scheduling ,Computer science ,Evolutionary algorithm ,Computational intelligence ,02 engineering and technology ,Fault detection and isolation ,Theoretical Computer Science ,020901 industrial engineering & automation ,Computer engineering ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Memetic algorithm ,Systems design ,020201 artificial intelligence & image processing ,Geometry and Topology ,Dual modular redundancy ,Software - Abstract
The rate of transient faults has increased significantly as the technology scales up. The tolerance of transient faults has become an important issue in the system design. Dual modular redundancy (DMR) and triple modular redundancy (TMR) are two commonly used techniques that can achieve fault detection and masking through executing redundant tasks. As DMR and TMR have different time and cost overheads, we must carefully determine which one should be used for each task (i.e., task hardening) to achieve the optimal system design. Furthermore, for multi-core systems, the system-level design includes the allocation of cores for the tasks (i.e., task mapping) as well. This paper aims at task hardening and mapping simultaneously for independent tasks on multi-cores with heterogeneous performances, in order to minimize the maximum completion time of all tasks (i.e., makespan). We demonstrate that once task hardening is given, task mapping of independent tasks can be achieved by employing min–max-weight perfect matching with a polynomial time complexity. Besides, as there is a trade-off between cost and time performance, we propose a multi-objective memetic algorithm (MOMA)-based task hardening method to obtain a set of solutions with different numbers of cores (i.e., costs), so the designer can choose different solutions according to different requirements. The key idea of the MOMA is to incorporate problem-specific knowledge into the global search of evolutionary algorithms. Our experimental studies have demonstrated the effectiveness of the proposed method and have shown that by combining the results of MOMA and MOEA we can provide a designer with a highly accurate set of solutions within a reasonable amount of time.
- Published
- 2019
- Full Text
- View/download PDF
13. Dual-modular-redundancy and dual-level error-interception based triple-node-upset tolerant latch designs for safety-critical applications
- Author
-
Patrick Girard, Jun Zhou, Zhihui He, Zhengfeng Huang, Aibin Yan, Tianming Ni, Jie Cui, Xiaoqing Wen, Anhui University [Hefei], Anhui Polytechnic University, Kyushu Institute of Technology, TEST (TEST), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), and Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
- Subjects
business.industry ,Computer science ,020208 electrical & electronic engineering ,General Engineering ,Fault tolerance ,02 engineering and technology ,self-recoverability ,Upset ,020202 computer hardware & architecture ,Dual (category theory) ,fault-tolerance ,low-cost ,triple-node upset ,latch design ,0202 electrical engineering, electronic engineering, information engineering ,Node (circuits) ,Interception ,[SPI.NANO]Engineering Sciences [physics]/Micro and nanotechnologies/Microelectronics ,business ,Dual modular redundancy ,Computer hardware - Abstract
International audience; This paper presents a dual-modular-redundancy and dual-level error-interception based triple-node-upset (TNU) tolerant latch design (namely DDETT) for safety-critical applications. The DDETT latch comprises two parallel single-node-upset self-recoverable cells to store values and three C-elements to intercept errors. Both of the two cells are constructed from triple mutually-feeding-back 2-input C-elements, and the cells feed two internal C-elements for first-level error-interception. Moreover, the two internal C-elements feed an output-stage C-element for second-level error-interception, making the DDETT latch TNU-tolerant in that it can tolerate any possible TNU. This paper further presents a low-cost version of the DDETT latch, namely LCDDETT. The LCDDETT latch uses two dual-interlocked-storage-cells (DICEs) to store values and uses dual-level error-interception to tolerate any possible TNU with cost-effectiveness. Simulation results not only confirm the TNU-tolerance of the proposed latches but also demonstrate that the delay-power-area products of the DDETT and LCDDETT latches are reduced by approximately 34% and 58%, respectively.
- Published
- 2021
- Full Text
- View/download PDF
14. FT-BLAS: A High Performance BLAS Implementation With Online Fault Tolerance
- Author
-
Elisabeth Giem, Zizhong Chen, Quan Fan, Yujia Zhai, Kai Zhao, and Jinyang Liu
- Subjects
FOS: Computer and information sciences ,020203 distributed computing ,Computer Science - Performance ,Low overhead ,On the fly ,Computer science ,Reliability (computer networking) ,MathematicsofComputing_NUMERICALANALYSIS ,Fault tolerance ,010103 numerical & computational mathematics ,02 engineering and technology ,Parallel computing ,01 natural sciences ,Basic Linear Algebra Subprograms ,Performance (cs.PF) ,Computer Science - Distributed, Parallel, and Cluster Computing ,0202 electrical engineering, electronic engineering, information engineering ,Distributed, Parallel, and Cluster Computing (cs.DC) ,SIMD ,0101 mathematics ,Dual modular redundancy - Abstract
Basic Linear Algebra Subprograms (BLAS) is a core library in scientific computing and machine learning. This paper presents FT-BLAS, a new implementation of BLAS routines that not only tolerates soft errors on the fly, but also provides comparable performance to modern state-of-the-art BLAS libraries on widely-used processors such as Intel Skylake and Cascade Lake. To accommodate the features of BLAS, which contains both memory-bound and computing-bound routines, we propose a hybrid strategy to incorporate fault tolerance into our brand-new BLAS implementation: duplicating computing instructions for memory-bound Level-1 and Level-2 BLAS routines and incorporating an Algorithm-Based Fault Tolerance mechanism for computing-bound Level-3 BLAS routines. Our high performance and low overhead are obtained from delicate assembly-level optimization and a kernel-fusion approach to the computing kernels. Experimental results demonstrate that FT-BLAS offers high reliability and high performance -- faster than Intel MKL, OpenBLAS, and BLIS by up to 3.50%, 22.14% and 21.70%, respectively, for routines spanning all three levels of BLAS we benchmarked, even under hundreds of errors injected per minute., camera-ready version at ICS'21: International Conference on Supercomputing 2021 with ISBN updated
- Published
- 2021
15. Design methodology for fault tolerant ASICs.
- Author
-
Petrovic, Vladimir, Ilic, Marko, Schoof, Gunter, and Stamenkovic, Zoran
- Abstract
The sensitivity of application specific integrated circuits (ASICs) to the single event effects (SEE) can induce failures of the systems which are exposed to increased radiation levels in the space and on the ground. This paper presents a design methodology for a full fault tolerant ASIC that is immune to the single event upsets (SEU) in sequential logic, the single event transients (SET) in combinational logic and the single event latchup (SEL). The dual modular redundancy (DMR) and a SEL power-switch (SPS) are the basis for a modified ASIC design flow. Measurement results have proven the correct functionality of DMR and SPS circuits, as well as a high fault tolerance of implemented ASICs along with moderate overhead in respect of power consumption and occupied silicon area. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
16. Fast error detection through efficient use of hardwired resources in FPGAs.
- Author
-
Nazar, Gabriel L. and Carro, Luigi
- Abstract
Providing high reliability for FPGAs is a demanding task, as such devices may be subject to faults in the configuration bitstream, altering the specified function. Traditional modular redundancy remains the most used technique, due to its high fault coverage and low performance overhead. When high availability and strict real-time deadlines must be considered, however, a short mean time to repair also becomes crucial. The use of fine-grained modules can accelerate error detection, fault diagnosis and bitstream correction, but with increased area costs. In this work, we propose the use of hardwired resources found in state-of-the-art FPGAs to provide fast and area efficient fine-grained error detection. Experimental results show an average speed up in error detection of 7.68 times with only 3.2% more area overhead, when compared to coarse-grained modular redundancy. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
17. Improvements and recent updates of persistent fault analysis on block ciphers
- Author
-
Xinjie Zhao, Xiaoxuan Lou, Shivam Bhasin, Shize Guo, Bolin Yang, Guorui Xu, Kui Ren, and Fan Zhang
- Subjects
Constraint (information theory) ,Computer engineering ,Feature (computer vision) ,Computer science ,business.industry ,Fault injection ,Dual modular redundancy ,Fault (power engineering) ,Encryption ,business ,Implementation ,Block cipher - Abstract
Persistence is an intrinsic nature of many errors yet has not been caught enough attractions for years. In this chapter, the feature of persistence is applied to fault attacks (FAs), and the persistent FA is proposed. Different from traditional FAs, adversaries can prepare the fault injection stage before the encryption stage, which relaxes the constraint of the tight-coupled time synchronization. The persistent fault analysis (PFA) is elaborated on different implementations of AES-128, specially fault-hardened implementations based on dual modular redundancy (DMR). Our experimental results show that PFA is quite simple and efficient in breaking these typical implementations. To show the feasibility and practicability of our attack, a case study is illustrated on a few countermeasures of masking. This work puts forward a new direction of FAs and can be extended to attack other implementations under more interesting scenarios.
- Published
- 2020
- Full Text
- View/download PDF
18. Simulation-based system reliability analysis of electrohydraulic actuator with dual modular redundancy
- Author
-
Andreev, Maxim, Kolesnikov, Artem, Grätz, Uwe, Gundermann, Julia, and Dresdner Verein zur Förderung der Fluidtechnik e. V. Dresden
- Subjects
reliability ,Computer science ,ddc:621.3 ,Reliability engineering ,12th International Fluid Power Conference, failure detection system, reliability, digital twin, machine learning ,machine learning ,12. IFK, Fehlererkennungssystem, Zuverlässigkeit, digitaler Zwilling, maschinelles Lernen ,12th International Fluid Power Conference ,digital twin ,failure detection system ,ddc:620 ,Actuator ,Dual modular redundancy ,Simulation based ,Reliability (statistics) - Abstract
This paper describes the failure detection system of an electro-hydraulic actuator with dual modular redundancy based on a hybrid twin TM concept. Hybrid twin TM is a combination of virtual twin that operates in parallel with the actuator and represents its ideal behaviour, and a digital twin that identifies possible failures using the sensor readings residuals. Simulation-based system reliability analysis helps to generate a dataset for training the digital twin using machine learning algorithms. A systematic failure detection approach based on decision trees and the process of analysing the quality of the result is described.
- Published
- 2020
19. Software-Only Triple Diverse Redundancy on GPUs for Autonomous Driving Platforms
- Author
-
Carles Hernandez, Jaume Abella, Leonidas Kosmidis, Sergi Alcaide Portet, Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, and Barcelona Supercomputing Center
- Subjects
Triple modular redundancy ,Single fault ,Vehicles autònoms ,Computer science ,Computation ,Autonomous vehicles ,GPU ,02 engineering and technology ,01 natural sciences ,Fault detection and isolation ,Software ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,CCF ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,Microprocessors ,010302 applied physics ,business.industry ,Unitats de processament gràfic ,020202 computer hardware & architecture ,TMR ,Embedded system ,Autonomous driving ,Microprocessadors ,Dual modular redundancy ,business ,Error detection and correction ,Graphics processing units ,Fault detection - Abstract
Autonomous driving (AD) imposes the need for safe computations in high-performance computing (HPC) components such as GPUs, thus with capabilities to detect and recover from errors since a safe state may not exist anymore. This can be achieved with Triple Modular Redundancy (TMR) for computation components. Furthermore, error detection capabilities need to provide some form of diversity to avoid the case where a single fault leads all redundant executions lead to the same error, which would go undetected. In our past work, we assessed GPUs against dual modular redundancy (DMR) with diversity, showing their potential and limitations to provide diverse redundancy building on reset and restart for recovery. However, such recovery scheme may be too slow for some applications. This paper proposes a software-only solution to deliver diverse TMR on commercial off-the-shelf (COTS) GPUs. Our work details how staggered execution can be achieved and assesses the performance of TMR on COTS GPUs. Moreover, we identify those elements where diversity cannot be guaranteed and provide some discussion comparing the case of DMR and TMR for those elements. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 871467 (SELENE). Leonidas Kosmidis has been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under a Juan de la Cierva Formacion postdoctoral fellowship with number FJCI-2017-34095.
- Published
- 2020
- Full Text
- View/download PDF
20. Evaluating the Impact of Repetition, Redundancy, Scrubbing, and Partitioning on 28-nm FPGA Reliability Through Neutron Testing
- Author
-
Paolo Rech, Ogun O. Kibar, Ken Mai, and Prashanth Mohan
- Subjects
Triple modular redundancy ,Nuclear and High Energy Physics ,010308 nuclear & particles physics ,Computer science ,Hardware_PERFORMANCEANDRELIABILITY ,01 natural sciences ,Computational science ,Nuclear Energy and Engineering ,Logic gate ,0103 physical sciences ,Redundancy (engineering) ,Static random-access memory ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Electrical and Electronic Engineering ,Error detection and correction ,Dual modular redundancy ,Field-programmable gate array ,Radiation hardening - Abstract
SRAM-based field-programmable gate arrays (FPGAs) are widely deployed in space and high-radiation environments, but they exhibit vulnerability to radiation effects. Designs can be hardened against radiation effects with design-side countermeasures such as redundancy, scrubbing, and partitioning. Through neutron tests, we investigate the impact of these design-side countermeasures on 28-nm FPGAs. We specifically address not only the provided radiation hardness but also the resource utilization and performance overheads. In addition, we evaluate the efficacy of repeating the operation after error detection. The results show that using coarse-grained and fine-grained triple modular redundancy (TMR) over dual modular redundancy (DMR) improves the failure cross section by $3.29\times $ and $11.49\times $ , respectively. The partitioning scheme that we used does not show a significant effect on radiation hardness. Using an internal scrubber and repeating the operation after a failure further decreases DMR, coarse-grained TMR, and fine-grained TMR cross sections by $5.10\times $ , $1.85\times $ , and $1.18\times $ , respectively.
- Published
- 2019
- Full Text
- View/download PDF
21. A Dual Modular Redundancy Scheme for CPU–FPGA Platform-Based Systems
- Author
-
Igor Brandao Machado Matsuo, Long Zhao, and Wei-Jen Lee
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,05 social sciences ,Monitoring system ,Data storage system ,Industrial and Manufacturing Engineering ,Control and Systems Engineering ,Gate array ,Embedded system ,Voting ,0502 economics and business ,Task analysis ,Redundancy (engineering) ,050211 marketing ,Electrical and Electronic Engineering ,Dual modular redundancy ,business ,Field-programmable gate array ,050203 business & management ,media_common - Abstract
This paper presents a practical view of how to implement a dual modular redundancy (DMR) scheme in a central processing unit–field-programmable gate array (CPU–FPGA) heterogeneous platform-based system, which is also described. FPGAs can be valuable resources when determinism and fast response/acquisition rates are required, while processing large volumes of data. On the other side, CPUs are affordable options for most other processing tasks, especially less frequent tasks, such as data recording. A heterogeneous platform is herein proposed and aims to achieve a reliable, however cost-effective solution. Subsequently, this paper proposes a method to implement a DMR scheme for monitoring systems by means of self-monitoring schemes, health indicators, and a voter that is not a physical switch connected to the monitoring devices, but is software-implemented within the interface system after the data storage system. This proved to be an effective way to deal with disagreements between units when an even number of units are being used, thus not being possible to use a majority-based voting system. The implemented design was thoroughly tested, showing effectiveness in terms of redundancy with improved reliability.
- Published
- 2018
- Full Text
- View/download PDF
22. Feedback-Based Low-Power Soft-Error-Tolerant Design for Dual-Modular Redundancy
- Author
-
Han Jie, Yufeng Li, Xuan Zeng, Jie Chen, Jianhao Hu, Bruce F. Cockburn, Fan Yang, and Yan Li
- Subjects
Very-large-scale integration ,Markov random field ,Computer science ,media_common.quotation_subject ,020208 electrical & electronic engineering ,Markov process ,02 engineering and technology ,020202 computer hardware & architecture ,symbols.namesake ,Soft error ,Computer engineering ,Hardware and Architecture ,Voting ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,symbols ,Electrical and Electronic Engineering ,Dual modular redundancy ,Software ,media_common - Abstract
Triple-modular redundancy (TMR), which consists of three identical modules and a voting circuit, is a common architecture for soft-error tolerance. However, the original TMR suffers from two major drawbacks: the large area overhead and the vulnerability of the voter. In order to overcome these drawbacks, we propose a new complementary dual-modular redundancy (CDMR) scheme for mitigating the effect of soft errors. Inspired by the Markov random field (MRF) theory, a two-stage voting system is implemented in CDMR, including a first-stage optimal MRF structure and a second-stage high-performance merging unit. The CDMR scheme can reduce the voting circuit area by 20% while saving the area of one redundant module, achieving at least 26% error-rate reduction at an ultralow supply voltage of 0.25 V with 8.33% faster timing compared to previous voter designs.
- Published
- 2018
- Full Text
- View/download PDF
23. A Comparison of Dual Modular Redundancy and Concurrent Error Detection in Finite Impulse Response Filters Implemented in SRAM-Based FPGAs Through Fault Injection
- Author
-
Juan Antonio Maestro, Pedro Reviriego, and Luis Alberto Aranda
- Subjects
Finite impulse response ,010308 nuclear & particles physics ,Computer science ,business.industry ,020208 electrical & electronic engineering ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,Fault injection ,01 natural sciences ,Application-specific integrated circuit ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Static random-access memory ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Electrical and Electronic Engineering ,Error detection and correction ,Dual modular redundancy ,Field-programmable gate array ,business - Abstract
Compared with application specific integrated circuits (ASICs), static random access memory (SRAM)-based field programmable gate arrays (FPGAs) respond differently to radiation due to the configuration memory vulnerability. In this brief, the differences between the permanent error model for SRAM-based FPGAs due to configuration memory single event upsets (SEUs), and the ASIC SEU error model are put into perspective for error detection schemes. In particular, a concurrent error detection (CED) technique for finite impulse response filters in ASICs is implemented and evaluated in an SRAM-based FPGA through fault injection emulation. This method is compared with a dual modular redundancy (DMR) scheme in order to obtain a common behavior. The analysis of experimental data indicates that the CED technique has less undetected errors than DMR. However, our exhaustive fault injection tests reveal that false positive detections are more likely to occur in CED, since the error detection branch uses more FPGA resources than the DMR comparator. This phenomenon, which is negligible in ASICs, implies a partial or complete unnecessary reconfiguration, so it should be considered in SRAM-based FPGAs.
- Published
- 2018
- Full Text
- View/download PDF
24. Availability analysis of safety grade multiple redundant controller used in advanced nuclear safety systems
- Author
-
Gee Yong Park, Hyun Gook Kang, Dong Hoon Kim, and Kwang Seop Son
- Subjects
Triple modular redundancy ,Computer science ,020209 energy ,Programmable logic controller ,02 engineering and technology ,01 natural sciences ,010305 fluids & plasmas ,Reliability engineering ,Nuclear Energy and Engineering ,Backplane ,0103 physical sciences ,Fault coverage ,0202 electrical engineering, electronic engineering, information engineering ,Unavailability ,Dual modular redundancy ,Error detection and correction ,Mean time to repair - Abstract
We analyze the availability of the Safety Programmable Logic Controller (SPLC) having multiple redundant architectures. In the SPLC, input/output and processor module are configured as triple modular redundancy (TMR), and backplane bus, power and communication modules are configured as dual modular redundancy (DMR). The voting logics for redundant architectures are based on the forwarding error detection. It means that the receivers perform the voting logics based on the status information of transmitters. To analyze the availability of SPLC, we construct the Markov model and simplify the model adopting the system unavailability rate. The results show that the fault coverage factor should be ≥0.8 and Mean Time To Repair (MTTR) should be ≤100 h in order to satisfy the requirement that the availability of the safety grade PLC should be ≥0.995. Also we evaluate the availability of SPLC comparing to other PLCs such as simplex, processor DMR (pDMR) and independent TMR (iTMR) PLCs used in the existing nuclear safety systems. The availability of SPLC is higher than those of the simplex, pDMR but is lower than that of iTMR for one month which is the periodic off-line test and inspection. That’s why the number of redundant modules used in PLC is more dominant to increasing the availability than the number of fault masking methods such as voting logics used in PLC on the assumption that operation time is in the early stage. But the availability of iTMR, which has many redundant modules but has only a voting logic fast decrease and eventually is the lowest after 8000 h. Also if the MTTR of each module in PLC is required to be increased to 200 h, the availability of SPLC would be better than iTMR.
- Published
- 2018
- Full Text
- View/download PDF
25. Optimal reliability design of a system with k-out-of-n subsystems considering redundancy strategies
- Author
-
Heungseob Kim
- Subjects
Triple modular redundancy ,0209 industrial biotechnology ,Mathematical optimization ,021103 operations research ,Markov chain ,Computer science ,0211 other engineering and technologies ,02 engineering and technology ,Industrial and Manufacturing Engineering ,Parallel genetic algorithm ,Reliability engineering ,020901 industrial engineering & automation ,Approximation error ,Matrix analytic method ,Redundancy (engineering) ,Reliability design ,Safety, Risk, Reliability and Quality ,Dual modular redundancy - Abstract
This study presents new reliability models for k -out-of- n systems using a structured continuous-time Markov chain. The approach makes it comfortable to identify the characteristics of the lifetime of the system, such as reliability and expected lifetime. To demonstrate the advantage, the analysis results of them for the multi-spectral camera system, which is the only payload of Korea multi-purpose satellite-2 as an example of a real system. Furthermore, existing studies on a k -out-of- n system with standby redundancy has provided the approximated reliability for it. In this paper, it is confirmed that the approximation error could have an effect on the reliability design for a system. Moreover, new versions of reliability optimization problems, redundancy allocation problem (RAP) and reliability-redundancy allocation problem(RRAP), are proposed. In order to maximize system reliability, they further determine the redundancy strategy, either active or standby redundancy for each k -out-of- n subsystem from the traditional problems. A parallel genetic algorithm is proposed for an RRAP modeled by nonlinear mixed integer programming.
- Published
- 2017
- Full Text
- View/download PDF
26. Fully Programmable Redundancy Circuits for STT-MRAM
- Author
-
Sang-Gyu Park and Dong-gi Lee
- Subjects
010302 applied physics ,Triple modular redundancy ,Magnetoresistive random-access memory ,Hardware_MEMORYSTRUCTURES ,Comparator ,business.industry ,Computer science ,020208 electrical & electronic engineering ,Array data type ,02 engineering and technology ,01 natural sciences ,Electronic, Optical and Magnetic Materials ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Electrical and Electronic Engineering ,Dual modular redundancy ,business ,Computer hardware ,Random access ,Electronic circuit - Abstract
We propose fully programmable redundancy schemes for spin-transfer-torque magnetic random access memories (STT-MRAMs). To store redundancy information, these schemes use magnetic tunnel junctions (MTJs), which are core memory elements of STT-MRAMs. This can greatly simplify the fabrication process of STT-MRAMs. Furthermore, it also allows reprogramming of the redundancy information after packaging or even during normal use by end-users without requiring any special high-voltage setup. We propose two redundancy schemes. First, we propose an address comparator, which uses MTJs and is a direct replacement of a conventional address comparator. Second, we propose a scheme in which the redundancy circuits share the storage cells and read–write peripheral circuits with the normal data array structure.
- Published
- 2017
- Full Text
- View/download PDF
27. Variation-Aware Reliable Many-Core System Design by Exploiting Inherent Core Redundancy
- Author
-
Ching-Yao Chou, An-Yeu Wu, Huai-Ting Li, Yuan-Ting Hsieh, and Wei-Ching Chu
- Subjects
010302 applied physics ,Triple modular redundancy ,Multi-core processor ,business.industry ,Computer science ,Distributed computing ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Soft error ,Hardware and Architecture ,Robustness (computer science) ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Systems design ,Electrical and Electronic Engineering ,business ,Dual modular redundancy ,Software - Abstract
Reliability issues are more severe in multi/many-core systems because of the integration of more devices in advanced technology nodes. To achieve robust computing in nanoscale designs, many circuit-level and architecture-level redundancy techniques had been proposed, which pose large fixed silicon area overhead and a lack of flexibility. In recent years, some methods have exploited the “inherent core redundancy” of many-core systems to implicitly implement N-modular redundant (NMR) subsystems to achieve area-efficient fault-tolerant computing. However, while facing the different levels of soft error rate, task vulnerability, and task significance in the many-core system, existing core-level redundancy methods become ineffective. To achieve robust computation in many-core systems with intercore variations and mixed workloads, we propose a variation-aware core-level redundancy scheme. Two novel approaches are presented in this scheme: 1) we construct NMR tables that store the degree of redundancy using mathematical models for systems affected by these variations and 2) we dynamically allocate each replicated task to a proper core with variation-aware mapping algorithms to achieve high reliability. Based on a modified multicore simulator, Sniper-Transient Error Process Variation (TEVR), the experimental results show that the proposed scheme can increase the reliability by 47.92% and achieve the energy saving of 39% compared with conventional core-level redundancy methods.
- Published
- 2017
- Full Text
- View/download PDF
28. A new model for the redundancy allocation problem with component mixing and mixed redundancy strategy
- Author
-
Ali Zeinal Hamadani and Hadi Gholinezhad
- Subjects
Triple modular redundancy ,0209 industrial biotechnology ,Mathematical optimization ,Reliability optimization ,021103 operations research ,Optimization problem ,Exploit ,0211 other engineering and technologies ,02 engineering and technology ,Industrial and Manufacturing Engineering ,020901 industrial engineering & automation ,Decision variables ,Redundancy (engineering) ,Safety, Risk, Reliability and Quality ,Dual modular redundancy ,Algorithm ,Mathematics - Abstract
This paper develops a new model for redundancy allocation problem. In this paper, like many recent papers, the choice of the redundancy strategy is considered as a decision variable. But, in our model each subsystem can exploit both active and cold-standby strategies simultaneously. Moreover, the model allows for component mixing such that components of different types may be used in each subsystem. The problem, therefore, boils down to determining the types of components, redundancy levels, and number of active and cold-standby units of each type for each subsystem to maximize system reliability by considering such constraints as available budget, weight, and space. Since RAP belongs to the NP-hard class of optimization problems, a genetic algorithm (GA) is developed for solving the problem. Finally, the performance of the proposed algorithm is evaluated by applying it to a well-known test problem from the literature with relatively satisfactory results.
- Published
- 2017
- Full Text
- View/download PDF
29. Fault-tolerant digital systems development using triple modular redundancy
- Author
-
Cs. Szász and R. Şinca
- Subjects
Triple modular redundancy ,Environmental Engineering ,Computer science ,business.industry ,Materials Science (miscellaneous) ,020208 electrical & electronic engineering ,05 social sciences ,General Engineering ,050301 education ,Fault tolerance ,02 engineering and technology ,Management Science and Operations Research ,Control system ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Systems design ,Digital control ,Dual modular redundancy ,business ,0503 education ,Control bus ,Information Systems - Abstract
The paper presents a fault-tolerant digital system design and development strategy for high reliability hardware architectures implementation. Starting from the general consideration that digital hardware systems play a key role in a large scale of control systems implementation, a triple modular redundancy (TMR) solution it is proposed for development. For this reason, the well-known 1 bit majority voter configuration has been extended and generalized to the full control bus of a digital control system. Computer simulations show that the proposed hardware solution fulfills in all the theoretical expectations and it can be used for experimental tests and implementation. The presented design solution and conclusions are well suited to generalization for a wide range of fault-tolerant digital systems development ranging from reliable and safety servo control applications up to high reliability parallel and distributed computing hardware architectures.
- Published
- 2017
- Full Text
- View/download PDF
30. A Long Duration Transient Resilient Pipeline Scheme
- Author
-
Erol Koser and Walter Stechele
- Subjects
010302 applied physics ,Standard cell ,Combinational logic ,Engineering ,business.industry ,020208 electrical & electronic engineering ,Transistor ,02 engineering and technology ,01 natural sciences ,Electronic, Optical and Magnetic Materials ,law.invention ,Viterbi decoder ,law ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Technology scaling ,Redundancy (engineering) ,Electrical and Electronic Engineering ,Safety, Risk, Reliability and Quality ,Dual modular redundancy ,business ,Short duration ,Computer hardware - Abstract
Single event transients (SETs) in combinational logic remain an important topic in the reliability domain. SETs were traditionally relatively short in comparison to the clock period. The majority of the countermeasures utilizes this property. However, advances in technology scaling will reverse the ratio for complementary metal-oxide semiconductor devices. Investigations show that SETs may last up to multiple clock cycles in the future. So-called long duration transients (LDTs) corrupt almost all available countermeasures. This paper presents a new methodology to tackle LDTs. Dual modular redundancy (DMR) is used to detect any corruption of the application logic. A new micro-rollback scheme expands the DMR architecture with fault correction capabilities. The concept is also capable of handling single event upsets and timing violations. The correction penalty is two clock cycles. The approach was implemented and verified in a Viterbi decoder architecture. The scheme utilizes a newly designed History Cell. The History Cell introduces an area overhead of 97% and a power overhead of 110%, compared to a standard cell DFF.
- Published
- 2017
- Full Text
- View/download PDF
31. Reliability models for a nonrepairable system with heterogeneous components having a phase-type time-to-failure distribution
- Author
-
Heungseob Kim and Pansoo Kim
- Subjects
Flexibility (engineering) ,0209 industrial biotechnology ,Engineering ,021103 operations research ,Stochastic modelling ,business.industry ,0211 other engineering and technologies ,02 engineering and technology ,Industrial and Manufacturing Engineering ,Reliability engineering ,Continuous-time Markov chain ,020901 industrial engineering & automation ,Benchmark (computing) ,Redundancy (engineering) ,Phase-type distribution ,Safety, Risk, Reliability and Quality ,Dual modular redundancy ,business ,Reliability (statistics) - Abstract
This research paper presents practical stochastic models for designing and analyzing the time-dependent reliability of nonrepairable systems. The models are formulated for nonrepairable systems with heterogeneous components having phase-type time-to-failure distributions by a structured continuous time Markov chain (CTMC). The versatility of the phase-type distributions enhances the flexibility and practicality of the systems. By virtue of these benefits, studies in reliability engineering can be more advanced than the previous studies. This study attempts to solve a redundancy allocation problem (RAP) by using these new models. The implications of mixing components, redundancy levels, and redundancy strategies are simultaneously considered to maximize the reliability of a system. An imperfect switching case in a standby redundant system is also considered. Furthermore, the experimental results for a well-known RAP benchmark problem are presented to demonstrate the approximating error of the previous reliability function for a standby redundant system and the usefulness of the current research.
- Published
- 2017
- Full Text
- View/download PDF
32. Performance Analysis of Transient Fault-Injection and Fault-Tolerant System for Digital Circuits on FPGA
- Author
-
Dinesha P and Sharath Kumar Y N
- Subjects
Digital electronics ,Triple modular redundancy ,General Computer Science ,business.industry ,Computer science ,020209 energy ,020208 electrical & electronic engineering ,Fault tolerance ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,Fault injection ,Fault (power engineering) ,Fault coverage ,0202 electrical engineering, electronic engineering, information engineering ,Dual modular redundancy ,business ,Field-programmable gate array ,Computer hardware ,Hardware_LOGICDESIGN - Abstract
A Fault-Tolerant System is necessary to improve the reliability of digital circuits with the presence of Fault Injection and also improves the system performance with better Fault Coverage. In this work, an efficient Transient Fault-Injection system (FIS) and Fault-Tolerant System (FTS) are designed for digital circuits. The FIS includes Berlekamp Massey Algorithm (BMA) based LFSRs, with fault logic followed by one – hot-encoder register, which generates the faults. The FTS is designed using Triple-Modular-Redundancy (TMR) and Dual Modular- Redundancy (DMR). The TMR module is designed using the Majority Voter Logic (MVL), and DMR is designed using Self-Voter Logic (SVL) for digital circuits such as synchronous and asynchronous circuits. The four different MVL approaches are designed in the TMR module for digital circuits. The FIS-FTS module is designed on Xilinx-ISE 14.7 environment and implemented on Artix-7 FPGA. The synthesis results include chip area, gate count, delay, and power are analyzed along with fault tolerance, and coverage for given digital circuits. The fault tolerance is analyzed using Modelsim-simulator. The FIS-FTS module covers an average of 99.17% fault coverage for both synchronous and asynchronous circuits.
- Published
- 2020
- Full Text
- View/download PDF
33. D2NN
- Author
-
Yannan Liu, Yu Li, Min Li, Qiang Xu, Bo Luo, and Ye Tian
- Subjects
021110 strategic, defence & security studies ,Computer science ,Distributed computing ,0211 other engineering and technologies ,02 engineering and technology ,Fault injection ,Fault injection attack ,Robustness (computer science) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Deep neural networks ,Dual modular redundancy ,MNIST database ,Medical systems - Abstract
Deep Neural Networks (DNNs) have attracted mainstream adoption in various application domains. Their reliability and security are therefore serious concerns in those safety-critical applications such as surveillance and medical systems. In this paper, we propose a novel dual modular redundancy framework for DNNs, namely D2NN, which is able to tradeoff the system robustness with overhead in a fine-grained manner. We evaluate D2NN framework with DNN models trained on MNIST and CIFAR10 datasets under fault injection attacks, and experimental results demonstrate the efficacy of our proposed solution.
- Published
- 2019
- Full Text
- View/download PDF
34. Persistent Fault Injection in FPGA via BRAM Modification
- Author
-
Guorui Xu, Bin Shao, Xinjie Zhao, Kui Ren, Fan Zhang, Bolin Yang, and Yiran Zhang
- Subjects
050101 languages & linguistics ,business.industry ,Computer science ,05 social sciences ,Inversive ,02 engineering and technology ,Fault injection ,Fault (power engineering) ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,business ,Dual modular redundancy ,Field-programmable gate array ,Countermeasure (computer) ,Block cipher - Abstract
The feasibility of persistent fault analysis relies on special faults which can persist in all the rounds of block ciphers. This prerequisite can be positioned as a good fit into the FPGA scenario, which however has not been carefully exploited ever before. In this paper, we propose the persistent fault attack on the block cipher AES-128 implemented in FPGA where a new type of persistent fault is induced with the technique of Block RAM (BRAM) modification. The details of persistent fault injection are elaborated, especially on how the target bits of AES in BRAM can be identified and how they can be altered. Our experimental results show that: with the proposed attack, a simple statistical analysis can extract the secret key of AES-128 with S-Box implemented in BRAMs and protected by the countermeasure of inversive decryption based dual modular redundancy.
- Published
- 2019
- Full Text
- View/download PDF
35. Soft-Error Tolerance Depending on Supply Voltage by Heavy Ions on Radiation-Hardened Flip Flops in a 65 nm Bulk Process
- Author
-
Kazutoshi Kobayashi, Jun Furuta, Mitsunori Ebara, and Yuto Tsukita
- Subjects
Materials science ,Bistability ,010308 nuclear & particles physics ,business.industry ,Radiation ,FLOPS ,01 natural sciences ,Ion ,Soft error ,0103 physical sciences ,Optoelectronics ,Dual modular redundancy ,business ,Order of magnitude ,Voltage - Abstract
We evaluated soft-error tolerance by heavy ions on several types of flip flops (FFs) called transmission-gate FF (TGFF), Dual Interlocked Storage Cell FF (DICEFF), Bistable Cross-coupled Dual Modular Redundancy FF (BCDMRFF) and BCDMRFF with Set and Reset (BCDMRFFSR) in a 65 nm bulk process. Radiation-hardened FFs are stronger against soft errors than a standard TGFF by two or three order of magnitude. DICEFF has higher soft-error tolerance than BCDMRFF by low-LET heavy ions less than 40 MeV-cm2/mg, while BCDMRFF is stronger against soft error than DICEFF by high-LET ions over 40 MeV-cm2/mg. DICEFF becomes weaker by lowering supply voltage, while BCDMRFF has higher soft-error tolerance than DICEFF when supply voltages is less than 1.0 V.
- Published
- 2019
- Full Text
- View/download PDF
36. A Knapsack Methodology for Hardware-based DMR Protection against Soft Errors in Superscalar Out-of-Order Processors
- Author
-
Marcelo Brandalero, Douglas Maciel Cardoso, Jose Rodrigo Azambuja, Luciano Agostini, Antonio Carlos Schneider Beck, Rafael Billig Tonetto, and Gabriel L. Nazar
- Subjects
Reduction (complexity) ,Out-of-order execution ,Computer engineering ,Knapsack problem ,Heuristic (computer science) ,Computer science ,Dual modular redundancy ,Resilience (network) ,Fault detection and isolation ,Vulnerability (computing) - Abstract
High-performance superscalar processors have been adopted to satisfy the rising demand for processing applications of ever-growing complexity. This extra complexity, added to the increasing vulnerability of transistors due to technology scaling, poses a great challenge since these effects have also been proven to affect ground-level safety-critical applications. To increase microarchitectural resilience, designers may adopt Dual Modular Redundancy (DMR), which offers full fault detection. However, given that DMR incurs in high area and energy overheads, we propose a design-time methodology aiming to achieve the best tradeoff between resilience and area overhead, decreasing DMR costs and maintaining acceptable detection levels for such a complex design. This is done by adopting the Knapsack Problem (KSP) as a heuristic to identify the optimal micro-architectural structures that should be duplicated to achieve target resilience with the smallest possible area overhead. By injecting over 800k faults in 12 significant micro-architectural structures of different versions of the complex Berkeley Out-of-Order Machine (BOOM) superscalar processor modeled with RTL accuracy, we compare this optimal strategy against a greedy one, showing that 90% of vulnerability reduction may be achieved with 50.6% and 107.8% area overheads for the optimal and greedy strategies, respectively.
- Published
- 2019
- Full Text
- View/download PDF
37. A Radiation Hard Sense Circuit for Spin Transfer Torque Random Access Memory
- Author
-
Saba Mohammadi, Seyed Mohammadjavad Seyed Talebi, Masoomeh Jasemi, Michael Green, and Nader Bagherzadeh
- Subjects
Random access memory ,Hardware_MEMORYSTRUCTURES ,Computer science ,business.industry ,Reliability (computer networking) ,Transistor ,Electrical engineering ,Spin-transfer torque ,Hardware_PERFORMANCEANDRELIABILITY ,Sense (electronics) ,Radiation ,law.invention ,CMOS ,law ,business ,Dual modular redundancy - Abstract
Spin transfer torque (STT-RAM) is a fast, scalable and non-volatile memory technology. These characteristics make STT-RAM one of the best candidates among memories that can be used for space applications. Although STT-RAM cell itself is immune to high energy particles, its sensing circuit might be severely affected by radiation. In this work, we first extensively analyze the effect of radiation on the STT-RAM sense circuit and then propose a radiation hardened circuit. Using a dual modular redundancy and a voting scheme, radiation susceptibility of the sense circuit is eliminated. The sense circuit is implemented in 45 nm CMOS technology. Simulation results show that the proposed circuit is immune to radiation pulses up to 400 Krad.
- Published
- 2019
- Full Text
- View/download PDF
38. Reduced length redundancy adaptive protection for the cascaded integrator-comb interpolation filter on FPGA
- Author
-
Juan Antonio Maestro, Alfonso Sanchez-Macian, and Kyle W. Gear
- Subjects
010302 applied physics ,Triple modular redundancy ,business.industry ,Computer science ,020208 electrical & electronic engineering ,Control reconfiguration ,02 engineering and technology ,Condensed Matter Physics ,01 natural sciences ,Atomic and Molecular Physics, and Optics ,Surfaces, Coatings and Films ,Electronic, Optical and Magnetic Materials ,Filter (video) ,Integrator ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Overhead (computing) ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Electrical and Electronic Engineering ,Safety, Risk, Reliability and Quality ,Dual modular redundancy ,business ,Field-programmable gate array ,Computer hardware - Abstract
Cascaded Integrator-Comb filters are a popular choice of filter for use as an interpolator in FPGAs due to their efficient multiplierless design, such as within an on-board satellite communication module. Electronics in space have the problem of being highly susceptible to cosmic radiation, due to the lack of magnetosphere and therefore require some form of protection from Single Event Upsets (SEUs). General techniques exist such as Dual Modular Redundancy (DMR) and Triple Modular Redundancy (TMR), but these may not be desirable due to their large area overhead and power consumption cost. Proposed is a reduced length redundancy technique that offers variable protection for varying power and area resources that could be used with partial reconfiguration to provide dynamically adaptive SEU protection depending on the current environment.
- Published
- 2021
- Full Text
- View/download PDF
39. Soft-Error-Tolerant Dual-Modular-Redundancy Architecture with Repair and Retry Scheme for Memory-Control Circuit on FPGA
- Author
-
Tadanobu Toba, Makoto Saen, and Yusuke Kanno
- Subjects
010302 applied physics ,Scheme (programming language) ,010308 nuclear & particles physics ,Computer science ,business.industry ,Memory control ,01 natural sciences ,Electronic, Optical and Magnetic Materials ,Soft error ,Embedded system ,0103 physical sciences ,Electrical and Electronic Engineering ,Architecture ,Field-programmable gate array ,Dual modular redundancy ,business ,computer ,computer.programming_language - Published
- 2017
- Full Text
- View/download PDF
40. Compiler-Directed Soft Error Detection and Recovery to Avoid DUE and SDC via Tail-DMR
- Author
-
Changhee Jung, Qingrui Liu, Dongyoon Lee, and Devesh Tiwari
- Subjects
010302 applied physics ,Speedup ,Computer science ,Detector ,Real-time computing ,Exception handling ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,020202 computer hardware & architecture ,Soft error ,Hardware and Architecture ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Compiler ,Latency (engineering) ,Error detection and correction ,Dual modular redundancy ,computer ,Algorithm ,Software - Abstract
This article presents Clover, a compiler-directed soft error detection and recovery scheme for lightweight soft error resilience. The compiler carefully generates soft-error-tolerant code based on idempotent processing without explicit checkpoints. During program execution, Clover relies on a small number of acoustic wave detectors deployed in the processor to identify soft errors by sensing the wave made by a particle strike. To cope with DUEs (detected unrecoverable errors) caused by the sensing latency of error detection, Clover leverages a novel selective instruction duplication technique called tail-DMR (dual modular redundancy) that provides a region-level error containment. Once a soft error is detected by either the sensors or the tail-DMR, Clover takes care of the error as in the case of exception handling. To recover from the error, Clover simply redirects program control to the beginning of the code region where the error is detected. The experimental results demonstrate that the average runtime overhead is only 26%, which is a 75% reduction compared to that of the state-of-the-art soft error resilience technique. In addition, this article evaluates an alternative technique called tail-wait, comparing it to Clover. According to the evaluation with the different processor configurations and the various error detection latencies, Clover turns out to be a superior technique, achieving 1.06 to 3.49 × speedup over the tail-wait.
- Published
- 2016
- Full Text
- View/download PDF
41. Redundancy allocation problem for k-out-of-n systems with a choice of redundancy strategies
- Author
-
Ali Zeinal Hamadani, Mahsa Aghaei, and Mostafa Abouei Ardakan
- Subjects
Triple modular redundancy ,0209 industrial biotechnology ,Reliability optimization ,021103 operations research ,Computer science ,0211 other engineering and technologies ,02 engineering and technology ,Common method ,Industrial and Manufacturing Engineering ,020901 industrial engineering & automation ,Decision variables ,Choice of redundancy strategies ,ddc:650 ,Redundancy (engineering) ,Dual modular redundancy ,k-out-ofn system ,Integer programming ,Algorithm ,Redundancy allocation problem - Abstract
To increase the reliability of a specific system, using redundant components is a common method which is called redundancy allocation problem (RAP). Some of the RAP studies have focused on k-out-of-n systems. However, all of these studies assumed predetermined active or standby strategies for each subsystem. In this paper, for the first time, we propose a k-out-of-n system with a choice of redundancy strategies. Therefore, a k-out-of-n series-parallel system is considered when the redundancy strategy can be chosen for each subsystem. In other words, in the proposed model, the redundancy strategy is considered as an additional decision variable and an exact method based on integer programming is used to obtain the optimal solution of the problem. As the optimization of RAP belongs to the NP-hard class of problems, a modified version of genetic algorithm (GA) is also developed. The exact method and the proposed GA are implemented on a well-known test problem and the results demonstrate the efficiency of the new approach compared with the previous studies.
- Published
- 2016
- Full Text
- View/download PDF
42. Evaluation and optimization of the mixed redundancy strategy in cloud-based systems
- Author
-
Chun Tan, Xueliang Zhao, Pan He, Zhihao Zheng, and Yue Yuan
- Subjects
020203 distributed computing ,Mathematical optimization ,Markov chain ,Computer Networks and Communications ,Computer science ,business.industry ,Real-time computing ,Active redundancy ,Markov process ,Cloud computing ,02 engineering and technology ,symbols.namesake ,Strategy ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,symbols ,Electrical and Electronic Engineering ,Dual modular redundancy ,Greedy algorithm ,business - Abstract
Mixed redundancy strategies are generally used in cloud-based systems, with different node switch mechanisms from traditional fault-tolerant strategies. Existing studies often concentrate on optimizing a single strategy in cloud computing environment and ignore the impact of mixed redundancy strategies. Therefore, a model is proposed to evaluate and optimize the reliability and performance of cloud-based degraded systems subject to a mixed active and cold standby redundancy strategy. In this strategy, node switching is triggered by a continual monitoring and detection mechanism when active nodes fail. To evaluate the transient availability and the expected job completion rate of systems with such kind of strategy, a continuous-time Markov chain model is built on the state transition process and a numerical method is used to solve the model. To choose the optimal redundancy for the mixed strategy under system constraints, a greedy search algorithm is proposed after sensitivity analysis. Illustrative examples were presented to explain the process of calculating the transient probability of each system state and in turn, the availability and performance of the whole system. It was shown that the near-optimal redundancy solution could be obtained using the optimization method. The comparison with optimization of the traditional mixed redundancy strategy proved that the system behavior was different using different kinds of mixed strategies and less redundancy was assigned for the new type of mixed strategy under the same system constraint.
- Published
- 2016
- Full Text
- View/download PDF
43. A redundancy strategy for minimizing cost in systems with non-disjoint subsystems under reliability constraint
- Author
-
Debasis Bhattacharya and Soma Roychowdhury
- Subjects
Triple modular redundancy ,021110 strategic, defence & security studies ,Mathematical optimization ,021103 operations research ,Computational complexity theory ,Total cost ,Strategy and Management ,0211 other engineering and technologies ,Active redundancy ,02 engineering and technology ,Redundancy (engineering) ,Systems design ,Safety, Risk, Reliability and Quality ,Dual modular redundancy ,Time complexity ,Mathematics - Abstract
The present paper solves a redundancy allocation problem under reliability constraint. To improve the system reliability, use of redundancy, i.e., use of additional components above the minimum number of components required for the system to operate, is a common practice. But it increases the total cost of the system as well. Thus the problem of allocating redundancy to coherent systems with competing choices of system-components needs to be optimally resolved so that the cost of adding redundancy is minimized. In this paper the problem of redundancy allocation is solved by minimizing the total cost subject to meeting a pre-assigned reliability target. The decision variables here are the number of redundancies. The methodology developed here can be applied to any coherent system, simple or complex. The computational complexity increases with the increase in complexity of the system design. The methodology of solving the redundancy allocation problem developed here yields a deterministic optimal solution, which is a polynomial time solution of an established NP-hard problem. Numerical examples have been included to illustrate the method developed here. The sensitivity of the optimal solution, augmented system reliability and related cost of using redundancy has been studied with respect to the specified reliability targets. No assumption about the form of the component life distribution has been made in the study.
- Published
- 2016
- Full Text
- View/download PDF
44. Mathematical Modeling and Examination of the Effects of Structural Redundancy in а Class of Computer-Based Fault Tolerant Systems
- Author
-
Mariya Hristova
- Subjects
Triple modular redundancy ,Mean time between failures ,General Computer Science ,Mathematical model ,010308 nuclear & particles physics ,Computer science ,Real-time computing ,Computer based ,Active redundancy ,Fault tolerance ,02 engineering and technology ,01 natural sciences ,Reliability engineering ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,020201 artificial intelligence & image processing ,Dual modular redundancy - Abstract
The present article models and examines k˅n systems, in particular Triple modular redundancy (2˅3) and 3˅5. The aim of the study is to derive mathematical models, which are used for determining the impact of structural redundancy (the number of channels n and the threshold of the quorum function k) on the reliability of the system. The probability of failure-free operation p and the Mean Time Between Failures (MTBF) are used as reliability indicators.
- Published
- 2016
- Full Text
- View/download PDF
45. Integrating physical level design and high level synthesis for simultaneous multi-cycle transient and multiple transient fault resiliency of application specific datapath processors
- Author
-
Anirban Sengupta and Deepak Kachave
- Subjects
010302 applied physics ,Digital electronics ,Engineering ,business.industry ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,Condensed Matter Physics ,Fault (power engineering) ,01 natural sciences ,Atomic and Molecular Physics, and Optics ,020202 computer hardware & architecture ,Surfaces, Coatings and Films ,Electronic, Optical and Magnetic Materials ,Abstraction layer ,Embedded system ,High-level synthesis ,0103 physical sciences ,Datapath ,0202 electrical engineering, electronic engineering, information engineering ,Transient (computer programming) ,Electrical and Electronic Engineering ,Physical design ,Safety, Risk, Reliability and Quality ,business ,Dual modular redundancy - Abstract
Radiation induced faults in digital systems have started gathering major attention in recent years due to increasing reliability concern for future technologies. For future technologies, multiple transient faults (MTF) originating from a single radiation hit are expected to occur more frequently. Further, due to continuous massive scaling in device geometry, a particle with moderate linear energy transfer (LET) values is expected to affect more than one module/device during striking. Additionally, incessant escalation in operating speed with evolution of technology has increased likelihood of multi-cycle transient (MCT) faults in digital circuits. This calls for novel solutions for concurrently tackling multi-cycle transient and multi-transient fault resiliency at a higher design abstraction level such as behavioral level. This paper proposes a novel approach for generating simultaneous multi-cycle transient and multiple transient fault resilient designs during high level synthesis (HLS) of application specific datapath processors using the framework of dual modular redundancy. Results of the proposed approach on benchmarks indicated generation of low cost MCT–MFT resilient designs during HLS within acceptable runtime.
- Published
- 2016
- Full Text
- View/download PDF
46. A distributed lightweight Redundancy aware Topology Control Protocol for wireless sensor networks
- Author
-
Nadjib Badache, Manel Chenait, and Bahia Zebbane
- Subjects
020203 distributed computing ,Computer Networks and Communications ,Topology control ,business.industry ,Computer science ,Distributed computing ,020206 networking & telecommunications ,02 engineering and technology ,Energy conservation ,Key distribution in wireless sensor networks ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Electrical and Electronic Engineering ,Dual modular redundancy ,business ,Wireless sensor network ,Information Systems ,Computer network ,Efficient energy use - Abstract
WSN consists of a large number of sensor nodes randomly deployed, and, in many cases, it is impossible to replace sensors when a node failure occurs. Thus, applications tend to deploy more nodes than necessary to cope with possible node failures and to increase the network lifetime, which leads to create some sensing and communication redundancy. However, sensors in the same region, may collect and forward the same information, which will waste more energy. In this paper, we propose a distributed Lightweight Redundancy aware Topology Control Protocol (LRTCP) for wireless sensor networks. It exploits the sensor redundancy in the same region by dividing the network into groups so that a connected backbone can be maintained by keeping a minimum of working nodes and turning off the redundant ones. LRTCP identifies equivalent nodes in terms of communication based on their redundancy degrees with respect of some eligibility rules. Simulation results indicate that, compared with existing distributed topology control algorithms, LRTCP improves network capacity and energy efficiency.
- Published
- 2016
- Full Text
- View/download PDF
47. Modular design of parallel robots with static redundancy
- Author
-
Fengfeng Xi and Amin Moosavian
- Subjects
Self-reconfiguring modular robot ,020301 aerospace & aeronautics ,0209 industrial biotechnology ,Engineering ,business.industry ,Mechanical Engineering ,Parallel manipulator ,Bioengineering ,Fault tolerance ,Control engineering ,02 engineering and technology ,Modular design ,Computer Science Applications ,Morphing ,020901 industrial engineering & automation ,0203 mechanical engineering ,Mechanics of Materials ,Redundancy (engineering) ,Robot ,business ,Dual modular redundancy - Abstract
Modularity in design for most mechanical systems can lead to higher reliability and reduction in cost for the design and build of the system. In this paper, the modular design of a new class of reconfigurable parallel robots is discussed. These robots can attain static redundancy without actuation redundancy and are ideal for applications that require fault tolerance and enhanced stiffness but may not necessarily require redundant actuation. Such applications can be found in the area of wing morphing for performance improvement. It is shown that a universal modular limb connectivity type exists that can be adopted for modular design of statically redundant parallel robots, regardless of the degree of static redundancy of the system. Additionally, the modular design approach is demonstrated through a case study for a morphing wing application.
- Published
- 2016
- Full Text
- View/download PDF
48. Cross-Layer Dual Modular Redundancy Hardened Scheme of Flip-Flop Design Based on Sense-Amplifier
- Author
-
Haochen Qi, Su Zi'an, Tianming Ni, Hui Xu, Lu Yingchun, Manzi Eric, Zhengfeng Huang, and Qi Xu
- Subjects
Scheme (programming language) ,Sense amplifier ,Computer science ,020208 electrical & electronic engineering ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,General Medicine ,020202 computer hardware & architecture ,law.invention ,Power (physics) ,Hardware and Architecture ,law ,Single event upset ,Logic gate ,0202 electrical engineering, electronic engineering, information engineering ,Electronic engineering ,Electrical and Electronic Engineering ,Differential (infinitesimal) ,Dual modular redundancy ,computer ,Flip-flop ,computer.programming_language - Abstract
As the demand for low-power and high-speed logic circuits increases, the design of differential flip-flops based on sense-amplifier (SAFF), which have excellent power and speed characteristics, has become more and more popular. Conventional SAFF (Con SAFF) and improved SAFF designs focus more on the improvement of speed and power consumption, but ignore their Single-Event-Upset (SEU) sensitivity. In fact, SAFF is more susceptible to particle impacts due to the small voltage swing required for differential input in the master stage. Based on the SEU vulnerability of SAFF, this paper proposes a novel scheme, namely cross-layer Dual Modular Redundancy (DMR), to improve the robustness of SAFF. That is, unit-level DMR technology is performed in the master stage, while transistor-level stacking technology is used in the slave stage. This scheme can be applied to some current typical SAFF designs, such as Con SAFF, Strollo SAFF, Ahmadi SAFF, Jeong SAFF, etc. Detailed HSPICE simulation results demonstrate that hardened SAFF designs can not only fully tolerate the Single Node Upset of sensitive nodes, but also partially tolerate the Double Node Upset caused by charge sharing. Besides, compared with the conventional DMR hardened scheme, the proposed cross-layer DMR hardened scheme not only has the same fault-tolerant characteristics, but also greatly reduces the delay, area and power consumption.
- Published
- 2020
- Full Text
- View/download PDF
49. Radiation-Induced Soft Errors
- Author
-
Masanori Hashimoto, Kazutoshi Wakabayashi, Takao Onoye, Jun Furuta, Eishi H. Ibe, Kazutoshi Kobayashi, Hiroshi Kawaguchi, Masahiko Yoshimoto, Yukio Mitsuyama, Makoto Sugihara, Shusuke Yoshimoto, Hiroyuki Ochi, Hidetoshi Onodera, and Hiroyuki Kanbara
- Subjects
010302 applied physics ,Memory hierarchy ,010308 nuclear & particles physics ,business.industry ,Computer science ,CPU cache ,Fault tolerance ,Hardware_PERFORMANCEANDRELIABILITY ,01 natural sciences ,Soft error ,Robustness (computer science) ,Embedded system ,0103 physical sciences ,Cache ,Static random-access memory ,business ,Dual modular redundancy - Abstract
We will begin by a quick but thorough look at the effects of faults, errors and failures, caused by terrestrial neutrons originating from cosmic rays, on the terrestrial electronic systems in the variety of industries. Mitigation measures, taken at various levels of design hierarchy from physical to systems level against neutron-induced adverse effects, are then introduced. Challenges for retaining robustness under future technology development are also discussed. Such challenges in mitigation approaches are featured for SRAMs (Static Random Access Memories), FFs (Flip-Flops), FPGAs (Field Programmable Gate Arrays) and computer systems as exemplified in the following articles: (i) Layout aware neutron-induced soft-error simulation and fault tolerant design techniques are introduced for 6T SRAMs. The PNP layout instead of conventional NPN layout is proposed and its robustness is demonstrated by using the MONTE CARLO simulator PHITS. (ii) RHBD (Radiation-Hardened By Design) FFs hardened by using specially designed redundant techniques are extensively evaluated. BCDMR (Bistable Cross-Coupled Dual Modular Redundancy) FFs is proposed in order to avoid MCU (Multi-Cell Upset) impacts on FF reliability. Its robustness is demonstrated thorough a set of neutron irradiation tests. (iii) CGRA (Coarse-Grained Reconfigurable Architecture) is proposed for an FPGA-chip-level tolerance. Prototype CGRA-FPGA chips are manufactured and their robustness is demonstrated under alpha particle/neutron irradiation tests. (iv) Simulation techniques for failures in heterogeneous computer system with memory hierarchy consisting of a register file, an L1 cache, an L2 cache and a main memory are also proposed in conjunction with masking effects of faults/errors.
- Published
- 2018
- Full Text
- View/download PDF
50. A CPU-FPGA heterogeneous platform-based monitoring system and redundant mechanisms
- Author
-
Wei-Jen Lee, Igor Brandao Machado Matsuo, and Long Zhao
- Subjects
010308 nuclear & particles physics ,business.industry ,Computer science ,Monitoring system ,Symmetric multiprocessor system ,01 natural sciences ,Gate array ,Data redundancy ,Embedded system ,0103 physical sciences ,Redundancy (engineering) ,Central processing unit ,business ,Field-programmable gate array ,Dual modular redundancy - Abstract
This paper presents a practical view of how to implement a Dual Modular Redundancy (DMR) scheme in a CPU-FPGA (Central Processing Unit — Field-Programmable Gate Array) heterogeneous platform-based monitoring system, which is also described. FPGAs in a monitoring system can be valuable resources when it is important to either have a reprogrammable system or fast response/acquisition rates when processing large volumes of data. On the other side, CPUs are affordable options for most other processing tasks. A heterogeneous platform is proposed and aims to achieve a reliable, however cost-effective solution. After this, the paper will focus on matters such as synchronization between units, data redundancy and self-monitoring schemes. The implemented design was thoroughly tested, showing effectiveness in terms of redundancy with improved reliability.
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.