301. FDIR method using an embedded timecode in packets for SpaceWire-D: SpaceWire networks and protocols, short paper
- Author
-
Hiroto Namikoshi, Michiya Hayama, and Isao Odagi
- Subjects
Triple modular redundancy ,Engineering ,Network packet ,Timecode ,business.industry ,Robustness (computer science) ,Embedded system ,Redundancy (engineering) ,Fault tolerance ,business ,Fault detection and isolation ,SpaceWire - Abstract
In satellite systems, triple modular redundancy (TMR) method with interconnected 3-CPUs is widely used to improve fault tolerance for the SEU/SET. Fault Detection, Isolation and Recovery (FDIR) functionality is also used to improve robustness of the system which isolates a faulty CPU and switches to a redundant CPU automatically. However, the FDIR does not work correctly in the following cases. First, SEU and SET may cause an unnecessary link occupation on the SpaceWire network. In this case, the voting mechanism and the fault detection mechanism work incorrectly due to the communication failure. Second, it is difficult to classify the cause of the fault combined with more than 1 failure mode by the master CPU. This paper proposes a novel FDIR method to overcome examples described above. The proposed method masks output signals of the SpaceWire interface with the error signal outputted from the voter. It enables the system to reset the link and notify the faults automatically. Furthermore, the CPUs notify each other the signal applying exclusive-OR (XOR) operation to the calculation results and a Timecode. This mechanism improves granularity of the fault classification. Finally, this paper clarifies the recovery time of the system in case of the double-fault including the link occupation by computer simulation. The simulation results show that the proposed method recovers the system with the same speed of the method which only uses a timeout mechanism.
- Published
- 2016