201. Fault-tolerant message switching based on wormhole switching and backtracking
- Author
-
M. Sueishi, Masato Kitakami, and Hideo Ito
- Subjects
business.industry ,Network packet ,Computer science ,Throughput ,Fault tolerance ,Hardware_PERFORMANCEANDRELIABILITY ,Parallel computing ,Message switching ,chemistry.chemical_compound ,chemistry ,Header ,Overhead (computing) ,Concurrent computing ,business ,Wormhole switching ,Computer network - Abstract
Parallel computers are now popularly applied to applications where many calculations are required. In a NO Remote memory Access model (NORA) parallel computer, many processors are connected by communication links and calculation results are obtained by communications among processors. The message switching method, which controls message transmission in the parallel computer, is one of the most important parameters to improve the performance of the parallel computer. Since parallel computers include many processors, its failure rate is very high and many fault-tolerant switching methods have been proposed. The existing methods have problems, however, such as low communication throughput, low fault-tolerant capability, and large hardware overhead. We propose fault-tolerant switching by improving wormhole switching. The proposed method inserts dummy flits, having no information, after the header flit, the first flit of the packet. By overwriting the header flit to the dummy flit, backtracking is implemented without hardware overhead. Computer simulation says that in a 16 by 16 2D torus, for example, the throughput of the proposed method is almost equal to that of existing methods which require large hardware overhead if the number of the faulty nodes is less then 40.
- Published
- 2004
- Full Text
- View/download PDF