Back to Search Start Over

FogROS2-FT: Fault Tolerant Cloud Robotics

Authors :
Chen, Kaiyuan
Hari, Kush
Chung, Trinity
Wang, Michael
Tian, Nan
Juette, Christian
Ichnowski, Jeffrey
Ren, Liu
Kubiatowicz, John
Stoica, Ion
Goldberg, Ken
Publication Year :
2024

Abstract

Cloud robotics enables robots to offload complex computational tasks to cloud servers for performance and ease of management. However, cloud compute can be costly, cloud services can suffer occasional downtime, and connectivity between the robot and cloud can be prone to variations in network Quality-of-Service (QoS). We present FogROS2-FT (Fault Tolerant) to mitigate these issues by introducing a multi-cloud extension that automatically replicates independent stateless robotic services, routes requests to these replicas, and directs the first response back. With replication, robots can still benefit from cloud computations even when a cloud service provider is down or there is low QoS. Additionally, many cloud computing providers offer low-cost spot computing instances that may shutdown unpredictably. Normally, these low-cost instances would be inappropriate for cloud robotics, but the fault tolerance nature of FogROS2-FT allows them to be used reliably. We demonstrate FogROS2-FT fault tolerance capabilities in 3 cloud-robotics scenarios in simulation (visual object detection, semantic segmentation, motion planning) and 1 physical robot experiment (scan-pick-and-place). Running on the same hardware specification, FogROS2-FT achieves motion planning with up to 2.2x cost reduction and up to a 5.53x reduction on 99 Percentile (P99) long-tail latency. FogROS2-FT reduces the P99 long-tail latency of object detection and semantic segmentation by 2.0x and 2.1x, respectively, under network slowdown and resource contention.<br />Comment: IEEE/RSJ International Conference on Intelligent Robots and Systems 2024 Best Paper Finalist

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2412.05408
Document Type :
Working Paper