Back to Search Start Over

SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed & Timed Transport with Recycled Entropies

Authors :
Bonato, Tommaso
Kabbani, Abdul
De Sensi, Daniele
Pan, Rong
Le, Yanfang
Raiciu, Costin
Handley, Mark
Schneider, Timo
Blach, Nils
Ghalayini, Ahmad
Alves, Daniel
Papamichael, Michael
Caulfield, Adrian
Hoefler, Torsten
Bonato, Tommaso
Kabbani, Abdul
De Sensi, Daniele
Pan, Rong
Le, Yanfang
Raiciu, Costin
Handley, Mark
Schneider, Timo
Blach, Nils
Ghalayini, Ahmad
Alves, Daniel
Papamichael, Michael
Caulfield, Adrian
Hoefler, Torsten
Publication Year :
2024

Abstract

With the rapid growth of machine learning (ML) workloads in datacenters, existing congestion control (CC) algorithms fail to deliver the required performance at scale. ML traffic is bursty and bulk-synchronous and thus requires quick reaction and strong fairness. We show that existing CC algorithms that use delay as a main signal react too slowly and are not always fair. We design SMaRTT, a simple sender-based CC algorithm that combines delay, ECN, and optional packet trimming for fast and precise window adjustments. At the core of SMaRTT lies the novel QuickAdapt algorithm that accurately estimates the bandwidth at the receiver. We show how to combine SMaRTT with a new per-packet traffic load-balancing algorithm called REPS to effectively reroute packets around congested hotspots as well as flaky or failing links. Our evaluation shows that SMaRTT alone outperforms EQDS, Swift, BBR, and MPRDMA by up to 50% on modern datacenter networks.<br />Comment: Fixed typo and wrong y axis of one plot

Details

Database :
OAIster
Publication Type :
Electronic Resource
Accession number :
edsoai.on1438542368
Document Type :
Electronic Resource