Back to Search Start Over

Distributed Learning and Inference Systems: A Networking Perspective

Authors :
Moussa, Hesham G.
Akhavain, Arashmid
Hosseini, S. Maryam
McCormick, Bill
Publication Year :
2025

Abstract

Machine learning models have achieved, and in some cases surpassed, human-level performance in various tasks, mainly through centralized training of static models and the use of large models stored in centralized clouds for inference. However, this centralized approach has several drawbacks, including privacy concerns, high storage demands, a single point of failure, and significant computing requirements. These challenges have driven interest in developing alternative decentralized and distributed methods for AI training and inference. Distribution introduces additional complexity, as it requires managing multiple moving parts. To address these complexities and fill a gap in the development of distributed AI systems, this work proposes a novel framework, Data and Dynamics-Aware Inference and Training Networks (DA-ITN). The different components of DA-ITN and their functions are explored, and the associated challenges and research areas are highlighted.<br />Comment: This paper has been submitted to IEEE Network magazine and is still under review

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2501.05323
Document Type :
Working Paper