1. Architecting Efficient, Large-Scale AI: An Algorithm-System Co-Design Approach
- Author
-
Hsia, Samuel Cheng-Yuan
- Subjects
- Computer Architecture, Generative AI, Hardware Accelerators, Hardware-Software Co-Design, Machine Learning, Recommender Systems, Computer science, Computer engineering, Electrical engineering
- Abstract
Driven by significant advancements in algorithmic techniques and the emergence of new multimodal generative applications, deep learning has entered the era of "large-scale AI". As leading models dramatically increase in size and complexity, the hardware and software requirements also become significantly more demanding. If efficient solutions are not developed in a timely manner, model exploration will grind to a halt and at-scale serving will be infeasible. End-to-end co-design solutions must address three key themes: the unique technical challenges posed by the large-scale nature of these models, the distinct requirements for training versus inference, and the critical need for efficiency. This dissertation presents three case studies for navigating the complexities of large-scale AI. The first case involves a cross-stack characterization of large-scale models, identifying performance bottlenecks and potential avenues for optimization across different system layers. The second case study explores redesigning embedding-centric models through data and hardware-aware observations, aiming for substantial improvements from novel embedding representations. The third case study develops tools that help researchers gain better insights into mapping of increasingly complex models onto physical infrastructures, addressing the logistical and operational challenges of deploying large-scale AI systems in data centers. Looking forward, the dissertation identifies areas for future research, including co-design strategies tailored for embedding-driven, multimodal AI models and the role of reliability versus resiliency in data center-scale training environments. Collectively, this work contributes to the foundational understanding and practical advancement of large-scale AI technology, setting a course for future innovations in the field.
- Published
- 2024