Start Over

A lightweight performance proxy for deep‐learning model training on Amazon SageMaker.

Authors :: Keller Tesser, Rafael
Marques, Alvaro
Borin, Edson
Source :: Concurrency & Computation: Practice & Experience; 6/25/2024, Vol. 36 Issue 14, p1-22, 22p
Publication Year :: 2024
Abstract: Summary: Cloud computing has become popular for training deep‐learning (DL) models, avoiding the costs of acquiring and maintaining on‐premise systems. SageMaker is a cloud service that automates the execution of DL workloads. Its features include automatic hyperparameter optimization and use of spot instances. Nonetheless, it does not assist in selecting the right instance type for a workload. In public clouds, rent price depends on the configuration of the chosen instance type. Advanced and faster instances are typically more expensive, but not always the best choice. To select the optimal instance type, users must compare the workload's relative performance (and hence cost) on several candidates. Building on the execution profiles of multiple DL applications, we model the performance and cost of training DL applications on SageMaker and propose a lightweight technique to estimate these at low temporal and monetary cost. This method is a performance proxy that can be used to replace more expensive performance measurement procedures. So, it could speed up any technique that relies on such measurements. We show how it can help cloud customers seeking suitable instance types to train DL models, and that it can accurately predict the performance of different instance types when training these models on SageMaker. [ABSTRACT FROM AUTHOR]

Subjects :: ESTIMATION theory
CLOUD computing
PRICES
DEEP learning

Details

Language :: English
ISSN :: 15320626
Volume :: 36
Issue :: 14
Database :: Complementary Index
Journal :: Concurrency & Computation: Practice & Experience
Publication Type :: Academic Journal
Accession number :: 177418729
Full Text :: https://doi.org/10.1002/cpe.8104

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A lightweight performance proxy for deep‐learning model training on Amazon SageMaker.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A lightweight performance proxy for deep‐learning model training on Amazon SageMaker.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources