1. Scaling-laws for Large Time-series Models
- Author
-
Edwards, Thomas D. P., Alvey, James, Alsing, Justin, Nguyen, Nam H., and Wandelt, Benjamin D.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Scaling laws for large language models (LLMs) have provided useful guidance on how to train ever larger models for predictable performance gains. Time series forecasting shares a similar sequential structure to language, and is amenable to large-scale transformer architectures. Here we show that foundational decoder-only time series transformer models exhibit analogous scaling-behavior to LLMs, while architectural details (aspect ratio and number of heads) have a minimal effect over broad ranges. We assemble a large corpus of heterogenous time series data on which to train, and establish, for the first time, power-law scaling relations with respect to parameter count, dataset size, and training compute, spanning five orders of magnitude., Comment: 8 pages, 3 figures
- Published
- 2024