Back to Search Start Over

A General-Purpose Transferable Predictor for Neural Architecture Search

Authors :
Han, Fred X.
Mills, Keith G.
Chudak, Fabian
Riahi, Parsa
Salameh, Mohammad
Zhang, Jialin
Lu, Wei
Jui, Shangling
Niu, Di
Publication Year :
2023

Abstract

Understanding and modelling the performance of neural architectures is key to Neural Architecture Search (NAS). Performance predictors have seen widespread use in low-cost NAS and achieve high ranking correlations between predicted and ground truth performance in several NAS benchmarks. However, existing predictors are often designed based on network encodings specific to a predefined search space and are therefore not generalizable to other search spaces or new architecture families. In this paper, we propose a general-purpose neural predictor for NAS that can transfer across search spaces, by representing any given candidate Convolutional Neural Network (CNN) with a Computation Graph (CG) that consists of primitive operators. We further combine our CG network representation with Contrastive Learning (CL) and propose a graph representation learning procedure that leverages the structural information of unlabeled architectures from multiple families to train CG embeddings for our performance predictor. Experimental results on NAS-Bench-101, 201 and 301 demonstrate the efficacy of our scheme as we achieve strong positive Spearman Rank Correlation Coefficient (SRCC) on every search space, outperforming several Zero-Cost Proxies, including Synflow and Jacov, which are also generalizable predictors across search spaces. Moreover, when using our proposed general-purpose predictor in an evolutionary neural architecture search algorithm, we can find high-performance architectures on NAS-Bench-101 and find a MobileNetV3 architecture that attains 79.2% top-1 accuracy on ImageNet.<br />Comment: Accepted to SDM2023; version includes supplementary material; 12 Pages, 3 Figures, 6 Tables

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2302.10835
Document Type :
Working Paper
Full Text :
https://doi.org/10.1137/1.9781611977653.ch81