The recently introduced twin-width of a graph $G$ is the minimum integer $d$ such that $G$ has a $d$-contraction sequence, that is, a sequence of $\left| V(G) \right|-1$ iterated vertex identifications for which the overall maximum number of red edges incident to a single vertex is at most $d$, where a red edge appears between two sets of identified vertices if they are not homogeneous in $G$ (not fully adjacent nor fully non-adjacent). We show that if a graph admits a $d$-contraction sequence, then it also has a linear-arity tree of $f(d)$-contractions, for some function $f$. Informally if we accept to worsen the twin-width bound, we can choose the next contraction from a set of $\Theta(\left| V(G) \right|)$ pairwise disjoint pairs of vertices. This has two main consequences. First it permits to show that every bounded twin-width class is small, i.e., has at most $n!c^n$ graphs labeled by $[n]$, for some constant $c$. This unifies and extends the same result for bounded treewidth graphs [Beineke and Pippert, JCT '69], proper subclasses of permutations graphs [Marcus and Tardos, JCTA '04], and proper minor-free classes [Norine et al., JCTB '06]. It implies in turn that bounded-degree graphs, interval graphs, and unit disk graphs have unbounded twin-width. The second consequence is an $O(\log n)$-adjacency labeling scheme for bounded twin-width graphs, confirming several cases of the implicit graph conjecture. We then explore the small conjecture that, conversely, every small hereditary class has bounded twin-width. The conjecture passes many tests. Inspired by sorting networks of logarithmic depth, we show that $\log_{\Theta(\log \log d)}n$-subdivisions of $K_n$ (a small class when $d$ is constant) have twin-width at most $d$. We obtain a rather sharp converse with a surprisingly direct proof: the $\log_{d+1}n$-subdivision of $K_n$ has twin-width at least $d$. Secondly graphs with bounded stack or queue number (also small classes) have bounded twin-width. These sparse classes are surprisingly rich since they contain certain (small) classes of expanders. Thirdly we show that cubic expanders obtained by iterated random 2-lifts from $K_4$ [Bilu and Linial, Combinatorica '06] also have bounded twin-width. These graphs are related to so-called separable permutations and also form a small class. We suggest a promising connection between the small conjecture and group theory. Finally we define a robust notion of sparse twin-width. We show that for a hereditary class $\mathcal C$ of bounded twin-width the five following conditions are equivalent: every graph in $\mathcal C$ (1) has no $K_{t,t}$ subgraph for some fixed $t$, (2) has an adjacency matrix without a $d$-by-$d$ division with a 1 entry in each of the $d^2$ cells for some fixed $d$, (3) has at most linearly many edges, (4) the subgraph closure of $\mathcal C$ has bounded twin-width, and (5) $\mathcal C$ has bounded expansion. We discuss how sparse classes with similar behavior with respect to clique subdivisions compare to bounded sparse twin-width.Mathematics Subject Classifications: 68R10, 05C30, 05C48Keywords: Twin-width, small classes, expanders, clique subdivisions, sparsity