Author: "Cagnetta, Francesco" / Search Limiters: Full Text - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Cagnetta, Francesco"' showing total 23 results

Start Over Author "Cagnetta, Francesco" Search Limiters Full Text

23 results on '"Cagnetta, Francesco"'

1. Towards a theory of how the structure of language is acquired by deep neural networks

Author: Cagnetta, Francesco and Wyart, Matthieu
Subjects: Computer Science - Computation and Language, Condensed Matter - Disordered Systems and Neural Networks, Computer Science - Machine Learning
Abstract: How much data is required to learn the structure of a language via next-token prediction? We study this question for synthetic datasets generated via a Probabilistic Context-Free Grammar (PCFG) -- a tree-like generative model that captures many of the hierarchical structures found in natural languages. We determine token-token correlations analytically in our model and show that they can be used to build a representation of the grammar's hidden variables, the longer the range the deeper the variable. In addition, a finite training set limits the resolution of correlations to an effective range, whose size grows with that of the training set. As a result, a Language Model trained with increasingly many examples can build a deeper representation of the grammar's structure, thus reaching good performance despite the high dimensionality of the problem. We conjecture that the relationship between training set size and effective range of correlations holds beyond our synthetic datasets. In particular, our conjecture predicts how the scaling law for the test loss behaviour with training set size depends on the length of the context window, which we confirm empirically in Shakespeare's plays and Wikipedia articles., Comment: NeurIPS 2024
Published: 2024

2. How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

Author: Cagnetta, Francesco, Petrini, Leonardo, Tomasini, Umberto M., Favero, Alessandro, and Wyart, Matthieu
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Deep learning algorithms demonstrate a surprising ability to learn high-dimensional tasks from limited examples. This is commonly attributed to the depth of neural networks, enabling them to build a hierarchy of abstract, low-dimensional data representations. However, how many training examples are required to learn such representations remains unknown. To quantitatively study this question, we introduce the Random Hierarchy Model: a family of synthetic tasks inspired by the hierarchical structure of language and images. The model is a classification task where each class corresponds to a group of high-level features, chosen among several equivalent groups associated with the same class. In turn, each feature corresponds to a group of sub-features chosen among several equivalent ones and so on, following a hierarchy of composition rules. We find that deep networks learn the task by developing internal representations invariant to exchanging equivalent groups. Moreover, the number of data required corresponds to the point where correlations between low-level features and classes become detectable. Overall, our results indicate how deep networks overcome the curse of dimensionality by building invariant representations, and provide an estimate of the number of data required to learn a hierarchical task., Comment: 9 pages, 8 figures
Published: 2023
Full Text: View/download PDF

3. Kernels, Data & Physics

Author: Cagnetta, Francesco, Oliveira, Deborah, Sabanayagam, Mahalakshmi, Tsilivis, Nikolaos, and Kempe, Julia
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Lecture notes from the course given by Professor Julia Kempe at the summer school "Statistical physics of Machine Learning" in Les Houches. The notes discuss the so-called NTK approach to problems in machine learning, which consists of gaining an understanding of generally unsolvable problems by finding a tractable kernel formulation. The notes are mainly focused on practical applications such as data distillation and adversarial robustness, examples of inductive bias are also discussed., Comment: These are notes from the lecture of Julia Kempe given at the summer school "Statistical Physics \& Machine Learning", that took place in Les Houches School of Physics in France from 4th to 29th July 2022
Published: 2023

4. How deep convolutional neural networks lose spatial information with training

Author: Tomasini, Umberto M., Petrini, Leonardo, Cagnetta, Francesco, and Wyart, Matthieu
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: A central question of machine learning is how deep nets manage to learn tasks in high dimensions. An appealing hypothesis is that they achieve this feat by building a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural representation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the net. This loss of sensitivity correlates with performance, and surprisingly correlates with a gain of sensitivity to white noise acquired during training. These facts are unexplained, and as we demonstrate still hold when white noise is added to the images of the training set. Here, we (i) show empirically for various architectures that stability to image diffeomorphisms is achieved by both spatial and channel pooling, (ii) introduce a model scale-detection task which reproduces our empirical observations on spatial pooling and (iii) compute analitically how the sensitivity to diffeomorphisms and noise scales with depth due to spatial pooling. The scalings are found to depend on the presence of strides in the net architecture. We find that the increased sensitivity to noise is due to the perturbing noise piling up during pooling, after being rectified by ReLU units.
Published: 2022

5. What Can Be Learnt With Wide Convolutional Neural Networks?

Author: Cagnetta, Francesco, Favero, Alessandro, and Wyart, Matthieu
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Understanding how convolutional neural networks (CNNs) can efficiently learn high-dimensional functions remains a fundamental challenge. A popular belief is that these models harness the local and hierarchical structure of natural data such as images. Yet, we lack a quantitative understanding of how such structure affects performance, e.g., the rate of decay of the generalisation error with the number of training samples. In this paper, we study infinitely-wide deep CNNs in the kernel regime. First, we show that the spectrum of the corresponding kernel inherits the hierarchical structure of the network, and we characterise its asymptotics. Then, we use this result together with generalisation bounds to prove that deep CNNs adapt to the spatial scale of the target function. In particular, we find that if the target function depends on low-dimensional subsets of adjacent input variables, then the decay of the error is controlled by the effective dimensionality of these subsets. Conversely, if the target function depends on the full set of input variables, then the error decay is controlled by the input dimension. We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN with randomly-initialised parameters. Interestingly, we find that, despite their hierarchical structure, the functions generated by infinitely-wide deep CNNs are too rich to be efficiently learnable in high dimension.
Published: 2022

6. Learning sparse features can lead to overfitting in neural networks

Author: Petrini, Leonardo, Cagnetta, Francesco, Vanden-Eijnden, Eric, and Wyart, Matthieu
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: It is widely believed that the success of deep networks lies in their ability to learn a meaningful representation of the features of the data. Yet, understanding when and how this feature learning improves performance remains a challenge: for example, it is beneficial for modern architectures trained to classify images, whereas it is detrimental for fully-connected networks trained for the same task on the same data. Here we propose an explanation for this puzzle, by showing that feature learning can perform worse than lazy training (via random feature kernel or the NTK) as the former can lead to a sparser neural representation. Although sparsity is known to be essential for learning anisotropic data, it is detrimental when the target function is constant or smooth along certain directions of input space. We illustrate this phenomenon in two settings: (i) regression of Gaussian random functions on the d-dimensional unit sphere and (ii) classification of benchmark datasets of images. For (i), we compute the scaling of the generalization error with number of training points, and show that methods that do not learn features generalize better, even when the dimension of the input space is large. For (ii), we show empirically that learning features can indeed lead to sparse and thereby less smooth representations of the image predictors. This fact is plausibly responsible for deteriorating the performance, which is known to be correlated with smoothness along diffeomorphisms.
Published: 2022

7. Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

Author: Favero, Alessandro, Cagnetta, Francesco, and Wyart, Matthieu
Subjects: Statistics - Machine Learning, Condensed Matter - Disordered Systems and Neural Networks, Computer Science - Machine Learning
Abstract: Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge. We study this problem within a teacher-student framework for kernel regression, using `convolutional' kernels inspired by the neural tangent kernel of simple convolutional architectures of given filter size. Using heuristic methods from physics, we find in the ridgeless case that locality is key in determining the learning curve exponent $\beta$ (that relates the test error $\epsilon_t\sim P^{-\beta}$ to the size of the training set $P$), whereas translational invariance is not. In particular, if the filter size of the teacher $t$ is smaller than that of the student $s$, $\beta$ is a function of $s$ only and does not depend on the input dimension. We confirm our predictions on $\beta$ empirically. We conclude by proving, using a natural universality assumption, that performing kernel regression with a ridge that decreases with the size of the training set leads to similar learning curve exponents to those we obtain in the ridgeless case., Comment: 32 pages, 7 figures
Published: 2021
Full Text: View/download PDF

8. A renormalization group study of the dynamics of active membranes: universality classes and scaling laws

Author: Cagnetta, Francesco, Skultety, Viktor, Evans, Martin R., and Marenduzzo, Davide
Subjects: Condensed Matter - Statistical Mechanics, Condensed Matter - Soft Condensed Matter
Abstract: Motivated by experimental observations of patterning at the leading edge of motile eukaryotic cells, we introduce a general model for the dynamics of nearly-flat fluid membranes driven from within by an ensemble of activators. We include, in particular, a kinematic coupling between activator density and membrane slope which generically arises whenever the membrane has a non-vanishing normal speed. We unveil the phase diagram of the model by means of a perturbative field-theoretical renormalization group analysis. Due to the aforementioned kinematic coupling the natural dynamical scaling is acoustic, that is the dynamical critical exponent is 1. However, as soon as the the normal velocity of the membrane is tuned to zero, the system crosses over to diffusive dynamic scaling in mean field. Distinct critical points can be reached depending on how the limit of vanishing velocity is realised: in each of them corrections to scaling due to nonlinear coupling terms must be taken into accounts. The detailed analysis of these critical points reveals novel scaling regimes wich can be accessed with perturbative methods, together with signs of strong coupling behaviour, which establishes a promising ground for further non-perturbative calculations. Our results unify several previous studies on the dynamics of active membrane, while also identifying nontrivial scaling regimes which cannot be captured by passive theories of fluctuating interfaces and are relevant for the physics of living membranes.
Published: 2021
Full Text: View/download PDF

9. Universal properties of active membranes

Author: Cagnetta, Francesco, Skultety, Viktor, Evans, Martin R., and Marenduzzo, Davide
Subjects: Condensed Matter - Statistical Mechanics, Condensed Matter - Soft Condensed Matter
Abstract: We put forward a general field theory for membranes with embedded activators and analyse their critical properties using renormalization group techniques. Depending on the membrane-activator coupling, we find a crossover between acoustic and diffusive scaling regimes, with mean-field dynamical critical exponents z = 1 and 2 respectively. We argue that the acoustic scaling, which is exact in all spatial dimensions, is a suitable candidate for the universal description of the spatiotemporal patterns observed at the leading edge of motile cells. Furthermore, one-loop corrections to the diffusive mean-field exponents reveal universal behaviour distinct from the Kardar-Parisi-Zhang scaling of passive interfaces and signs of strong-coupling behaviour., Comment: 5 pages, 3 figures
Published: 2021
Full Text: View/download PDF

10. Work Fluctuations in the Active Ornstein- Uhlenbeck Particle model

Author: Semeraro, Massimiliano, Suma, Antonio, Petrelli, Isabella, Cagnetta, Francesco, and Gonnella, Giuseppe
Subjects: Condensed Matter - Statistical Mechanics
Abstract: We study the large deviations of the power injected by the active force for an Active Ornstein-Uhlenbeck Particle (AOUP), free or in a confining potential. For the free-particle case, we compute the rate function analytically in d-dimensions from a saddle-point expansion, and numerically in two dimensions by it a) direct sampling of the active work in numerical solutions of the AOUP equations and b) Legendre-Fenchel transform of the scaled cumulant generating function obtained via a cloning algorithm. The rate function presents asymptotically linear branches on both sides and it is independent of the system's dimensionality, apart from a multiplicative factor. For the confining potential case, we focus on two-dimensional systems and obtain the rate function numerically using both methods a) and b). We find a different scenario for harmonic and anharmonic potentials: in the former case, the phenomenology of fluctuations is analogous to that of a free particle, but the rate function might be non-analytic; in the latter case the rate functions are analytic, but fluctuations are realised by entirely different means, which rely strongly on the particle-potential interaction. Finally, we check the validity of a fluctuation relation for the active work distribution. In the free-particle case, the relation is satisfied with a slope proportional to the bath temperature. The same slope is found for the harmonic potential, regardless of activity, and for an anharmonic potential with low activity. In the anharmonic case with high activity, instead, we find a different slope which is equal to an effective temperature obtained from the fluctuation-dissipation theorem., Comment: 33 pages, 12 figures
Published: 2021
Full Text: View/download PDF

11. Kinetic roughening in active interfaces

Author: Cagnetta, Francesco, Evans, Martin R., and Marenduzzo, Davide
Subjects: Condensed Matter - Statistical Mechanics, Condensed Matter - Soft Condensed Matter
Abstract: The essential features of many interfaces driven out of equilibrium are described by the same equation---the Kardar-Parisi-Zhang (KPZ) equation. How do living interfaces, such as the cell membrane, fit into this picture? In an endeavour to answer such a question, we proposed in [F. Cagnetta, M. R. Evans, D. Marenduzzo, PRL 120, 258001 (2018)] an idealised model for the membrane of a moving cell. Here we discuss how the addition of simple ingredients inspired by the dynamics of the membrane of moving cells affects common kinetic roughening theories such as the KPZ and Edwards-Wilkinson equations., Comment: 5 pages, 4 figures, FisMat 2019
Published: 2019
Full Text: View/download PDF

12. Active interfaces : a universal approach

Author: Cagnetta, Francesco, Evans, Martin, and Marenduzzo, Davide
Subjects: 530.13, statistical mechanics, lattice models, Kardar-ParisiZhang equation, KPZ equation, Renormalisation group techniques
Abstract: This thesis proposes and characterises a stochastic model of an active interface within the framework of statistical mechanics. Statistical methods have indeed proven successful in probing the dynamics of kinetically roughened interfaces, producing results which fill a wide, 40 year long literature. The principle of universality, according to which large scales and long times screen a system intimate details, provides a mean to systematise such knowledge: many growing interfaces, for instance, are described by the same equation—the Kardar-ParisiZhang (KPZ) equation. To what extent living interfaces fit into the picture is still an open question, a question this thesis attempts to answer by drawing inspiration from the membrane of moving cells. Here, the aforementioned universality principle can be used as a road roller to pave our way into the crowded and highly dynamic environment of the cell membrane. The hope is that of smoothening all—and only—the irrelevant asperities, minor attributes whose account does yield an insight nowhere near the effort they require. The result of our crude approximation, that is the model presented in the thesis, can be thoroughly analysed with numerical and analytical methods: its main features turn out to match qualitatively those of actual membranes. In addition, the model allows for a rigorous derivation of the field equations which govern its large scales and long times properties. Scaling arguments then show that these equations include all the relevant ingredients, so as to corroborate the crude approximations made at the beginning. The model presented can thus be concluded to be a reasonable candidate for the universal description of active interfaces and reveal the signature features that can be looked for in experiments.
Published: 2020
Full Text: View/download PDF

13. Efficiency of one-dimensional active transport conditioned on motility

Author: Cagnetta, Francesco and Mallmin, Emil
Subjects: Condensed Matter - Statistical Mechanics
Abstract: By conditioning a stochastic process on the value of an observable, one obtains a new stochastic process with different properties. We apply this idea in the context of active matter, and condition interacting self-propelled particles on their individual motility. Using the effective process formalism from dynamical large deviations theory, we derive the interactions that actuate the imposed mobility against jamming interactions in two toy models---the totally asymmetric exclusion process and run-and-tumble particles, \emil{in the case of two or three particles}. We provide a framework which takes into account the energy-consumption required for self-propulsion, and address the question of how energy-efficient the emergent interactions are. Upon conditioning, run-and-tumble particles develop an alignment interaction and achieve a higher gain in efficiency than TASEP particles. A point of diminishing returns in efficiency is reached beyond a certain level of conditioning. With recourse to a general formula for the change in energy efficiency upon conditioning, we conclude that the most significant gains occur when there are large fluctuations in mobility to exploit. From a detailed comparison of the emergent effective interaction in a two- versus a three-body scenario, we discover evidence of a screening effect which suggests that conditioning can produce topological rather than metric interactions., Comment: 11 pages, 9 figures
Published: 2019
Full Text: View/download PDF

14. Inviscid limit of the active interface equations

Author: Cagnetta, Francesco and Evans, Martin R.
Subjects: Condensed Matter - Statistical Mechanics
Abstract: We present a detailed solution of the active interface equations in the inviscid limit. The active interface equations were previously introduced as a toy model of membrane-protein systems: they describe a stochastic interface where growth is stimulated by inclusions which themselves move on the interface. In the inviscid limit, the equations reduce to a pair of coupled conservation laws. After discussing how the inviscid limit is obtained, we turn to the corresponding Riemann problem: the solution of the set of conservation laws with discontinuous initial condition. In particular, by considering two physically meaningful initial conditions, a giant trough and a giant peak in the interface, we elucidate the generation of shock waves and rarefaction fans in the system. Then, by combining several Riemann problems, we construct an oscillating solution of the active interface with periodic boundaries conditions. The existence of this oscillating state reflects the reciprocal coupling between the two conserved quantities in our system., Comment: 22 pages, 11 figures
Published: 2019
Full Text: View/download PDF

15. Statistical mechanics of a single active slider on a fluctuating interface

Author: Cagnetta, Francesco, Evans, Martin R., and Marenduzzo, Davide
Subjects: Condensed Matter - Statistical Mechanics
Abstract: We study the statistical mechanics of a single active slider on a fluctuating interface, by means of numerical simulations and theoretical arguments. The slider, which moves by definition towards the interface minima, is active as it also stimulates growth of the interface. Even though such a particle has no counterpart in thermodynamic systems, active sliders may provide a simple model for ATP-dependent membrane proteins that activate cytoskeletal growth. We find a wide range of dynamical regimes according to the ratio between the timescales associated with the slider motion and the interface relaxation. If the interface dynamics is slow, the slider behaves like a random walker in a random envinronment which, furthermore, is able to escape environmental troughs by making them grow. This results in different dynamic exponens to the interface and the particle: the former behaves as an Edward-Wilkinson surface with dynamic exponent 2 whereas the latter has dynamic exponent 3/2. When the interface is fast, we get sustained ballistic motion with the particle surfing a membrane wave created by itself. However, if the interface relaxes immediately (i.e., it is infinitely fast), particle motion becomes symmetric and goes back to diffusive. Due to such a rich phenomenology, we propose the active slider as a toy model of fundamental interest in the field of active membranes and, generally, whenever the system constituent can alter the environment by spending energy., Comment: 13 pages, 19 figures
Published: 2018
Full Text: View/download PDF

16. Large fluctuations and dynamic phase transition in a system of self-propelled particles

Author: Cagnetta, Francesco, Corberi, Federico, Gonnella, Giuseppe, and Suma, Antonio
Subjects: Condensed Matter - Statistical Mechanics
Abstract: We study the statistics, in stationary conditions, of the work $W_\tau$ done by the active force in different systems of self-propelled particles in a time $\tau$. We show the existence of a critical value $W_\tau ^\dag$ such that fluctuations with $W_\tau >W_\tau ^\dag$ correspond to configurations where interaction between particles plays a minor role whereas those with $W_\tau < W_\tau ^\dag$ represent states with single particles dragged by clusters. This two-fold behavior is fully mirrored by the probability distribution $P(W_\tau)$ of the work, which does not obey the large-deviation principle for $W_\tau
Published: 2017
Full Text: View/download PDF

17. Strong anomalous diffusion of the phase of a chaotic pendulum

Author: Cagnetta, Francesco, Gonnella, Giuseppe, Mossa, Alessandro, and Ruffo, Stefano
Subjects: Nonlinear Sciences - Chaotic Dynamics, Condensed Matter - Statistical Mechanics, 70K55, 82C70, 34C28
Abstract: In this letter we consider the phase diffusion of a harmonically driven undamped pendulum and show that it is anomalous in the strong sense. The role played by the fractal properties of the phase space is highlighted, providing an illustration of the link between deterministic chaos and anomalous transport. Finally, we build a stochastic model which reproduces most properties of the original Hamiltonian system by alternating ballistic flights and random diffusion., Comment: 6 pages, 6 figures
Published: 2015
Full Text: View/download PDF

18. How deep convolutional neural networks lose spatial information with training

Author: Tomasini, Umberto M, primary, Petrini, Leonardo, additional, Cagnetta, Francesco, additional, and Wyart, Matthieu, additional
Published: 2023
Full Text: View/download PDF

19. How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

Author: Petrini, Leonardo, Cagnetta, Francesco, Tomasini, Umberto M., Favero, Alessandro, and Wyart, Matthieu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Learning generic high-dimensional tasks is notably hard, as it requires a number of training data exponential in the dimension. Yet, deep convolutional neural networks (CNNs) have shown remarkable success in overcoming this challenge. A popular hypothesis is that learnable tasks are highly structured and that CNNs leverage this structure to build a low-dimensional representation of the data. However, little is known about how much training data they require, and how this number depends on the data structure. This paper answers this question for a simple classification task that seeks to capture relevant aspects of real data: the Random Hierarchy Model. In this model, each of the $n_c$ classes corresponds to $m$ synonymic compositions of high-level features, which are in turn composed of sub-features through an iterative process repeated $L$ times. We find that the number of training data $P^*$ required by deep CNNs to learn this task (i) grows asymptotically as $n_c m^L$, which is only polynomial in the input dimensionality; (ii) coincides with the training set size such that the representation of a trained network becomes invariant to exchanges of synonyms; (iii) corresponds to the number of data at which the correlations between low-level features and classes become detectable. Overall, our results indicate how deep CNNs can overcome the curse of dimensionality by building invariant representations, and provide an estimate of the number of data required to learn a task based on its hierarchically compositional structure.
Published: 2023
Full Text: View/download PDF

20. Kinetic roughening in active interfaces

Author: Cagnetta Francesco, Evans Martin R., and Marenduzzo Davide
Subjects: Physics, QC1-999
Abstract: The essential features of many interfaces driven out of equilibrium are described by the same equation—the Kardar-Parisi-Zhang (KPZ) equation. How do living interfaces, such as the cell membrane, fit into this picture? In an endeavour to answer such a question, we proposed in [F. Cagnetta, M. R. Evans, D. Marenduzzo, PRL 120, 258001 (2018)] an idealised model for the membrane of a moving cell. Here we discuss how the addition of simple ingredients inspired by the dynamics of the membrane of moving cells affects common kinetic roughening theories such as the KPZ and Edwards-Wilkinson equations.
Published: 2020
Full Text: View/download PDF

21. Kinetic roughening in active interfaces

Author: Cagnetta, Francesco, Evans, Martin R., and Marenduzzo, Davide
Subjects: cond-mat.soft, cond-mat.stat-mech
Abstract: The essential features of many interfaces driven out of equilibrium are described by the same equation---the Kardar-Parisi-Zhang (KPZ) equation. How do living interfaces, such as the cell membrane, fit into this picture? In an endeavour to answer such a question, we proposed in [F. Cagnetta, M. R. Evans, D. Marenduzzo, PRL 120, 258001 (2018)] an idealised model for the membrane of a moving cell. Here we discuss how the addition of simple ingredients inspired by the dynamics of the membrane of moving cells affects common kinetic roughening theories such as the KPZ and Edwards-Wilkinson equations.
Published: 2020

22. Kinetic roughening in active interfaces.

Author: Puppin, E., Cagnetta, Francesco, Evans, Martin R., and Marenduzzo, Davide
Subjects: *CHEMICAL kinetics, *CELL membranes, *ERYTHROCYTES, *BIOLOGICAL interfaces, *ION channels
Abstract: The essential features of many interfaces driven out of equilibrium are described by the same equation—the Kardar-Parisi-Zhang (KPZ) equation. How do living interfaces, such as the cell membrane, fit into this picture? In an endeavour to answer such a question, we proposed in [F. Cagnetta, M. R. Evans, D. Marenduzzo, PRL 120, 258001 (2018)] an idealised model for the membrane of a moving cell. Here we discuss how the addition of simple ingredients inspired by the dynamics of the membrane of moving cells affects common kinetic roughening theories such as the KPZ and Edwards-Wilkinson equations. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

23. Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

Author: Favero, Alessandro, Cagnetta, Francesco, and Wyart, Matthieu
Subjects: FOS: Computer and information sciences, Statistics and Probability, Computer Science - Machine Learning, deep learning, FOS: Physical sciences, Machine Learning (stat.ML), Statistical and Nonlinear Physics, Disordered Systems and Neural Networks (cond-mat.dis-nn), Condensed Matter - Disordered Systems and Neural Networks, Machine Learning (cs.LG), analysis of algorithms, machine learning, learning theory, Statistics - Machine Learning, Statistics, Probability and Uncertainty
Abstract: Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge. We study this problem within a teacher-student framework for kernel regression, using `convolutional' kernels inspired by the neural tangent kernel of simple convolutional architectures of given filter size. Using heuristic methods from physics, we find in the ridgeless case that locality is key in determining the learning curve exponent $\beta$ (that relates the test error $\epsilon_t\sim P^{-\beta}$ to the size of the training set $P$), whereas translational invariance is not. In particular, if the filter size of the teacher $t$ is smaller than that of the student $s$, $\beta$ is a function of $s$ only and does not depend on the input dimension. We confirm our predictions on $\beta$ empirically. We conclude by proving, using a natural universality assumption, that performing kernel regression with a ridge that decreases with the size of the training set leads to similar learning curve exponents to those we obtain in the ridgeless case., Comment: 32 pages, 7 figures

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

23 results on '"Cagnetta, Francesco"'

1. Towards a theory of how the structure of language is acquired by deep neural networks

2. How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

3. Kernels, Data & Physics

4. How deep convolutional neural networks lose spatial information with training

5. What Can Be Learnt With Wide Convolutional Neural Networks?

6. Learning sparse features can lead to overfitting in neural networks

7. Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

8. A renormalization group study of the dynamics of active membranes: universality classes and scaling laws

9. Universal properties of active membranes

10. Work Fluctuations in the Active Ornstein- Uhlenbeck Particle model

11. Kinetic roughening in active interfaces

12. Active interfaces : a universal approach

13. Efficiency of one-dimensional active transport conditioned on motility

14. Inviscid limit of the active interface equations

15. Statistical mechanics of a single active slider on a fluctuating interface

16. Large fluctuations and dynamic phase transition in a system of self-propelled particles

17. Strong anomalous diffusion of the phase of a chaotic pendulum

18. How deep convolutional neural networks lose spatial information with training

19. How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

20. Kinetic roughening in active interfaces

21. Kinetic roughening in active interfaces

22. Kinetic roughening in active interfaces.

23. Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

23 results on '"Cagnetta, Francesco"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources