1. Can neural networks acquire a structural bias from raw linguistic data?
- Author
-
Warstadt, Alex and Bowman, Samuel R.
- Subjects
inductive bias ,structure dependence ,BERT ,learnability of grammar ,poverty of the stimulus ,neural net-work ,self-supervised learning - Abstract
We evaluate whether BERT, a widely used neural network forsentence processing, acquires an inductive bias towards form-ing structural generalizations through pretraining on raw data.We conduct four experiments testing its preference for struc-tural vs. linear generalizations in different structure-dependentphenomena. We find that BERT makes a structural general-ization in 3 out of 4 empirical domains—subject-auxiliary in-version, reflexive binding, and verb tense detection in embed-ded clauses—but makes a linear generalization when tested onNPI licensing. We argue that these results are the strongest ev-idence so far from artificial learners supporting the propositionthat a structural bias can be acquired from raw data. If this con-clusion is correct, it is tentative evidence that some linguisticuniversals can be acquired by learners without innate biases.However, the precise implications for human language acqui-sition are unclear, as humans learn language from significantlyless data than BERT.
- Published
- 2020