Back to Search
Start Over
Structured Penalties for Log-linear Language Models
- Source :
- EMNLP-Empirical Methods in Natural Language Processing, EMNLP-Empirical Methods in Natural Language Processing, Oct 2013, Seattle, United States. pp.233-243
- Publication Year :
- 2013
- Publisher :
- HAL CCSD, 2013.
-
Abstract
- International audience; Language models can be formalized as loglinear regression models where the input features represent previously observed contexts up to a certain length m. The complexity of existing algorithms to learn the parameters by maximum likelihood scale linearly in nd, where n is the length of the training corpus and d is the number of observed features. We present a model that grows logarithmically in d, making it possible to efficiently leverage longer contexts. We account for the sequential structure of natural language using treestructured penalized objectives to avoid overfitting and achieve better generalization.
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- EMNLP-Empirical Methods in Natural Language Processing, EMNLP-Empirical Methods in Natural Language Processing, Oct 2013, Seattle, United States. pp.233-243
- Accession number :
- edsair.dedup.wf.001..db51b09d360b61880ee0b54d982405c8