Search

Your search keyword '"Geiger, Atticus"' showing total 38 results

Search Constraints

Start Over You searched for: Author "Geiger, Atticus" Remove constraint Author: "Geiger, Atticus"
38 results on '"Geiger, Atticus"'

Search Results

1. Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small

2. Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations

3. Updating CLIP to Prefer Descriptions Over Captions

4. ReFT: Representation Finetuning for Language Models

5. pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

6. RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

7. A Reply to Makelov et al. (2023)'s 'Interpretability Illusion' Arguments

8. Linear Representations of Sentiment in Large Language Models

9. Rigorously Assessing Natural Language Explanations of Neurons

10. ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning

11. Interpretability at Scale: Identifying Causal Mechanisms in Alpaca

12. Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

13. Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability

14. Causal Abstraction with Soft Interventions

15. Causal Proxy Models for Concept-Based Model Explanations

16. A Semantics for Causing, Enabling, and Preventing Verbs Using Structural Causal Models

17. CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior

18. Causal Distillation for Language Models

19. Inducing Causal Structure for Interpretable Neural Networks

20. Causal Abstractions of Neural Networks

21. Dynabench: Rethinking Benchmarking in NLP

22. DynaSent: A Dynamic Benchmark for Sentiment Analysis

23. Relational reasoning and generalization using non-symbolic neural networks

24. Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation

25. Posing Fair Generalization Tasks for Natural Language Inference

26. Stress-Testing Neural Models of Natural Language Inference with Multiply-Quantified Sentences

28. Causal Abstraction for Faithful Model Interpretation

32. Relational Reasoning and Generalization Using Nonsymbolic Neural Networks.

33. Causal Distillation for Language Models

35. Dynabench: Rethinking Benchmarking in NLP

Catalog

Books, media, physical & digital resources