Search

Your search keyword '"Ankit Singh"' showing total 62 results

Search Constraints

Start Over You searched for: Author "Ankit Singh" Remove constraint Author: "Ankit Singh" Database arXiv Remove constraint Database: arXiv
62 results on '"Ankit Singh"'

Search Results

1. A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs

2. A Statistical Framework for Data-dependent Retrieval-Augmented Models

3. Analysis of Plan-based Retrieval for Grounded Text Generation

4. Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond

5. Efficient Document Ranking with Learnable Late Interactions

6. Cascade-Aware Training of Language Models

7. Faster Cascades via Speculative Decoding

8. Language Model Cascades: Token-level uncertainty and beyond

9. Mechanics of Next Token Prediction with Self-Attention

10. From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

11. DistillSpec: Improving Speculative Decoding via Knowledge Distillation

12. What do larger image classifiers memorise?

13. Think before you speak: Training Language Models With Pause Tokens

14. $\mu$2mech: a Software Package Combining Microstructure Modeling and Mechanical Property Prediction

15. When Does Confidence-Based Cascade Deferral Suffice?

16. On the Role of Attention in Prompt-tuning

17. ResMem: Learn what you can and memorize the rest

18. Supervision Complexity and its Role in Knowledge Distillation

19. EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval

20. Large Language Models with Controllable Working Memory

21. The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

22. Generalization Properties of Retrieval-based Models

23. A Fourier Approach to Mixture Learning

24. Teacher Guided Training: An Efficient Framework for Knowledge Transfer

25. ELM: Embedding and Logit Margins for Long-Tail Learning

26. FedLite: A Scalable Approach for Federated Learning on Resource-constrained Clients

27. When in Doubt, Summon the Titans: Efficient Inference with Large Models

28. Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

29. Distilling Double Descent

30. On the Reproducibility of Neural Network Predictions

31. Modifying Memories in Transformer Models

32. Long-tail learning via logit adjustment

33. Adversarial robustness via robust low rank representations

34. $O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers

35. Why distillation helps: a statistical perspective

36. Doubly-stochastic mining for heterogeneous retrieval

37. Federated Learning with Only Positive Labels

38. Robust Large-Margin Learning in Hyperbolic Space

39. Reliable Distributed Clustering with Redundant Data Assignment

40. Low-Rank Bottleneck in Multi-head Attention Models

41. Achieving Multi-Port Memory Performance on Single-Port Memory with Coding Techniques

42. Are Transformers universal approximators of sequence-to-sequence functions?

43. Sampled Softmax with Random Fourier Features

44. The Generalized Lasso for Sub-gaussian Measurements with Dithered Quantization

45. Robust Gradient Descent via Moment Encoding with LDPC Codes

46. Representation Learning and Recovery in the ReLU Model

47. Lifting high-dimensional nonlinear models with Gaussian regressors

48. MDS Code Constructions with Small Sub-packetization and Near-optimal Repair Bandwidth

49. Associative Memory using Dictionary Learning and Expander Decoding

50. A Note on Secure Minimum Storage Regenerating Codes

Catalog

Books, media, physical & digital resources