Search

Your search keyword '"Zha, Sheng"' showing total 105 results

Search Constraints

Start Over You searched for: Author "Zha, Sheng" Remove constraint Author: "Zha, Sheng"
105 results on '"Zha, Sheng"'

Search Results

1. Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning

2. DEM: Distribution Edited Model for Training with Mixed Data Distributions

3. Pre-training Differentially Private Models with Limited Public Data

4. Extreme Miscalibration and the Illusion of Adversarial Robustness

5. Zero redundancy distributed learning with differential privacy

6. On the accuracy and efficiency of group-wise clipping in differentially private optimization

7. Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

8. Coupling public and private gradient provably helps optimization

9. HYTREL: Hypergraph-enhanced Tabular Data Representation Learning

10. Large Language Models of Code Fail at Completing Code with Potential Bugs

11. Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion

12. Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic

13. Differentially Private Optimization on Large Model at Small Cost

14. Differentially Private Bias-Term Fine-tuning of Foundation Models

15. Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger

16. Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning

18. Meta-learning via Language Model In-context Tuning

19. Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

21. Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual

22. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

23. Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

24. Just-in-Time Dynamic-Batching

26. Question Type Guided Attention in Visual Question Answering

29. Preparation and characterisation of wheat starch-based aerogels for procyanidin encapsulation to enhance stability.

30. Preparation of phillyrin/cyclodextrin inclusion complexes and study of their physical properties, solubility enhancement, molecular docking and antioxidant activity.

35. Python Array API Standard: Toward Array Interoperability in the Scientific Python Ecosystem

37. Differentially Private Bias-Term only Fine-tuning of Foundation Models

39. Meta-learning via Language Model In-context Tuning

41. Context, Language Modeling, and Multimodal Data in Finance

45. Question Type Guided Attention in Visual Question Answering

50. A cross ? sectional study of affective, psychiatric, cognitive disorders and motor complications of Parkinson's disease.

Catalog

Books, media, physical & digital resources