17 results on '"Mingze Wang"'
Search Results
2. How Transformers Implement Induction Heads: Approximation and Optimization Analysis.
3. Incorporate LLMs with Influential Recommender System.
4. Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training.
5. The Implicit Bias of Gradient Noise: A Symmetry Perspective.
6. RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model.
7. Improving Generalization and Convergence by Enhancing Implicit Regularization.
8. Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling.
9. Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
10. The Noise Geometry of Stochastic Gradient Descent: A Quantitative and Analytical Characterization.
11. Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks.
12. Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling.
13. Q-YOLO: Efficient Inference for Real-time Object Detection.
14. Incorporating Voice Instructions in Model-Based Reinforcement Learning for Self-Driving Cars.
15. Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks.
16. When does SGD favor flat minima? A quantitative characterization via linear stability.
17. Generalization Error Bounds for Deep Neural Networks Trained by SGD.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.