Author: "Wei, Fuxuan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wei, Fuxuan"' showing total 10 results

Start Over Author "Wei, Fuxuan"

10 results on '"Wei, Fuxuan"'

1. Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training

Author: Wang, Yixuan, Luo, Xianzhen, Wei, Fuxuan, Liu, Yijun, Zhu, Qingfu, Zhang, Xuanyu, Yang, Qing, Xu, Dongliang, and Che, Wanxiang
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Existing speculative decoding methods typically require additional model structure and training processes to assist the model for draft token generation. This makes the migration of acceleration methods to the new model more costly and more demanding on device memory. To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning stage of the large language model. The training method simply introduces some noise at the input for the model to learn the denoising task. It significantly enhances the parallel decoding capability of the model without affecting the original task capability. In addition, we propose a tree-based retrieval-augmented Jacobi (TR-Jacobi) decoding strategy to further improve the inference speed of MSN models. Experiments in both the general and code domains have shown that MSN can improve inference speed by 2.3-2.7x times without compromising model performance. The MSN model also achieves comparable acceleration ratios to the SOTA model with additional model structure on Spec-Bench., Comment: EMNLP 2024, camera ready
Published: 2024

2. CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding

Author: Qin, Libo, Wei, Fuxuan, Chen, Qiguang, Zhou, Jingxuan, Huang, Shijue, Si, Jiasheng, Lu, Wenpeng, and Che, Wanxiang
Subjects: Computer Science - Computation and Language
Abstract: Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this problem, we present the pioneering work of Cross-task Interactive Prompting (CroPrompt) for SLU, which enables the model to interactively leverage the information exchange across the correlated tasks in SLU. Additionally, we further introduce a multi-task self-consistency mechanism to mitigate the error propagation caused by the intent information injection. We conduct extensive experiments on the standard SLU benchmark and the results reveal that CroPrompt consistently outperforms the existing prompting approaches. In addition, the multi-task self-consistency mechanism can effectively ease the error propagation issue, thereby enhancing the performance. We hope this work can inspire more research on cross-task prompting for SLU.
Published: 2024

3. Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages

Author: Qin, Libo, Chen, Qiguang, Wei, Fuxuan, Huang, Shijue, and Che, Wanxiang
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Chain-of-thought (CoT) is capable of eliciting models to explicitly generate reasoning paths, thus promoting reasoning accuracy and attracting increasing attention. Specifically, zero-shot CoT achieves remarkable improvements in a wide range of reasoning tasks by simply instructing the LLM with the prompt "Let's think step by step!". Despite the success of zero-shot CoT, the existing zero-shot prompting techniques remain limited to a single language, making it challenging to generalize to other languages and hindering global development. In this work, we introduce cross-lingual prompting (CLP), aiming to improve zero-shot CoT reasoning across languages. Specifically, CLP consists of two main components: (1) cross-lingual alignment prompting and (2) task-specific solver prompting. The cross-lingual alignment prompting is responsible for aligning representations across different languages, whereas the task-specific solver prompting is used to generate the final chain of thoughts and results for the reasoning task. In addition, we further introduce cross-lingual self-consistent prompting (CLSP) to ensemble different reasoning paths across languages. Our experimental evaluations on several benchmarks demonstrate that CLP and CLSP significantly outperform the existing prompting methods and achieve state-of-the-art performance. We hope this work will inspire further breakthroughs in cross-lingual CoT., Comment: Accepted at EMNLP2023 Main Conference
Published: 2023

4. HIT-SCIR at MMNLU-22: Consistency Regularization for Multilingual Spoken Language Understanding

Author: Zheng, Bo, Li, Zhouyang, Wei, Fuxuan, Chen, Qiguang, Qin, Libo, and Che, Wanxiang
Subjects: Computer Science - Computation and Language
Abstract: Multilingual spoken language understanding (SLU) consists of two sub-tasks, namely intent detection and slot filling. To improve the performance of these two sub-tasks, we propose to use consistency regularization based on a hybrid data augmentation strategy. The consistency regularization enforces the predicted distributions for an example and its semantically equivalent augmentation to be consistent. We conduct experiments on the MASSIVE dataset under both full-dataset and zero-shot settings. Experimental results demonstrate that our proposed method improves the performance on both intent detection and slot filling tasks. Our system\footnote{The code will be available at \url{https://github.com/bozheng-hit/MMNLU-22-HIT-SCIR}.} ranked 1st in the MMNLU-22 competition under the full-dataset setting., Comment: Accepted by EMNLP2022 MMNLU-22 Workshop. The winner of the MMNLU-22 Competition Full Dataset Task. Code is available at https://github.com/bozheng-hit/MMNLU-22-HIT-SCIR
Published: 2023

5. NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Author: Dhole, Kaustubh D., Gangal, Varun, Gehrmann, Sebastian, Gupta, Aadesh, Li, Zhenhao, Mahamood, Saad, Mahendiran, Abinaya, Mille, Simon, Shrivastava, Ashish, Tan, Samson, Wu, Tongshuang, Sohl-Dickstein, Jascha, Choi, Jinho D., Hovy, Eduard, Dusek, Ondrej, Ruder, Sebastian, Anand, Sajant, Aneja, Nagender, Banjade, Rabin, Barthe, Lisa, Behnke, Hanna, Berlot-Attwell, Ian, Boyle, Connor, Brun, Caroline, Cabezudo, Marco Antonio Sobrevilla, Cahyawijaya, Samuel, Chapuis, Emile, Che, Wanxiang, Choudhary, Mukund, Clauss, Christian, Colombo, Pierre, Cornell, Filip, Dagan, Gautier, Das, Mayukh, Dixit, Tanay, Dopierre, Thomas, Dray, Paul-Alexis, Dubey, Suchitra, Ekeinhor, Tatiana, Di Giovanni, Marco, Goyal, Tanya, Gupta, Rishabh, Hamla, Louanes, Han, Sang, Harel-Canada, Fabrice, Honore, Antoine, Jindal, Ishan, Joniak, Przemyslaw K., Kleyko, Denis, Kovatchev, Venelin, Krishna, Kalpesh, Kumar, Ashutosh, Langer, Stefan, Lee, Seungjae Ryan, Levinson, Corey James, Liang, Hualou, Liang, Kaizhao, Liu, Zhexiong, Lukyanenko, Andrey, Marivate, Vukosi, de Melo, Gerard, Meoni, Simon, Meyer, Maxime, Mir, Afnan, Moosavi, Nafise Sadat, Muennighoff, Niklas, Mun, Timothy Sum Hon, Murray, Kenton, Namysl, Marcin, Obedkova, Maria, Oli, Priti, Pasricha, Nivranshu, Pfister, Jan, Plant, Richard, Prabhu, Vinay, Pais, Vasile, Qin, Libo, Raji, Shahab, Rajpoot, Pawan Kumar, Raunak, Vikas, Rinberg, Roy, Roberts, Nicolas, Rodriguez, Juan Diego, Roux, Claude, S., Vasconcellos P. H., Sai, Ananya B., Schmidt, Robin M., Scialom, Thomas, Sefara, Tshephisho, Shamsi, Saqib N., Shen, Xudong, Shi, Haoyue, Shi, Yiwen, Shvets, Anna, Siegel, Nick, Sileo, Damien, Simon, Jamie, Singh, Chandan, Sitelew, Roman, Soni, Priyank, Sorensen, Taylor, Soto, William, Srivastava, Aman, Srivatsa, KV Aditya, Sun, Tony, T, Mukund Varma, Tabassum, A, Tan, Fiona Anting, Teehan, Ryan, Tiwari, Mo, Tolkiehn, Marie, Wang, Athena, Wang, Zijian, Wang, Gloria, Wang, Zijie J., Wei, Fuxuan, Wilie, Bryan, Winata, Genta Indra, Wu, Xinyi, Wydmański, Witold, Xie, Tianbao, Yaseen, Usama, Yee, Michael A., Zhang, Jing, and Zhang, Yue
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks. We demonstrate the efficacy of NL-Augmenter by using several of its transformations to analyze the robustness of popular natural language models. The infrastructure, datacards and robustness analysis results are available publicly on the NL-Augmenter repository (https://github.com/GEM-benchmark/NL-Augmenter)., Comment: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter
Published: 2021

6. GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling

Author: Qin, Libo, Wei, Fuxuan, Xie, Tianbao, Xu, Xiao, Che, Wanxiang, and Liu, Ting
Subjects: Computer Science - Computation and Language
Abstract: Multi-intent SLU can handle multiple intents in an utterance, which has attracted increasing attention. However, the state-of-the-art joint models heavily rely on autoregressive approaches, resulting in two issues: slow inference speed and information leakage. In this paper, we explore a non-autoregressive model for joint multiple intent detection and slot filling, achieving more fast and accurate. Specifically, we propose a Global-Locally Graph Interaction Network (GL-GIN) where a local slot-aware graph interaction layer is proposed to model slot dependency for alleviating uncoordinated slots problem while a global intent-slot graph interaction layer is introduced to model the interaction between multiple intents and all slots in the utterance. Experimental results on two public datasets show that our framework achieves state-of-the-art performance while being 11.5 times faster., Comment: Accepted at ACL2021 (main conference)
Published: 2021

7. Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages

Author: Qin, Libo, primary, Chen, Qiguang, additional, Wei, Fuxuan, additional, Huang, Shijue, additional, and Che, Wanxiang, additional
Published: 2023
Full Text: View/download PDF

8. Multi-domain Spoken Language Understanding Using Domain- and Task-aware Parameterization

Author: Qin, Libo, primary, Wei, Fuxuan, additional, Ni, Minheng, additional, Zhang, Yue, additional, Che, Wanxiang, additional, Li, Yangming, additional, and Liu, Ting, additional
Published: 2022
Full Text: View/download PDF

9. HIT-SCIR at MMNLU-22: Consistency Regularization for Multilingual Spoken Language Understanding

Author: Zheng, Bo, primary, Li, Zhouyang, additional, Wei, Fuxuan, additional, Chen, Qiguang, additional, Qin, Libo, additional, and Che, Wanxiang, additional
Published: 2022
Full Text: View/download PDF

10. GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling

Author: Qin, Libo, primary, Wei, Fuxuan, additional, Xie, Tianbao, additional, Xu, Xiao, additional, Che, Wanxiang, additional, and Liu, Ting, additional
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

10 results on '"Wei, Fuxuan"'

1. Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training

2. CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding

3. Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages

4. HIT-SCIR at MMNLU-22: Consistency Regularization for Multilingual Spoken Language Understanding

5. NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

6. GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling

7. Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages

8. Multi-domain Spoken Language Understanding Using Domain- and Task-aware Parameterization

9. HIT-SCIR at MMNLU-22: Consistency Regularization for Multilingual Spoken Language Understanding

10. GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

10 results on '"Wei, Fuxuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources