Author: "Zheng, Boyuan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zheng, Boyuan"' showing total 209 results

Start Over Author "Zheng, Boyuan"

209 results on '"Zheng, Boyuan"'

1. Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

Author: Gou, Boyu, Wang, Ruohan, Zheng, Boyuan, Xie, Yanan, Chang, Cheng, Shu, Yiheng, Sun, Huan, and Su, Yu
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Multimodal large language models (MLLMs) are transforming the capabilities of graphical user interface (GUI) agents, facilitating their transition from controlled simulations to complex, real-world applications across various platforms. However, the effectiveness of these agents hinges on the robustness of their grounding capability. Current GUI agents predominantly utilize text-based representations such as HTML or accessibility trees, which, despite their utility, often introduce noise, incompleteness, and increased computational overhead. In this paper, we advocate a human-like embodiment for GUI agents that perceive the environment entirely visually and directly take pixel-level operations on the GUI. The key is visual grounding models that can accurately map diverse referring expressions of GUI elements to their coordinates on the GUI across different platforms. We show that a simple recipe, which includes web-based synthetic data and slight adaptation of the LLaVA architecture, is surprisingly effective for training such visual grounding models. We collect the largest dataset for GUI visual grounding so far, containing 10M GUI elements and their referring expressions over 1.3M screenshots, and use it to train UGround, a strong universal visual grounding model for GUI agents. Empirical results on six benchmarks spanning three categories (grounding, offline agent, and online agent) show that 1) UGround substantially outperforms existing visual grounding models for GUI agents, by up to 20% absolute, and 2) agents with UGround outperform state-of-the-art agents, despite the fact that existing agents use additional text-based input while ours only uses visual perception. These results provide strong support for the feasibility and promises of GUI agents that navigate the digital world as humans do.
Published: 2024

2. Interpretable Robotic Manipulation from Language

Author: Zheng, Boyuan, Zhou, Jianlong, and Chen, Fang
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: Humans naturally employ linguistic instructions to convey knowledge, a process that proves significantly more complex for machines, especially within the context of multitask robotic manipulation environments. Natural language, moreover, serves as the primary medium through which humans acquire new knowledge, presenting a potentially intuitive bridge for translating concepts understandable by humans into formats that can be learned by machines. In pursuit of facilitating this integration, we introduce an explainable behavior cloning agent, named Ex-PERACT, specifically designed for manipulation tasks. This agent is distinguished by its hierarchical structure, which incorporates natural language to enhance the learning process. At the top level, the model is tasked with learning a discrete skill code, while at the bottom level, the policy network translates the problem into a voxelized grid and maps the discretized actions to voxel grids. We evaluate our method across eight challenging manipulation tasks utilizing the RLBench benchmark, demonstrating that Ex-PERACT not only achieves competitive policy performance but also effectively bridges the gap between human instructions and machine execution in complex environments.
Published: 2024

3. Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning

Author: Zhang, Hai, Zheng, Boyuan, Ji, Tianying, Liu, Jinhang, Guo, Anqi, Zhao, Junqiao, and Li, Lanqing
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Offline meta reinforcement learning (OMRL) has emerged as a promising approach for interaction avoidance and strong generalization performance by leveraging pre-collected data and meta-learning techniques. Previous context-based approaches predominantly rely on the intuition that alternating optimization between the context encoder and the policy can lead to performance improvements, as long as the context encoder follows the principle of maximizing the mutual information between the task variable $M$ and its latent representation $Z$ ($I(Z;M)$) while the policy adopts the standard offline reinforcement learning (RL) algorithms conditioning on the learned task representation.Despite promising results, the theoretical justification of performance improvements for such intuition remains underexplored.Inspired by the return discrepancy scheme in the model-based RL field, we find that the previous optimization framework can be linked with the general RL objective of maximizing the expected return, thereby explaining performance improvements. Furthermore, after scrutinizing this optimization framework, we find it ignores the variation of the task representation in the alternating optimization process, which weakens the condition necessary for monotonic performance improvements, and may therefore violate the monotonicity.We name this issue \underline{task representation shift} and theoretically prove that the monotonic performance improvements can be guaranteed with appropriate context encoder updates.We use different settings to rein in the task representation shift on three widely adopted training objectives concerning maximizing $I(Z;M)$ across different data qualities.Empirical results show that reining in the task representation shift can indeed improve performance.
Published: 2024

4. Separation and recovery of uranium and beryllium from acidic sulphate solutions by solvent extraction using mixture of organophosphonic extractants

Author: Xia, Hongyang, Zheng, Boyuan, Zhao, Xu, Huang, Yanting, Wang, Hongqiang, Hu, Eming, Lei, Zhiwu, Hu, Fang, and Wang, Qingliang
Published: 2024
Full Text: View/download PDF

5. A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

Author: Mo, Lingbo, Liao, Zeyi, Zheng, Boyuan, Su, Yu, Xiao, Chaowei, and Sun, Huan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Language agents powered by large language models (LLMs) have seen exploding development. Their capability of using language as a vehicle for thought and communication lends an incredible level of flexibility and versatility. People have quickly capitalized on this capability to connect LLMs to a wide range of external components and environments: databases, tools, the Internet, robotic embodiment, etc. Many believe an unprecedentedly powerful automation technology is emerging. However, new automation technologies come with new safety risks, especially for intricate systems like language agents. There is a surprisingly large gap between the speed and scale of their development and deployment and our understanding of their safety risks. Are we building a house of cards? In this position paper, we present the first systematic effort in mapping adversarial attacks against language agents. We first present a unified conceptual framework for agents with three major components: Perception, Brain, and Action. Under this framework, we present a comprehensive discussion and propose 12 potential attack scenarios against different components of an agent, covering different attack strategies (e.g., input manipulation, adversarial demonstrations, jailbreaking, backdoors). We also draw connections to successful attack strategies previously applied to LLMs. We emphasize the urgency to gain a thorough understanding of language agent risks before their widespread deployment.
Published: 2024

6. Dual-View Visual Contextualization for Web Navigation

Author: Kil, Jihyung, Song, Chan Hee, Zheng, Boyuan, Deng, Xiang, Su, Yu, and Chao, Wei-Lun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Automatic web navigation aims to build a web agent that can follow language instructions to execute complex and diverse tasks on real-world websites. Existing work primarily takes HTML documents as input, which define the contents and action spaces (i.e., actionable elements and operations) of webpages. Nevertheless, HTML documents may not provide a clear task-related context for each element, making it hard to select the right (sequence of) actions. In this paper, we propose to contextualize HTML elements through their "dual views" in webpage screenshots: each HTML element has its corresponding bounding box and visual content in the screenshot. We build upon the insight -- web developers tend to arrange task-related elements nearby on webpages to enhance user experiences -- and propose to contextualize each element with its neighbor elements, using both textual and visual features. The resulting representations of HTML elements are more informative for the agent to take action. We validate our method on the recently released Mind2Web dataset, which features diverse navigation domains and tasks on real-world websites. Our method consistently outperforms the baseline in all the scenarios, including cross-task, cross-website, and cross-domain ones., Comment: Accepted to CVPR 2024
Published: 2024

7. The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts

Author: Shen, Lingfeng, Tan, Weiting, Chen, Sihao, Chen, Yunmo, Zhang, Jingyu, Xu, Haoran, Zheng, Boyuan, Koehn, Philipp, and Khashabi, Daniel
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LLMs across different languages and discusses approaches to alleviating such concerns. By comparing how state-of-the-art LLMs respond to the same set of malicious prompts written in higher- vs. lower-resource languages, we observe that (1) LLMs tend to generate unsafe responses much more often when a malicious prompt is written in a lower-resource language, and (2) LLMs tend to generate more irrelevant responses to malicious prompts in lower-resource languages. To understand where the discrepancy can be attributed, we study the effect of instruction tuning with reinforcement learning from human feedback (RLHF) or supervised finetuning (SFT) on the HH-RLHF dataset. Surprisingly, while training with high-resource languages improves model alignment, training in lower-resource languages yields minimal improvement. This suggests that the bottleneck of cross-lingual alignment is rooted in the pretraining stage. Our findings highlight the challenges in cross-lingual LLM safety, and we hope they inform future research in this direction.
Published: 2024

8. GPT-4V(ision) is a Generalist Web Agent, if Grounded

Author: Zheng, Boyuan, Gou, Boyu, Kil, Jihyung, Sun, Huan, and Su, Yu
Subjects: Computer Science - Information Retrieval, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: The recent development on large multimodal models (LMMs), especially GPT-4V(ision) and Gemini, has been quickly expanding the capability boundaries of multimodal models beyond traditional tasks like image captioning and visual question answering. In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website. We propose SEEACT, a generalist web agent that harnesses the power of LMMs for integrated visual understanding and acting on the web. We evaluate on the recent MIND2WEB benchmark. In addition to standard offline evaluation on cached websites, we enable a new online evaluation setting by developing a tool that allows running web agents on live websites. We show that GPT-4V presents a great potential for web agents -- it can successfully complete 51.1 of the tasks on live websites if we manually ground its textual plans into actions on the websites. This substantially outperforms text-only LLMs like GPT-4 or smaller models (FLAN-T5 and BLIP-2) specifically fine-tuned for web agents. However, grounding still remains a major challenge. Existing LMM grounding strategies like set-of-mark prompting turns out to be not effective for web agents, and the best grounding strategy we develop in this paper leverages both the HTML structure and visuals. Yet, there is still a substantial gap with oracle grounding, leaving ample room for further improvement. All code, data, and evaluation tools are available at https://github.com/OSU-NLP-Group/SeeAct.
Published: 2024

9. MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Author: Yue, Xiang, Ni, Yuansheng, Zhang, Kai, Zheng, Tianyu, Liu, Ruoqi, Zhang, Ge, Stevens, Samuel, Jiang, Dongfu, Ren, Weiming, Sun, Yuxuan, Wei, Cong, Yu, Botao, Yuan, Ruibin, Sun, Renliang, Yin, Ming, Zheng, Boyuan, Yang, Zhenzhu, Liu, Yibo, Huang, Wenhao, Sun, Huan, Su, Yu, and Chen, Wenhu
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning. MMMU includes 11.5K meticulously collected multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines: Art & Design, Business, Science, Health & Medicine, Humanities & Social Science, and Tech & Engineering. These questions span 30 subjects and 183 subfields, comprising 30 highly heterogeneous image types, such as charts, diagrams, maps, tables, music sheets, and chemical structures. Unlike existing benchmarks, MMMU focuses on advanced perception and reasoning with domain-specific knowledge, challenging models to perform tasks akin to those faced by experts. The evaluation of 14 open-source LMMs as well as the proprietary GPT-4V(ision) and Gemini highlights the substantial challenges posed by MMMU. Even the advanced GPT-4V and Gemini Ultra only achieve accuracies of 56% and 59% respectively, indicating significant room for improvement. We believe MMMU will stimulate the community to build next-generation multimodal foundation models towards expert artificial general intelligence., Comment: CVPR 2024 Oral
Published: 2023

10. Mind2Web: Towards a Generalist Agent for the Web

Author: Deng, Xiang, Gu, Yu, Zheng, Boyuan, Chen, Shijie, Stevens, Samuel, Wang, Boshi, Sun, Huan, and Su, Yu
Subjects: Computer Science - Computation and Language
Abstract: We introduce Mind2Web, the first dataset for developing and evaluating generalist agents for the web that can follow language instructions to complete complex tasks on any website. Existing datasets for web agents either use simulated websites or only cover a limited set of websites and tasks, thus not suitable for generalist web agents. With over 2,000 open-ended tasks collected from 137 websites spanning 31 domains and crowdsourced action sequences for the tasks, Mind2Web provides three necessary ingredients for building generalist web agents: 1) diverse domains, websites, and tasks, 2) use of real-world websites instead of simulated and simplified ones, and 3) a broad spectrum of user interaction patterns. Based on Mind2Web, we conduct an initial exploration of using large language models (LLMs) for building generalist web agents. While the raw HTML of real-world websites are often too large to be fed to LLMs, we show that first filtering it with a small LM significantly improves the effectiveness and efficiency of LLMs. Our solution demonstrates a decent level of performance, even on websites or entire domains the model has never seen before, but there is still a substantial room to improve towards truly generalizable agents. We open-source our dataset, model implementation, and trained models (https://osu-nlp-group.github.io/Mind2Web) to facilitate further research on building a generalist agent for the web., Comment: Website: https://osu-nlp-group.github.io/Mind2Web. Updated with supplementary material. NeurIPS'23 Spotlight
Published: 2023

11. Beryllium adsorption from beryllium mining wastewater with novel porous lotus leaf biochar modified with PO43−/NH4+ multifunctional groups (MLLB)

Author: Zhao, Xu, Wang, Qingliang, Sun, Yige, Li, Haoshuai, Lei, Zhiwu, Zheng, Boyuan, Xia, Hongyang, Su, Yucheng, Ali, Khan Muhammad Yaruq, Wang, Hongqiang, and Hu, Fang
Published: 2024
Full Text: View/download PDF

12. Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency

Author: Shen, Lingfeng, Tan, Weiting, Zheng, Boyuan, and Khashabi, Daniel
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: With growing capabilities of large language models, prompting them has become the dominant way to access them. This has motivated the development of strategies for automatically selecting effective language prompts. In this paper, we introduce prompt flatness, a new metric to quantify the expected utility of a language prompt. This metric is inspired by flatness regularization in statistical learning that quantifies the robustness of the model towards its parameter perturbations. We provide theoretical foundations for this metric and its relationship with other prompt selection metrics, providing a comprehensive understanding of existing methods. Empirically, we show that combining prompt flatness with existing metrics improves both performance and sample efficiency. Our metric outperforms the previous prompt selection metrics with an average increase of 5% in accuracy and 10% in Pearson correlation across 6 classification benchmarks.
Published: 2023

13. Genetic Imitation Learning by Reward Extrapolation

Author: Zheng, Boyuan, Zhou, Jianlong, and Chen, Fang
Subjects: Computer Science - Neural and Evolutionary Computing, Computer Science - Machine Learning
Abstract: Imitation learning demonstrates remarkable performance in various domains. However, imitation learning is also constrained by many prerequisites. The research community has done intensive research to alleviate these constraints, such as adding the stochastic policy to avoid unseen states, eliminating the need for action labels, and learning from the suboptimal demonstrations. Inspired by the natural reproduction process, we proposed a method called GenIL that integrates the Genetic Algorithm with imitation learning. The involvement of the Genetic Algorithm improves the data efficiency by reproducing trajectories with various returns and assists the model in estimating more accurate and compact reward function parameters. We tested GenIL in both Atari and Mujoco domains, and the result shows that it successfully outperforms the previous extrapolation methods over extrapolation accuracy, robustness, and overall policy performance when input data is limited.
Published: 2023

14. Explaining Imitation Learning through Frames

Author: Zheng, Boyuan, Zhou, Jianlong, Liu, Chunjie, Li, Yiqiao, and Chen, Fang
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
Published: 2023

15. GANExplainer: GAN-based Graph Neural Networks Explainer

Author: Li, Yiqiao, Zhou, Jianlong, Zheng, Boyuan, and Chen, Fang
Subjects: Computer Science - Machine Learning
Abstract: With the rapid deployment of graph neural networks (GNNs) based techniques into a wide range of applications such as link prediction, node classification, and graph classification the explainability of GNNs has become an indispensable component for predictive and trustworthy decision-making. Thus, it is critical to explain why graph neural network (GNN) makes particular predictions for them to be believed in many applications. Some GNNs explainers have been proposed recently. However, they lack to generate accurate and real explanations. To mitigate these limitations, we propose GANExplainer, based on Generative Adversarial Network (GAN) architecture. GANExplainer is composed of a generator to create explanations and a discriminator to assist with the Generator development. We investigate the explanation accuracy of our models by comparing the performance of GANExplainer with other state-of-the-art methods. Our empirical results on synthetic datasets indicate that GANExplainer improves explanation accuracy by up to 35\% compared to its alternatives.
Published: 2022

16. An Empirical Study on Finding Spans

Author: Gu, Weiwei, Zheng, Boyuan, Chen, Yunmo, Chen, Tongfei, and Van Durme, Benjamin
Subjects: Computer Science - Computation and Language
Abstract: We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks. We focus on approaches that can be employed in training end-to-end information extraction systems, and find there is no definitive solution without considering task properties, and provide our observations to help with future design choices: 1) a tagging approach often yields higher precision while span enumeration and boundary prediction provide higher recall; 2) span type information can benefit a boundary prediction approach; 3) additional contextualization does not help span finding in most cases., Comment: Accepted to EMNLP 2022
Published: 2022

17. Multilingual Coreference Resolution in Multiparty Dialogue

Author: Zheng, Boyuan, Xia, Patrick, Yarmohammadi, Mahsa, and Van Durme, Benjamin
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Existing multiparty dialogue datasets for entity coreference resolution are nascent, and many challenges are still unaddressed. We create a large-scale dataset, Multilingual Multiparty Coref (MMC), for this task based on TV transcripts. Due to the availability of gold-quality subtitles in multiple languages, we propose reusing the annotations to create silver coreference resolution data in other languages (Chinese and Farsi) via annotation projection. On the gold (English) data, off-the-shelf models perform relatively poorly on MMC, suggesting that MMC has broader coverage of multiparty coreference than prior datasets. On the silver data, we find success both using it for data augmentation and training from scratch, which effectively simulates the zero-shot cross-lingual setting.
Published: 2022

18. Learn To Remember: Transformer with Recurrent Memory for Document-Level Machine Translation

Author: Feng, Yukun, Li, Feng, Song, Ziang, Zheng, Boyuan, and Koehn, Philipp
Subjects: Computer Science - Artificial Intelligence
Abstract: The Transformer architecture has led to significant gains in machine translation. However, most studies focus on only sentence-level translation without considering the context dependency within documents, leading to the inadequacy of document-level coherence. Some recent research tried to mitigate this issue by introducing an additional context encoder or translating with multiple sentences or even the entire document. Such methods may lose the information on the target side or have an increasing computational complexity as documents get longer. To address such problems, we introduce a recurrent memory unit to the vanilla Transformer, which supports the information exchange between the sentence and previous context. The memory unit is recurrently updated by acquiring information from sentences, and passing the aggregated knowledge back to subsequent sentence states. We follow a two-stage training strategy, in which the model is first trained at the sentence level and then finetuned for document-level translation. We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0.91 s-BLEU over the sentence-level baseline. We also achieve state-of-the-art results on TED and News, outperforming the previous work by 0.36 s-BLEU and 1.49 d-BLEU on average., Comment: Accepted by NAACL-2022 Findings
Published: 2022
Full Text: View/download PDF

19. Promoting neurovascularized bone regeneration with a novel 3D printed inorganic-organic magnesium silicate/PLA composite scaffold

Author: Wang, Zhaozhen, Zheng, Boyuan, Yu, Xiaolu, Shi, Yiwan, Zhou, Xinting, Gao, Botao, He, Fupo, Tam, Man Seng, Wang, Huajun, Cheang, Lek Hang, Zheng, Xiaofei, and Wu, Tingting
Published: 2024
Full Text: View/download PDF

20. Study on removal of beryllium from uranium beryllium ore wastewater by acid leaching activated carbon and its mechanism

Author: Zhao, Xu, Xia, Hongyang, Su, Yucheng, Wang, Hongqiang, Hu, Eming, Hu, Fang, Lei, Zhiwu, Wang, Qingliang, Zhou, Chunze, Zheng, Boyuan, and Hu, Pengfei
Published: 2023
Full Text: View/download PDF

21. A Super-robust Armoured Superhydrophobic Surface with Excellent Anti-icing Ability

Author: Wang, Peng, Zhao, Hui, Zheng, Boyuan, Guan, Ximei, Sun, Bin, Liao, Yongli, Yue, Ying, Duan, Wei, and Ding, Haimin
Published: 2023
Full Text: View/download PDF

22. Nonlinear attitude tracking control for four-wheel-leg-vehicle with series active suspension

Author: Wu, Liang, Zheng, Boyuan, Zhang, Weizhou, Youn, Iljoong, and Jia, Weiwei
Published: 2024
Full Text: View/download PDF

23. Frontiers in Laser Additive Manufacturing Technology

Author: Yang, Yongqiang, Jiang, Renwu, Han, Changjun, Chen, Jiaqi, Li, Haoran, Wang, Yan, Tang, Jinrong, Zhou, Heng, Hu, Weinan, Zheng, Boyuan, Liu, Zixin, Song, Changhui, and Wang, Di
Published: 2024
Full Text: View/download PDF

24. An eco-friendly porous hydrogel adsorbent based on dextran/phosphate/amino for efficient removal of Be(II) from aqueous solution

Author: Zhao, Xu, Wang, Qingliang, Sun, Yige, Li, Haoshuai, Lei, Zhiwu, Zheng, Boyuan, Xia, Hongyang, Su, Yucheng, Ali, Kham Muhammad Yaruq, Wang, Hongqiang, and Hu, Fang
Published: 2024
Full Text: View/download PDF

25. Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning

Author: Wang, Cunxiang, Zheng, Boyuan, Niu, Yuchen, and Zhang, Yue
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: To quantitatively and intuitively explore the generalization ability of pre-trained language models (PLMs), we have designed several tasks of arithmetic and logical reasoning. We both analyse how well PLMs generalize when the test data is in the same distribution as the train data and when it is different, for the latter analysis, we have also designed a cross-distribution test set other than the in-distribution test set. We conduct experiments on one of the most advanced and publicly released generative PLM - BART. Our research finds that the PLMs can easily generalize when the distribution is the same, however, it is still difficult for them to generalize out of the distribution., Comment: Accepted by NLPCC2021
Published: 2021

26. Imitation Learning: Progress, Taxonomies and Challenges

Author: Zheng, Boyuan, Verma, Sunny, Zhou, Jianlong, Tsang, Ivor, and Chen, Fang
Subjects: Computer Science - Machine Learning, Computer Science - Robotics
Abstract: Imitation learning aims to extract knowledge from human experts' demonstrations or artificially created agents in order to replicate their behaviors. Its success has been demonstrated in areas such as video games, autonomous driving, robotic simulations and object manipulation. However, this replicating process could be problematic, such as the performance is highly dependent on the demonstration quality, and most trained agents are limited to perform well in task-specific environments. In this survey, we provide a systematic review on imitation learning. We first introduce the background knowledge from development history and preliminaries, followed by presenting different taxonomies within Imitation Learning and key milestones of the field. We then detail challenges in learning strategies and present research opportunities with learning policy from suboptimal demonstration, voice instructions and other associated optimization schemes.
Published: 2021

27. Exceptional strength-ductility synergy achieved by spinodal decomposition in a high Cu content high-entropy alloy

Author: Wu, Yidong, Dong, Zhao, Zheng, Boyuan, Liu, Xuli, and Hui, Xidong
Published: 2024
Full Text: View/download PDF

28. SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning

Author: Zheng, Boyuan, Yang, Xiaoyu, Ruan, Yu-Ping, Ling, Zhenhua, Liu, Quan, Wei, Si, and Zhu, Xiaodan
Subjects: Computer Science - Computation and Language
Abstract: This paper introduces the SemEval-2021 shared task 4: Reading Comprehension of Abstract Meaning (ReCAM). This shared task is designed to help evaluate the ability of machines in representing and understanding abstract concepts. Given a passage and the corresponding question, a participating system is expected to choose the correct answer from five candidates of abstract concepts in a cloze-style machine reading comprehension setup. Based on two typical definitions of abstractness, i.e., the imperceptibility and nonspecificity, our task provides three subtasks to evaluate the participating models. Specifically, Subtask 1 aims to evaluate how well a system can model concepts that cannot be directly perceived in the physical world. Subtask 2 focuses on models' ability in comprehending nonspecific concepts located high in a hypernym hierarchy given the context of a passage. Subtask 3 aims to provide some insights into models' generalizability over the two types of abstractness. During the SemEval-2021 official evaluation period, we received 23 submissions to Subtask 1 and 28 to Subtask 2. The participating teams additionally made 29 submissions to Subtask 3. The leaderboard and competition website can be found at https://competitions.codalab.org/competitions/26153. The data and baseline code are available at https://github.com/boyuanzheng010/SemEval2021-Reading-Comprehension-of-Abstract-Meaning.
Published: 2021

29. Effective Removal of Beryllium from Industrial Wastewater by Alkali-Leaching Activated Carbon

Author: Zhao, Xu, Zheng, Boyuan, Xia, Hongyang, Su, Yucheng, Wang, Hongqiang, Hu, Eming, Hu, Pengfei, Hu, Fang, Lei, Zhiwu, and Wang, Qingliang
Published: 2023
Full Text: View/download PDF

30. Study on additive and subtractive manufacturing of high-quality surface parts enabled by picosecond laser

Author: Zheng, Boyuan, Trofimov, Vyacheslav, Yang, Yongqiang, Liu, Linqing, Feng, Yongwei, Zheng, Zhantu, Huang, Jinhui, and Wang, Di
Published: 2023
Full Text: View/download PDF

31. The Inner Link between Apolipoprotein E and the risk of Alzheimer’s disease

Author: Zheng, Boyuan, primary
Published: 2024
Full Text: View/download PDF

32. Beryllium adsorption from beryllium mining wastewater with novel porous lotus leaf biochar modified with PO43−/NH4+ multifunctional groups (MLLB).

Author: Zhao, Xu, Wang, Qingliang, Sun, Yige, Li, Haoshuai, Lei, Zhiwu, Zheng, Boyuan, Xia, Hongyang, Su, Yucheng, Ali, Khan Muhammad Yaruq, Wang, Hongqiang, and Hu, Fang
Subjects: PRECIPITATION (Chemistry), COMPLEXATION reactions, BERYLLIUM, ADSORPTION capacity, HYDROXYL group
Abstract: Wastewater produced in beryllium mining seriously affects ecological balance and causes great environmental pressure. We designed a novel porous lotus leaf biochar modified with PO43−/NH4+ multifunctional groups (MLLB) and used it for beryllium(Be) removal from beryllium mining wastewater. Kinetic and thermodynamic experiments showed that the adsorption capacity (Qe) of Be with MLLB from the simulated beryllium mining wastewater could reach 40.38 g kg−1 (35 °C, pH = 5.5), and the adsorption process was spontaneous and endothermic. The dispersion coefficient Kd of Be with MLLB was 2.6 × 104 mL g−1, which proved that MLLB had strong selective adsorption capacity for Be. Phosphoric acid, ammonia, and hydroxyl groups on the MLLB surface would complex with Be to form Be(OH)2 and Be(NH4)PO4 complexation products, which implied that surface complexation and precipitation reactions might co-existed in the adsorption process. The above results showed that MLLB could effectively adsorb Be and prevent beryllium exposure in a beryllium mining process. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Two-Stage Robust Optimization of Integrated Energy Systems Considering Uncertainty in Carbon Source Load.

Author: Li, Na, Zheng, Boyuan, Wang, Guanxiong, Liu, Wenjie, Guo, Dongxu, Zou, Linna, and Pan, Chongchao
Subjects: CARBON emissions, ROBUST optimization, RENEWABLE energy sources, LINEAR programming, MATHEMATICAL optimization
Abstract: Integrated Energy Systems (IESs) interconnect various energy networks to achieve coordinated planning and optimized operation among heterogeneous energy subsystems, making them a hot topic in current energy research. However, with the high integration of renewable energy sources, their fluctuation characteristics introduce uncertainties to the entire system, including the corresponding indirect carbon emissions from electricity. To address these issues, this paper constructs a two-stage, three-layer robust optimization operation model for IESs from day-ahead to intra-day. The model analyzes the uncertainties in carbon emission intensity at grid-connected nodes, as well as the uncertainty characteristics of photovoltaic, wind turbine, and cooling, heating, and electricity loads, expressed using polyhedral uncertainty sets. It standardizes the modeling of internal equipment in the IES, introduces carbon emission trading mechanisms, and constructs a low-carbon economic model, transforming the objective function and constraints into a compact form. The column-and-constraint generation algorithm is applied to transform the three-layer model into a single-layer main problem and a two-layer subproblem for iterative solution. The Karush–Kuhn–Tucker (KKT) condition is used to convert the two-layer subproblem into a linear programming model. A case study conducted on a park shows that while the introduction of uncertainty optimization increases system costs and carbon emissions compared to deterministic optimization, the scheduling strategy is more stable, significantly reducing the impact of uncertainties on the system. Moreover, the proposed strategy reduces total costs by 5.03% and carbon emissions by 1.25% compared to scenarios considering only source load uncertainty, fully verifying that the proposed method improves the economic and low-carbon performance of the system. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Strategic optimization operations in the integrated energy system through multitime scale comprehensive demand response

Author: Zheng, Boyuan, primary, Hou, Xiaowang, additional, Xu, Shouhan, additional, Jin, Tai, additional, Liu, Wenjie, additional, Li, Na, additional, Guo, Dongxu, additional, and Pan, Chongchao, additional
Published: 2024
Full Text: View/download PDF

35. The ecological and economic impacts of the comprehensive implementation of electric buses in New York City

Author: Zheng, Boyuan, primary, Yu, Miaoxin, additional, Yang, Yue, additional, and Zi, Yumeng, additional
Published: 2024
Full Text: View/download PDF

36. Open-Circuit Fault-Tolerant Control for DTP-SynRM Based on Rotation Coordinate Transformation

Author: Li, Bingjun, Zou, Jibin, Xu, Yongxiang, Zheng, Boyuan, and Xiao, Lijun
Abstract: This article proposes an open-circuit fault-tolerant strategy for a dual-three-phase synchronous reluctance motor (DTP-SynRM) based on rotation coordinate transformation. A DTP-SynRM vector space decoupling model is established to achieve complete decoupling of voltage, current, and flux equations. Then, the influence of the fault state on the current and the torque fluctuations of the motor is analyzed. The proposed fault-tolerant strategy converts the second harmonic of the current through rotational coordinate transformation, thereby directly controlling the second harmonic of the current using a proportional–integral regulator in the new coordinate system. The simulation results prove the analysis and the effectiveness of the proposed strategy.
Published: 2024
Full Text: View/download PDF

37. Scrutinize What We Ignore: Reining Task Representation Shift In Context-Based Offline Meta Reinforcement Learning

Author: Zhang, Hai, Zheng, Boyuan, Guo, Anqi, Ji, Tianying, Heng, Pheng-Ann, Zhao, Junqiao, Li, Lanqing, Zhang, Hai, Zheng, Boyuan, Guo, Anqi, Ji, Tianying, Heng, Pheng-Ann, Zhao, Junqiao, and Li, Lanqing
Abstract: Offline meta reinforcement learning (OMRL) has emerged as a promising approach for interaction avoidance and strong generalization performance by leveraging pre-collected data and meta-learning techniques. Previous context-based approaches predominantly rely on the intuition that maximizing the mutual information between the task and the task representation ($I(Z;M)$) can lead to performance improvements. Despite achieving attractive results, the theoretical justification of performance improvement for such intuition has been lacking. Motivated by the return discrepancy scheme in the model-based RL field, we find that maximizing $I(Z;M)$ can be interpreted as consistently raising the lower bound of the expected return for a given policy conditioning on the optimal task representation. However, this optimization process ignores the task representation shift between two consecutive updates, which may lead to performance improvement collapse. To address this problem, we turn to use the framework of performance difference bound to consider the impacts of task representation shift explicitly. We demonstrate that by reining the task representation shift, it is possible to achieve monotonic performance improvements, thereby showcasing the advantage against previous approaches. To make it practical, we design an easy yet highly effective algorithm RETRO (\underline{RE}ining \underline{T}ask \underline{R}epresentation shift in context-based \underline{O}ffline meta reinforcement learning) with only adding one line of code compared to the backbone. Empirical results validate its state-of-the-art (SOTA) asymptotic performance, training stability and training-time consumption on MuJoCo and MetaWorld benchmarks.
Published: 2024

38. High-Frequency Current Harmonic Analysis and Suppression in Dual Three-Phase PMSMs With Advanced Carrier Phase-Shift PWM

Author: Zheng, Boyuan, primary, Zou, Jibin, additional, Xu, Yongxiang, additional, Yu, Guodong, additional, Wang, Lu, additional, and Zanchetta, Pericle, additional
Published: 2024
Full Text: View/download PDF

39. Explaining Imitation Learning Through Frames

Author: Zheng, Boyuan, primary, Zhou, Jianlong, additional, Liu, Chunjie, additional, Li, Yiqiao, additional, and Chen, Fang, additional
Published: 2024
Full Text: View/download PDF

40. Postfault Strategy for Dual Three-Phase PMSM With Reduced Current Loops Considering Leakage Inductance

Author: Zheng, Boyuan, Wang, Xinyuan, Xu, Yongxiang, Zou, Jibin, Yu, Guodong, and Zanchetta, Pericle
Abstract: One of the main advantages of dual three-phase permanent magnet synchronous machines (DTP-PMSMs) is their fault-tolerant capability. However, the complex fault-tolerant controller structure, high copper loss and unneglectable torque ripple under postfault operation limit their further development. The inaccurate transmission of control signals under the open-phase fault (OPF) is considered in this article. Undesirable second-order torque ripple and 3rd order current harmonics are induced by the transmission error between the controller output voltage and the actual stator voltage according to theoretical analysis. To cope with this issue, a fault tolerant control strategy based on voltage compensation is developed for DTP-PMSMs under OPF without introducing extra complex control schemes. The current references are derived in sinusoidal forms with minimum copper loss. Only three current loops are utilized in the proposed fault-tolerant strategy, compared to four current loops under normal operation, without changing the control strategy and controller parameters. In terms of compensation performance, since the leakage inductance affects the voltage compensation, a parameter acquisition method based on sine wave current injection is proposed. The proposed strategy does not induce heavy computation burden and has a good steady-state and dynamic response. Simulation and experiment results prove the theoretical analysis and the superiority of the proposed strategy.
Published: 2024
Full Text: View/download PDF

41. Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning

Author: Wang, Cunxiang, primary, Zheng, Boyuan, additional, Niu, Yuchen, additional, and Zhang, Yue, additional
Published: 2021
Full Text: View/download PDF

42. Fault detection for sucker rod pump based on motor power

Author: Zheng, Boyuan, Gao, Xianwen, and Li, Xiangyu
Published: 2019
Full Text: View/download PDF

43. Diagnosis of Sucker Rod Pump based on generating dynamometer cards

Author: Zheng, Boyuan, Gao, Xianwen, and Li, Xiangyu
Published: 2019
Full Text: View/download PDF

44. Study on additive and subtractive manufacturing using picosecond laser micromachining

Author: Courvoisier, François, Lecler, Sylvain, Pfleging, Wilhelm, Zheng, Boyuan, Trofimov, Vyacheslav, Yang, Yongqiang, Wang, Meng, Tai, Zhiheng, Yan, Zhongwei, Wang, Yan, and Wang, Di
Published: 2024
Full Text: View/download PDF

45. Aldehyde dehydrogenase 2 activation ameliorates cyclophosphamide-induced acute cardiotoxicity via detoxification of toxic aldehydes and suppression of cardiac cell death

Author: Liu, Wenwen, Zhai, Xiaoxuan, Wang, Wenjun, Zheng, Boyuan, Zhang, Zhenxiao, Fan, Xinhui, Chen, Yuguo, and Wang, Jiali
Published: 2018
Full Text: View/download PDF

46. Assessment of treatment efficacy using surface-enhanced Raman spectroscopy analysis of urine in rats with kidney transplantation or kidney disease

Author: Feng, Shijian, Zhou, Lan, Lin, Duo, Zhao, Jianhua, Guan, Qiunong, Zheng, Boyuan, Wang, Kunjie, Li, Hong, Chen, Rong, Zeng, Haishan, and Du, Caigan
Published: 2019
Full Text: View/download PDF

47. Surface modification of the biodegradable cardiovascular stent material Mg–Zn–Y–Nd alloy via conjugating REDV peptide for better endothelialization

Author: Chen, Li, Li, Jingan, Wang, Shuo, Zhu, Shijie, Zhu, Chao, Zheng, Boyuan, Yang, Ge, and Guan, Shaokang
Published: 2018
Full Text: View/download PDF

48. High efficiency frequency conversion based on cascaded processes under large phase mismatching between some interacting pulses

Author: Schunemann, Peter G., Zheng, Boyuan, Trofimov, Vyacheslav A., Kharitonov, Dmitry M., Fedotov, Mikhail V., Yang, Yongqiang, and Wang, Di
Published: 2024
Full Text: View/download PDF

49. Sucker rod pumping diagnosis using valve working position and parameter optimal continuous hidden Markov model

Author: Zheng, Boyuan and Gao, Xianwen
Published: 2017
Full Text: View/download PDF

50. Effects of Uncertainty and Knowledge Graph on Perception of Fairness

Author: Zhou, Jianlong, primary, Zheng, Boyuan, additional, and Chen, Fang, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

209 results on '"Zheng, Boyuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources