8,851 results on '"PENG Hao"'
Search Results
202. Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance
- Author
-
Fu, Yao, Ou, Litu, Chen, Mingyu, Wan, Yuhao, Peng, Hao, and Khot, Tushar
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
As large language models (LLMs) are continuously being developed, their evaluation becomes increasingly important yet challenging. This work proposes Chain-of-Thought Hub, an open-source evaluation suite on the multi-step reasoning capabilities of large language models. We are interested in this setting for two reasons: (1) from the behavior of GPT and PaLM model family, we observe that complex reasoning is likely to be a key differentiator between weaker and stronger LLMs; (2) we envisage large language models to become the next-generation computational platform and foster an ecosystem of LLM-based new applications, this naturally requires the foundation models to perform complex tasks that often involve the composition of linguistic and logical operations. Our approach is to compile a suite of challenging reasoning benchmarks to track the progress of LLMs. Our current results show that: (1) model scale clearly correlates with reasoning capabilities; (2) As of May 2023, Claude-v1.3 and PaLM-2 are the only two models that are comparable with GPT-4, while open-sourced models still lag behind; (3) LLaMA-65B performs closely to code-davinci-002, indicating that with successful further development such as reinforcement learning from human feedback (RLHF), it has great potential to be close to GPT-3.5-Turbo. Our results also suggest that for the open-source efforts to catch up, the community may focus more on building better base models and exploring RLHF., Comment: Preprint. Code at https://github.com/FranxYao/chain-of-thought-hub
- Published
- 2023
203. Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
- Author
-
Fu, Yao, Peng, Hao, Khot, Tushar, and Lapata, Mirella
- Subjects
Computer Science - Computation and Language - Abstract
We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing. We are interested in this question because if LLMs were able to improve each other, it would imply the possibility of creating strong AI agents with minimal human intervention. We ask two LLMs to negotiate with each other, playing the roles of a buyer and a seller, respectively. They aim to reach a deal with the buyer targeting a lower price and the seller a higher one. A third language model, playing the critic, provides feedback to a player to improve the player's negotiation strategies. We let the two agents play multiple rounds, using previous negotiation history and AI feedback as in-context demonstrations to improve the model's negotiation strategy iteratively. We use different LLMs (GPT and Claude) for different roles and use the deal price as the evaluation metric. Our experiments reveal multiple intriguing findings: (1) Only a subset of the language models we consider can self-play and improve the deal price from AI feedback, weaker models either do not understand the game's rules or cannot incorporate AI feedback for further improvement. (2) Models' abilities to learn from the feedback differ when playing different roles. For example, it is harder for Claude-instant to improve as the buyer than as the seller. (3) When unrolling the game to multiple rounds, stronger agents can consistently improve their performance by meaningfully using previous experiences and iterative AI feedback, yet have a higher risk of breaking the deal. We hope our work provides insightful initial explorations of having models autonomously improve each other with game playing and AI feedback., Comment: Preprint. Code at https://github.com/FranxYao/GPT-Bargaining
- Published
- 2023
204. LeTI: Learning to Generate from Textual Interactions
- Author
-
Wang, Xingyao, Peng, Hao, Jabbarvand, Reyhaneh, and Ji, Heng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Software Engineering - Abstract
Fine-tuning pre-trained language models (LMs) is essential for enhancing their capabilities. Existing techniques commonly fine-tune on input-output pairs (e.g., instruction tuning) or with numerical rewards that gauge the output quality (e.g., RLHF). We explore LMs' potential to learn from textual interactions (LETI) that not only check their correctness with binary labels but also pinpoint and explain errors in their outputs through textual feedback. Our focus is the code generation task, where the model produces code based on natural language instructions. This setting invites a natural and scalable way to acquire textual feedback: the error messages and stack traces from code execution using a Python interpreter. LETI iteratively fine-tunes the model, using the LM objective, on a concatenation of natural language instructions, LM-generated programs, and textual feedback. Prepended to this fine-tuning text, a binary reward token is used to differentiate correct and buggy solutions. LETI requires no ground-truth outputs for training and even outperforms a fine-tuned baseline that does. LETI not only improves the performance of LMs on a code generation dataset MBPP, but also generalizes to other datasets. Trained on MBPP, it achieves comparable or better performance than the base LMs on unseen problems in HumanEval. Furthermore, compared to binary feedback, we observe that textual feedback leads to improved generation quality and sample efficiency, achieving the same performance with fewer than half of the gradient steps. LETI is equally applicable in natural language tasks when they can be formulated as code generation, which we empirically verified on event argument extraction., Comment: NAACL 2024 Findings
- Published
- 2023
205. AMD: Autoregressive Motion Diffusion
- Author
-
Han, Bo, Peng, Hao, Dong, Minjing, Ren, Yi, Shen, Yixuan, and Xu, Chang
- Subjects
Computer Science - Multimedia - Abstract
Human motion generation aims to produce plausible human motion sequences according to various conditional inputs, such as text or audio. Despite the feasibility of existing methods in generating motion based on short prompts and simple motion patterns, they encounter difficulties when dealing with long prompts or complex motions. The challenges are two-fold: 1) the scarcity of human motion-captured data for long prompts and complex motions. 2) the high diversity of human motions in the temporal domain and the substantial divergence of distributions from conditional modalities, leading to a many-to-many mapping problem when generating motion with complex and long texts. In this work, we address these gaps by 1) elaborating the first dataset pairing long textual descriptions and 3D complex motions (HumanLong3D), and 2) proposing an autoregressive motion diffusion model (AMD). Specifically, AMD integrates the text prompt at the current timestep with the text prompt and action sequences at the previous timestep as conditional information to predict the current action sequences in an iterative manner. Furthermore, we present its generalization for X-to-Motion with "No Modality Left Behind", enabling the generation of high-definition and high-fidelity human motions based on user-defined modality input., Comment: accepted by AAAI2024
- Published
- 2023
206. Contrastive Graph Clustering in Curvature Spaces
- Author
-
Sun, Li, Wang, Feiyang, Ye, Junda, Peng, Hao, and Yu, Philip S.
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Graph clustering is a longstanding research topic, and has achieved remarkable success with the deep learning methods in recent years. Nevertheless, we observe that several important issues largely remain open. On the one hand, graph clustering from the geometric perspective is appealing but has rarely been touched before, as it lacks a promising space for geometric clustering. On the other hand, contrastive learning boosts the deep graph clustering but usually struggles in either graph augmentation or hard sample mining. To bridge this gap, we rethink the problem of graph clustering from geometric perspective and, to the best of our knowledge, make the first attempt to introduce a heterogeneous curvature space to graph clustering problem. Correspondingly, we present a novel end-to-end contrastive graph clustering model named CONGREGATE, addressing geometric graph clustering with Ricci curvatures. To support geometric clustering, we construct a theoretically grounded Heterogeneous Curvature Space where deep representations are generated via the product of the proposed fully Riemannian graph convolutional nets. Thereafter, we train the graph clusters by an augmentation-free reweighted contrastive approach where we pay more attention to both hard negatives and hard positives in our curvature space. Empirical results on real-world graphs show that our model outperforms the state-of-the-art competitors., Comment: Accepted by IJCAI'23
- Published
- 2023
207. Hierarchical State Abstraction Based on Structural Information Principles
- Author
-
Zeng, Xianghua, Peng, Hao, Li, Angsheng, Liu, Chunyang, He, Lifang, and Yu, Philip S.
- Subjects
Computer Science - Artificial Intelligence - Abstract
State abstraction optimizes decision-making by ignoring irrelevant environmental information in reinforcement learning with rich observations. Nevertheless, recent approaches focus on adequate representational capacities resulting in essential information loss, affecting their performances on challenging tasks. In this article, we propose a novel mathematical Structural Information principles-based State Abstraction framework, namely SISA, from the information-theoretic perspective. Specifically, an unsupervised, adaptive hierarchical state clustering method without requiring manual assistance is presented, and meanwhile, an optimal encoding tree is generated. On each non-root tree node, a new aggregation function and condition structural entropy are designed to achieve hierarchical state abstraction and compensate for sampling-induced essential information loss in state abstraction. Empirical evaluations on a visual gridworld domain and six continuous control benchmarks demonstrate that, compared with five SOTA state abstraction approaches, SISA significantly improves mean episode reward and sample efficiency up to 18.98 and 44.44%, respectively. Besides, we experimentally show that SISA is a general framework that can be flexibly integrated with different representation-learning objectives to improve their performances further.
- Published
- 2023
208. Hyperbolic Geometric Graph Representation Learning for Hierarchy-imbalance Node Classification
- Author
-
Fu, Xingcheng, Wei, Yuecen, Sun, Qingyun, Yuan, Haonan, Wu, Jia, Peng, Hao, and Li, Jianxin
- Subjects
Computer Science - Machine Learning ,Computer Science - Social and Information Networks - Abstract
Learning unbiased node representations for imbalanced samples in the graph has become a more remarkable and important topic. For the graph, a significant challenge is that the topological properties of the nodes (e.g., locations, roles) are unbalanced (topology-imbalance), other than the number of training labeled nodes (quantity-imbalance). Existing studies on topology-imbalance focus on the location or the local neighborhood structure of nodes, ignoring the global underlying hierarchical properties of the graph, i.e., hierarchy. In the real-world scenario, the hierarchical structure of graph data reveals important topological properties of graphs and is relevant to a wide range of applications. We find that training labeled nodes with different hierarchical properties have a significant impact on the node classification tasks and confirm it in our experiments. It is well known that hyperbolic geometry has a unique advantage in representing the hierarchical structure of graphs. Therefore, we attempt to explore the hierarchy-imbalance issue for node classification of graph neural networks with a novelty perspective of hyperbolic geometry, including its characteristics and causes. Then, we propose a novel hyperbolic geometric hierarchy-imbalance learning framework, named HyperIMBA, to alleviate the hierarchy-imbalance issue caused by uneven hierarchy-levels and cross-hierarchy connectivity patterns of labeled nodes.Extensive experimental results demonstrate the superior effectiveness of HyperIMBA for hierarchy-imbalance node classification tasks., Comment: Accepted by Web Conference (WWW) 2023
- Published
- 2023
- Full Text
- View/download PDF
209. Graph Collaborative Signals Denoising and Augmentation for Recommendation
- Author
-
Fan, Ziwei, Xu, Ke, Dong, Zhang, Peng, Hao, Zhang, Jiawei, and Yu, Philip S.
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Graph collaborative filtering (GCF) is a popular technique for capturing high-order collaborative signals in recommendation systems. However, GCF's bipartite adjacency matrix, which defines the neighbors being aggregated based on user-item interactions, can be noisy for users/items with abundant interactions and insufficient for users/items with scarce interactions. Additionally, the adjacency matrix ignores user-user and item-item correlations, which can limit the scope of beneficial neighbors being aggregated. In this work, we propose a new graph adjacency matrix that incorporates user-user and item-item correlations, as well as a properly designed user-item interaction matrix that balances the number of interactions across all users. To achieve this, we pre-train a graph-based recommendation method to obtain users/items embeddings, and then enhance the user-item interaction matrix via top-K sampling. We also augment the symmetric user-user and item-item correlation components to the adjacency matrix. Our experiments demonstrate that the enhanced user-item interaction matrix with improved neighbors and lower density leads to significant benefits in graph-based recommendation. Moreover, we show that the inclusion of user-user and item-item correlations can improve recommendations for users with both abundant and insufficient interactions. The code is in \url{https://github.com/zfan20/GraphDA}., Comment: Short Paper Accepted by SIGIR 2023, 6 pages
- Published
- 2023
210. Effective and Stable Role-Based Multi-Agent Collaboration by Structural Information Principles
- Author
-
Zeng, Xianghua, Peng, Hao, and Li, Angsheng
- Subjects
Computer Science - Artificial Intelligence - Abstract
Role-based learning is a promising approach to improving the performance of Multi-Agent Reinforcement Learning (MARL). Nevertheless, without manual assistance, current role-based methods cannot guarantee stably discovering a set of roles to effectively decompose a complex task, as they assume either a predefined role structure or practical experience for selecting hyperparameters. In this article, we propose a mathematical Structural Information principles-based Role Discovery method, namely SIRD, and then present a SIRD optimizing MARL framework, namely SR-MARL, for multi-agent collaboration. The SIRD transforms role discovery into a hierarchical action space clustering. Specifically, the SIRD consists of structuralization, sparsification, and optimization modules, where an optimal encoding tree is generated to perform abstracting to discover roles. The SIRD is agnostic to specific MARL algorithms and flexibly integrated with various value function factorization approaches. Empirical evaluations on the StarCraft II micromanagement benchmark demonstrate that, compared with state-of-the-art MARL algorithms, the SR-MARL framework improves the average test win rate by 0.17%, 6.08%, and 3.24%, and reduces the deviation by 16.67%, 30.80%, and 66.30%, under easy, hard, and super hard scenarios., Comment: 9 pages, 8 figures,2 references
- Published
- 2023
211. Electric and magnetic conductivities in magnetized fermion systems
- Author
-
Peng, Hao-Hao, Sheng, Xin-Li, Pu, Shi, and Wang, Qun
- Subjects
Nuclear Theory ,High Energy Physics - Theory - Abstract
In Wigner function approach with relaxation time approximation, we calculate electric and magnetic conductivities of a fermion system in the strong magnetic field. The linear response has been calculated to the perturbation of electromagnetic fields on the background constant magnetic field. The Wigner function is separated into an equilibrium part in the background magnetic field and an off-equilibrium part induced by perturbative fields. The analytical expression for the equilibrium part and the corresponding equilibrium conditions are given. For the off-equilibrium part, we obtain the kinetic equation at the leading order in $\hbar$ from the master equation of the Wigner function. When perturbative fields only depend on the proper time, the off-equilibrium part can be analytically solved from which the vector and axial vector currents are obtained. We obtain the longitudinal and transverse Ohm conductivities as well as Hall conductivity as the linear response of the vector current to the perturbative electric field. The behaviors of these conductivities as functions of the evolving time, relaxation time, particle mass, and strength of the background magnetic field are investigated both analytically and numerically., Comment: 25 pages, 6 figures
- Published
- 2023
- Full Text
- View/download PDF
212. SE-GSL: A General and Effective Graph Structure Learning Framework through Structural Entropy Optimization
- Author
-
Zou, Dongcheng, Peng, Hao, Huang, Xiang, Yang, Renyu, Li, Jianxin, Wu, Jia, Liu, Chunyang, and Yu, Philip S.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Graph Neural Networks (GNNs) are de facto solutions to structural data learning. However, it is susceptible to low-quality and unreliable structure, which has been a norm rather than an exception in real-world graphs. Existing graph structure learning (GSL) frameworks still lack robustness and interpretability. This paper proposes a general GSL framework, SE-GSL, through structural entropy and the graph hierarchy abstracted in the encoding tree. Particularly, we exploit the one-dimensional structural entropy to maximize embedded information content when auxiliary neighbourhood attributes are fused to enhance the original graph. A new scheme of constructing optimal encoding trees is proposed to minimize the uncertainty and noises in the graph whilst assuring proper community partition in hierarchical abstraction. We present a novel sample-based mechanism for restoring the graph structure via node structural entropy distribution. It increases the connectivity among nodes with larger uncertainty in lower-level communities. SE-GSL is compatible with various GNN models and enhances the robustness towards noisy and heterophily structures. Extensive experiments show significant improvements in the effectiveness and robustness of structure learning and node representation learning., Comment: 12 pages,5 figures, accepted by WWW2023
- Published
- 2023
- Full Text
- View/download PDF
213. FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection
- Author
-
Yang, Yingguang, Yang, Renyu, Peng, Hao, Li, Yangyang, Li, Tong, Liao, Yong, and Zhou, Pengyuan
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Social bot detection is of paramount importance to the resilience and security of online social platforms. The state-of-the-art detection models are siloed and have largely overlooked a variety of data characteristics from multiple cross-lingual platforms. Meanwhile, the heterogeneity of data distribution and model architecture makes it intricate to devise an efficient cross-platform and cross-model detection framework. In this paper, we propose FedACK, a new federated adversarial contrastive knowledge distillation framework for social bot detection. We devise a GAN-based federated knowledge distillation mechanism for efficiently transferring knowledge of data distribution among clients. In particular, a global generator is used to extract the knowledge of global data distribution and distill it into each client's local model. We leverage local discriminator to enable customized model design and use local generator for data enhancement with hard-to-decide samples. Local training is conducted as multi-stage adversarial and contrastive learning to enable consistent feature spaces among clients and to constrain the optimization direction of local models, reducing the divergences between local and global models. Experiments demonstrate that FedACK outperforms the state-of-the-art approaches in terms of accuracy, communication efficiency, and feature space consistency., Comment: Accepted by the ACM Web Conference 2023 (WWW'23)
- Published
- 2023
- Full Text
- View/download PDF
214. Reinforcement Learning Guided Multi-Objective Exam Paper Generation
- Author
-
Shang, Yuhu, Luo, Xuexiong, Wang, Lihong, Peng, Hao, Zhang, Xiankun, Ren, Yimeng, and Liang, Kun
- Subjects
Computer Science - Machine Learning ,Computer Science - Computers and Society - Abstract
To reduce the repetitive and complex work of instructors, exam paper generation (EPG) technique has become a salient topic in the intelligent education field, which targets at generating high-quality exam paper automatically according to instructor-specified assessment criteria. The current advances utilize the ability of heuristic algorithms to optimize several well-known objective constraints, such as difficulty degree, number of questions, etc., for producing optimal solutions. However, in real scenarios, considering other equally relevant objectives (e.g., distribution of exam scores, skill coverage) is extremely important. Besides, how to develop an automatic multi-objective solution that finds an optimal subset of questions from a huge search space of large-sized question datasets and thus composes a high-quality exam paper is urgent but non-trivial. To this end, we skillfully design a reinforcement learning guided Multi-Objective Exam Paper Generation framework, termed MOEPG, to simultaneously optimize three exam domain-specific objectives including difficulty degree, distribution of exam scores, and skill coverage. Specifically, to accurately measure the skill proficiency of the examinee group, we first employ deep knowledge tracing to model the interaction information between examinees and response logs. We then design the flexible Exam Q-Network, a function approximator, which automatically selects the appropriate question to update the exam paper composition process. Later, MOEPG divides the decision space into multiple subspaces to better guide the updated direction of the exam paper. Through extensive experiments on two real-world datasets, we demonstrate that MOEPG is feasible in addressing the multiple dilemmas of exam paper generation scenario.
- Published
- 2023
215. FedsNet: the real-time network for pedestrian detection based on RT-DETR
- Author
-
Peng, Hao and Chen, Shiqiang
- Published
- 2024
- Full Text
- View/download PDF
216. Performance analysis of a new multifunctional aircraft environmental control system under variable operating conditions
- Author
-
Shangguan, Zhen, Wei, Xinyi, Peng, Hao, and Cheng, Qing
- Published
- 2024
- Full Text
- View/download PDF
217. Message passing approach to analyze the robustness of hypergraph
- Author
-
Peng, Hao, Qian, Cheng, Zhao, Dandan, Zhong, Ming, Han, Jianmin, Li, Runchao, and Wang, Wei
- Subjects
Physics - Physics and Society - Abstract
Hypergraph networks are closer to real life because they can reflect higher-order interactions, so researchers have begun using them to build models for real-world networks. The mean-field approach is the current tool for studying the percolation problem on hypergraph networks. However, we found that when there is a loop in the hypergraph network, the calculated results using this approach deviate from the real results. Therefore, in this paper, we rephrase the percolation on the hypergraph network as a message passing process, thus obtaining a message passing approach. Our proposed approach has been tested in several hypergraph networks with loops, and the experimental results are more accurate than those under the mean-field approach. This is helpful to analyze and understand the robustness of hypergraph networks with loops. In addition, we also specifically analyzed how four different types of loops affect the accuracy of the experiment. Our proposed message passing approach also provides another way to study percolation on hypergraph networks., Comment: 13 pages, 6 figures
- Published
- 2023
218. A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT
- Author
-
Zhou, Ce, Li, Qian, Li, Chen, Yu, Jun, Liu, Yixin, Wang, Guangjing, Zhang, Kai, Ji, Cheng, Yan, Qiben, He, Lifang, Peng, Hao, Li, Jianxin, Wu, Jia, Liu, Ziwei, Xie, Pengtao, Xiong, Caiming, Pei, Jian, Yu, Philip S., and Sun, Lichao
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks with different data modalities. A PFM (e.g., BERT, ChatGPT, and GPT-4) is trained on large-scale data which provides a reasonable parameter initialization for a wide range of downstream applications. BERT learns bidirectional encoder representations from Transformers, which are trained on large datasets as contextual language models. Similarly, the generative pretrained transformer (GPT) method employs Transformers as the feature extractor and is trained using an autoregressive paradigm on large datasets. Recently, ChatGPT shows promising success on large language models, which applies an autoregressive language model with zero shot or few shot prompting. The remarkable achievements of PFM have brought significant breakthroughs to various fields of AI. Numerous studies have proposed different methods, raising the demand for an updated survey. This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities. The review covers the basic components and existing pretraining methods used in natural language processing, computer vision, and graph learning. Additionally, it explores advanced PFMs used for different data modalities and unified PFMs that consider data quality and quantity. The review also discusses research related to the fundamentals of PFMs, such as model efficiency and compression, security, and privacy. Finally, the study provides key implications, future research directions, challenges, and open problems in the field of PFMs. Overall, this survey aims to shed light on the research of the PFMs on scalability, security, logical reasoning ability, cross-domain learning ability, and the user-friendly interactive ability for artificial general intelligence., Comment: 99 pages, 16 figures
- Published
- 2023
219. A Comprehensive Survey on Automatic Knowledge Graph Construction
- Author
-
Zhong, Lingfeng, Wu, Jia, Li, Qian, Peng, Hao, and Wu, Xindong
- Subjects
Computer Science - Information Retrieval - Abstract
Automatic knowledge graph construction aims to manufacture structured human knowledge. To this end, much effort has historically been spent extracting informative fact patterns from different data sources. However, more recently, research interest has shifted to acquiring conceptualized structured knowledge beyond informative data. In addition, researchers have also been exploring new ways of handling sophisticated construction tasks in diversified scenarios. Thus, there is a demand for a systematic review of paradigms to organize knowledge structures beyond data-level mentions. To meet this demand, we comprehensively survey more than 300 methods to summarize the latest developments in knowledge graph construction. A knowledge graph is built in three steps: knowledge acquisition, knowledge refinement, and knowledge evolution. The processes of knowledge acquisition are reviewed in detail, including obtaining entities with fine-grained types and their conceptual linkages to knowledge graphs; resolving coreferences; and extracting entity relationships in complex scenarios. The survey covers models for knowledge refinement, including knowledge graph completion, and knowledge fusion. Methods to handle knowledge evolution are also systematically presented, including condition knowledge acquisition, condition knowledge graph completion, and knowledge dynamic. We present the paradigms to compare the distinction among these methods along the axis of the data environment, motivation, and architecture. Additionally, we also provide briefs on accessible resources that can help readers to develop practical knowledge graph systems. The survey concludes with discussions on the challenges and possible directions for future exploration., Comment: This paper contains 50 pages and 22 figures. This paper is submitted to ACM Computing Surveys
- Published
- 2023
220. Specializing Smaller Language Models towards Multi-Step Reasoning
- Author
-
Fu, Yao, Peng, Hao, Ou, Litu, Sabharwal, Ashish, and Khot, Tushar
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
The surprising ability of Large Language Models (LLMs) to perform well on complex reasoning with only few-shot chain-of-thought prompts is believed to emerge only in very large-scale models (100+ billion parameters). We show that such abilities can, in fact, be distilled down from GPT-3.5 ($\ge$ 175B) to T5 variants ($\le$ 11B). We propose model specialization, to specialize the model's ability towards a target task. The hypothesis is that large models (commonly viewed as larger than 100B) have strong modeling power, but are spread on a large spectrum of tasks. Small models (commonly viewed as smaller than 10B) have limited model capacity, but if we concentrate their capacity on a specific target task, the model can achieve a decent improved performance. We use multi-step math reasoning as our testbed because it is a very typical emergent ability. We show two important aspects of model abilities: (1). there exists a very complex balance/ tradeoff between language models' multi-dimensional abilities; (2). by paying the price of decreased generic ability, we can clearly lift up the scaling curve of models smaller than 10B towards a specialized multi-step math reasoning ability. We further give comprehensive discussions about important design choices for better generalization, including the tuning data format, the start model checkpoint, and a new model selection method. We hope our practice and discoveries can serve as an important attempt towards specialized smaller models in the new research paradigm set by LLMs., Comment: Preprint
- Published
- 2023
221. Mutual Wasserstein Discrepancy Minimization for Sequential Recommendation
- Author
-
Fan, Ziwei, Liu, Zhiwei, Peng, Hao, and Yu, Philip S
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval - Abstract
Self-supervised sequential recommendation significantly improves recommendation performance by maximizing mutual information with well-designed data augmentations. However, the mutual information estimation is based on the calculation of Kullback Leibler divergence with several limitations, including asymmetrical estimation, the exponential need of the sample size, and training instability. Also, existing data augmentations are mostly stochastic and can potentially break sequential correlations with random modifications. These two issues motivate us to investigate an alternative robust mutual information measurement capable of modeling uncertainty and alleviating KL divergence limitations. To this end, we propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation. We propose the Wasserstein Discrepancy Measurement to measure the mutual information between augmented sequences. Wasserstein Discrepancy Measurement builds upon the 2-Wasserstein distance, which is more robust, more efficient in small batch sizes, and able to model the uncertainty of stochastic augmentation processes. We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement. Extensive experiments on four benchmark datasets demonstrate the effectiveness of MStein over baselines. More quantitative analyses show the robustness against perturbations and training efficiency in batch size. Finally, improvements analysis indicates better representations of popular users or items with significant uncertainty. The source code is at https://github.com/zfan20/MStein., Comment: Updated with the correction of the asymmetric mistake on the mutual information connection
- Published
- 2023
222. Unbiased and Efficient Self-Supervised Incremental Contrastive Learning
- Author
-
Ji, Cheng, Li, Jianxin, Peng, Hao, Wu, Jia, Fu, Xingcheng, Sun, Qingyun, and Yu, Phillip S.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Contrastive Learning (CL) has been proved to be a powerful self-supervised approach for a wide range of domains, including computer vision and graph representation learning. However, the incremental learning issue of CL has rarely been studied, which brings the limitation in applying it to real-world applications. Contrastive learning identifies the samples with the negative ones from the noise distribution that changes in the incremental scenarios. Therefore, only fitting the change of data without noise distribution causes bias, and directly retraining results in low efficiency. To bridge this research gap, we propose a self-supervised Incremental Contrastive Learning (ICL) framework consisting of (i) a novel Incremental InfoNCE (NCE-II) loss function by estimating the change of noise distribution for old data to guarantee no bias with respect to the retraining, (ii) a meta-optimization with deep reinforced Learning Rate Learning (LRL) mechanism which can adaptively learn the learning rate according to the status of the training processes and achieve fast convergence which is critical for incremental learning. Theoretically, the proposed ICL is equivalent to retraining, which is based on solid mathematical derivation. In practice, extensive experiments in different domains demonstrate that, without retraining a new model, ICL achieves up to 16.7x training speedup and 16.8x faster convergence with competitive results.
- Published
- 2023
223. Parallel Multi-Extended State Observers based {ADRC} with Application to High-Speed Precision Motion Stage
- Author
-
Tang, Guojie, Xue, Wenchao, Peng, Hao, Zhao, Yanlong, and Yang, Zhijun
- Subjects
Electrical Engineering and Systems Science - Systems and Control ,Mathematics - Optimization and Control - Abstract
In this paper, the parallel multi-extended state observers (ESOs) based active disturbance rejection control approach is proposed to achieve desired tracking performance by automatically selecting the estimation values leading to the least tracking error. First, the relationship between the estimation error of ESO and the tracking error of output is quantitatively studied for single ESO with general order. In particular, the algorithm for calculating the tracking error caused by single ESO's estimation error is constructed. Moreover, by timely evaluating the least tracking error caused by different ESOs, a novel switching ADRC approach with parallel multi-ESOs is proposed. In addition, the stability of the algorithm is rigorously proved. Furthermore, the proposed ADRC is applied to the high-speed precision motion stage which has large nonlinear uncertainties and elastic deformation disturbances near the dead zone of friction. The experimental results show that the parallel multi-ESOs based ADRC has higher tracking performance than the traditional single ESO based ADRC., Comment: 10 pages, 9 figures
- Published
- 2023
224. State of the Art and Potentialities of Graph-level Learning
- Author
-
Yang, Zhenyu, Zhang, Ge, Wu, Jia, Yang, Jian, Sheng, Quan Z., Xue, Shan, Zhou, Chuan, Aggarwal, Charu, Peng, Hao, Hu, Wenbin, Hancock, Edwin, and Liò, Pietro
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Graphs have a superior ability to represent relational data, like chemical compounds, proteins, and social networks. Hence, graph-level learning, which takes a set of graphs as input, has been applied to many tasks including comparison, regression, classification, and more. Traditional approaches to learning a set of graphs heavily rely on hand-crafted features, such as substructures. But while these methods benefit from good interpretability, they often suffer from computational bottlenecks as they cannot skirt the graph isomorphism problem. Conversely, deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations. As a result, these deep graph learning methods have been responsible for many successes. Yet, there is no comprehensive survey that reviews graph-level learning starting with traditional learning and moving through to the deep learning approaches. This article fills this gap and frames the representative algorithms into a systematic taxonomy covering traditional learning, graph-level deep neural networks, graph-level graph neural networks, and graph pooling. To ensure a thoroughly comprehensive survey, the evolutions, interactions, and communications between methods from four different branches of development are also examined. This is followed by a brief review of the benchmark data sets, evaluation metrics, and common downstream applications. The survey concludes with a broad overview of 12 current and future directions in this booming field.
- Published
- 2023
225. Self-organization Preserved Graph Structure Learning with Principle of Relevant Information
- Author
-
Sun, Qingyun, Li, Jianxin, Yang, Beining, Fu, Xingcheng, Peng, Hao, and Yu, Philip S.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Most Graph Neural Networks follow the message-passing paradigm, assuming the observed structure depicts the ground-truth node relationships. However, this fundamental assumption cannot always be satisfied, as real-world graphs are always incomplete, noisy, or redundant. How to reveal the inherent graph structure in a unified way remains under-explored. We proposed PRI-GSL, a Graph Structure Learning framework guided by the Principle of Relevant Information, providing a simple and unified framework for identifying the self-organization and revealing the hidden structure. PRI-GSL learns a structure that contains the most relevant yet least redundant information quantified by von Neumann entropy and Quantum Jensen-Shannon divergence. PRI-GSL incorporates the evolution of quantum continuous walk with graph wavelets to encode node structural roles, showing in which way the nodes interplay and self-organize with the graph structure. Extensive experiments demonstrate the superior effectiveness and robustness of PRI-GSL., Comment: Accepted by AAAI 2023
- Published
- 2022
226. Sensitivity analysis of biological washout and depth selection for a machine learning based dose verification framework in proton therapy
- Author
-
Yu, Shixiong, Liu, Yuxiang, Hu, Zongsheng, Zhang, Haozhao, Qi, Pengyu, and Peng, Hao
- Subjects
Physics - Medical Physics ,Computer Science - Machine Learning - Abstract
Dose verification based on proton-induced positron emitters is a promising quality assurance tool and may leverage the strength of artificial intelligence. To move a step closer towards practical application, the sensitivity analysis of two factors needs to be performed: biological washout and depth selection. selection. A bi-directional recurrent neural network (RNN) model was developed. The training dataset was generated based upon a CT image-based phantom (abdomen region) and multiple beam energies/pathways, using Monte-Carlo simulation (1 mm spatial resolution, no biological washout). For the modeling of biological washout, a simplified analytical model was applied to change raw activity profiles over a period of 5 minutes, incorporating both physical decay and biological washout. For the study of depth selection (a challenge linked to multi field/angle irradiation), truncations were applied at different window lengths (100, 125, 150 mm) to raw activity profiles. Finally, the performance of a worst-case scenario was examined by combining both factors (depth selection: 125 mm, biological washout: 5 mins). The accuracy was quantitatively evaluated in terms of range uncertainty, mean absolute error (MAE) and mean relative errors (MRE). Our proposed AI framework shows good immunity to the perturbation associated with two factors. The detection of proton-induced positron emitters, combined with machine learning, has great potential to implement online patient-specific verification in proton therapy.
- Published
- 2022
227. Generation of subcycle isolated attosecond pulses by pumping ionizing gating
- Author
-
Wu, Zhaohui, Peng, Hao, Zeng, Xiaoming, Li, Zhaoli, Zhang, 1 Zhimeng, Cao, 1 Huabao, Fu, Yuxi, Wang, Xiaodong, Wang, Xiao, Mu, Jie, Zuo, 1 Yanlei, Riconda, C., Weber, S., and Su, Jingqin
- Subjects
Physics - Optics - Abstract
We present a novel approach named as pumping ionizing gating (PIG) for the generation of isolated attosecond pulses (IAPs). In this regime, a short laser is used to ionize a pre-existing gas grating, creating a fast-extending plasma grating(FEPG) having an ionization front propagating with the velocity of light. A low-intensity long counterpropagating pump pulse is then reflected by a very narrow region of the ionization front, only where the Bragg conditions for resonant reflection is satisfied. Consequently, the pump reflection is confined within a sub-cycle region called PIG, and forms a wide-band coherent IAP in combination with the frequency up-conversion effect due to the plasma gradient. This approach results in a new scheme to generate IAPs fromlong picosecond pump pulses. Three-dimensional (3D) simulations show that a 1.6-ps, 1-{\mu}m pump pulse can be used to generate a 330 as laser pulse with a peak intensity approximately 33 times that of the pump and a conversion efficiency of around 0.1%.These results highlight the potential of the PIG method for generating IAPs with high conversion efficiency and peak intensity., Comment: It provides a new way to generate isolated attosecond pulse(IAP) by a picosecond pump, which has a protential to boost the IAP energy to joule level
- Published
- 2022
228. Radio continuum and OH line emission of high-z OH megamaser galaxies
- Author
-
Wu, Zhongzu, Sotnikova, Yu. V., Zhang, Bo, Mufakharov, T., Zhu, Ming, Jiang, Peng, Chen, Yongjun, Shen, Zhiqiang, Sun, Chun, Peng, Hao, and Wu, Hong
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We present the study of arcsecond scale radio continuum and OH line emission of a sample of known OH megamaser galaxies with $z \geq$ 0.15 using archival Very Large Array (VLA) data. And also the results of our pilot Five hundred meter aperture spherical radio telescope (FAST) observations of 12 of these OHM galaxies. The arcsecond-scale resolution images show that the OH emission is distributed in one compact structure and spatially associated with radio continuum emission. Furthermore, nearly all the fitted components are likely smaller than the beam size ($\sim$ 1.4"), which indicates that the broad OH line profiles of these sources originated from one masing region or that more components are distributed in sub-arcsec scales. The radio parameters, including brightness temperature, spectral index, and q-index, show no significant differences with the low-redshift OHM galaxies, which have significantly lower OH line luminosities. Because these parameters are indicators of the central power sources (AGN, starburst, or both), our results indicate that the presence of radio AGN in the nuclei may not be essential for the formation of OH emission. Over 1/3 of OHMs in this sample (6/17) show possible variable features likely caused by interstellar scintillation due to small angular sizes. We might underestimate this value because these sources are associated with this sample's highest OH line flux densities. Those with low OH line flux densities might need higher sensitivity observations to study the variabilities. These results support the compact nature of OH maser emission and a starburst origin for the OHMs in our selected sample., Comment: 25 pages,7 figures,accepted by A&A
- Published
- 2022
- Full Text
- View/download PDF
229. Research on Digital Requirements Management for Nuclear Safety-Class DCS System Design
- Author
-
Liu, Ying, Zhang, Xu, He, Ming-jing, Peng, Hao, Li, Xi, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Gu, Pengfei, editor, Xu, Yang, editor, Chen, Weihua, editor, Wang, Zhongqiu, editor, Sun, Yongbin, editor, and Liu, Zheming, editor
- Published
- 2024
- Full Text
- View/download PDF
230. Research on Decoupling and Isolation Design of Nuclear Safety-Class DCS
- Author
-
Yao, Ying-fan, Zang, Kai-yu, Yang, Hao-qin, Peng, Hao, Zhang, De-qian, Liu, Quan-dong, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Gu, Pengfei, editor, Xu, Yang, editor, Chen, Weihua, editor, Wang, Zhongqiu, editor, Sun, Yongbin, editor, and Liu, Zheming, editor
- Published
- 2024
- Full Text
- View/download PDF
231. Design and Implementation of Security Audit System for Nuclear Safety-Class DCS
- Author
-
Feng, Shi-man, Zhao, Wen-yue, Peng, Hao, Luo, Xiao-jun, Yao, Ying-fan, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Gu, Pengfei, editor, Xu, Yang, editor, Chen, Weihua, editor, Wang, Zhongqiu, editor, Sun, Yongbin, editor, and Liu, Zheming, editor
- Published
- 2024
- Full Text
- View/download PDF
232. Application of CNN-LSTM in Predicting the Status of Control Drive Plug-Ins
- Author
-
Ge, Zhen-di, Yao, Zhang, Peng, Hao, Yang, Zheng-ji, Wen, Yi, Zhu, Yu-lin, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Gu, Pengfei, editor, Xu, Yang, editor, Chen, Weihua, editor, Wang, Zhongqiu, editor, Sun, Yongbin, editor, and Liu, Zheming, editor
- Published
- 2024
- Full Text
- View/download PDF
233. Nuclear Safety-Class DCS Design Based on Simulation with Independent Stations of Same-Configuration Technology
- Author
-
Zhang, Xu, Deng, Xiao-jun, Yang, Jing-hua, Peng, Hao, Pu, Wen-de, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Gu, Pengfei, editor, Xu, Yang, editor, Chen, Weihua, editor, Wang, Zhongqiu, editor, Sun, Yongbin, editor, and Liu, Zheming, editor
- Published
- 2024
- Full Text
- View/download PDF
234. Research on Improving the Engineering Design of Nuclear Safety-Class DCS Based on Process Optimization
- Author
-
Zhang, Xu, Feng, Shi-man, Peng, Hao, Chen, Shi-yong, Liu, Quan-dong, Pu, Wen-de, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Gu, Pengfei, editor, Xu, Yang, editor, Chen, Weihua, editor, Wang, Zhongqiu, editor, Sun, Yongbin, editor, and Liu, Zheming, editor
- Published
- 2024
- Full Text
- View/download PDF
235. Field Test Research on Bit Speed Increase of Risk Exploration Well in Daqing Paleo Central Uplift Zone
- Author
-
Chang, Lei, Yang, Li-jing, Li, Ji-feng, Zhao, Ying-nan, Wang, Hong-ying, Wang, Peng-hao, Wu, Wei, Series Editor, and Lin, Jia'en, editor
- Published
- 2024
- Full Text
- View/download PDF
236. Improvement of corrosion resistance in NaOH solution and glass forming ability of as-cast Mg-based bulk metallic glasses by microalloying
- Author
-
Peng Hao, Li Shuangshou, and Huang Tianyou
- Subjects
Mg-Ni-based alloy ,glass forming ability ,corrosion resistance ,Technology ,Manufactures ,TS1-2301 - Abstract
The influences of the addition of Ag on the glass forming ability (GFA) and corrosion behavior were investigated in the Mg-Ni-based alloy system by X-ray diffraction (XRD) and electrochemical polarization in 0.1 mol/L NaOH solution. Results shows that the GFA of the Mg-Ni-based BMGs can be improved dramatically by the addition of an appropriate amount of Ag; and the addition element Ag can improve the corrosion resistance of Mg-Ni-based bulk metallic glass. The large difference in atomic size and large negative mixing enthalpy in alloy system can contribute to the high GFA. The addition element Ag improves the forming speed and the stability of the passive film, which is helpful to decrease the passivation current density and to improve the corrosion resistance of Mg-Ni-based bulk metallic glass.
- Published
- 2011
237. Finite Element Analysis for the Self-Loosening Behavior of the Bolted Joint with a Superelastic Shape Memory Alloy
- Author
-
Xiangjun Jiang, Jin Huang, Yongkun Wang, Baotong Li, Jingli Du, and Peng Hao
- Subjects
shape memory alloys ,finite element modeling ,ratcheting ,self-loosening of bolt ,Technology ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Microscopy ,QH201-278.5 ,Descriptive and experimental mechanics ,QC120-168.85 - Abstract
A macroscopic constitutive model is proposed in this research to reproduce the uniaxial transition ratcheting behaviors of the superelastic shape memory alloy (SMA) undergoing cyclic loading, based on the cosine-type phase transition equation with the initial martensite evolution coefficient that provides the predictive residual martensite accumulation evolution and the nonlinear feature of hysteresis loop. The calculated results are compared with the experimental results to show the validity of the present computational procedure in transition ratcheting. Finite element implementation for the self-loosening behavior of the superelastic SMA bolt is then carried out based on the proposed constitutive model to analyze the curves of stress-strain responses on the bolt bar, clamping force reduction law, dissipation energy change law of the bolted joint for different external loading cases, and preload force of the bolt.
- Published
- 2018
- Full Text
- View/download PDF
238. Investigation on Microstructure of Beetle Elytra and Energy Absorption Properties of Bio-Inspired Honeycomb Thin-Walled Structure under Axial Dynamic Crushing
- Author
-
Jianxun Du and Peng Hao
- Subjects
elytra ,microstructure ,impact loading ,aluminum alloy ,hierarchy order ,Chemistry ,QD1-999 - Abstract
The beetle elytra requires not only to be lightweight to make a beetle fly easily, but also to protect its body and hind-wing from outside damage. The honeycomb sandwich structure in the beetle elytra make it meet the above requirements. In the present work, the microstructures of beetle elytra, including biology layers and thin-walled honeycombs, are observed by scanning electron microscope and discussed. A new bionic honeycomb structure (BHS) with a different hierarchy order of filling cellular structure is established. inspired by elytra internal structure. Then the energy absorbed ability of different bionic models with the different filling cell size are compared by using nonlinear finite element software LS-DYNA (Livermore Software Technology Corp., Livermore, CA, USA). Numerical results show that the absorbed energy of bionic honeycomb structures is increased obviously with the increase of the filling cell size. The findings indicate that the bionic honeycomb structure with second order has an obviously improvement over conventional structures filled with honeycombs and shows great potential for novel clean energy absorption equipment.
- Published
- 2018
- Full Text
- View/download PDF
239. Self-Supervised Continual Graph Learning in Adaptive Riemannian Spaces
- Author
-
Sun, Li, Ye, Junda, Peng, Hao, Wang, Feiyang, and Yu, Philip S.
- Subjects
Computer Science - Machine Learning - Abstract
Continual graph learning routinely finds its role in a variety of real-world applications where the graph data with different tasks come sequentially. Despite the success of prior works, it still faces great challenges. On the one hand, existing methods work with the zero-curvature Euclidean space, and largely ignore the fact that curvature varies over the coming graph sequence. On the other hand, continual learners in the literature rely on abundant labels, but labeling graph in practice is particularly hard especially for the continuously emerging graphs on-the-fly. To address the aforementioned challenges, we propose to explore a challenging yet practical problem, the self-supervised continual graph learning in adaptive Riemannian spaces. In this paper, we propose a novel self-supervised Riemannian Graph Continual Learner (RieGrace). In RieGrace, we first design an Adaptive Riemannian GCN (AdaRGCN), a unified GCN coupled with a neural curvature adapter, so that Riemannian space is shaped by the learnt curvature adaptive to each graph. Then, we present a Label-free Lorentz Distillation approach, in which we create teacher-student AdaRGCN for the graph sequence. The student successively performs intra-distillation from itself and inter-distillation from the teacher so as to consolidate knowledge without catastrophic forgetting. In particular, we propose a theoretically grounded Generalized Lorentz Projection for the contrastive distillation in Riemannian space. Extensive experiments on the benchmark datasets show the superiority of RieGrace, and additionally, we investigate on how curvature changes over the graph sequence., Comment: Accepted by AAAI 2023 (Main Track), 9 pages, 4 figures
- Published
- 2022
240. Anomalous magnetohydrodynamics with temperature-dependent electric conductivity and application to the global polarization
- Author
-
Peng, Hao-Hao, Wu, Sihao, Wang, Ren-jie, She, Duan, and Pu, Shi
- Subjects
High Energy Physics - Phenomenology ,Nuclear Theory - Abstract
We have derived the solutions of the relativistic anomalous magnetohydrodynamics with longitudinal Bjorken boost invariance and transverse electromagnetic fields in the presence of temperature or energy density dependent electric conductivity. We consider the equations of states in a high temperature limit or in a high chiral chemical potential limit. We obtain both perturbative analytic solutions up to the order of \hbar and numerical solutions in our configurations of initial electromagnetic fields and Bjorken flow velocity. Our results show that the temperature or energy density dependent electric conductivity plays an important role to the decaying of the energy density and electromagnetic fields. We also implement our results to the splitting of global polarization for \Lambda and \bar{\Lambda} hyperons induced by the magnetic fields. Our results for the splitting of global polarization disagree with the experimental data in low energy collisions, which implies that the contribution from gradient of chemical potential may dominate in the low energy collisions.
- Published
- 2022
- Full Text
- View/download PDF
241. Recent Advancements of Artificial Intelligence in Particle Therapy
- Author
-
Peng, Hao, Wu, Chao, Nguyen, Dan, Schuemann, Jan, Mairani, Andrea, Pu, Yuehu, and Jiang, Steve
- Subjects
Physics - Medical Physics - Abstract
We are in a golden age of progress in artificial intelligence (AI). Radiotherapy, due to its technology-intensive nature as well as direct human-machine interactions, is perfectly suited for benefitting from AI to enhance accuracy and efficiency. Over the past few years, a vast majority of AI research have already been published in the field of photon therapy, while the applications of AI specifically targeted for particle therapy remain scarcely investigated. There are two distinct differences between photon therapy and particle therapy: beam interaction physics (photons vs. charged particles) and beam delivery mode (e.g. IMRT/VMAT vs. pencil beam scanning). As a result, different strategies of AI deployment are required between these two radiotherapy modalities. In this article, we aim to present a comprehensive survey of recent literatures exclusively focusing on AI-powered particle therapy. Six major aspects are included: treatment planning, dose calculation, range and dose verification, image guidance, quality assurance and adaptive replanning. A number of perspectives as well as potential challenges and common pitfalls, are also discussed.
- Published
- 2022
242. MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction
- Author
-
Wang, Xiaozhi, Chen, Yulin, Ding, Ning, Peng, Hao, Wang, Zimu, Lin, Yankai, Han, Xu, Hou, Lei, Li, Juanzi, Liu, Zhiyuan, Li, Peng, and Zhou, Jie
- Subjects
Computer Science - Computation and Language - Abstract
The diverse relationships among real-world events, including coreference, temporal, causal, and subevent relations, are fundamental to understanding natural languages. However, two drawbacks of existing datasets limit event relation extraction (ERE) tasks: (1) Small scale. Due to the annotation complexity, the data scale of existing datasets is limited, which cannot well train and evaluate data-hungry models. (2) Absence of unified annotation. Different types of event relations naturally interact with each other, but existing datasets only cover limited relation types at once, which prevents models from taking full advantage of relation interactions. To address these issues, we construct a unified large-scale human-annotated ERE dataset MAVEN-ERE with improved annotation schemes. It contains 103,193 event coreference chains, 1,216,217 temporal relations, 57,992 causal relations, and 15,841 subevent relations, which is larger than existing datasets of all the ERE tasks by at least an order of magnitude. Experiments show that ERE on MAVEN-ERE is quite challenging, and considering relation interactions with joint learning can improve performances. The dataset and source codes can be obtained from https://github.com/THU-KEG/MAVEN-ERE., Comment: Accepted at EMNLP 2022. Camera-ready version
- Published
- 2022
243. COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
- Author
-
Peng, Hao, Wang, Xiaozhi, Hu, Shengding, Jin, Hailong, Hou, Lei, Li, Juanzi, Liu, Zhiyuan, and Liu, Qun
- Subjects
Computer Science - Computation and Language - Abstract
Conceptual knowledge is fundamental to human cognition and knowledge bases. However, existing knowledge probing works only focus on evaluating factual knowledge of pre-trained language models (PLMs) and ignore conceptual knowledge. Since conceptual knowledge often appears as implicit commonsense behind texts, designing probes for conceptual knowledge is hard. Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. For the tasks, we collect and annotate 24k data instances covering 393 concepts, which is COPEN, a COnceptual knowledge Probing bENchmark. Extensive experiments on different sizes and types of PLMs show that existing PLMs systematically lack conceptual knowledge and suffer from various spurious correlations. We believe this is a critical bottleneck for realizing human-like cognition in PLMs. COPEN and our codes are publicly released at https://github.com/THU-KEG/COPEN., Comment: Accepted by EMNLP 2022
- Published
- 2022
244. How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
- Author
-
Hassid, Michael, Peng, Hao, Rotem, Daniel, Kasai, Jungo, Montero, Ivan, Smith, Noah A., and Schwartz, Roy
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
The attention mechanism is considered the backbone of the widely-used Transformer architecture. It contextualizes the input by computing input-specific attention matrices. We find that this mechanism, while powerful and elegant, is not as important as typically thought for pretrained language models. We introduce PAPA, a new probing method that replaces the input-dependent attention matrices with constant ones -- the average attention weights over multiple inputs. We use PAPA to analyze several established pretrained Transformers on six downstream tasks. We find that without any input-dependent attention, all models achieve competitive performance -- an average relative drop of only 8% from the probing baseline. Further, little or no performance drop is observed when replacing half of the input-dependent attention matrices with constant (input-independent) ones. Interestingly, we show that better-performing models lose more from applying our method than weaker models, suggesting that the utilization of the input-dependent attention mechanism might be a factor in their success. Our results motivate research on simpler alternatives to input-dependent attention, as well as on methods for better utilization of this mechanism in the Transformer architecture., Comment: Findings of EMNLP 2022
- Published
- 2022
245. Ranking-based Group Identification via Factorized Attention on Social Tripartite Graph
- Author
-
Yang, Mingdai, Liu, Zhiwei, Yang, Liangwei, Liu, Xiaolong, Wang, Chen, Peng, Hao, and Yu, Philip S.
- Subjects
Computer Science - Social and Information Networks ,Computer Science - Artificial Intelligence - Abstract
Due to the proliferation of social media, a growing number of users search for and join group activities in their daily life. This develops a need for the study on the ranking-based group identification (RGI) task, i.e., recommending groups to users. The major challenge in this task is how to effectively and efficiently leverage both the item interaction and group participation of users' online behaviors. Though recent developments of Graph Neural Networks (GNNs) succeed in simultaneously aggregating both social and user-item interaction, they however fail to comprehensively resolve this RGI task. In this paper, we propose a novel GNN-based framework named Contextualized Factorized Attention for Group identification (CFAG). We devise tripartite graph convolution layers to aggregate information from different types of neighborhoods among users, groups, and items. To cope with the data sparsity issue, we devise a novel propagation augmentation (PA) layer, which is based on our proposed factorized attention mechanism. PA layers efficiently learn the relatedness of non-neighbor nodes to improve the information propagation to users. Experimental results on three benchmark datasets verify the superiority of CFAG. Additional detailed investigations are conducted to demonstrate the effectiveness of the proposed framework., Comment: 9 pages. Accepted by WSDM'23. Github: https://github.com/mdyfrank/CFAG
- Published
- 2022
246. Sequential Recommendation with Auxiliary Item Relationships via Multi-Relational Transformer
- Author
-
Fan, Ziwei, Liu, Zhiwei, Wang, Chen, Huang, Peijie, Peng, Hao, and Yu, Philip S.
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Sequential Recommendation (SR) models user dynamics and predicts the next preferred items based on the user history. Existing SR methods model the 'was interacted before' item-item transitions observed in sequences, which can be viewed as an item relationship. However, there are multiple auxiliary item relationships, e.g., items from similar brands and with similar contents in real-world scenarios. Auxiliary item relationships describe item-item affinities in multiple different semantics and alleviate the long-lasting cold start problem in the recommendation. However, it remains a significant challenge to model auxiliary item relationships in SR. To simultaneously model high-order item-item transitions in sequences and auxiliary item relationships, we propose a Multi-relational Transformer capable of modeling auxiliary item relationships for SR (MT4SR). Specifically, we propose a novel self-attention module, which incorporates arbitrary item relationships and weights item relationships accordingly. Second, we regularize intra-sequence item relationships with a novel regularization module to supervise attentions computations. Third, for inter-sequence item relationship pairs, we introduce a novel inter-sequence related items modeling module. Finally, we conduct experiments on four benchmark datasets and demonstrate the effectiveness of MT4SR over state-of-the-art methods and the improvements on the cold start problem. The code is available at https://github.com/zfan20/MT4SR., Comment: Accepted to BigData 2022. The code is at https://github.com/zfan20/MT4SR
- Published
- 2022
247. DAGAD: Data Augmentation for Graph Anomaly Detection
- Author
-
Liu, Fanzhen, Ma, Xiaoxiao, Wu, Jia, Yang, Jian, Xue, Shan, Beheshti, Amin, Zhou, Chuan, Peng, Hao, Sheng, Quan Z., and Aggarwal, Charu C.
- Subjects
Computer Science - Machine Learning - Abstract
Graph anomaly detection in this paper aims to distinguish abnormal nodes that behave differently from the benign ones accounting for the majority of graph-structured instances. Receiving increasing attention from both academia and industry, yet existing research on this task still suffers from two critical issues when learning informative anomalous behavior from graph data. For one thing, anomalies are usually hard to capture because of their subtle abnormal behavior and the shortage of background knowledge about them, which causes severe anomalous sample scarcity. Meanwhile, the overwhelming majority of objects in real-world graphs are normal, bringing the class imbalance problem as well. To bridge the gaps, this paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs, equipped with three specially designed modules: 1) an information fusion module employing graph neural network encoders to learn representations, 2) a graph data augmentation module that fertilizes the training set with generated samples, and 3) an imbalance-tailored learning module to discriminate the distributions of the minority (anomalous) and majority (normal) classes. A series of experiments on three datasets prove that DAGAD outperforms ten state-of-the-art baseline detectors concerning various mostly-used metrics, together with an extensive ablation study validating the strength of our proposed modules., Comment: Regular paper accepted by the 22nd IEEE International Conference on Data Mining (ICDM 2022)
- Published
- 2022
248. Modeling Context With Linear Attention for Scalable Document-Level Translation
- Author
-
Wu, Zhaofeng, Peng, Hao, Pappas, Nikolaos, and Smith, Noah A.
- Subjects
Computer Science - Computation and Language - Abstract
Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are difficult to scale to long documents as their attention layers have quadratic complexity in the sequence length. Recent efforts on efficient attention improve scalability, but their effect on document translation remains unexplored. In this work, we investigate the efficacy of a recent linear attention model by Peng et al. (2021) on document translation and augment it with a sentential gate to promote a recency inductive bias. We evaluate the model on IWSLT 2015 and OpenSubtitles 2018 against the transformer, demonstrating substantially increased decoding speed on long sequences with similar or better BLEU scores. We show that sentential gating further improves translation quality on IWSLT., Comment: Findings of EMNLP 2022
- Published
- 2022
249. Transparency Helps Reveal When Language Models Learn Meaning
- Author
-
Wu, Zhaofeng, Merrill, William, Peng, Hao, Beltagy, Iz, and Smith, Noah A.
- Subjects
Computer Science - Computation and Language - Abstract
Many current NLP systems are built from language models trained to optimize unsupervised objectives on large amounts of raw text. Under what conditions might such a procedure acquire meaning? Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations (i.e., languages with strong transparency), both autoregressive and masked language models successfully learn to emulate semantic relations between expressions. However, when denotations are changed to be context-dependent with the language otherwise unmodified, this ability degrades. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not represent natural language semantics well. We show this failure relates to the context-dependent nature of natural language form-meaning mappings., Comment: Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2023. Author's final version (pre-MIT Press publication)
- Published
- 2022
250. Complexity-Based Prompting for Multi-Step Reasoning
- Author
-
Fu, Yao, Peng, Hao, Sabharwal, Ashish, Clark, Peter, and Khot, Tushar
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
We study the task of prompting large-scale language models to perform multi-step reasoning. Existing work shows that when prompted with a chain of thoughts (CoT), sequences of short sentences describing intermediate reasoning steps towards a final answer, large language models can generate new reasoning chains and predict answers for new inputs. A central question is which reasoning examples make the most effective prompts. In this work, we propose complexity-based prompting, a simple and effective example selection scheme for multi-step reasoning. We show that prompts with higher reasoning complexity, i.e., chains with more reasoning steps, achieve substantially better performance on multi-step reasoning tasks over strong baselines. We further extend our complexity-based criteria from prompting (selecting inputs) to decoding (selecting outputs), where we sample multiple reasoning chains from the model, then choose the majority of generated answers from complex reasoning chains (over simple chains). When used to prompt GPT-3 and Codex, our approach substantially improves multi-step reasoning accuracy and achieves new state-of-the-art (SOTA) performance on three math benchmarks (GSM8K, MultiArith, and MathQA) and two BigBenchHard tasks (Date Understanding and Penguins), with an average +5.3 and up to +18 accuracy improvements. Compared with existing example selection schemes like manual tuning or retrieval-based selection, selection based on reasoning complexity is intuitive, easy to implement, and annotation-efficient. Further results demonstrate the robustness of performance gains from complex prompts under format perturbation and distribution shift., Comment: Preprint
- Published
- 2022
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.