Author: "Zhu Zihao" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhu Zihao"' showing total 500 results

Start Over Author "Zhu Zihao"

500 results on '"Zhu Zihao"'

1. EAIRiskBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents

Author: Zhu, Zihao, Wu, Bingzhe, Zhang, Zhengyou, Han, Lei, and Wu, Baoyuan
Subjects: Computer Science - Artificial Intelligence
Abstract: Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction. The emergence of foundation models as the "brain" of EAI agents for high-level task planning has shown promising results. However, the deployment of these agents in physical environments presents significant safety challenges. For instance, a housekeeping robot lacking sufficient risk awareness might place a metal container in a microwave, potentially causing a fire. To address these critical safety concerns, comprehensive pre-deployment risk assessments are imperative. This study introduces EAIRiskBench, a novel framework for automated physical risk assessment in EAI scenarios. EAIRiskBench employs a multi-agent cooperative system that leverages various foundation models to generate safety guidelines, create risk-prone scenarios, make task planning, and evaluate safety systematically. Utilizing this framework, we construct EAIRiskDataset, comprising diverse test cases across various domains, encompassing both textual and visual scenarios. Our comprehensive evaluation of state-of-the-art foundation models reveals alarming results: all models exhibit high task risk rates (TRR), with an average of 95.75% across all evaluated models. To address these challenges, we further propose two prompting-based risk mitigation strategies. While these strategies demonstrate some efficacy in reducing TRR, the improvements are limited, still indicating substantial safety concerns. This study provides the first large-scale assessment of physical risk awareness in EAI agents. Our findings underscore the critical need for enhanced safety measures in EAI systems and provide valuable insights for future research directions in developing safer embodied artificial intelligence system.
Published: 2024

2. BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning

Author: Wu, Baoyuan, Chen, Hongrui, Zhang, Mingda, Zhu, Zihao, Wei, Shaokui, Yuan, Danni, Zhu, Mingli, Wang, Ruotong, Liu, Li, and Shen, Chao
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: As an emerging approach to explore the vulnerability of deep neural networks (DNNs), backdoor learning has attracted increasing interest in recent years, and many seminal backdoor attack and defense algorithms are being developed successively or concurrently, in the status of a rapid arms race. However, mainly due to the diverse settings, and the difficulties of implementation and reproducibility of existing works, there is a lack of a unified and standardized benchmark of backdoor learning, causing unfair comparisons or unreliable conclusions (e.g., misleading, biased or even false conclusions). Consequently, it is difficult to evaluate the current progress and design the future development roadmap of this literature. To alleviate this dilemma, we build a comprehensive benchmark of backdoor learning called BackdoorBench. Our benchmark makes three valuable contributions to the research community. 1) We provide an integrated implementation of state-of-the-art (SOTA) backdoor learning algorithms (currently including 20 attack and 32 defense algorithms), based on an extensible modular-based codebase. 2) We conduct comprehensive evaluations with 5 poisoning ratios, based on 4 models and 4 datasets, leading to 11,492 pairs of attack-against-defense evaluations in total. 3) Based on above evaluations, we present abundant analysis from 10 perspectives via 18 useful analysis tools, and provide several inspiring insights about backdoor learning. We hope that our efforts could build a solid foundation of backdoor learning to facilitate researchers to investigate existing algorithms, develop more innovative algorithms, and explore the intrinsic mechanism of backdoor learning. Finally, we have created a user-friendly website at http://backdoorbench.com, which collects all important information of BackdoorBench, including codebase, docs, leaderboard, and model Zoo., Comment: Substantial extensions based on our previous conference version "Backdoorbench: A comprehensive benchmark of backdoor learning" published at NeurIPS D&B Track 2022. 20 backdoor attack algorithms, 32 backdoor defense algorithms, 11000+ pairs of attack-against-defense evaluations, 10 analyses, 18 analysis tools
Published: 2024

3. LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

Author: Zhu, Zihao, Tao, Tianli, Tao, Yitian, Deng, Haowen, Cai, Xinyi, Wu, Gaofeng, Wang, Kaidong, Tang, Haifeng, Zhu, Lixuan, Gu, Zhuoyang, Huang, Jiawei, Shen, Dinggang, and Zhang, Han
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets with missing time points. This limitation significantly impedes subsequent neuroscience and clinical modeling. Yet, existing deep generative models are facing difficulties in missing brain image completion, due to sparse data and the nonlinear, dramatic contrast/geometric variations in the developing brain. We propose LoCI-DiffCom, a novel Longitudinal Consistency-Informed Diffusion model for infant brain image Completion,which integrates the images from preceding and subsequent time points to guide a diffusion model for generating high-fidelity missing data. Our designed LoCI module can work on highly sparse sequences, relying solely on data from two temporal points. Despite wide separation and diversity between age time points, our approach can extract individualized developmental features while ensuring context-aware consistency. Our experiments on a large infant brain MR dataset demonstrate its effectiveness with consistent performance on missing infant brain MR completion even in big gap scenarios, aiding in better delineation of early developmental trajectories.
Published: 2024

4. Dr.Hair: Reconstructing Scalp-Connected Hair Strands without Pre-training via Differentiable Rendering of Line Segments

Author: Takimoto, Yusuke, Takehara, Hikari, Sato, Hiroyuki, Zhu, Zihao, and Zheng, Bo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics
Abstract: In the film and gaming industries, achieving a realistic hair appearance typically involves the use of strands originating from the scalp. However, reconstructing these strands from observed surface images of hair presents significant challenges. The difficulty in acquiring Ground Truth (GT) data has led state-of-the-art learning-based methods to rely on pre-training with manually prepared synthetic CG data. This process is not only labor-intensive and costly but also introduces complications due to the domain gap when compared to real-world data. In this study, we propose an optimization-based approach that eliminates the need for pre-training. Our method represents hair strands as line segments growing from the scalp and optimizes them using a novel differentiable rendering algorithm. To robustly optimize a substantial number of slender explicit geometries, we introduce 3D orientation estimation utilizing global optimization, strand initialization based on Laplace's equation, and reparameterization that leverages geometric connectivity and spatial proximity. Unlike existing optimization-based methods, our method is capable of reconstructing internal hair flow in an absolute direction. Our method exhibits robust and accurate inverse rendering, surpassing the quality of existing methods and significantly improving processing speed., Comment: CVPR 2024
Published: 2024

5. skelemap: skeleton-based boundary growth for efficient and automated cartogram generation

Author: Wang, Yunchao, Sun, Guodao, Zhu, Zihao, Li, Tong, and Liang, Ronghua
Published: 2024
Full Text: View/download PDF

6. Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion

Author: Guo, Lianghu, Tao, Tianli, Cai, Xinyi, Zhu, Zihao, Huang, Jiawei, Zhu, Lixuan, Gu, Zhuoyang, Tang, Haifeng, Zhou, Rui, Han, Siyan, Liang, Yan, Yang, Qing, Shen, Dinggang, and Zhang, Han
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Early infancy is a rapid and dynamic neurodevelopmental period for behavior and neurocognition. Longitudinal magnetic resonance imaging (MRI) is an effective tool to investigate such a crucial stage by capturing the developmental trajectories of the brain structures. However, longitudinal MRI acquisition always meets a serious data-missing problem due to participant dropout and failed scans, making longitudinal infant brain atlas construction and developmental trajectory delineation quite challenging. Thanks to the development of an AI-based generative model, neuroimage completion has become a powerful technique to retain as much available data as possible. However, current image completion methods usually suffer from inconsistency within each individual subject in the time dimension, compromising the overall quality. To solve this problem, our paper proposed a two-stage cascaded diffusion model, Cas-DiffCom, for dense and longitudinal 3D infant brain MRI completion and super-resolution. We applied our proposed method to the Baby Connectome Project (BCP) dataset. The experiment results validate that Cas-DiffCom achieves both individual consistency and high fidelity in longitudinal infant brain image completion. We further applied the generated infant brain images to two downstream tasks, brain tissue segmentation and developmental trajectory delineation, to declare its task-oriented potential in the neuroscience field.
Published: 2024

7. Enhanced Few-Shot Class-Incremental Learning via Ensemble Models

Author: Zhu, Mingli, Zhu, Zihao, Chen, Sihong, Chen, Chen, and Wu, Baoyuan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Few-shot class-incremental learning (FSCIL) aims to continually fit new classes with limited training data, while maintaining the performance of previously learned classes. The main challenges are overfitting the rare new training samples and forgetting old classes. While catastrophic forgetting has been extensively studied, the overfitting problem has attracted less attention in FSCIL. To tackle overfitting challenge, we design a new ensemble model framework cooperated with data augmentation to boost generalization. In this way, the enhanced model works as a library storing abundant features to guarantee fast adaptation to downstream tasks. Specifically, the multi-input multi-output ensemble structure is applied with a spatial-aware data augmentation strategy, aiming at diversifying the feature extractor and alleviating overfitting in incremental sessions. Moreover, self-supervised learning is also integrated to further improve the model generalization. Comprehensive experimental results show that the proposed method can indeed mitigate the overfitting problem in FSCIL, and outperform the state-of-the-art methods.
Published: 2024

8. Multi-condensate lengths with degenerate excitation gaps in BaNi$_2$As$_2$ revealed by muon spin relaxation study

Author: Chen, Kaiwen, Zhu, Zihao, Xie, Yaofeng, Hillier, Adrian D., Lord, James S., Dai, Pengcheng, and Shu, Lei
Subjects: Condensed Matter - Superconductivity
Abstract: The recently discovered (Ba,Sr)Ni$_2$As$_2$ family provides an ideal platform for investigating the interaction between electronic nematicity and superconductivity. Here we report the muon spin relaxation ($\mu$SR) measurements on BaNi$_2$As$_2$. Transverse-field $\mu$SR experiments indicate that the temperature dependence of superfluid density is best fitted with a single-band $s$-wave model. On the other hand, the magnetic penetration depth $\lambda$ shows magnetic field dependence, which contradicts with the single-band fully-gapped scenario. Zero-field $\mu$SR experiments indicate the absence of spontaneous magnetic field in the superconducting state, showing the preservation of time-reversal symmetry in the superconducting state. Our $\mu$SR experiments suggest that BaNi$_2$As$_2$ is a fully-gapped multiband superconductor. The superconducting gap amplitudes of each band are nearly the same while different bands exhibit different coherence lengths. The present work helps to elucidate the controversial superconducting property of this parent compound, paving the way for further research on doping the system with Sr to enhance superconductivity., Comment: Accepted by Phys. Rev. B
Published: 2024

9. EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling

Author: Liu, Haiyang, Zhu, Zihao, Becherini, Giorgio, Peng, Yichen, Su, Mingyang, Zhou, You, Zhe, Xuefei, Iwamoto, Naoya, Zheng, Bo, and Black, Michael J.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We propose EMAGE, a framework to generate full-body human gestures from audio and masked gestures, encompassing facial, local body, hands, and global movements. To achieve this, we first introduce BEAT2 (BEAT-SMPLX-FLAME), a new mesh-level holistic co-speech dataset. BEAT2 combines a MoShed SMPL-X body with FLAME head parameters and further refines the modeling of head, neck, and finger movements, offering a community-standardized, high-quality 3D motion captured dataset. EMAGE leverages masked body gesture priors during training to boost inference performance. It involves a Masked Audio Gesture Transformer, facilitating joint training on audio-to-gesture generation and masked gesture reconstruction to effectively encode audio and body gesture hints. Encoded body hints from masked gestures are then separately employed to generate facial and body movements. Moreover, EMAGE adaptively merges speech features from the audio's rhythm and content and utilizes four compositional VQ-VAEs to enhance the results' fidelity and diversity. Experiments demonstrate that EMAGE generates holistic gestures with state-of-the-art performance and is flexible in accepting predefined spatial-temporal gesture inputs, generating complete, audio-synchronized results. Our code and dataset are available https://pantomatrix.github.io/EMAGE/, Comment: Fix typos; Conflict of Interest Disclosure; CVPR Camera Ready; Project Page: https://pantomatrix.github.io/EMAGE/
Published: 2023

10. BlackboxBench: A Comprehensive Benchmark of Black-box Adversarial Attacks

Author: Zheng, Meixi, Yan, Xuanchen, Zhu, Zihao, Chen, Hongrui, and Wu, Baoyuan
Subjects: Computer Science - Cryptography and Security
Abstract: Adversarial examples are well-known tools to evaluate the vulnerability of deep neural networks (DNNs). Although lots of adversarial attack algorithms have been developed, it is still challenging in the practical scenario that the model's parameters and architectures are inaccessible to the attacker/evaluator, i.e., black-box adversarial attacks. Due to the practical importance, there has been rapid progress from recent algorithms, reflected by the quick increase in attack success rate and the quick decrease in query numbers to the target model. However, there is a lack of thorough evaluations and comparisons among these algorithms, causing difficulties of tracking the real progress, analyzing advantages and disadvantages of different technical routes, as well as designing future development roadmap of this field. Thus, in this work, we aim at building a comprehensive benchmark of black-box adversarial attacks, called BlackboxBench. It mainly provides: 1) a unified, extensible and modular-based codebase, implementing 25 query-based attack algorithms and 30 transfer-based attack algorithms; 2) comprehensive evaluations: we evaluate the implemented algorithms against several mainstreaming model architectures on 2 widely used datasets (CIFAR-10 and a subset of ImageNet), leading to 14,106 evaluations in total; 3) thorough analysis and new insights, as well analytical tools. The website and source codes of BlackboxBench are available at https://blackboxbench.github.io/ and https://github.com/SCLBD/BlackboxBench/, respectively., Comment: 37 pages, 29 figures
Published: 2023

11. Defenses in Adversarial Machine Learning: A Survey

Author: Wu, Baoyuan, Wei, Shaokui, Zhu, Mingli, Zheng, Meixi, Zhu, Zihao, Zhang, Mingda, Chen, Hongrui, Yuan, Danni, Liu, Li, and Liu, Qingshan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Adversarial phenomenon has been widely observed in machine learning (ML) systems, especially in those using deep neural networks, describing that ML systems may produce inconsistent and incomprehensible predictions with humans at some particular cases. This phenomenon poses a serious security threat to the practical application of ML systems, and several advanced attack paradigms have been developed to explore it, mainly including backdoor attacks, weight attacks, and adversarial examples. For each individual attack paradigm, various defense paradigms have been developed to improve the model robustness against the corresponding attack paradigm. However, due to the independence and diversity of these defense paradigms, it is difficult to examine the overall robustness of an ML system against different kinds of attacks.This survey aims to build a systematic review of all existing defense paradigms from a unified perspective. Specifically, from the life-cycle perspective, we factorize a complete machine learning system into five stages, including pre-training, training, post-training, deployment, and inference stages, respectively. Then, we present a clear taxonomy to categorize and review representative defense methods at each individual stage. The unified perspective and presented taxonomies not only facilitate the analysis of the mechanism of each defense paradigm but also help us to understand connections and differences among different defense paradigms, which may inspire future research to develop more advanced, comprehensive defenses., Comment: 21 pages, 5 figures, 2 tables, 237 reference papers
Published: 2023

12. C5: toward better conversation comprehension and contextual continuity for ChatGPT

Author: Liang, Pan, Ye, Danwei, Zhu, Zihao, Wang, Yunchao, Xia, Wang, Liang, Ronghua, and Sun, Guodao
Published: 2024
Full Text: View/download PDF

13. Superconducting Properties of La$_2$(Cu$_{1-x}$Ni_x)$_5$As$_3$O$_2$: A $\rm \mu$SR Study

Author: Wu, Qiong, Chen, Kaiwen, Zhu, Zihao, Tan, Cheng, Yang, Yanxing, Li, Xin, Shiroka, Toni, Chen, Xu, Guo, Jiangang, Chen, Xiaolong, and Shu, Lei
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Strongly Correlated Electrons
Abstract: We report the results of muon spin rotation and relaxation ($\rm \mu$SR) measurements on the recently discovered layered Cu-based superconducting material La$_{2}($Cu$_{1-x}$Ni$_{x}$)$_{5}$As$_{3}$O$_{2}$ ($x =$ 0.40, 0.45). Transverse-field $\rm \mu$SR experiments on both samples show that the temperature dependence of superfluid density is best described by a two-band model. The absolute values of zero-temperature magnetic penetration depth $\lambda_{\rm ab}(0)$ were found to be 427(1.7) nm and 422(1.5) nm for $x =$ 0.40 and 0.45, respectively. Both compounds are located between the unconventional and the standard BCS superconductors in the Uemura plot. No evidence of time-reversal symmetry (TRS) breaking in the superconducting state is suggested by zero-field $\rm \mu$SR measurements.
Published: 2023

14. VDC: Versatile Data Cleanser based on Visual-Linguistic Inconsistency by Multimodal Large Language Models

Author: Zhu, Zihao, Zhang, Mingda, Wei, Shaokui, Wu, Bingzhe, and Wu, Baoyuan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: The role of data in building AI systems has recently been emphasized by the emerging concept of data-centric AI. Unfortunately, in the real-world, datasets may contain dirty samples, such as poisoned samples from backdoor attack, noisy labels in crowdsourcing, and even hybrids of them. The presence of such dirty samples makes the DNNs vunerable and unreliable.Hence, it is critical to detect dirty samples to improve the quality and realiability of dataset. Existing detectors only focus on detecting poisoned samples or noisy labels, that are often prone to weak generalization when dealing with dirty samples from other domains.In this paper, we find a commonality of various dirty samples is visual-linguistic inconsistency between images and associated labels. To capture the semantic inconsistency between modalities, we propose versatile data cleanser (VDC) leveraging the surpassing capabilities of multimodal large language models (MLLM) in cross-modal alignment and reasoning.It consists of three consecutive modules: the visual question generation module to generate insightful questions about the image; the visual question answering module to acquire the semantics of the visual content by answering the questions with MLLM; followed by the visual answer evaluation module to evaluate the inconsistency.Extensive experiments demonstrate its superior performance and generalization to various categories and types of dirty samples. The code is available at \url{https://github.com/zihao-ai/vdc}., Comment: Accepted to ICLR 2024
Published: 2023

15. C5: Towards Better Conversation Comprehension and Contextual Continuity for ChatGPT

Author: Liang, Pan, Ye, Danwei, Zhu, Zihao, Wang, Yunchao, Xia, Wang, Liang, Ronghua, and Sun, Guodao
Subjects: Computer Science - Artificial Intelligence
Abstract: Large language models (LLMs), such as ChatGPT, have demonstrated outstanding performance in various fields, particularly in natural language understanding and generation tasks. In complex application scenarios, users tend to engage in multi-turn conversations with ChatGPT to keep contextual information and obtain comprehensive responses. However, human forgetting and model contextual forgetting remain prominent issues in multi-turn conversation scenarios, which challenge the users' conversation comprehension and contextual continuity for ChatGPT. To address these challenges, we propose an interactive conversation visualization system called C5, which includes Global View, Topic View, and Context-associated Q\&A View. The Global View uses the GitLog diagram metaphor to represent the conversation structure, presenting the trend of conversation evolution and supporting the exploration of locally salient features. The Topic View is designed to display all the question and answer nodes and their relationships within a topic using the structure of a knowledge graph, thereby display the relevance and evolution of conversations. The Context-associated Q\&A View consists of three linked views, which allow users to explore individual conversations deeply while providing specific contextual information when posing questions. The usefulness and effectiveness of C5 were evaluated through a case study and a user study.
Published: 2023

16. Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy

Author: Zhu, Zihao, Zhang, Mingda, Wei, Shaokui, Shen, Li, Fan, Yanbo, and Wu, Baoyuan
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Data-poisoning based backdoor attacks aim to insert backdoor into models by manipulating training datasets without controlling the training process of the target model. Existing attack methods mainly focus on designing triggers or fusion strategies between triggers and benign samples. However, they often randomly select samples to be poisoned, disregarding the varying importance of each poisoning sample in terms of backdoor injection. A recent selection strategy filters a fixed-size poisoning sample pool by recording forgetting events, but it fails to consider the remaining samples outside the pool from a global perspective. Moreover, computing forgetting events requires significant additional computing resources. Therefore, how to efficiently and effectively select poisoning samples from the entire dataset is an urgent problem in backdoor attacks.To address it, firstly, we introduce a poisoning mask into the regular backdoor training loss. We suppose that a backdoored model training with hard poisoning samples has a more backdoor effect on easy ones, which can be implemented by hindering the normal training process (\ie, maximizing loss \wrt mask). To further integrate it with normal training process, we then propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization.Specifically, the outer loop aims to achieve the backdoor attack goal by minimizing the loss based on the selected samples, while the inner loop selects hard poisoning samples that impede this goal by maximizing the loss. After several rounds of adversarial training, we finally select effective poisoning samples with high contribution. Extensive experiments on benchmark datasets demonstrate the effectiveness and efficiency of our approach in boosting backdoor attack performance.
Published: 2023

17. Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

Author: Wang, Ruotong, Chen, Hongrui, Zhu, Zihao, Liu, Li, and Wu, Baoyuan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security
Abstract: Deep neural networks (DNNs) can be manipulated to exhibit specific behaviors when exposed to specific trigger patterns, without affecting their performance on benign samples, dubbed \textit{backdoor attack}. Currently, implementing backdoor attacks in physical scenarios still faces significant challenges. Physical attacks are labor-intensive and time-consuming, and the triggers are selected in a manual and heuristic way. Moreover, expanding digital attacks to physical scenarios faces many challenges due to their sensitivity to visual distortions and the absence of counterparts in the real world. To address these challenges, we define a novel trigger called the \textbf{V}isible, \textbf{S}emantic, \textbf{S}ample-Specific, and \textbf{C}ompatible (VSSC) trigger, to achieve effective, stealthy and robust simultaneously, which can also be effectively deployed in the physical scenario using corresponding objects. To implement the VSSC trigger, we propose an automated pipeline comprising three modules: a trigger selection module that systematically identifies suitable triggers leveraging large language models, a trigger insertion module that employs generative models to seamlessly integrate triggers into images, and a quality assessment module that ensures the natural and successful insertion of triggers through vision-language models. Extensive experimental results and analysis validate the effectiveness, stealthiness, and robustness of the VSSC trigger. It can not only maintain robustness under visual distortions but also demonstrates strong practicality in the physical scenario. We hope that the proposed VSSC trigger and implementation approach could inspire future studies on designing more practical triggers in backdoor attacks., Comment: 23 pages, 21 figures, 18 tables
Published: 2023

18. Immunocompatible elastomer with increased resistance to the foreign body response

Author: Zhou, Xianchi, Lu, Zhouyu, Cao, Wenzhong, Zhu, Zihao, Chen, Yifeng, Ni, Yanwen, Liu, Zuolong, Jia, Fan, Ye, Yang, Han, Haijie, Yao, Ke, Liu, Weifeng, Wang, Youxiang, Ji, Jian, and Zhang, Peng
Published: 2024
Full Text: View/download PDF

19. Genome-wide characterization of Remorin gene family and their responsive expression to abiotic stresses and plant hormone in Brassica napus

Author: Sun, Nan, Zhou, Jiale, Liu, Yanfeng, Li, Dong, Xu, Xin, Zhu, Zihao, Xu, Xuesheng, Zhan, Renhui, Zhang, Hongxia, and Wang, Limin
Published: 2024
Full Text: View/download PDF

20. Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-cycle Perspective

Author: Wu, Baoyuan, Zhu, Zihao, Liu, Li, Liu, Qingshan, He, Zhaofeng, and Lyu, Siwei
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans. Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system, such as backdoor attack occurring at the pre-training, in-training and inference stage; weight attack occurring at the post-training, deployment and inference stage; adversarial attack occurring at the inference stage. However, although these adversarial paradigms share a common goal, their developments are almost independent, and there is still no big picture of AML. In this work, we aim to provide a unified perspective to the AML community to systematically review the overall progress of this field. We firstly provide a general definition about AML, and then propose a unified mathematical framework to covering existing attack paradigms. According to the proposed unified framework, we build a full taxonomy to systematically categorize and review existing representative methods for each paradigm. Besides, using this unified framework, it is easy to figure out the connections and differences among different attack paradigms, which may inspire future researchers to develop more advanced attack paradigms. Finally, to facilitate the viewing of the built taxonomy and the related literature in adversarial machine learning, we further provide a website, \ie, \url{http://adversarial-ml.com}, where the taxonomies and literature will be continuously updated., Comment: 35 pages, 4 figures, 10 tables, 313 reference papers
Published: 2023

21. Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning

Author: Li, Longkang, Liang, Siyuan, Zhu, Zihao, Ding, Chris, Zha, Hongyuan, and Wu, Baoyuan
Subjects: Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing
Abstract: The permutation flow shop scheduling (PFSS), aiming at finding the optimal permutation of jobs, is widely used in manufacturing systems. When solving large-scale PFSS problems, traditional optimization algorithms such as heuristics could hardly meet the demands of both solution accuracy and computational efficiency, thus learning-based methods have recently garnered more attention. Some work attempts to solve the problems by reinforcement learning methods, which suffer from slow convergence issues during training and are still not accurate enough regarding the solutions. To that end, we propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Moreover, in order to extract better feature representations of input jobs, we incorporate the graph structure as the encoder. The extensive experiments reveal that our proposed model obtains significant promotion and presents excellent generalizability in large-scale problems with up to 1000 jobs. Compared to the state-of-the-art reinforcement learning method, our model's network parameters are reduced to only 37\% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8\% to 1.3\% on average. The code is available at: \url{https://github.com/longkangli/PFSS-IL}., Comment: 6 figures, 11 tables, AAAI 2024
Published: 2022

22. Investigation of Peptides Containing Branched-Chain Amino Acids from Arthrospira platensis Through A Peptidomics Workflow

Author: Guo, Yinuo, Wu, Linrong, Zhu, Zihao, Hou, Hu, and Wang, Yanchao
Published: 2024
Full Text: View/download PDF

23. BackdoorBench: A Comprehensive Benchmark of Backdoor Learning

Author: Wu, Baoyuan, Chen, Hongrui, Zhang, Mingda, Zhu, Zihao, Wei, Shaokui, Yuan, Danni, and Shen, Chao
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Backdoor learning is an emerging and vital topic for studying deep neural networks' vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being proposed, successively or concurrently, in the status of a rapid arms race. However, we find that the evaluations of new methods are often unthorough to verify their claims and accurate performance, mainly due to the rapid development, diverse settings, and the difficulties of implementation and reproducibility. Without thorough evaluations and comparisons, it is not easy to track the current progress and design the future development roadmap of the literature. To alleviate this dilemma, we build a comprehensive benchmark of backdoor learning called BackdoorBench. It consists of an extensible modular-based codebase (currently including implementations of 8 state-of-the-art (SOTA) attacks and 9 SOTA defense algorithms) and a standardized protocol of complete backdoor learning. We also provide comprehensive evaluations of every pair of 8 attacks against 9 defenses, with 5 poisoning ratios, based on 5 models and 4 datasets, thus 8,000 pairs of evaluations in total. We present abundant analysis from different perspectives about these 8,000 evaluations, studying the effects of different factors in backdoor learning. All codes and evaluations of BackdoorBench are publicly available at \url{https://backdoorbench.github.io}., Comment: Accepted at NeurIPS 2022 Datasets and Benchmarks Track; 44 pages; 8 backdoor attacks; 9 backdoor defenses; 8,000 pairs of attack-defense evaluations; several analysis and 5 analysis tools
Published: 2022

24. From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering

Author: Zhu, Zihao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In order to achieve a general visual question answering (VQA) system, it is essential to learn to answer deeper questions that require compositional reasoning on the image and external knowledge. Meanwhile, the reasoning process should be explicit and explainable to understand the working mechanism of the model. It is effortless for human but challenging for machines. In this paper, we propose a Hierarchical Graph Neural Module Network (HGNMN) that reasons over multi-layer graphs with neural modules to address the above issues. Specifically, we first encode the image by multi-layer graphs from the visual, semantic and commonsense views since the clues that support the answer may exist in different modalities. Our model consists of several well-designed neural modules that perform specific functions over graphs, which can be used to conduct multi-step reasoning within and between different graphs. Compared to existing modular networks, we extend visual reasoning from one graph to more graphs. We can explicitly trace the reasoning process according to module weights and graph attentions. Experiments show that our model not only achieves state-of-the-art performance on the CRIC dataset but also obtains explicit and explainable reasoning procedures.
Published: 2022

25. VAC2: Visual Analysis of Combined Causality in Event Sequences

Author: Zhu, Sujia, Shen, Yue, Zhu, Zihao, Xia, Wang, Chang, Baofeng, Liang, Ronghua, and Sun, Guodao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Identifying causality behind complex systems plays a significant role in different domains, such as decision making, policy implementations, and management recommendations. However, existing causality studies on temporal event sequences data mainly focus on individual causal discovery, which is incapable of exploiting combined causality. To fill the absence of combined causes discovery on temporal event sequence data,eliminating and recruiting principles are defined to balance the effectiveness and controllability on cause combinations. We also leverage the Granger causality algorithm based on the reactive point processes to describe impelling or inhibiting behavior patterns among entities. In addition, we design an informative and aesthetic visual metaphor of "electrocircuit" to encode aggregated causality for ensuring our causality visualization is non-overlapping and non-intersecting. Diverse sorting strategies and aggregation layout are also embedded into our parallel-based, directed and weighted hypergraph for illustrating combined causality. Our developed combined causality visual analysis system can help users effectively explore combined causes as well as an individual cause. This interactive system supports multi-level causality exploration with diverse ordering strategies and a focus and context technique to help users obtain different levels of information abstraction. The usefulness and effectiveness of the system are further evaluated by conducting a pilot user study and two case studies on event sequence data.
Published: 2022

26. A Guide to Quantify Arabidopsis Seedling Thermomorphogenesis at Single Timepoints and by Interval Monitoring

Author: Janitza, Philipp, primary, Zhu, Zihao, additional, Anwer, Muhammad Usman, additional, van Zanten, Martijn, additional, and Delker, Carolin, additional
Published: 2024
Full Text: View/download PDF

27. BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis

Author: Liu, Haiyang, Zhu, Zihao, Iwamoto, Naoya, Peng, Yichen, Li, Zhengqing, Zhou, You, Bozkurt, Elif, and Zheng, Bo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language, Computer Science - Graphics, Computer Science - Machine Learning, Computer Science - Multimedia
Abstract: Achieving realistic, vivid, and human-like synthesized conversational gestures conditioned on multi-modal data is still an unsolved problem due to the lack of available datasets, models and standard evaluation metrics. To address this, we build Body-Expression-Audio-Text dataset, BEAT, which has i) 76 hours, high-quality, multi-modal data captured from 30 speakers talking with eight different emotions and in four different languages, ii) 32 millions frame-level emotion and semantic relevance annotations. Our statistical analysis on BEAT demonstrates the correlation of conversational gestures with facial expressions, emotions, and semantics, in addition to the known correlation with audio, text, and speaker identity. Based on this observation, we propose a baseline model, Cascaded Motion Network (CaMN), which consists of above six modalities modeled in a cascaded architecture for gesture synthesis. To evaluate the semantic relevancy, we introduce a metric, Semantic Relevance Gesture Recall (SRGR). Qualitative and quantitative experiments demonstrate metrics' validness, ground truth data quality, and baseline's state-of-the-art performance. To the best of our knowledge, BEAT is the largest motion capture dataset for investigating human gestures, which may contribute to a number of different research fields, including controllable gesture synthesis, cross-modality analysis, and emotional gesture recognition. The data, code and model are available on https://pantomatrix.github.io/BEAT/., Comment: 28 pages, 15 figures, Accepted by ECCV2022
Published: 2022

28. Spatio-temporal dynamics of net primary productivity and the economic value of Spartina alterniflora in the coastal regions of China

Author: Wei, Sijie, Zhu, Zihao, and Wang, Shoubing
Published: 2024
Full Text: View/download PDF

29. Novel pleuromutilin derivatives conjugated with phenyl-sulfide and boron-containing moieties as potent antibacterial agents against antibiotic-resistant bacteria

Author: Luo, Xinyu, Wu, Guangxu, Feng, Jing, Zhang, Jie, Fu, Hengjian, Yu, Hang, Han, Zunsheng, Nie, Wansen, Zhu, Zihao, Liu, Bo, Pan, Weidong, Li, Beibei, Wang, Yan, Zhang, Chi, Li, Tianlei, Zhang, Wenxuan, and Wu, Song
Published: 2024
Full Text: View/download PDF

30. TE-Spikformer:Temporal-enhanced spiking neural network with transformer

Author: Gao, ShouWei, Fan, XiangYu, Deng, XingYang, Hong, ZiChao, Zhou, Hao, and Zhu, ZiHao
Published: 2024
Full Text: View/download PDF

31. An elastomer with in situ generated pure zwitterionic surfaces for fibrosis-resistant implants

Author: Zhou, Xianchi, Cao, Wenzhong, Chen, Yongcheng, Zhu, Zihao, Lai, Yuxian, Liu, Zuolong, Jia, Fan, Lu, Zhouyu, Han, Haijie, Yao, Ke, Wang, Youxiang, Ji, Jian, and Zhang, Peng
Published: 2024
Full Text: View/download PDF

32. The development of fishery-photovoltaic complementary industry and the studies on its environmental, ecological and economic effects in China: A review

Author: Zhu, Zihao, Song, Zijie, Xu, Sihan, Wang, Shoubing, Chen, Xingyu, Wang, Yongshuang, and Zhu, Zhenhua
Published: 2024
Full Text: View/download PDF

33. Establishment of a workflow for high-throughput identification of anti-inflammatory peptides from sea cucumbers

Author: Jiang, Bingxue, Liu, Jinqiu, Zhu, Zihao, Fu, Linlan, Chang, Yaoguang, Wang, Yanchao, and Xue, Changhu
Published: 2024
Full Text: View/download PDF

34. Edge dislocation with [001] [formula omitted] induced positive magnetoresistance in BaTiO3/La0.66Sr0.33MnO3 heterostructure

Author: Wang, Zhaoyang, Zhu, Zihao, Yang, Hui, Sun, Fei, Zhang, Yi, Zhang, Xiaoyue, Zhang, Bangmin, and Zheng, Yue
Published: 2024
Full Text: View/download PDF

35. First principles calculation and corrosion resistance study of Fe60CrxNi40-x medium entropy alloy

Author: Wang, Wei, Wu, Dongting, Gao, Yu, Zhu, Zihao, and Zou, Yong
Published: 2024
Full Text: View/download PDF

36. Material-structure integrated design for ultra-broadband microwave metamaterial absorber

Author: Peng, Mengyue, Qina, Faxiang, Zhou, Liping, Wei, Huijie, Zhu, Zihao, and Shen, Xiaopeng
Subjects: Physics - Applied Physics, Physics - Optics
Abstract: We propose herein a method of material-structure integrated design for broadband absorption of dielectric metamaterial, which is achieved by combination of genetic algorithm and simulation platform. A multi-layered metamaterial absorber with an ultra-broadband absorption from 5.3 to 18 GHz (a relative bandwidth of as high as 109%) is realized numerically and experimentally. In addition, simulated results demonstrate the proposed metamaterial exhibits good incident angle and polarization tolerance, which also are significant criteria for practical applications. By investigating the working principle with theoretical calculation and numerical simulation, it can be found that merging of multiple resonance modes encompassing quarter-wavelength interference cancellation, spoof surface plasmon polariton mode, dielectric resonance mode and grating mode is responsible for a remarkable ultra-broadband absorption. Analysis of respective contribution of material and structure indicates that either of them plays an indispensable role in activating different resonance modes, and symphony of material and structure is essential to afford desirable target performance. The material-structure integrated design philosophy highlights the superiority of coupling material and structure and provides an effective comprehensive optimization strategy for dielectric metamaterials., Comment: 26 pages, 8 figures
Published: 2021
Full Text: View/download PDF

37. Lignin-derived dual-function red light carbon dots for hypochlorite detection and anti-counterfeiting

Author: Chang, Yixuan, Kong, Fanwei, Zhu, Zihao, Wang, Ziai, Chen, Chunxia, Li, Xiaobai, and Ma, Hongwei
Published: 2023
Full Text: View/download PDF

38. Towards better pattern enhancement in temporal evolving set visualization

Author: Zhu, Zihao, Shen, Yue, Zhu, Sujia, Zhang, Gefei, Liang, Ronghua, and Sun, Guodao
Published: 2023
Full Text: View/download PDF

39. A wireless fluorescent sensing device for on-site closed-loop detection of hydrazine levels in the environment

Author: Zhu, Zihao, Song, Ke, Li, Xiaobai, Chen, Yu, Kong, Fanwei, Mo, Wanqi, Cheng, Zhiyong, Yang, Shilong, and Ma, Hongwei
Published: 2024
Full Text: View/download PDF

40. Covalently grafted human serum albumin coating mitigates the foreign body response against silicone implants in mice

Author: Zhou, Xianchi, Hao, Hongye, Chen, Yifeng, Cao, Wenzhong, Zhu, Zihao, Ni, Yanwen, Liu, Zuolong, Jia, Fan, Wang, Youxiang, Ji, Jian, and Peng Zhang
Published: 2024
Full Text: View/download PDF

41. MCR-Net: A Multi-Step Co-Interactive Relation Network for Unanswerable Questions on Machine Reading Comprehension

Author: Peng, Wei, Hu, Yue, Yu, Jing, Xing, Luxi, Xie, Yuqiang, Zhu, Zihao, and Sun, Yajing
Subjects: Computer Science - Computation and Language
Abstract: Question answering systems usually use keyword searches to retrieve potential passages related to a question, and then extract the answer from passages with the machine reading comprehension methods. However, many questions tend to be unanswerable in the real world. In this case, it is significant and challenging how the model determines when no answer is supported by the passage and abstains from answering. Most of the existing systems design a simple classifier to determine answerability implicitly without explicitly modeling mutual interaction and relation between the question and passage, leading to the poor performance for determining the unanswerable questions. To tackle this problem, we propose a Multi-Step Co-Interactive Relation Network (MCR-Net) to explicitly model the mutual interaction and locate key clues from coarse to fine by introducing a co-interactive relation module. The co-interactive relation module contains a stack of interaction and fusion blocks to continuously integrate and fuse history-guided and current-query-guided clues in an explicit way. Experiments on the SQuAD 2.0 and DuReader datasets show that our model achieves a remarkable improvement, outperforming the BERT-style baselines in literature. Visualization analysis also verifies the importance of the mutual interaction between the question and passage., Comment: Accepted to ICASSP 2021
Published: 2021

42. Intrinsic new properties of a quantum spin liquid

Author: Yang, Yanxing, Li, Xin, Tan, Cheng, Zhu, Zihao, Zhang, Jian, Ding, Zhaofeng, Wu, Qiong, Chen, Changshen, Shiroka, Toni, Xia, Yuanhua, MacLaughlin, Douglas E., Varma, Chandra M., and Shu, Lei
Subjects: Condensed Matter - Strongly Correlated Electrons, Condensed Matter - Materials Science
Abstract: Quantum fluctuations are expected to lead to highly entangled spin-liquid states in certain two-dimensional spin-1/2 compounds. We have synthesized and measured thermodynamic properties and muon spin relaxation rates in the copper-based two-dimensional triangular-lattice spin liquids Lu$_3$Cu$_2$Sb$_3$O$_{14}$ and Lu$_3$CuZnSb$_3$O$_{14}$. The former is the least disordered of this kind discovered to date. Magnetic entropy generation at high temperatures has been ruled out after carefully correcting for the lattice specific heat. Surprisingly, roughly half of the magnetic entropy is missing down to temperatures of O(10$^{-3}$) the exchange energy, independent of magnetic field up to $g\mu_B H \gtrsim k_B\Theta_W$, where $\Theta_W$ is the Weiss temperature. The magnetic specific heat divided by temperature $C_M(T)/T$ and muon spin relaxation rate $\lambda(T)$ are both temperature-independent at low temperatures, followed by logarithmic decreases with increasing temperature. This behavior can be simply characterized by scale-invariant time-dependent fluctuations with a single parameter. Since no cooperative effects due to impurities are observed, the measured properties are intrinsic. They are evidence that in Lu$_3$Cu$_2$Sb$_3$O$_{14}$ massive quantum fluctuations lead to either a gigantic specific heat peak from singlet excitations at very low temperatures or, perhaps less likely, an extensively degenerate possibly topological singlet ground state.
Published: 2021

43. Real-time evaluating temperature-dependent interfacial shear strength of thermoplastic composites based on stress impedance effect of magnetic fibers

Author: Li, Yunlong, Feng, Tangfeng, Wang, Yunfei, Zhu, Zihao, Peng, Hua-Xin, Xu, Peng, and Qin, Faxiang
Published: 2024
Full Text: View/download PDF

44. Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering

Author: Yu, Jing, Zhu, Zihao, Wang, Yujing, Zhang, Weifeng, Hu, Yue, and Tan, Jianlong
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Knowledge-based Visual Question Answering (KVQA) requires external knowledge beyond the visible content to answer questions about an image. This ability is challenging but indispensable to achieve general VQA. One limitation of existing KVQA solutions is that they jointly embed all kinds of information without fine-grained selection, which introduces unexpected noises for reasoning the correct answer. How to capture the question-oriented and information-complementary evidence remains a key challenge to solve the problem. Inspired by the human cognition theory, in this paper, we depict an image by multiple knowledge graphs from the visual, semantic and factual views. Thereinto, the visual graph and semantic graph are regarded as image-conditioned instantiation of the factual graph. On top of these new representations, we re-formulate Knowledge-based Visual Question Answering as a recurrent reasoning process for obtaining complementary evidence from multimodal information. To this end, we decompose the model into a series of memory-based reasoning steps, each performed by a G raph-based R ead, U pdate, and C ontrol ( GRUC ) module that conducts parallel reasoning over both visual and semantic information. By stacking the modules multiple times, our model performs transitive reasoning and obtains question-oriented concept representations under the constrain of different modalities. Finally, we perform graph neural networks to infer the global-optimal answer by jointly considering all the concepts. We achieve a new state-of-the-art performance on three popular benchmark datasets, including FVQA, Visual7W-KB and OK-VQA, and demonstrate the effectiveness and interpretability of our model with extensive experiments., Comment: Accepted at Pattern Recognition. arXiv admin note: substantial text overlap with arXiv:2006.09073
Published: 2020
Full Text: View/download PDF

45. Muon Spin Relaxation and fluctuating magnetism in the pseudogap phase of YBa$_{2}$Cu$_{3}$O$_{y}$

Author: Zhu, Zihao, Zhang, Jian, Ding, Zhaofeng, Tan, Cheng, Chen, Changsheng, Wu, Qiong, Yang, Yanxing, Bernal, Oscar O., Ho, Pei-Chun, Morris, Gerald D., Koda, Akihiro, Hillier, Adrian D., Cottrell, Stephen P., Baker, Peter J., Biswas, Pabitra K., Qian, Jun, Yao, Xin, MacLaughlin, Douglas E., and Shu, Lei
Subjects: Condensed Matter - Strongly Correlated Electrons, Condensed Matter - Superconductivity
Abstract: We report results of a muon spin relaxation study of slow magnetic fluctuations in the pseudogap phase of underdoped single-crystalline YBa$_{2}$Cu$_{3}$O$_{y}$, $y = 6.77$ and 6.83. The dependence of the dynamic muon spin relaxation rate on applied magnetic field yields the rms magnitude~$B\mathrm{_{loc}^{rms}}$ and correlation time~$\tau_c$ of fluctuating local fields at muon sites. The observed relaxation rates do not decrease with decreasing temperature~$T$ below the pseudogap onset at $T^\ast$, as would be expected for a conventional magnetic transition; both $B\mathrm{_{loc}^{rms}}$ and $\tau_c$ are roughly constant in the pseudogap phase down to the superconducting transition. Corresponding NMR relaxation rates are estimated to be too small to be observable. Our results put strong constraints on theories of the anomalous pseudogap magnetism in YBa$_{2}$Cu$_{3}$O$_{y}$.
Published: 2020
Full Text: View/download PDF

46. DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

Author: Jiang, Xiaoze, Yu, Jing, Sun, Yajing, Qin, Zengchang, Zhu, Zihao, Hu, Yue, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Visual Dialogue task requires an agent to be engaged in a conversation with human about an image. The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation. In this paper, we propose a novel generative decoding architecture to generate high-quality responses, which moves away from decoding the whole encoded semantics towards the design that advocates both transparency and flexibility. In this architecture, word generation is decomposed into a series of attention-based information selection steps, performed by the novel recurrent Deliberation, Abandon and Memory (DAM) module. Each DAM module performs an adaptive combination of the response-level semantics captured from the encoder and the word-level semantics specifically selected for generating each word. Therefore, the responses contain more detailed and non-repetitive descriptions while maintaining the semantic accuracy. Furthermore, DAM is flexible to cooperate with existing visual dialogue encoders and adaptive to the encoder structures by constraining the information selection mode in DAM. We apply DAM to three typical encoders and verify the performance on the VisDial v1.0 dataset. Experimental results show that the proposed models achieve new state-of-the-art performance with high-quality responses. The code is available at https://github.com/JXZe/DAM., Comment: Accepted by IJCAI 2020. SOLE copyright holder is IJCAI (International Joint Conferences on Artificial Intelligence)
Published: 2020

47. Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering

Author: Zhu, Zihao, Yu, Jing, Wang, Yujing, Sun, Yajing, Hu, Yue, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Fact-based Visual Question Answering (FVQA) requires external knowledge beyond visible content to answer questions about an image, which is challenging but indispensable to achieve general VQA. One limitation of existing FVQA solutions is that they jointly embed all kinds of information without fine-grained selection, which introduces unexpected noises for reasoning the final answer. How to capture the question-oriented and information-complementary evidence remains a key challenge to solve the problem. In this paper, we depict an image by a multi-modal heterogeneous graph, which contains multiple layers of information corresponding to the visual, semantic and factual features. On top of the multi-layer graph representations, we propose a modality-aware heterogeneous graph convolutional network to capture evidence from different layers that is most relevant to the given question. Specifically, the intra-modal graph convolution selects evidence from each modality and cross-modal graph convolution aggregates relevant information across different modalities. By stacking this process multiple times, our model performs iterative reasoning and predicts the optimal answer by analyzing all question-oriented evidence. We achieve a new state-of-the-art performance on the FVQA task and demonstrate the effectiveness and interpretability of our model with extensive experiments.
Published: 2020

48. Persistent spin dynamics and absence of spin freezing in the $H$-$T$ phase diagram of the 2D triangular antiferromagnet YbMgGaO$_4$

Author: Ding, Zhaofeng, Zhu, Zihao, Zhang, Jian, Tan, Cheng, Yang, Yanxing, MacLaughlin, Douglas E., and Shu, Lei
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: We report results of muon spin relaxation and rotation ($\mu$SR) experiments on the spin-liquid candidate~YbMgGaO$_{4}$. No static magnetism $\gtrsim 0.003\mu_B$ per Yb ion, ordered or disordered, is observed down to 22~mK, a factor of two lower in temperature than previous measurements. Persistent (temperature-independent) spin dynamics are observed up to 0.20~K and at least 1~kOe, thus extending previous zero-field $\mu$SR results over a substantial region of the $H$-$T$ phase diagram. Knight shift measurements in a 10-kOe transverse field reveal two lines with nearly equal amplitudes. Inhomogeneous muon depolarization in a longitudinal field, previously characterized by stretched-exponential relaxation due to spatial inhomogeneity, is fit equally well with two exponentials, also of equal amplitudes. We attribute these results to two interstitial muon sites in the unit cell, rather than disorder or other spatial distribution. Further evidence for this attribution is found from agreement between the ratio of the two measured relaxation rates and calculated mean-square local Yb$^{3+}$ dipolar fields at candidate muon sites. Zero-field data can be understood as a combination of two-exponential dynamic relaxation and quasistatic nuclear dipolar fields.
Published: 2020
Full Text: View/download PDF

49. Spinon Fermi surface spin liquid in a triangular lattice antiferromagnet NaYbSe$_2$

Author: Dai, Peng-Ling, Zhang, Gaoning, Xie, Yaofeng, Duan, Chunruo, Gao, Yonghao, Zhu, Zihao, Feng, Erxi, Tao, Zhen, Huang, Chien-Lung, Cao, Huibo, Podlesnyak, Andrey, Granroth, Garrett E., Voneshen, David, Wang, Shun, Tan, Guotai, Morosan, Emilia, Wang, Xia, Lin, Hai-Qing, Shu, Lei, Chen, Gang, Guo, Yanfeng, Lu, Xingye, and Dai, Pengcheng
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: Triangular lattice of rare-earth ions with interacting effective spin-$1/2$ local moments is an ideal platform to explore the physics of quantum spin liquids (QSLs) in the presence of strong spin-orbit coupling, crystal electric fields, and geometrical frustration. The Yb delafossites, NaYbCh$_2$ (Ch=O, S, Se) with Yb ions forming a perfect triangular lattice, have been suggested to be candidates for QSLs. Previous thermodynamics, nuclear magnetic resonance, and muon spin rotation measurements on NaYbCh$_2$ have supported the suggestion of the QSL ground states. The key signature of a QSL, the spin excitation continuum, arising from the spin quantum number fractionalization, has not been observed. Here we perform both elastic and inelastic neutron scattering measurements as well as detailed thermodynamic measurements on high-quality single-crystalline NaYbSe$_2$ samples to confirm the absence of long-range magnetic order down to 40 mK, and further reveal a clear signature of magnetic excitation continuum extending from 0.1 to 2.5 meV. The comparison between the structure of the magnetic excitation spectra and the theoretical expectation from the spinon continuum suggests that the ground state of NaYbSe$_2$ is a QSL with a spinon Fermi surface., Comment: 18 pages, 6 figures
Published: 2020
Full Text: View/download PDF

50. Coordinating matching, rebalancing and charging of electric ride-hailing fleet under hybrid requests

Author: Yu, Xinlian, Zhu, Zihao, Mao, Haijun, Hua, Mingzhuang, Li, Dawei, Chen, Jingxu, and Xu, Hongli
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

500 results on '"Zhu Zihao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources