Author: "Liu, Ziyi" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Liu, Ziyi"' showing total 1,758 results

Start Over Author "Liu, Ziyi"

1,758 results on '"Liu, Ziyi"'

1. Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics

Author: Lee, Isabelle, Lum, Joshua, Liu, Ziyi, and Yogatama, Dani
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: While interpretability research has shed light on some internal algorithms utilized by transformer-based LLMs, reasoning in natural language, with its deep contextuality and ambiguity, defies easy categorization. As a result, formulating clear and motivating questions for circuit analysis that rely on well-defined in-domain and out-of-domain examples required for causal interventions is challenging. Although significant work has investigated circuits for specific tasks, such as indirect object identification (IOI), deciphering natural language reasoning through circuits remains difficult due to its inherent complexity. In this work, we take initial steps to characterize causal reasoning in LLMs by analyzing clear-cut cause-and-effect sentences like "I opened an umbrella because it started raining," where causal interventions may be possible through carefully crafted scenarios using GPT-2 small. Our findings indicate that causal syntax is localized within the first 2-3 layers, while certain heads in later layers exhibit heightened sensitivity to nonsensical variations of causal sentences. This suggests that models may infer reasoning by (1) detecting syntactic cues and (2) isolating distinct heads in the final layers that focus on semantic relationships., Comment: 12 pages
Published: 2024

2. An Efficient System for Automatic Map Storytelling -- A Case Study on Historical Maps

Author: Liu, Ziyi, Affolter, Claudio, Wu, Sidi, Chen, Yizi, and Hurni, Lorenz
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Historical maps provide valuable information and knowledge about the past. However, as they often feature non-standard projections, hand-drawn styles, and artistic elements, it is challenging for non-experts to identify and interpret them. While existing image captioning methods have achieved remarkable success on natural images, their performance on maps is suboptimal as maps are underrepresented in their pre-training process. Despite the recent advance of GPT-4 in text recognition and map captioning, it still has a limited understanding of maps, as its performance wanes when texts (e.g., titles and legends) in maps are missing or inaccurate. Besides, it is inefficient or even impractical to fine-tune the model with users' own datasets. To address these problems, we propose a novel and lightweight map-captioning counterpart. Specifically, we fine-tune the state-of-the-art vision-language model CLIP to generate captions relevant to historical maps and enrich the captions with GPT-3.5 to tell a brief story regarding where, what, when and why of a given map. We propose a novel decision tree architecture to only generate captions relevant to the specified map type. Our system shows invariance to text alterations in maps. The system can be easily adapted and extended to other map types and scaled to a larger map captioning system. The code is open-sourced at https://github.com/claudaff/automatic-map-storytelling.
Published: 2024

3. Sequential Probability Assignment with Contexts: Minimax Regret, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood

Author: Liu, Ziyi, Attias, Idan, and Roy, Daniel M.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study the fundamental problem of sequential probability assignment, also known as online learning with logarithmic loss, with respect to an arbitrary, possibly nonparametric hypothesis class. Our goal is to obtain a complexity measure for the hypothesis class that characterizes the minimax regret and to determine a general, minimax optimal algorithm. Notably, the sequential $\ell_{\infty}$ entropy, extensively studied in the literature (Rakhlin and Sridharan, 2015, Bilodeau et al., 2020, Wu et al., 2023), was shown to not characterize minimax risk in general. Inspired by the seminal work of Shtarkov (1987) and Rakhlin, Sridharan, and Tewari (2010), we introduce a novel complexity measure, the \emph{contextual Shtarkov sum}, corresponding to the Shtarkov sum after projection onto a multiary context tree, and show that the worst case log contextual Shtarkov sum equals the minimax regret. Using the contextual Shtarkov sum, we derive the minimax optimal strategy, dubbed \emph{contextual Normalized Maximum Likelihood} (cNML). Our results hold for sequential experts, beyond binary labels, which are settings rarely considered in prior work. To illustrate the utility of this characterization, we provide a short proof of a new regret upper bound in terms of sequential $\ell_{\infty}$ entropy, unifying and sharpening state-of-the-art bounds by Bilodeau et al. (2020) and Wu et al. (2023)., Comment: To appear in NeurIPS 2024
Published: 2024

4. TEAM: Temporal Adversarial Examples Attack Model against Network Intrusion Detection System Applied to RNN

Author: Liu, Ziyi, Ye, Dengpan, Tang, Long, Zhang, Yunming, and Deng, Jiacheng
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: With the development of artificial intelligence, neural networks play a key role in network intrusion detection systems (NIDS). Despite the tremendous advantages, neural networks are susceptible to adversarial attacks. To improve the reliability of NIDS, many research has been conducted and plenty of solutions have been proposed. However, the existing solutions rarely consider the adversarial attacks against recurrent neural networks (RNN) with time steps, which would greatly affect the application of NIDS in real world. Therefore, we first propose a novel RNN adversarial attack model based on feature reconstruction called \textbf{T}emporal adversarial \textbf{E}xamples \textbf{A}ttack \textbf{M}odel \textbf{(TEAM)}, which applied to time series data and reveals the potential connection between adversarial and time steps in RNN. That is, the past adversarial examples within the same time steps can trigger further attacks on current or future original examples. Moreover, TEAM leverages Time Dilation (TD) to effectively mitigates the effect of temporal among adversarial examples within the same time steps. Experimental results show that in most attack categories, TEAM improves the misjudgment rate of NIDS on both black and white boxes, making the misjudgment rate reach more than 96.68%. Meanwhile, the maximum increase in the misjudgment rate of the NIDS for subsequent original samples exceeds 95.57%.
Published: 2024

5. avaTTAR: Table Tennis Stroke Training with On-body and Detached Visualization in Augmented Reality

Author: Ma, Dizhi, Hu, Xiyun, Shi, Jingyu, Patel, Mayank, Jain, Rahul, Liu, Ziyi, Zhu, Zhengzhe, and Ramani, Karthik
Subjects: Computer Science - Human-Computer Interaction
Abstract: Table tennis stroke training is a critical aspect of player development. We designed a new augmented reality (AR) system, avaTTAR, for table tennis stroke training. The system provides both "on-body" (first-person view) and "detached" (third-person view) visual cues, enabling users to visualize target strokes and correct their attempts effectively with this dual perspectives setup. By employing a combination of pose estimation algorithms and IMU sensors, avaTTAR captures and reconstructs the 3D body pose and paddle orientation of users during practice, allowing real-time comparison with expert strokes. Through a user study, we affirm avaTTAR's capacity to amplify player experience and training results.
Published: 2024
Full Text: View/download PDF

6. Bulk high-temperature superconductivity in the high-pressure tetragonal phase of bilayer La2PrNi2O7

Author: Wang, Ningning, Wang, Gang, Shen, Xiaoling, Hou, Jun, Luo, Jun, Ma, Xiaoping, Yang, Huaixin, Shi, Lifen, Dou, Jie, Feng, Jie, Yang, Jie, Shi, Yunqing, Ren, Zhian, Ma, Hanming, Yang, Pengtao, Liu, Ziyi, Liu, Yue, Zhang, Hua, Dong, Xiaoli, Wang, Yuxin, Jiang, Kun, Hu, Jiangping, Calder, Stuart, Yan, Jiaqiang, Sun, Jianping, Wang, Bosen, Zhou, Rui, Uwatoko, Yoshiya, and Cheng, Jinguang
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Strongly Correlated Electrons
Abstract: The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the filamentary nature with low superconducting volume fraction. The presence of a novel "1313" polymorph and competing R-P phases obscured proper identification of the phase for HTSC. Thus, achieving bulk HTSC and identifying the phase at play are the most prominent tasks at present. Here, we address these issues in the praseodymium (Pr)-doped La2PrNi2O7 polycrystalline samples. We find that the substitutions of Pr for La effectively inhibits the intergrowth of different R-P phases, resulting in nearly pure bilayer structure. For La2PrNi2O7, pressure-induced orthorhombic-to-tetragonal structural transition takes place at Pc ~ 11 GPa, above which HTSC emerges gradually upon further compression. The superconducting transition temperatures at 18-20 GPa reach Tconset = 82.5 K and Tczero = 60 K, which are the highest values among known nickelate superconductors. More importantly, bulk HTSC was testified by detecting clear diamagnetic signals below ~75 K corresponding to an estimated superconducting volume fraction ~ 57(5)% at 20 GPa. Our results not only resolve the existing controversies but also illuminate directions for exploring bulk HTSC in the bilayer nickelates.
Published: 2024

7. Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals

Author: Liu, Ziyi, Attias, Idan, and Roy, Daniel M.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In this work, we investigate the problem of adapting to the presence or absence of causal structure in multi-armed bandit problems. In addition to the usual reward signal, we assume the learner has access to additional variables, observed in each round after acting. When these variables $d$-separate the action from the reward, existing work in causal bandits demonstrates that one can achieve strictly better (minimax) rates of regret (Lu et al., 2020). Our goal is to adapt to this favorable "conditionally benign" structure, if it is present in the environment, while simultaneously recovering worst-case minimax regret, if it is not. Notably, the learner has no prior knowledge of whether the favorable structure holds. In this paper, we establish the Pareto optimal frontier of adaptive rates. We prove upper and matching lower bounds on the possible trade-offs in the performance of learning in conditionally benign and arbitrary environments, resolving an open question raised by Bilodeau et al. (2022). Furthermore, we are the first to obtain instance-dependent bounds for causal bandits, by reducing the problem to the linear bandit setting. Finally, we examine the common assumption that the marginal distributions of the post-action contexts are known and show that a nontrivial estimate is necessary for better-than-worst-case minimax rates., Comment: Accepted to ICML 2024
Published: 2024

8. InterIntent: Investigating Social Intelligence of LLMs via Intention Understanding in an Interactive Game Context

Author: Liu, Ziyi, Anand, Abhishek, Zhou, Pei, Huang, Jen-tse, and Zhao, Jieyu
Subjects: Computer Science - Artificial Intelligence
Abstract: Large language models (LLMs) have demonstrated the potential to mimic human social intelligence. However, most studies focus on simplistic and static self-report or performance-based tests, which limits the depth and validity of the analysis. In this paper, we developed a novel framework, InterIntent, to assess LLMs' social intelligence by mapping their ability to understand and manage intentions in a game setting. We focus on four dimensions of social intelligence: situational awareness, self-regulation, self-awareness, and theory of mind. Each dimension is linked to a specific game task: intention selection, intention following, intention summarization, and intention guessing. Our findings indicate that while LLMs exhibit high proficiency in selecting intentions, achieving an accuracy of 88%, their ability to infer the intentions of others is significantly weaker, trailing human performance by 20%. Additionally, game performance correlates with intention understanding, highlighting the importance of the four components towards success in this game. These findings underline the crucial role of intention understanding in evaluating LLMs' social intelligence and highlight the potential of using social deduction games as a complex testbed to enhance LLM evaluation. InterIntent contributes a structured approach to bridging the evaluation gap in social intelligence within multiplayer games.
Published: 2024

9. DIP-Watermark: A Double Identity Protection Method Based on Robust Adversarial Watermark

Author: Zhang, Yunming, Ye, Dengpan, Xie, Caiyun, Shen, Sipeng, Liu, Ziyi, Deng, Jiacheng, and Tang, Long
Subjects: Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: The wide deployment of Face Recognition (FR) systems poses privacy risks. One countermeasure is adversarial attack, deceiving unauthorized malicious FR, but it also disrupts regular identity verification of trusted authorizers, exacerbating the potential threat of identity impersonation. To address this, we propose the first double identity protection scheme based on traceable adversarial watermarking, termed DIP-Watermark. DIP-Watermark employs a one-time watermark embedding to deceive unauthorized FR models and allows authorizers to perform identity verification by extracting the watermark. Specifically, we propose an information-guided adversarial attack against FR models. The encoder embeds an identity-specific watermark into the deep feature space of the carrier, guiding recognizable features of the image to deviate from the source identity. We further adopt a collaborative meta-optimization strategy compatible with sub-tasks, which regularizes the joint optimization direction of the encoder and decoder. This strategy enhances the representation of universal carrier features, mitigating multi-objective optimization conflicts in watermarking. Experiments confirm that DIP-Watermark achieves significant attack success rates and traceability accuracy on state-of-the-art FR models, exhibiting remarkable robustness that outperforms the existing privacy protection methods using adversarial attacks and deep watermarking, or simple combinations of the two. Our work potentially opens up new insights into proactive protection for FR privacy.
Published: 2024

10. STAT: Towards Generalizable Temporal Action Localization

Author: Liu, Yangcen, Liu, Ziyi, Zhai, Yuanhao, Li, Wen, Doerman, David, and Yuan, Junsong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Weakly-supervised temporal action localization (WTAL) aims to recognize and localize action instances with only video-level labels. Despite the significant progress, existing methods suffer from severe performance degradation when transferring to different distributions and thus may hardly adapt to real-world scenarios . To address this problem, we propose the Generalizable Temporal Action Localization task (GTAL), which focuses on improving the generalizability of action localization methods. We observed that the performance decline can be primarily attributed to the lack of generalizability to different action scales. To address this problem, we propose STAT (Self-supervised Temporal Adaptive Teacher), which leverages a teacher-student structure for iterative refinement. Our STAT features a refinement module and an alignment module. The former iteratively refines the model's output by leveraging contextual information and helps adapt to the target scale. The latter improves the refinement process by promoting a consensus between student and teacher models. We conduct extensive experiments on three datasets, THUMOS14, ActivityNet1.2, and HACS, and the results show that our method significantly improves the Baseline methods under the cross-distribution evaluation setting, even approaching the same-distribution evaluation performance., Comment: 14 pages, LaTeX
Published: 2024

11. Conversational Disease Diagnosis via External Planner-Controlled Large Language Models

Author: Sun, Zhoujian, Luo, Cheng, Liu, Ziyi, and Huang, Zhengxing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The development of large language models (LLMs) has brought unprecedented possibilities for artificial intelligence (AI) based medical diagnosis. However, the application perspective of LLMs in real diagnostic scenarios is still unclear because they are not adept at collecting patient data proactively. This study presents a LLM-based diagnostic system that enhances planning capabilities by emulating doctors. Our system involves two external planners to handle planning tasks. The first planner employs a reinforcement learning approach to formulate disease screening questions and conduct initial diagnoses. The second planner uses LLMs to parse medical guidelines and conduct differential diagnoses. By utilizing real patient electronic medical record data, we constructed simulated dialogues between virtual patients and doctors and evaluated the diagnostic abilities of our system. We demonstrated that our system obtained impressive performance in both disease screening and differential diagnoses tasks. This research represents a step towards more seamlessly integrating AI into clinical settings, potentially enhancing the accuracy and accessibility of medical diagnostics., Comment: Work in Progress
Published: 2024

12. Noncentrosymmetric Nowotny Chimney Ladder Ferromagnet Cr4Ge7 with a High Curie Temperature of ~ 207 K

Author: Yu, Zhenhai, Zhou, Kaijuan, Hou, Xiaofei, Chen, Xuejiao, Tao, Zhen, Ye, Yunguan, Xia, Wei, Li, Zhongyang, Zhao, Jinggeng, Wu, Wei, Liu, Ziyi, Wang, Xia, Yu, Na, Cheng, Jinguang, Luo, Jianlin, Zhang, Qiang, Pomjakushin, Vladimir, Zhong, Zhicheng, Rui, Soh Jian, Lu, Xingye, and Guo, Yanfeng
Subjects: Condensed Matter - Strongly Correlated Electrons, Condensed Matter - Materials Science
Abstract: Noncentrosymmetric magnets usually host intriguing magnetic interactions inherent the crystal structure with broken inversion symmetry, which can give rise to rich magnetic behaviors. We report herein the high-pressure synthesis, crystal structure, magnetizations and magnetic structure of a so-called Nowotny chimney ladder compound Cr4Ge7. Our analysis on the powder neutron diffraction data revises the crystal structure as a noncentrosymmetric space group (P-4c2, No.116). It exhibits two magnetic orders within the temperature range of 2 - 400 K. The first order at ~ 207 K associated with a small magnetic moment of ~ 0.75 miuB is assigned to a commensurate ferromagnetic structure with a propagation vector k = (0, 0, 0). The weak itinerant ferromagnet nature should be caused by the complex Cr spin orders from different Wyckoff positions. The second order at ~ 18 K is assumed to arise from a competition between the Dzyaloshinskii-Moria and Heisenberg interactions. The results provide an excellent platform for study on intricate interactions between various magnetic exchanges as well as for the exploration of high temperature exotic magnetic properties which host potential applications in next-generation spintronics., Comment: 21 pages, 5 figures, 2 tables; Supporting Information is not included
Published: 2024
Full Text: View/download PDF

13. Reusability report: exploring the utility of variational graph encoders for predicting molecular toxicity in drug design

Author: Li, Ruijiang, Lu, Jiang, Liu, Ziyi, Yi, Duoyun, Wan, Mengxuan, Zhang, Yixin, Zan, Peng, He, Song, and Bo, Xiaochen
Published: 2024
Full Text: View/download PDF

14. Mechanical Properties Test and Failure Analysis of Composite Foam Sandwich Structure in Ramp-Down Zone

Author: Yang, Kang, Liu, Ziyi, Yang, Yong, Zhou, Guoqing, Su, Changqing, and Feng, Huan
Published: 2024
Full Text: View/download PDF

15. Bulk high-temperature superconductivity in pressurized tetragonal La2PrNi2O7

Author: Wang, Ningning, Wang, Gang, Shen, Xiaoling, Hou, Jun, Luo, Jun, Ma, Xiaoping, Yang, Huaixin, Shi, Lifen, Dou, Jie, Feng, Jie, Yang, Jie, Shi, Yunqing, Ren, Zhian, Ma, Hanming, Yang, Pengtao, Liu, Ziyi, Liu, Yue, Zhang, Hua, Dong, Xiaoli, Wang, Yuxin, Jiang, Kun, Hu, Jiangping, Nagasaki, Shoko, Kitagawa, Kentaro, Calder, Stuart, Yan, Jiaqiang, Sun, Jianping, Wang, Bosen, Zhou, Rui, Uwatoko, Yoshiya, and Cheng, Jinguang
Published: 2024
Full Text: View/download PDF

16. Dietary vitamin intake and cancer risk in patients with chronic kidney disease: results from the National Health and Nutrition Examination Survey (2007–2018)

Author: Li, Jiyuan, Liu, Ziyi, Xie, Xubiao, Peng, Longkai, Dai, Helong, Gao, Chen, Mao, Wendan, Yuan, Wenjia, Zhao, Xue, Zhang, Hongliang, and Peng, Fenghua
Published: 2024
Full Text: View/download PDF

17. Multivitamins co-intake can reduce the prevalence of kidney stones: a large-scale cross-sectional study

Author: Zeng, Hongbo, Liu, Ziyi, He, Yunhui, Chen, Huixiang, He, Jun, Liu, Mingke, Wu, Shuiqing, He, Haiqing, Huang, Changkun, and Xu, Ran
Published: 2024
Full Text: View/download PDF

18. A conformal mapping approach to broadband nonlinear optics on chip

Author: Huang, Chunyu, Luo, Yu, Zhao, Yule, Ma, Xiaofei, Yan, Zhiwei, Liu, Ziyi, Sheng, Chong, Zhu, Shining, and Liu, Hui
Subjects: Physics - Optics
Abstract: Integrated nonlinear optical devices play an important role in modern optical communications. However, conventional on-chip optical devices with homogeneous or periodic translation dimensions generally have limited bandwidth when applied to nonlinear optical applications. Up today, there lacks a general method to design compact nonlinear optical devices over a broadband continuous frequency range. In this work, we propose a general strategy based on transformation optics (TO) to design curved accelerating waveguides (CAWs) with spatially gradient curvatures able to achieve broadband nonlinear frequency conversion on chip. Through rigorous analytical calculation, we show that increasing the acceleration (i.e. gradient in the waveguide curvature) broadens the output signal spectrum in the nonlinear process. In the experiment, we take the sum-frequency generation for infrared signal upconversion (SFG-ISU) as the example and fabricated a variety of CAWs using thin-film lithium niobate on insulator (LNOI). Efficient SFG is observed over a broadband continuous spectrum. Our conformal mapping approach offers a platform for various nonlinear optical processes and works in any frequency range, including visible, infrared and terahertz bands. Apart from LNOI, our approach is also compatible with other nonlinear materials, such as silicon, silicon nitride and chalcogenide glasses etc., Comment: 15 pages, 4 figures
Published: 2024
Full Text: View/download PDF

19. APOE–NOTCH axis governs elastogenesis during human cardiac valve remodeling

Author: Liu, Ziyi, Liu, Yu, Yu, Zhiyun, Tan, Cheng, Pek, Nicole, O’Donnell, Anna, Wu, Angeline, Glass, Ian, Winlaw, David S., Guo, Minzhe, Spence, Jason R., Chen, Ya-Wen, Yutzey, Katherine E., Miao, Yifei, and Gu, Mingxia
Published: 2024
Full Text: View/download PDF

20. Transcatheter Arterial Embolization with Bleomycin-Lipiodol of Hepatic Hemangiomas: Safety, Efficacy and Predictors of Response

Author: Zhao, Dan, Xie, Lingli, Makamure, Joyman, Liu, Ziyi, Zhang, Lijie, Li, Qing, Zhang, Xin, Zhao, Yazhuo, Zheng, Chuansheng, Shi, Liangrong, and Liang, Bin
Published: 2024
Full Text: View/download PDF

21. FDA-Approved Tedizolid Phosphate Prevents Cisplatin-Induced Hearing Loss Without Decreasing Its Anti-tumor Effect

Author: Yao, Zhiwei, Xiao, Yu, Li, Wen, Kong, Shuhui, Tu, Hailong, Guo, Siwei, Liu, Ziyi, Ma, Lushun, Qiao, Ruifeng, Wang, Song, Chang, Miao, Zhao, Xiaoxu, Zhang, Yuan, Xu, Lei, Sun, Daqing, and Fu, Xiaolong
Published: 2024
Full Text: View/download PDF

22. Self-Contradictory Reasoning Evaluation and Detection

Author: Liu, Ziyi, Sanyal, Soumya, Lee, Isabelle, Du, Yongkang, Gupta, Rahul, Liu, Yang, and Zhao, Jieyu
Subjects: Computer Science - Computation and Language
Abstract: In a plethora of recent work, large language models (LLMs) demonstrated impressive reasoning ability, but many proposed downstream reasoning tasks only focus on final answers. Two fundamental questions persist: 1) how consistent is the reasoning, and 2) can models detect unreliable reasoning? In this paper, we investigate self-contradictory (Self-Contra) reasoning, where the model reasoning does not support its answers. To answer 1), we define and assess the Self-Contra rate across three datasets and delve into finer-grained categories of Self-Contra reasoning. We find that LLMs often contradict themselves in reasoning tasks involving contextual information understanding or commonsense. The model may generate correct answers by taking shortcuts in reasoning or overlooking contextual evidence, leading to compromised reasoning. For 2), we task the state-of-the-art model GPT-4 with identifying Self-Contra reasoning and finer-grained fallacies. We find that finer-grained categories enhanced detection can improve GPT-4's ability to detect Self-Contra. However, it is only able to detect Self-Contra with a 52.2% F1 score, much lower compared to 66.7% for humans. Our results indicate that current LLMs lack the robustness necessary for reliable reasoning and we emphasize the urgent need for establishing best practices in comprehensive reasoning evaluations beyond pure performance-based metrics.
Published: 2023

23. Observation of high-temperature superconductivity in the high-pressure tetragonal phase of La2PrNi2O7-{\delta}

Author: Wang, Gang, Wang, Ningning, Wang, Yuxin, Shi, Lifen, Shen, Xiaoling, Hou, Jun, Ma, Hanming, Yang, Pengtao, Liu, Ziyi, Zhang, Hua, Dong, Xiaoli, Sun, Jianping, Wang, Bosen, Jiang, Kun, Hu, Jiangping, Uwatoko, Yoshiya, and Cheng, Jinguang
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Strongly Correlated Electrons
Abstract: The recent discovery of high-temperature superconductivity in the Ruddlesden-Popper phase La3Ni2O7 under high pressure marks a significant breakthrough in the field of 3d transition-metal oxide superconductors. For an emerging novel class of high-Tc superconductors, it is crucial to find more analogous superconducting materials with a dedicated effort toward broadening the scope of nickelate superconductors. Here, we report on the observation of high-Tc superconductivity in the high-pressure tetragonal I4/mmm phase of La2PrNi2O7 above ~10 GPa, which is distinct from the reported orthorhombic Fmmm phase of La3Ni2O7 above 14 GPa. For La2PrNi2O7, the onset and the zero-resistance temperatures of superconductivity reach Tconset = 78.2 K and Tczero = 40 K at 15 GPa. This superconducting phase shares the samilar structural symmetry as many cuprate superconductors, providing a fresh platform to investigate underlying mechanisms of nickelate superconductors., Comment: 19 pages and 6 figures
Published: 2023

24. Dual Defense: Adversarial, Traceable, and Invisible Robust Watermarking against Face Swapping

Author: Zhang, Yunming, Ye, Dengpan, Xie, Caiyun, Tang, Long, Chen, Chuanxi, Liu, Ziyi, and Deng, Jiacheng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The malicious applications of deep forgery, represented by face swapping, have introduced security threats such as misinformation dissemination and identity fraud. While some research has proposed the use of robust watermarking methods to trace the copyright of facial images for post-event traceability, these methods cannot effectively prevent the generation of forgeries at the source and curb their dissemination. To address this problem, we propose a novel comprehensive active defense mechanism that combines traceability and adversariality, called Dual Defense. Dual Defense invisibly embeds a single robust watermark within the target face to actively respond to sudden cases of malicious face swapping. It disrupts the output of the face swapping model while maintaining the integrity of watermark information throughout the entire dissemination process. This allows for watermark extraction at any stage of image tracking for traceability. Specifically, we introduce a watermark embedding network based on original-domain feature impersonation attack. This network learns robust adversarial features of target facial images and embeds watermarks, seeking a well-balanced trade-off between watermark invisibility, adversariality, and traceability through perceptual adversarial encoding strategies. Extensive experiments demonstrate that Dual Defense achieves optimal overall defense success rates and exhibits promising universality in anti-face swapping tasks and dataset generalization ability. It maintains impressive adversariality and traceability in both original and robust settings, surpassing current forgery defense methods that possess only one of these capabilities, including CMUA-Watermark, Anti-Forgery, FakeTagger, or PGD methods.
Published: 2023

25. Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance

Author: Zhang, Jesse, Zhang, Jiahui, Pertsch, Karl, Liu, Ziyi, Ren, Xiang, Chang, Minsuk, Sun, Shao-Hua, and Lim, Joseph J.
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We propose BOSS, an approach that automatically learns to solve new long-horizon, complex, and meaningful tasks by growing a learned skill library with minimal supervision. Prior work in reinforcement learning require expert supervision, in the form of demonstrations or rich reward functions, to learn long-horizon tasks. Instead, our approach BOSS (BOotStrapping your own Skills) learns to accomplish new tasks by performing "skill bootstrapping," where an agent with a set of primitive skills interacts with the environment to practice new skills without receiving reward feedback for tasks outside of the initial skill set. This bootstrapping phase is guided by large language models (LLMs) that inform the agent of meaningful skills to chain together. Through this process, BOSS builds a wide range of complex and useful behaviors from a basic set of primitive skills. We demonstrate through experiments in realistic household environments that agents trained with our LLM-guided bootstrapping procedure outperform those trained with naive bootstrapping as well as prior unsupervised skill acquisition methods on zero-shot execution of unseen, long-horizon tasks in new environments. Website at clvrai.com/boss., Comment: CoRL 2023 (Oral); 24 pages, 11 figures
Published: 2023

26. A Universal Scheme for Dynamic Partitioned Shortest Path Index

Author: Zhang, Mengxuan, Zhou, Xinjie, Li, Lei, Liu, Ziyi, Trajcevski, Goce, Huang, Yan, and Zhou, Xiaofang
Subjects: Computer Science - Databases
Abstract: Shortest path (SP) computation is the fundamental operation in various networks such as urban networks, logistic networks, communication networks, social networks, etc. With the development of technology and societal expansions, those networks tend to be massive. This, in turn, causes deteriorated performance of SP computation, and graph partitioning is commonly leveraged to scale up the SP algorithms. However, the partitioned shortest path (PSP) index has never been systematically investigated and theoretically analyzed, and there is a lack of experimental comparison among different PSP indexes. Moreover, few studies have explored PSP index maintenance in dynamic networks. Therefore, in this paper, we systematically analyze the dynamic PSP index by proposing a universal scheme for it. Specifically, we first propose two novel partitioned shortest path strategies (No-boundary and Post-boundary strategies) to improve the performance of PSP indexes and design the corresponding index maintenance approaches to deal with dynamic scenarios. Then we categorize the partition methods from the perspective of partition structure to facilitate the selection of partition methods in the PSP index. Furthermore, we propose a universal scheme for designing the PSP index by coupling its three dimensions (i.e. PSP strategy, partition structure, and SP algorithm). Based on this scheme, we propose five new PSP indexes with prominent performance in either query or update efficiency. Lastly, extensive experiments are implemented to demonstrate the effectiveness of the proposed PSP scheme, with valuable guidance provided on the PSP index design.
Published: 2023

27. Pressure-induced superconductivity in polycrystalline La3Ni2O7

Author: Wang, Gang, Wang, Ningning, Hou, Jun, Ma, Liang, Shi, Lifen, Ren, Zhian, Gu, Yadong, Shen, Xiaoling, Ma, Hanming, Yang, Pengtao, Liu, Ziyi, Guo, Haizhong, Sun, Jianping, Zhang, Guangming, Yan, Jiaqiang, Wang, Bosen, Uwatoko, Yoshiya, and Cheng, Jinguang
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Strongly Correlated Electrons
Abstract: We synthesized polycrystalline La3Ni2O7 samples by using the sol-gel method without post-annealing under high oxygen pressure, and then measured temperature-dependent resistivity under various hydrostatic pressures up to 14.5 GPa in a cubic anvil cell apparatus. We find that the density-wave-like anomaly in resistivity is progressively suppressed with increasing pressure and the resistivity drop corresponding to the onset of superconductivity emerges at pressure as low as 7 GPa. Zero resistivity is achieved at 9 GPa below 6.6 K, which increases quickly with pressure to 35.6 K at 14.5 GPa. The observation of zero-resistance state in the polycrystalline La3Ni2O7 samples under high pressures not only corroborates the recent report of superconductivity in the pressurized La3Ni2O7 crystals but also facilitates further studies on this emerging family of nickelate high-Tc superconductors., Comment: 12 pages, 4 figures
Published: 2023
Full Text: View/download PDF

28. What To Do (and Not to Do) with Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study

Author: Chiu, Albert, Lan, Xingchen, Liu, Ziyi, and Xu, Yiqing
Subjects: Statistics - Methodology, Economics - Econometrics, Statistics - Applications
Abstract: Two-way fixed effects (TWFE) models are ubiquitous in causal panel analysis in political science. However, recent methodological discussions challenge their validity in the presence of heterogeneous treatment effects (HTE) and violations of the parallel trends assumption (PTA). This burgeoning literature has introduced multiple estimators and diagnostics, leading to confusion among empirical researchers on two fronts: the reliability of existing results based on TWFE models and the current best practices. To address these concerns, we examined, replicated, and reanalyzed 37 articles from three leading political science journals that employed observational panel data with binary treatments. Using six newly introduced HTE-robust estimators, along with diagnostics tests and uncertainty measures that are robust to PTA violations, we find that only a small minority of studies are highly robust. Although HTE-robust estimates tend to be broadly consistent with TWFE estimates, discrepancies in point estimates, increased measures of uncertainty, and potential PTA violations call into question many results that were already on the margins of statistical significance. We offer recommendations for improving practice in empirical research based on these findings.
Published: 2023

29. SOAR: Scene-debiasing Open-set Action Recognition

Author: Zhai, Yuanhao, Liu, Ziyi, Wu, Zhenyu, Wu, Yi, Zhou, Chunluan, Doermann, David, Yuan, Junsong, and Hua, Gang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Deep learning models have a risk of utilizing spurious clues to make predictions, such as recognizing actions based on the background scene. This issue can severely degrade the open-set action recognition performance when the testing samples have different scene distributions from the training samples. To mitigate this problem, we propose a novel method, called Scene-debiasing Open-set Action Recognition (SOAR), which features an adversarial scene reconstruction module and an adaptive adversarial scene classification module. The former prevents the decoder from reconstructing the video background given video features, and thus helps reduce the background information in feature learning. The latter aims to confuse scene type classification given video features, with a specific emphasis on the action foreground, and helps to learn scene-invariant information. In addition, we design an experiment to quantify the scene bias. The results indicate that the current open-set action recognizers are biased toward the scene, and our proposed SOAR method better mitigates such bias. Furthermore, our extensive experiments demonstrate that our method outperforms state-of-the-art methods, and the ablation studies confirm the effectiveness of our proposed modules., Comment: Accepted to ICCV 2023, code:https://github.com/yhZhai/SOAR
Published: 2023

30. High-precision real-time autonomous driving target detection based on YOLOv8

Author: Liu, Huixin, Lu, Guohua, Li, Mingxi, Su, Weihua, Liu, Ziyi, Dang, Xu, and Zang, Dongyuan
Published: 2024
Full Text: View/download PDF

31. Optimizing Perovskite Thin‐Film Parameter Spaces with Machine Learning‐Guided Robotic Platform for High‐Performance Perovskite Solar Cells

Author: Zhang, Jiyun, Liu, Bowen, Liu, Ziyi, Wu, Jianchang, Arnold, Simon, Shi, Hongyang, Osterrieder, Tobias, Hauch, Jens A, Wu, Zhenni, Luo, Junsheng, Wagner, Jerrit, Berger, Christian G, Stubhan, Tobias, Schmitt, Frederik, Zhang, Kaicheng, Sytnyk, Mykhailo, Heumueller, Thomas, Sutter‐Fella, Carolin M, Peters, Ian Marius, Zhao, Yicheng, and Brabec, Christoph J
Subjects: Macromolecular and Materials Chemistry, Chemical Sciences, Physical Chemistry, closed-loop optimization, efficient and stable devices, machine learning, manufacturing optimization, perovskite thin films, PL characterization, robotic platform, Materials Engineering, Interdisciplinary Engineering, Macromolecular and materials chemistry, Materials engineering
Abstract: Abstract: Simultaneously optimizing the processing parameters of functional thin films remains a challenge. The design and utilization of a fully automated platform called SPINBOT is presented for the engineering of solution‐processed functional thin films. The SPINBOT is capable of performing experiments with high sampling variability through the unsupervised processing of hundreds of substrates with exceptional experimental control. Through the iterative optimization process enabled by the Bayesian optimization (BO) algorithm, the SPINBOT explores an intricate parameter space, continuously improving the quality and reproducibility of the produced thin films. This machine learning (ML)‐guided reliable SPINBOT platform enables the acceleration of the optimization process of perovskite solar cells via a simple photoluminescence characterization of films. As a result, this study arrives at an optimal film that, when processed into a solar cell in an ambient atmosphere, immediately yields a champion power conversion efficiency (PCE) of 21.6% with satisfactory performance reproducibility. The unsealed devices retain 90% of their initial efficiency after 1100 h of continuous operation at 60–65 °C under metal‐halide lamps. It is anticipated that the integration of robotic platforms with the intelligent algorithm will facilitate the widespread adoption of effective autonomous experimentation to address the evolving needs and constraints within the materials science research community.
Published: 2023

32. Dental stem cell dynamics in periodontal ligament regeneration: from mechanism to application

Author: Wen, Shuyi, Zheng, Xiao, Yin, Wuwei, Liu, Yushan, Wang, Ruijie, Zhao, Yaqi, Liu, Ziyi, Li, Cong, Zeng, Jincheng, and Rong, Mingdeng
Published: 2024
Full Text: View/download PDF

33. Comprehensive assessment of heavy slow resistance training and high-dose therapeutic ultrasound in managing patellar tendinopathy, a randomized single-blind controlled trial

Author: Xiao, Liufeng, Zhou, Heng, He, Jia, Liu, Hua, Li, Yongchao, Liu, Ziyi, and Hu, Hao
Published: 2024
Full Text: View/download PDF

34. Molecular quantification of fritillariae cirrhosae bulbus and its adulterants

Author: Liu, Ziyi, Pei, Yifei, Chen, Tiezhu, Yang, Zemin, Jiang, Wenjun, Feng, Xue, and Li, Xiwen
Published: 2024
Full Text: View/download PDF

35. The effects of environments and adhesive layer thickness on the failure modes of composite material bonded joints

Author: Yang, Kang, Feng, Huan, Li, Pengyang, Ji, Shude, Lv, Zan, and Liu, Ziyi
Published: 2024
Full Text: View/download PDF

36. Fine-grained recognition of bitter gourd maturity based on Improved YOLOv5-seg model

Author: Jiang, Sheng, Ao, Jiangbo, Yang, Hualin, Xie, Fangnan, Liu, Ziyi, Yang, Shanglin, Wei, Yichen, and Deng, Xijin
Published: 2024
Full Text: View/download PDF

37. Study on the rationality of small diameter metallic airway stent in treatment of tracheal stenosis in injured rabbits

Author: Li, Xiaoxiao, Wang, Changguo, Liu, Ziyi, Wu, Kai, Yang, Zhenyu, Zeng, Daxiong, Lin, Dang, and Jiang, Junhong
Published: 2024
Full Text: View/download PDF

38. Exploring the combined cooling effect of street canyon geometry and the surrounding built environment

Author: Liu, Ziyi, Hu, Lihui, Chen, Huilin, Li, Zexun, and Jiang, Ling
Published: 2024
Full Text: View/download PDF

39. MFU-Net: a deep multimodal fusion network for breast cancer segmentation with dual-layer spectral detector CT

Author: Yang, Aisen, Xu, Lulu, Qin, Na, Huang, Deqing, Liu, Ziyi, and Shu, Jian
Published: 2024
Full Text: View/download PDF

40. Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-Text Rationales

Author: Joshi, Brihi, Liu, Ziyi, Ramnath, Sahana, Chan, Aaron, Tong, Zhewei, Nie, Shaoliang, Wang, Qifan, Choi, Yejin, and Ren, Xiang
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Among the remarkable emergent capabilities of large language models (LMs) is free-text rationalization; beyond a certain scale, large LMs are capable of generating seemingly useful rationalizations, which in turn, can dramatically enhance their performances on leaderboards. This phenomenon raises a question: can machine generated rationales also be useful for humans, especially when lay humans try to answer questions based on those machine rationales? We observe that human utility of existing rationales is far from satisfactory, and expensive to estimate with human studies. Existing metrics like task performance of the LM generating the rationales, or similarity between generated and gold rationales are not good indicators of their human utility. While we observe that certain properties of rationales like conciseness and novelty are correlated with their human utility, estimating them without human involvement is challenging. We show that, by estimating a rationale's helpfulness in answering similar unseen instances, we can measure its human utility to a better extent. We also translate this finding into an automated score, GEN-U, that we propose, which can help improve LMs' ability to generate rationales with better human utility, while maintaining most of its task performance. Lastly, we release all code and collected data with this project., Comment: Accepted at ACL 2023
Published: 2023

41. Handling Concept Drift in Global Time Series Forecasting

Author: Liu, Ziyi, Godahewa, Rakshitha, Bandara, Kasun, and Bergmeir, Christoph
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Machine learning (ML) based time series forecasting models often require and assume certain degrees of stationarity in the data when producing forecasts. However, in many real-world situations, the data distributions are not stationary and they can change over time while reducing the accuracy of the forecasting models, which in the ML literature is known as concept drift. Handling concept drift in forecasting is essential for many ML methods in use nowadays, however, the prior work only proposes methods to handle concept drift in the classification domain. To fill this gap, we explore concept drift handling methods in particular for Global Forecasting Models (GFM) which recently have gained popularity in the forecasting domain. We propose two new concept drift handling methods, namely: Error Contribution Weighting (ECW) and Gradient Descent Weighting (GDW), based on a continuous adaptive weighting concept. These methods use two forecasting models which are separately trained with the most recent series and all series, and finally, the weighted average of the forecasts provided by the two models are considered as the final forecasts. Using LightGBM as the underlying base learner, in our evaluation on three simulated datasets, the proposed models achieve significantly higher accuracy than a set of statistical benchmarks and LightGBM baselines across four evaluation metrics., Comment: 23 pages, 3 figures, 2 tables
Published: 2023

42. Percolation-induced resistivity drop in cold-pressed LuH2

Author: Wang, Ningning, Hou, Jun, Liu, Ziyi, Shan, Pengfei, Chai, Congcong, Jin, Shifeng, Wang, Xiao, Long, Youwen, Liu, Yue, Zhang, Hua, Dong, Xiaoli, and Cheng, Jinguang
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Materials Science, Condensed Matter - Strongly Correlated Electrons
Abstract: The stoichiometric bulk LuH2 is a paramagnetic metal with high electrical conductivity comparable to simple metals. Here we show that the resistivity of cold-pressed (CP) LuH2 samples varies sensitively upon modifying the grain size or surface conditions via the grinding process, i.e., the CP pellets made of commercially purchased LuH2 powder remain metallic but exhibit thousands of times higher resistivity, while additional grinding of LuH2 powders in air further enhances the resistivity and even results in weakly localized behaviors. For these CP samples, interestingly, we can occasionally observe abrupt resistivity drops at high temperatures, which also show dependences on magnetic fields and electrical current. Measurements of variable-temperature XRD, magnetic susceptibility, and specific heat exclude the possibilities of structural, magnetic, and superconducting transitions for the observed resistivity drops. Instead, we tentatively attribute these above observations to the presence of insulating layers on the grain surface due to the modification of hydrogen stoichiometry or the pollution by oxygen/nitrogen. Percolation of the metallic grains through the insulating surfaces can explain the sudden drop in resistivity. The present results thus call for caution in asserting the resistivity drops as superconductivity and invalidate the background subtraction in analyzing the resistivity data., Comment: 17 pages, 10 figures
Published: 2023
Full Text: View/download PDF

43. Learning Adaptable Risk-Sensitive Policies to Coordinate in Multi-Agent General-Sum Games

Author: Liu, Ziyi and Fang, Yongchun
Subjects: Computer Science - Multiagent Systems
Abstract: In general-sum games, the interaction of self-interested learning agents commonly leads to socially worse outcomes, such as defect-defect in the iterated stag hunt (ISH). Previous works address this challenge by sharing rewards or shaping their opponents' learning process, which require too strong assumptions. In this paper, we demonstrate that agents trained to optimize expected returns are more likely to choose a safe action that leads to guaranteed but lower rewards. However, there typically exists a risky action that leads to higher rewards in the long run only if agents cooperate, e.g., cooperate-cooperate in ISH. To overcome this, we propose using action value distribution to characterize the decision's risk and corresponding potential payoffs. Specifically, we present Adaptable Risk-Sensitive Policy (ARSP). ARSP learns the distributions over agent's return and estimates a dynamic risk-seeking bonus to discover risky coordination strategies. Furthermore, to avoid overfitting training opponents, ARSP learns an auxiliary opponent modeling task to infer opponents' types and dynamically alter corresponding strategies during execution. Empirically, agents trained via ARSP can achieve stable coordination during training without accessing opponent's rewards or learning process, and can adapt to non-cooperative opponents during execution. To the best of our knowledge, it is the first method to learn coordination strategies between agents both in iterated prisoner's dilemma (IPD) and iterated stag hunt (ISH) without shaping opponents or rewards, and can adapt to opponents with distinct strategies during execution. Furthermore, we show that ARSP can be scaled to high-dimensional settings., Comment: arXiv admin note: substantial text overlap with arXiv:2205.15859
Published: 2023

44. Transcriptomic profiling reveals color variation mechanism of Fritillaria cirrhosa for the molecular plant breeding

Author: Wang, Ye, Yang, Zemin, Wang, Xinyue, Liu, Ziyi, Xie, Huigan, Fu, Shaobing, Gao, Dan, and Li, Xiwen
Published: 2024
Full Text: View/download PDF

45. Cross-Modality Cardiac Insight Transfer: A Contrastive Learning Approach to Enrich ECG with CMR Features

Author: Ding, Zhengyao, Hu, Yujian, Li, Ziyu, Zhang, Hongkun, Wu, Fei, Xiang, Yilang, Li, Tian, Liu, Ziyi, Chu, Xuesen, Huang, Zhengxing, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Linguraru, Marius George, editor, Dou, Qi, editor, Feragen, Aasa, editor, Giannarou, Stamatia, editor, Glocker, Ben, editor, Lekadir, Karim, editor, and Schnabel, Julia A., editor
Published: 2024
Full Text: View/download PDF

46. The Query on the Pre-Monitoring Obligation of Short Video Platforms : --Taking China's First Copyright Infringement Case in the Context of Algorithm Recommendation as an Example

Author: Liu, Ziyi, Appolloni, Andrea, Series Editor, Caracciolo, Francesco, Series Editor, Ding, Zhuoqi, Series Editor, Gogas, Periklis, Series Editor, Huang, Gordon, Series Editor, Nartea, Gilbert, Series Editor, Ngo, Thanh, Series Editor, Striełkowski, Wadim, Series Editor, Dou, Peng, editor, and Zhang, Keying, editor
Published: 2024
Full Text: View/download PDF

47. Learning Adaptable Risk-Sensitive Policies to Coordinate in Multi-agent General-Sum Games

Author: Liu, Ziyi, Fang, Yongchun, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
Published: 2024
Full Text: View/download PDF

48. Subgroup analysis for the functional linear model

Author: Sun, Yifan, Liu, Ziyi, and Wang, Wu
Subjects: Statistics - Methodology
Abstract: Classical functional linear regression models the relationship between a scalar response and a functional covariate, where the coefficient function is assumed to be identical for all subjects. In this paper, the classical model is extended to allow heterogeneous coefficient functions across different subgroups of subjects. The greatest challenge is that the subgroup structure is usually unknown to us. To this end, we develop a penalization-based approach which innovatively applies the penalized fusion technique to simultaneously determine the number and structure of subgroups and coefficient functions within each subgroup. An effective computational algorithm is derived. We also establish the oracle properties and estimation consistency. Extensive numerical simulations demonstrate its superiority compared to several competing methods. The analysis of an air quality dataset leads to interesting findings and improved predictions., Comment: 24 pages, 9 figures
Published: 2022

49. Clustered Federated Learning based on Nonconvex Pairwise Fusion

Author: Yu, Xue, Liu, Ziyi, Wang, Wu, and Sun, Yifan
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: This study investigates clustered federated learning (FL), one of the formulations of FL with non-i.i.d. data, where the devices are partitioned into clusters and each cluster optimally fits its data with a localized model. We propose a clustered FL framework that incorporates a nonconvex penalty to pairwise differences of parameters. Without a priori knowledge of the set of devices in each cluster and the number of clusters, this framework can autonomously estimate cluster structures. To implement the proposed framework, we introduce a novel clustered FL method called Fusion Penalized Federated Clustering (FPFC). Building upon the standard alternating direction method of multipliers (ADMM), FPFC can perform partial updates at each communication round and allows parallel computation with variable workload. These strategies significantly reduce the communication cost while ensuring privacy, making it practical for FL. We also propose a new warmup strategy for hyperparameter tuning in FL settings and explore the asynchronous variant of FPFC (asyncFPFC). Theoretical analysis provides convergence guarantees for FPFC with general losses and establishes the statistical convergence rate under a linear model with squared loss. Extensive experiments have demonstrated the superiority of FPFC compared to current methods, including robustness and generalization capability., Comment: 46 pages, 9 figures
Published: 2022

50. XMD: An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models

Author: Lee, Dong-Ho, Kadakia, Akshen, Joshi, Brihi, Chan, Aaron, Liu, Ziyi, Narahari, Kiran, Shibuya, Takashi, Mitani, Ryosuke, Sekiya, Toshiyuki, Pujara, Jay, and Ren, Xiang
Subjects: Computer Science - Computation and Language
Abstract: NLP models are susceptible to learning spurious biases (i.e., bugs) that work on some datasets but do not properly reflect the underlying task. Explanation-based model debugging aims to resolve spurious biases by showing human users explanations of model behavior, asking users to give feedback on the behavior, then using the feedback to update the model. While existing model debugging methods have shown promise, their prototype-level implementations provide limited practical utility. Thus, we propose XMD: the first open-source, end-to-end framework for explanation-based model debugging. Given task- or instance-level explanations, users can flexibly provide various forms of feedback via an intuitive, web-based UI. After receiving user feedback, XMD automatically updates the model in real time, by regularizing the model so that its explanations align with the user feedback. The new model can then be easily deployed into real-world applications via Hugging Face. Using XMD, we can improve the model's OOD performance on text classification tasks by up to 18%., Comment: 6 pages, 7 figures. Project page: https://inklab.usc.edu/xmd/
Published: 2022

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,758 results on '"Liu, Ziyi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources