Author: "An, Junwei" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"An, Junwei"' showing total 58,052 results

Start Over Author "An, Junwei"

58,052 results on '"An, Junwei"'

101. Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

Author: Cai, Weilin, Jiang, Juyong, Qin, Le, Cui, Junwei, Kim, Sunghun, and Huang, Jiayi
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Expert parallelism has been introduced as a strategy to distribute the computational workload of sparsely-gated mixture-of-experts (MoE) models across multiple computing devices, facilitating the execution of these increasingly large-scale models. However, the All-to-All communication intrinsic to expert parallelism constitutes a significant overhead, diminishing the MoE models' efficiency. Current optimization approaches offer some relief, yet they are constrained by the sequential interdependence of communication and computation operations. To address this limitation, we present a novel shortcut-connected MoE (ScMoE) architecture with an overlapping parallel strategy, which effectively decouples communication from its conventional sequence, allowing for a substantial overlap of 70% to 100% with computation. When compared with the prevalent top-2 MoE architecture, ScMoE demonstrates training speed improvements of 30% and 11%, and inference improvements of 40% and 15%, in our distributed environments with PCIe and NVLink hardware, respectively, where communication constitutes 60% and 15% of the total MoE time consumption. Building on the ScMoE architecture, we further implement an expert offloading strategy to facilitate memory-limited inference, optimizing latency through the overlap of expert migration. Additionally, extensive experiments and theoretical analyses indicate that ScMoE not only achieves comparable but in some instances surpasses the model quality of existing approaches.
Published: 2024

102. Normalized solutions for Sobolev critical Schr\'{o}dinger equations on bounded domains

Author: Pierotti, Dario, Verzini, Gianmaria, and Yu, Junwei
Subjects: Mathematics - Analysis of PDEs, 35J20, 35B33, 35Q55, 35J61
Abstract: We study the existence and multiplicity of positive solutions with prescribed $L^2$-norm for the Sobolev critical Schr\"odinger equation on a bounded domain $\Omega\subset\mathbb{R}^N$, $N\ge3$: \[ -\Delta U = \lambda U + U^{2^{*}-1},\qquad U\in H^1_0(\Omega),\qquad \int_\Omega U^2\,dx = \rho^{2}, \] where $2^*=\frac{2N}{N-2}$. First, we consider a general bounded domain $\Omega$ in dimension $N\ge3$, with a restriction, only in dimension $N=3$, involving its inradius and first Dirichlet eigenvalue. In this general case we show the existence of a mountain pass solution on the $L^2$-sphere, for $\rho$ belonging to a subset of positive measure of the interval $(0,\rho^{**})$, for a suitable threshold $\rho^{**}>0$. Next, assuming that $\Omega$ is star-shaped, we extend the previous result to all values $\rho\in(0,\rho^{**})$. With respect to that of local minimizers, already known in the literature, the existence of mountain pass solutions in the Sobolev critical case is much more elusive. In particular, our proofs are based on the sharp analysis of the bounded Palais-Smale sequences, provided by a nonstandard adaptation of the Struwe monotonicity trick, that we develop., Comment: 24 pages
Published: 2024

103. Superionic Fluoride Gate Dielectrics with Low Diffusion Barrier for Advanced Electronics

Author: Meng, Kui, Li, Zeya, Chen, Peng, Ma, Xingyue, Huang, Junwei, Li, Jiayi, Qin, Feng, Qiu, Caiyu, Zhang, Yilin, Zhang, Ding, Deng, Yu, Yang, Yurong, Gu, Genda, Hwang, Harold Y., Xue, Qi-Kun, Cui, Yi, and Yuan, Hongtao
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Exploration of new dielectrics with large capacitive coupling is an essential topic in modern electronics when conventional dielectrics suffer from the leakage issue near breakdown limit. To address this looming challenge, we demonstrate that rare-earth-metal fluorides with extremely-low ion migration barriers can generally exhibit an excellent capacitive coupling over 20 $\mu$F cm$^{-2}$ (with an equivalent oxide thickness of ~0.15 nm and a large effective dielectric constant near 30) and great compatibility with scalable device manufacturing processes. Such static dielectric capability of superionic fluorides is exemplified by MoS$_2$ transistors exhibiting high on/off current ratios over 10$^8$, ultralow subthreshold swing of 65 mV dec$^{-1}$, and ultralow leakage current density of ~10$^{-6}$ A cm$^{-2}$. Therefore, the fluoride-gated logic inverters can achieve significantly higher static voltage gain values, surpassing ~167, compared to conventional dielectric. Furthermore, the application of fluoride gating enables the demonstration of NAND, NOR, AND, and OR logic circuits with low static energy consumption. Notably, the superconductor-to-insulator transition at the clean-limit Bi$_2$Sr$_2$CaCu$_2$O$_{8+\delta}$ can also be realized through fluoride gating. Our findings highlight fluoride dielectrics as a pioneering platform for advanced electronics applications and for tailoring emergent electronic states in condensed matters., Comment: 33 pages, 5 figures
Published: 2024

104. Even-integer Quantum Hall Effect in an Oxide Caused by Hidden Rashba Effect

Author: Wang, Jingyue, Huang, Junwei, Kaplan, Daniel, Zhou, Xuehan, Tan, Congwei, Zhang, Jing, Jin, Gangjian, Cong, Xuzhong, Zhu, Yongchao, Gao, Xiaoyin, Liang, Yan, Zuo, Huakun, Zhu, Zengwei, Zhu, Ruixue, Stern, Ady, Liu, Hongtao, Gao, Peng, Yan, Binghai, Yuan, Hongtao, and Peng, Hailin
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Materials Science
Abstract: In the presence of high magnetic field, quantum Hall systems usually host both even- and odd-integer quantized states because of lifted band degeneracies. Selective control of these quantized states is challenging but essential to understand the exotic ground states and manipulate the spin textures. Here, we study the quantum Hall effect in Bi2O2Se thin films. In magnetic fields as high as 50 T, we observe only even-integer quantum Hall states, but no sign of odd-integer states. However, when reducing the thickness of the epitaxial Bi2O2Se film to one unit cell, we observe both odd- and even-integer states in this Janus (asymmetric) film grown on SrTiO3. By means of a Rashba bilayer model based on ab initio band structures of Bi2O2Se thin films, we can ascribe the absence of odd-integer states in thicker films to the hidden Rasbha effect, where the local inversion symmetry breaking in two sectors of the [Bi2O2]2+ layer yields opposite Rashba spin polarizations, which compensate with each other. In the one unit cell Bi2O2Se film grown on SrTiO3, the asymmetry introduced by top surface and bottom interface induces a net polar field. The resulting global Rashba effect lifts the band degeneracies present in the symmetric case of thicker films., Comment: 6 Figures, 23 pages
Published: 2024

105. SGDFormer: One-stage Transformer-based Architecture for Cross-Spectral Stereo Image Guided Denoising

Author: Zhang, Runmin, Yu, Zhu, Sheng, Zehua, Ying, Jiacheng, Cao, Si-Yuan, Chen, Shu-Jie, Yang, Bailin, Li, Junwei, and Shen, Hui-Liang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Cross-spectral image guided denoising has shown its great potential in recovering clean images with rich details, such as using the near-infrared image to guide the denoising process of the visible one. To obtain such image pairs, a feasible and economical way is to employ a stereo system, which is widely used on mobile devices. Current works attempt to generate an aligned guidance image to handle the disparity between two images. However, due to occlusion, spectral differences and noise degradation, the aligned guidance image generally exists ghosting and artifacts, leading to an unsatisfactory denoised result. To address this issue, we propose a one-stage transformer-based architecture, named SGDFormer, for cross-spectral Stereo image Guided Denoising. The architecture integrates the correspondence modeling and feature fusion of stereo images into a unified network. Our transformer block contains a noise-robust cross-attention (NRCA) module and a spatially variant feature fusion (SVFF) module. The NRCA module captures the long-range correspondence of two images in a coarse-to-fine manner to alleviate the interference of noise. The SVFF module further enhances salient structures and suppresses harmful artifacts through dynamically selecting useful information. Thanks to the above design, our SGDFormer can restore artifact-free images with fine structures, and achieves state-of-the-art performance on various datasets. Additionally, our SGDFormer can be extended to handle other unaligned cross-model guided restoration tasks such as guided depth super-resolution.
Published: 2024

106. VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting

Author: Tang, Yujin, Dong, Peijie, Tang, Zhenheng, Chu, Xiaowen, and Liang, Junwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Combining CNNs or ViTs, with RNNs for spatiotemporal forecasting, has yielded unparalleled results in predicting temporal and spatial dynamics. However, modeling extensive global information remains a formidable challenge; CNNs are limited by their narrow receptive fields, and ViTs struggle with the intensive computational demands of their attention mechanisms. The emergence of recent Mamba-based architectures has been met with enthusiasm for their exceptional long-sequence modeling capabilities, surpassing established vision models in efficiency and accuracy, which motivates us to develop an innovative architecture tailored for spatiotemporal forecasting. In this paper, we propose the VMRNN cell, a new recurrent unit that integrates the strengths of Vision Mamba blocks with LSTM. We construct a network centered on VMRNN cells to tackle spatiotemporal prediction tasks effectively. Our extensive evaluations show that our proposed approach secures competitive results on a variety of tasks while maintaining a smaller model size. Our code is available at https://github.com/yyyujintang/VMRNN-PyTorch., Comment: CVPR2024 Precognition Workshop
Published: 2024

107. Adversarially Masked Video Consistency for Unsupervised Domain Adaptation

Author: Zhu, Xiaoyu, Liang, Junwei, Huang, Po-Yao, and Hauptmann, Alex
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We study the problem of unsupervised domain adaptation for egocentric videos. We propose a transformer-based model to learn class-discriminative and domain-invariant feature representations. It consists of two novel designs. The first module is called Generative Adversarial Domain Alignment Network with the aim of learning domain-invariant representations. It simultaneously learns a mask generator and a domain-invariant encoder in an adversarial way. The domain-invariant encoder is trained to minimize the distance between the source and target domain. The masking generator, conversely, aims at producing challenging masks by maximizing the domain distance. The second is a Masked Consistency Learning module to learn class-discriminative representations. It enforces the prediction consistency between the masked target videos and their full forms. To better evaluate the effectiveness of domain adaptation methods, we construct a more challenging benchmark for egocentric videos, U-Ego4D. Our method achieves state-of-the-art performance on the Epic-Kitchen and the proposed U-Ego4D benchmark.
Published: 2024

108. RoDLA: Benchmarking the Robustness of Document Layout Analysis Models

Author: Chen, Yufan, Zhang, Jiaming, Peng, Kunyu, Zheng, Junwei, Liu, Ruiping, Torr, Philip, and Stiefelhagen, Rainer
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Before developing a Document Layout Analysis (DLA) model in real-world applications, conducting comprehensive robustness testing is essential. However, the robustness of DLA models remains underexplored in the literature. To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets. To cover realistic corruptions, we propose a perturbation taxonomy with 36 common document perturbations inspired by real-world document processing. Additionally, to better understand document perturbation impacts, we propose two metrics, Mean Perturbation Effect (mPE) for perturbation assessment and Mean Robustness Degradation (mRD) for robustness evaluation. Furthermore, we introduce a self-titled model, i.e., Robust Document Layout Analyzer (RoDLA), which improves attention mechanisms to boost extraction of robust features. Experiments on the proposed benchmarks (PubLayNet-P, DocLayNet-P, and M$^6$Doc-P) demonstrate that RoDLA obtains state-of-the-art mRD scores of 115.7, 135.4, and 150.4, respectively. Compared to previous methods, RoDLA achieves notable improvements in mAP of +3.8%, +7.1% and +12.1%, respectively., Comment: Accepted by CVPR 2024. Project page: https://yufanchen96.github.io/projects/RoDLA
Published: 2024

109. Understanding the Ranking Loss for Recommendation with Sparse User Feedback

Author: Lin, Zhutian, Pan, Junwei, Zhang, Shangyu, Wang, Ximei, Xiao, Xi, Huang, Shudong, Xiao, Lei, and Jiang, Jie
Subjects: Computer Science - Information Retrieval
Abstract: Click-through rate (CTR) prediction is a crucial area of research in online advertising. While binary cross entropy (BCE) has been widely used as the optimization objective for treating CTR prediction as a binary classification problem, recent advancements have shown that combining BCE loss with an auxiliary ranking loss can significantly improve performance. However, the full effectiveness of this combination loss is not yet fully understood. In this paper, we uncover a new challenge associated with the BCE loss in scenarios where positive feedback is sparse: the issue of gradient vanishing for negative samples. We introduce a novel perspective on the effectiveness of the auxiliary ranking loss in CTR prediction: it generates larger gradients on negative samples, thereby mitigating the optimization difficulties when using the BCE loss only and resulting in improved classification ability. To validate our perspective, we conduct theoretical analysis and extensive empirical evaluations on public datasets. Additionally, we successfully integrate the ranking loss into Tencent's online advertising system, achieving notable lifts of 0.70% and 1.26% in Gross Merchandise Value (GMV) for two main scenarios. The code is openly accessible at: https://github.com/SkylerLinn/Understanding-the-Ranking-Loss.
Published: 2024
Full Text: View/download PDF

110. T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Author: Zhang, Shijie, Jiang, Boyan, He, Keke, Zhu, Junwei, Tai, Ying, Wang, Chengjie, Zhang, Yinda, and Fu, Yanwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Pixel2Mesh (P2M) is a classical approach for reconstructing 3D shapes from a single color image through coarse-to-fine mesh deformation. Although P2M is capable of generating plausible global shapes, its Graph Convolution Network (GCN) often produces overly smooth results, causing the loss of fine-grained geometry details. Moreover, P2M generates non-credible features for occluded regions and struggles with the domain gap from synthetic data to real-world images, which is a common challenge for single-view 3D reconstruction methods. To address these challenges, we propose a novel Transformer-boosted architecture, named T-Pixel2Mesh, inspired by the coarse-to-fine approach of P2M. Specifically, we use a global Transformer to control the holistic shape and a local Transformer to progressively refine the local geometry details with graph-based point upsampling. To enhance real-world reconstruction, we present the simple yet effective Linear Scale Search (LSS), which serves as prompt tuning during the input preprocessing. Our experiments on ShapeNet demonstrate state-of-the-art performance, while results on real-world data show the generalization capability., Comment: Received by ICASSP 2024
Published: 2024

111. Observation of non-volatile anomalous Nernst effect in altermagnet with collinear N\'eel vector

Author: Han, Lei, Fu, Xizhi, He, Wenqing, Zhu, Yuxiang, Dai, Jiankun, Yang, Wenfeng, Zhu, Wenxuan, Bai, Hua, Chen, Chong, Wan, Caihua, Han, Xiufeng, Song, Cheng, Liu, Junwei, and Pan, Feng
Subjects: Condensed Matter - Materials Science
Abstract: Anomalous Nernst effect (ANE), a widely investigated transverse thermoelectric effect that converts waste heat into electrical energy with remarkable flexibility and integration capability, has been extended to antiferromagnets with non-collinear spin texture recently. ANE in compensated magnet with collinear N\'eel vector will bring more opportunities to construct magnetic-field-immune and ultrafast transverse thermoelectric converters, but remains unachieved for long. It is due to the degenerated band structure of traditional collinear compensated magnet excludes non-zero Berry curvature. Here, we realize non-volatile ANE in altermagnet Mn5Si3 thin film with collinear Neel vector, whose unique alternating spin-splitting band structure plays vital role in creating non-zero Berry curvature and hotpots of anomalous Nernst conductivity near band intersections. Interestingly, ANE is relatively weak in stoichiometric Mn5Si3, but undergoes a sixfold enhancement through strategically raising the Fermi level by additional Mn doping, indicating sensitive intrinsic influence from specific location of the Fermi level on ANE in altermagnet. Moreover, our investigation reveals a unique Neel-vector-dependent temperature-scaling relationship of anomalous Nernst conductivity in Mn5Si3. Our work not only fills a longstanding gap by confirming the presence of non-volatile ANE in collinear compensated magnet, but also enlightens thermoelectric physics related to exotic spin-splitting band structure in altermagnet., Comment: 25 pages, 4 figures
Published: 2024

112. TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation

Author: Liu, Yufei, Zhu, Junwei, Tang, Junshu, Zhang, Shijie, Zhang, Jiangning, Cao, Weijian, Wang, Chengjie, Wu, Yunsheng, and Huang, Dongjin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Texturing 3D humans with semantic UV maps remains a challenge due to the difficulty of acquiring reasonably unfolded UV. Despite recent text-to-3D advancements in supervising multi-view renderings using large text-to-image (T2I) models, issues persist with generation speed, text consistency, and texture quality, resulting in data scarcity among existing datasets. We present TexDreamer, the first zero-shot multimodal high-fidelity 3D human texture generation model. Utilizing an efficient texture adaptation finetuning strategy, we adapt large T2I model to a semantic UV structure while preserving its original generalization capability. Leveraging a novel feature translator module, the trained model is capable of generating high-fidelity 3D human textures from either text or image within seconds. Furthermore, we introduce ArTicuLated humAn textureS (ATLAS), the largest high-resolution (1024 X 1024) 3D human texture dataset which contains 50k high-fidelity textures with text descriptions., Comment: Project Page: https://ggxxii.github.io/texdreamer/
Published: 2024

113. Prioritized Semantic Learning for Zero-shot Instance Navigation

Author: Sun, Xinyu, Liu, Lizhao, Zhi, Hongyan, Qiu, Ronghe, and Liang, Junwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We study zero-shot instance navigation, in which the agent navigates to a specific object without using object annotations for training. Previous object navigation approaches apply the image-goal navigation (ImageNav) task (go to the location of an image) for pretraining, and transfer the agent to achieve object goals using a vision-language model. However, these approaches lead to issues of semantic neglect, where the model fails to learn meaningful semantic alignments. In this paper, we propose a Prioritized Semantic Learning (PSL) method to improve the semantic understanding ability of navigation agents. Specifically, a semantic-enhanced PSL agent is proposed and a prioritized semantic training strategy is introduced to select goal images that exhibit clear semantic supervision and relax the reward function from strict exact view matching. At inference time, a semantic expansion inference scheme is designed to preserve the same granularity level of the goal semantic as training. Furthermore, for the popular HM3D environment, we present an Instance Navigation (InstanceNav) task that requires going to a specific object instance with detailed descriptions, as opposed to the Object Navigation (ObjectNav) task where the goal is defined merely by the object category. Our PSL agent outperforms the previous state-of-the-art by 66% on zero-shot ObjectNav in terms of success rate and is also superior on the new InstanceNav task. Code will be released at https://github.com/XinyuSun/PSL-InstanceNav., Comment: Accepted by ECCV 2024
Published: 2024

114. The Wreaths of KHAN: Uniform Graph Feature Selection with False Discovery Rate Control

Author: Liang, Jiajun, Liu, Yue, Zhou, Doudou, Zhang, Sinian, and Lu, Junwei
Subjects: Mathematics - Statistics Theory, Quantitative Biology - Quantitative Methods, Statistics - Applications, Statistics - Methodology
Abstract: Graphical models find numerous applications in biology, chemistry, sociology, neuroscience, etc. While substantial progress has been made in graph estimation, it remains largely unexplored how to select significant graph signals with uncertainty assessment, especially those graph features related to topological structures including cycles (i.e., wreaths), cliques, hubs, etc. These features play a vital role in protein substructure analysis, drug molecular design, and brain network connectivity analysis. To fill the gap, we propose a novel inferential framework for general high dimensional graphical models to select graph features with false discovery rate controlled. Our method is based on the maximum of $p$-values from single edges that comprise the topological feature of interest, thus is able to detect weak signals. Moreover, we introduce the $K$-dimensional persistent Homology Adaptive selectioN (KHAN) algorithm to select all the homological features within $K$ dimensions with the uniform control of the false discovery rate over continuous filtration levels. The KHAN method applies a novel discrete Gram-Schmidt algorithm to select statistically significant generators from the homology group. We apply the structural screening method to identify the important residues of the SARS-CoV-2 spike protein during the binding process to the ACE2 receptors. We score the residues for all domains in the spike protein by the $p$-value weighted filtration level in the network persistent homology for the closed, partially open, and open states and identify the residues crucial for protein conformational changes and thus being potential targets for inhibition.
Published: 2024

115. GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time

Author: Li, Hao, Gao, Yuanyuan, Wu, Chenming, Zhang, Dingwen, Dai, Yalun, Zhao, Chen, Feng, Haocheng, Ding, Errui, Wang, Jingdong, and Han, Junwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper presents GGRt, a novel approach to generalizable novel view synthesis that alleviates the need for real camera poses, complexity in processing high-resolution images, and lengthy optimization processes, thus facilitating stronger applicability of 3D Gaussian Splatting (3D-GS) in real-world scenarios. Specifically, we design a novel joint learning framework that consists of an Iterative Pose Optimization Network (IPO-Net) and a Generalizable 3D-Gaussians (G-3DG) model. With the joint learning mechanism, the proposed framework can inherently estimate robust relative pose information from the image observations and thus primarily alleviate the requirement of real camera poses. Moreover, we implement a deferred back-propagation mechanism that enables high-resolution training and inference, overcoming the resolution constraints of previous methods. To enhance the speed and efficiency, we further introduce a progressive Gaussian cache module that dynamically adjusts during training and inference. As the first pose-free generalizable 3D-GS framework, GGRt achieves inference at $\ge$ 5 FPS and real-time rendering at $\ge$ 100 FPS. Through extensive experimentation, we demonstrate that our method outperforms existing NeRF-based pose-free techniques in terms of inference speed and effectiveness. It can also approach the real pose-based 3D-GS methods. Our contributions provide a significant leap forward for the integration of computer vision and computer graphics into practical applications, offering state-of-the-art results on LLFF, KITTI, and Waymo Open datasets and enabling real-time rendering for immersive experiences., Comment: Project page: https://3d-aigc.github.io/GGRt
Published: 2024

116. Skeleton-Based Human Action Recognition with Noisy Labels

Author: Xu, Yi, Peng, Kunyu, Wen, Di, Liu, Ruiping, Zheng, Junwei, Chen, Yufan, Zhang, Jiaming, Roitberg, Alina, Yang, Kailun, and Stiefelhagen, Rainer
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Understanding human actions from body poses is critical for assistive robots sharing space with humans in order to make informed and safe decisions about the next interaction. However, precise temporal localization and annotation of activity sequences is time-consuming and the resulting labels are often noisy. If not effectively addressed, label noise negatively affects the model's training, resulting in lower recognition quality. Despite its importance, addressing label noise for skeleton-based action recognition has been overlooked so far. In this study, we bridge this gap by implementing a framework that augments well-established skeleton-based human action recognition methods with label-denoising strategies from various research areas to serve as the initial benchmark. Observations reveal that these baselines yield only marginal performance when dealing with sparse skeleton data. Consequently, we introduce a novel methodology, NoiseEraSAR, which integrates global sample selection, co-teaching, and Cross-Modal Mixture-of-Experts (CM-MOE) strategies, aimed at mitigating the adverse impacts of label noise. Our proposed approach demonstrates better performance on the established benchmark, setting new state-of-the-art standards. The source code for this study is accessible at https://github.com/xuyizdby/NoiseEraSAR., Comment: Accepted to IROS 2024. The source code for this study is accessible at https://github.com/xuyizdby/NoiseEraSAR
Published: 2024

117. Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems

Author: Su, Junwei, Zou, Difan, and Wu, Chuan
Subjects: Computer Science - Machine Learning
Abstract: Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice and plays an important role in the generalization of modern machine learning. However, prior research has revealed instances where the generalization performance of SGD is worse than ridge regression due to uneven optimization along different dimensions. Preconditioning offers a natural solution to this issue by rebalancing optimization across different directions. Yet, the extent to which preconditioning can enhance the generalization performance of SGD and whether it can bridge the existing gap with ridge regression remains uncertain. In this paper, we study the generalization performance of SGD with preconditioning for the least squared problem. We make a comprehensive comparison between preconditioned SGD and (standard \& preconditioned) ridge regression. Our study makes several key contributions toward understanding and improving SGD with preconditioning. First, we establish excess risk bounds (generalization performance) for preconditioned SGD and ridge regression under an arbitrary preconditions matrix. Second, leveraging the excessive risk characterization of preconditioned SGD and ridge regression, we show that (through construction) there exists a simple preconditioned matrix that can make SGD comparable to (standard \& preconditioned) ridge regression. Finally, we show that our proposed preconditioning matrix is straightforward enough to allow robust estimation from finite samples while maintaining a theoretical improvement. Our empirical results align with our theoretical findings, collectively showcasing the enhanced regularization effect of preconditioned SGD.
Published: 2024

118. Crystal design of altermagnetism

Author: Zhou, Zhiyuan, Cheng, Xingkai, Hu, Mengli, Liu, Junwei, Pan, Feng, and Song, Cheng
Subjects: Condensed Matter - Materials Science
Abstract: Symmetry plays a fundamental role in condensed matter. The unique entanglement between magnetic sublattices and alternating crystal environment in altermagnets provides a unique opportunity for designing magnetic space symmetry. There have been extensive experimental efforts concentrated on tuning the Neel vector to reconstruct altermagnetic symmetry. However, it remains challenging to modulate the altermagnetic symmetry through the crystal aspect. Here, the crystal design of altermagnetism is successfully realized, by breaking glide mirrors and magnetic mirrors of the (0001) crystallographic plane in CrSb films via crystal distortion. We establish a locking relationship between altermagnetic symmetry and the emergent Dzyaloshinskii-Moriya (DM) vectors in different CrSb films, realizing unprecedentedly room-temperature spontaneous anomalous Hall effect in an altermagnetic metal. The concept of exchange-coupling torques is broadened to include both antiferromagnetic exchange-coupling torque and DM torque. Their relationship is designable, determining electrical manipulation modes, e.g., field-assisted switching for CrSb(1-100)/Pt and field-free switching for W/CrSb(11-20). Particularly, the unprecedentedly field-free 100-percent switching of Neel vectors is realized by making these two torques parallel or antiparallel, dependent on Neel vector orientation. Besides unravelling the rich mechanisms for electrical manipulation of altermagnetism rooted in broadened concept of exchange-coupling torques, we list other material candidates and propose that crystal design of altermagnetism would bring rich designability to magnonics, topology, etc., Comment: 23 pages, 4 figures
Published: 2024

119. BG-HGNN: Toward Scalable and Efficient Heterogeneous Graph Neural Network

Author: Su, Junwei, Mao, Lingjun, and Wu, Chuan
Subjects: Computer Science - Machine Learning
Abstract: Many computer vision and machine learning problems are modelled as learning tasks on heterogeneous graphs, featuring a wide array of relations from diverse types of nodes and edges. Heterogeneous graph neural networks (HGNNs) stand out as a promising neural model class designed for heterogeneous graphs. Built on traditional GNNs, existing HGNNs employ different parameter spaces to model the varied relationships. However, the practical effectiveness of existing HGNNs is often limited to simple heterogeneous graphs with few relation types. This paper first highlights and demonstrates that the standard approach employed by existing HGNNs inevitably leads to parameter explosion and relation collapse, making HGNNs less effective or impractical for complex heterogeneous graphs with numerous relation types. To overcome this issue, we introduce a novel framework, Blend&Grind-HGNN (BG-HGNN), which effectively tackles the challenges by carefully integrating different relations into a unified feature space manageable by a single set of parameters. This results in a refined HGNN method that is more efficient and effective in learning from heterogeneous graphs, especially when the number of relations grows. Our empirical studies illustrate that BG-HGNN significantly surpasses existing HGNNs in terms of parameter efficiency (up to 28.96 $\times$), training throughput (up to 8.12 $\times$), and accuracy (up to 1.07 $\times$).
Published: 2024

120. Continual All-in-One Adverse Weather Removal with Knowledge Replay on a Unified Network Structure

Author: Cheng, De, Ji, Yanling, Gong, Dong, Li, Yan, Wang, Nannan, Han, Junwei, and Zhang, Dingwen
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: In real-world applications, image degeneration caused by adverse weather is always complex and changes with different weather conditions from days and seasons. Systems in real-world environments constantly encounter adverse weather conditions that are not previously observed. Therefore, it practically requires adverse weather removal models to continually learn from incrementally collected data reflecting various degeneration types. Existing adverse weather removal approaches, for either single or multiple adverse weathers, are mainly designed for a static learning paradigm, which assumes that the data of all types of degenerations to handle can be finely collected at one time before a single-phase learning process. They thus cannot directly handle the incremental learning requirements. To address this issue, we made the earliest effort to investigate the continual all-in-one adverse weather removal task, in a setting closer to real-world applications. Specifically, we develop a novel continual learning framework with effective knowledge replay (KR) on a unified network structure. Equipped with a principal component projection and an effective knowledge distillation mechanism, the proposed KR techniques are tailored for the all-in-one weather removal task. It considers the characteristics of the image restoration task with multiple degenerations in continual learning, and the knowledge for different degenerations can be shared and accumulated in the unified network structure. Extensive experimental results demonstrate the effectiveness of the proposed method to deal with this challenging task, which performs competitively to existing dedicated or joint training image restoration methods. Our code is available at https://github.com/xiaojihh/CL_all-in-one.
Published: 2024

121. On the Topology Awareness and Generalization Performance of Graph Neural Networks

Author: Su, Junwei and Wu, Chuan
Subjects: Computer Science - Machine Learning
Abstract: Many computer vision and machine learning problems are modelled as learning tasks on graphs where graph neural networks GNNs have emerged as a dominant tool for learning representations of graph structured data A key feature of GNNs is their use of graph structures as input enabling them to exploit the graphs inherent topological properties known as the topology awareness of GNNs Despite the empirical successes of GNNs the influence of topology awareness on generalization performance remains unexplored, particularly for node level tasks that diverge from the assumption of data being independent and identically distributed IID The precise definition and characterization of the topology awareness of GNNs especially concerning different topological features are still unclear This paper introduces a comprehensive framework to characterize the topology awareness of GNNs across any topological feature Using this framework we investigate the effects of topology awareness on GNN generalization performance Contrary to the prevailing belief that enhancing the topology awareness of GNNs is always advantageous our analysis reveals a critical insight improving the topology awareness of GNNs may inadvertently lead to unfair generalization across structural groups which might not be desired in some scenarios Additionally we conduct a case study using the intrinsic graph metric the shortest path distance on various benchmark datasets The empirical results of this case study confirm our theoretical insights Moreover we demonstrate the practical applicability of our framework by using it to tackle the cold start problem in graph active learning
Published: 2024

122. ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling

Author: Zheng, Kangjie, Long, Siyu, Lu, Tianyu, Yang, Junwei, Dai, Xinyu, Zhang, Ming, Nie, Zaiqing, Ma, Wei-Ying, and Zhou, Hao
Subjects: Quantitative Biology - Biomolecules, Computer Science - Computational Engineering, Finance, and Science, Computer Science - Machine Learning
Abstract: Protein language models have demonstrated significant potential in the field of protein engineering. However, current protein language models primarily operate at the residue scale, which limits their ability to provide information at the atom level. This limitation prevents us from fully exploiting the capabilities of protein language models for applications involving both proteins and small molecules. In this paper, we propose ESM-AA (ESM All-Atom), a novel approach that enables atom-scale and residue-scale unified molecular modeling. ESM-AA achieves this by pre-training on multi-scale code-switch protein sequences and utilizing a multi-scale position encoding to capture relationships among residues and atoms. Experimental results indicate that ESM-AA surpasses previous methods in protein-molecule tasks, demonstrating the full utilization of protein language models. Further investigations reveal that through unified molecular modeling, ESM-AA not only gains molecular knowledge but also retains its understanding of proteins. The source codes of ESM-AA are publicly released at https://github.com/zhengkangjie/ESM-AA., Comment: ICML2024 camera-ready, update some experimental results, add github url, fix some typos
Published: 2024

123. Coordinated optimization of control parameters for improving the stability of wind-PV hybrid power systems under improved pelican optimization algorithm

Author: Liu, Peng, Wu, Yuchao, Sun, Junwei, and Zhao, Junhong
Published: 2024
Full Text: View/download PDF

124. The Association Between Blood Lead Levels and Urgency Urinary Incontinence Among Adult Females: A Retrospective Study Based on NHANES 2005–2020

Author: Wang, Junwei, Zhang, Cunming, and Zhang, Aiwei
Published: 2024
Full Text: View/download PDF

125. Homogeneous permeation and oriented crystallization in nanostructured mesopores for efficient and stable printable mesoscopic perovskite solar cells

Author: Zhang, Guodong, Cheng, Yanjie, Niu, Tingting, Zheng, Ziwei, Li, Zongwei, Xiang, Junwei, Gao, Qiaojiao, Xia, Minghao, Guo, Lijuan, Liu, Yiming, Zhang, Mengru, Tao, Yiran, Ran, Xueqin, Li, Mingjie, Xing, Guichuan, Xia, Yingdong, Chao, Lingfeng, Mei, Anyi, Han, Hongwei, and Chen, Yonghua
Published: 2024
Full Text: View/download PDF

126. Morphological Enrichment and Environmental Factors Correlation of Heavy Metals in Dominant Plants in Typical Manganese Ore Areas in Guizhou, China

Author: Huang, Mingqin, Cheng, Junwei, Zeng, Boping, and Cai, Shenwen
Published: 2024
Full Text: View/download PDF

127. Exploring the Potential of MIM-Manufactured Porous NiTi as a Vascular Drug Delivery Material

Author: Zhou, Yang, Wang, Tun, Lu, Peng, Wan, Zicheng, He, Hao, Wang, Junwei, Li, Dongyang, Li, Yimin, and Shu, Chang
Published: 2024
Full Text: View/download PDF

128. Analysis of Large-Scale In Situ Shear Tests of Sandy Gravel with Cobbles

Author: Jin, Junwei, Jin, Qianqian, Li, Mingyu, Liu, Bo, Zhao, Shiyong, and Wei, Yanqing
Published: 2024
Full Text: View/download PDF

129. Progress in the study of anti-tumor effects and mechanisms of vitexin

Author: Yang, Qiming, Huan, Rui, Meng, Defeng, Qi, Junwei, and Xia, Lei
Published: 2024
Full Text: View/download PDF

130. Multiphysics simulation and optimization of the in-situ beneficiation process of ilmenite minerals from lunar regolith

Author: Zhang, Peng, Liu, Xin, Liu, Guanghui, Dai, Wei, Yang, Hanzhe, Zheng, Haibo, Wang, Zhi, Niu, Ran, Bai, Yifan, Zhang, Yang, Liu, Chengbao, Yang, Ge, Yang, Junwei, and Zhang, Guang
Published: 2024
Full Text: View/download PDF

131. Peach cultivar ‘DaHongPao’: a promising resource for gummosis disease resistance from laboratory and field investigations

Author: Zhang, Dongmei, Xiang, Shu, Chi, Mengmeng, Huang, Xue, Zhu, Kaijie, Li, Guohuai, and Liu, Junwei
Published: 2024
Full Text: View/download PDF

132. Preparation of magnetic framework composites Fe3O4@MIL-100(Fe) and adsorption of lead in water

Author: Du, Meiling, Zhang, Jiabao, Rong, Junwei, Peng, Tao, Chen, Yinjie, Ji, Yanqin, and Guan, Yueping
Published: 2024
Full Text: View/download PDF

133. MLGAT: multi-layer graph attention networks for multimodal emotion recognition in conversations

Author: Wu, Jun, Wu, Junwei, Zheng, Yu, Zhan, Pengfei, Han, Min, Zuo, Gan, and Yang, Li
Published: 2024
Full Text: View/download PDF

134. Characteristics of Solute Transport Continuously Released from Coastal Unconfined Aquifers under the Tidal Action Based on Laboratory Experiment

Author: Guo, Min, Wan, Junwei, and Huang, Kun
Published: 2024
Full Text: View/download PDF

135. Poisoning medical knowledge using large language models

Author: Yang, Junwei, Xu, Hanwen, Mirzoyan, Srbuhi, Chen, Tong, Liu, Zixuan, Liu, Zequn, Ju, Wei, Liu, Luchen, Xiao, Zhiping, Zhang, Ming, and Wang, Sheng
Published: 2024
Full Text: View/download PDF

136. Even-integer quantum Hall effect in an oxide caused by a hidden Rashba effect

Author: Wang, Jingyue, Huang, Junwei, Kaplan, Daniel, Zhou, Xuehan, Tan, Congwei, Zhang, Jing, Jin, Gangjian, Cong, Xuzhong, Zhu, Yongchao, Gao, Xiaoyin, Liang, Yan, Zuo, Huakun, Zhu, Zengwei, Zhu, Ruixue, Stern, Ady, Liu, Hongtao, Gao, Peng, Yan, Binghai, Yuan, Hongtao, and Peng, Hailin
Published: 2024
Full Text: View/download PDF

137. Size Effect on Pore-Scale Variables and Heterogeneous Pore-Network Characteristics in Carbonate Rocks

Author: Shou, Yundong, Zhao, Zhi, Zhou, Xiaoping, and Chen, Junwei
Published: 2024
Full Text: View/download PDF

138. Nocturnal urination is associated with the presence of higher ventilatory chemosensitivity in patients with obstructive sleep apnea

Author: Dai, Lu, Guo, Junwei, Wang, Xiaona, Luo, Jinmei, Huang, Rong, and Xiao, Yi
Published: 2024
Full Text: View/download PDF

139. M-RRFS: A Memory-Based Robust Region Feature Synthesizer for Zero-Shot Object Detection

Author: Huang, Peiliang, Zhang, Dingwen, Cheng, De, Han, Longfei, Zhu, Pengfei, and Han, Junwei
Published: 2024
Full Text: View/download PDF

140. Experimental Investigation of Freezing Front Detection Behind Shield Tunnel Segments Using Ground-Penetrating Radar

Author: Yu, Xinhao, Gao, Wei, Li, Fangzheng, Yang, Diansen, Ding, Hang, Zhang, Jiwei, Wang, Lei, and Xu, Junwei
Published: 2024
Full Text: View/download PDF

141. Biogenic Selenium Nanoparticles Synthesized by L. brevis 23017 Enhance Aluminum Adjuvanticity and Make Up for its Disadvantage in Mice

Author: Zhang, Zheng, De, Xinqi, Sun, Weijiao, Liu, Runhang, Li, Yifan, Yang, Zaixing, Liu, Ning, Wu, Jingyi, Miao, Yaxin, Wang, Jiaqi, Wang, Fang, and Ge, Junwei
Published: 2024
Full Text: View/download PDF

142. SMYD2 Imparts Gemcitabine Resistance to Pancreatic Adenocarcinoma Cells by Upregulating EVI2A

Author: Jin, Lei, Qian, Daohai, Tang, Xiaolei, Huang, Yong, Zou, Junwei, and Wu, Zhaoying
Published: 2024
Full Text: View/download PDF

143. A novel surface temperature sensor and random forest-based welding quality prediction model

Author: Wang, Shugui, Cui, Yunxian, Song, Yuxin, Ding, Chenggang, Ding, Wanyu, and Yin, Junwei
Published: 2024
Full Text: View/download PDF

144. Positive leadership and employees’ pro-environmental behavior: a meta-analysis

Author: Zhang, Yajun, Duan, Chenglong, Zhang, Junwei, and Akhtar, Muhammad Naseer
Published: 2024
Full Text: View/download PDF

145. A memristor-based circuit design of avoidance learning with time delay and its application

Author: Sun, Junwei, Wang, Haojie, Xu, Yuanpeng, Liu, Peng, and Wang, Yanfeng
Published: 2024
Full Text: View/download PDF

146. Bayesian ensemble learning and Shapley additive explanations for fast estimation of slope stability with a physics-informed database

Author: Lei, Dongze, Ma, Junwei, Zhang, Guangcheng, Wang, Yankun, Deng, Xin, and Liu, Jiayu
Published: 2024
Full Text: View/download PDF

147. Study on the prognostic model for esophageal cancer survival based on blood indicators and probabilistic membrane system

Author: Wang, Yanfeng, Liu, Huaiyang, Li, Housheng, Jiang, Suxia, and Sun, Junwei
Published: 2024
Full Text: View/download PDF

148. Differences in reservoir quality within distributary channel belts at the braided river delta front in the Xinchang area of Western Sichuan’s 2nd member of the Xujiahe Formation: genesis and implications

Author: Zhao, Junwei, Zhang, Ling, Chen, Gongyang, Tian, Lei, Zheng, Xiaoli, and Wang, Heng
Published: 2024
Full Text: View/download PDF

149. Attributed network embedding model for exposing COVID-19 spread trajectory archetypes

Author: Ma, Junwei, Li, Bo, Li, Qingchun, Fan, Chao, and Mostafavi, Ali
Published: 2024
Full Text: View/download PDF

150. WeakCLIP: Adapting CLIP for Weakly-Supervised Semantic Segmentation

Author: Zhu, Lianghui, Wang, Xinggang, Feng, Jiapei, Cheng, Tianheng, Li, Yingyue, Jiang, Bo, Zhang, Dingwen, and Han, Junwei
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

58,052 results on '"An, Junwei"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources