44,626 results on '"Chen, Liang"'
Search Results
2. Tales of Hope, Tastes of Bitterness: Chinese Road Builders in Ethiopia by Miriam Driessen (review)
- Author
-
Chen, Liang
- Published
- 2022
3. The Faraday rotation measure of the M87 jet at 3.5mm with the Atacama Large Millimeter/submillimeter Array
- Author
-
Peng, Sijia, Lu, Ru-Sen, Goddi, Ciriaco, Krichbaum, Thomas P., Li, Zhiyuan, Liu, Ruo-Yu, Kim, Jae-Young, Nakamura, Masanori, Yuan, Feng, Chen, Liang, Marti-Vidal, Ivan, and Shen, Zhiqiang
- Subjects
Astrophysics - Astrophysics of Galaxies ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
Faraday rotation is an important probe of the magnetic fields and magnetized plasma around active galactic nuclei (AGN) jets. We present a Faraday rotation measure image of the M87 jet between 85.2 GHz and 101.3 GHz with a resolution of ~2" with the Atacama Large Millimeter/submillimeter Array (ALMA). We found that the rotation measure (RM) of the M87 core is $\rm (4.5\pm 0.4)\times10^{4}\ rad\ m^{-2}$ with a low linear polarization fraction of $\rm (0.88\pm 0.08)\%$. The spatial RM gradient in the M87 jet spans a wide range from $\sim -2\times10^4\rm~rad\ m^{-2}$ to $\sim 3\times10^4\rm~rad\ m^{-2}$ with a typical uncertainty of $0.3\times10^4\rm~rad\ m^{-2}$. A comparison with previous RM measurements of the core suggests that the Faraday rotation of the core may originate very close to the super massive black hole (SMBH). Both an internal origin and an external screen with a rapidly varying emitting source could be possible. As for the jet, the RM gradient indicates a helical configuration of the magnetic field that persists up to kpc scale. Combined with the kpc-scale RM measurements at lower frequencies, we found that RM is frequency-dependent in the jet. One possible scenario to explain this dependence is that the kpc-scale jet has a trumpet-like shape and the jet coil unwinds near its end., Comment: 11pages, 5 figures. Accepted for publication in ApJ. Comments are welcome
- Published
- 2024
4. Proto-OOD: Enhancing OOD Object Detection with Prototype Feature Similarity
- Author
-
Chen, Junkun, Mei, Jilin, Chen, Liang, Zhao, Fangzhou, and Hu, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
The limited training samples for object detectors commonly result in low accuracy out-of-distribution (OOD) object detection. We have observed that feature vectors of the same class tend to cluster tightly in feature space, whereas those of different classes are more scattered. This insight motivates us to leverage feature similarity for OOD detection. Drawing on the concept of prototypes prevalent in few-shot learning, we introduce a novel network architecture, Proto-OOD, designed for this purpose. Proto-OOD enhances prototype representativeness through contrastive loss and identifies OOD data by assessing the similarity between input features and prototypes. It employs a negative embedding generator to create negative embedding, which are then used to train the similarity module. Proto-OOD achieves significantly lower FPR95 in MS-COCO dataset and higher mAP for Pascal VOC dataset, when utilizing Pascal VOC as ID dataset and MS-COCO as OOD dataset. Additionally, we identify limitations in existing evaluation metrics and propose an enhanced evaluation protocol., Comment: 14pages
- Published
- 2024
5. Towards a Unified View of Preference Learning for Large Language Models: A Survey
- Author
-
Gao, Bofei, Song, Feifan, Miao, Yibo, Cai, Zefan, Yang, Zhe, Chen, Liang, Hu, Helan, Xu, Runxin, Dong, Qingxiu, Zheng, Ce, Xiao, Wen, Zhang, Ge, Zan, Daoguang, Lu, Keming, Yu, Bowen, Liu, Dayiheng, Cui, Zeyu, Yang, Jian, Sha, Lei, Wang, Houfeng, Sui, Zhifang, Wang, Peiyi, Liu, Tianyu, and Chang, Baobao
- Subjects
Computer Science - Computation and Language - Abstract
Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial factors to achieve success is aligning the LLM's output with human preferences. This alignment process often requires only a small amount of data to efficiently enhance the LLM's performance. While effective, research in this area spans multiple domains, and the methods involved are relatively complex to understand. The relationships between different methods have been under-explored, limiting the development of the preference alignment. In light of this, we break down the existing popular alignment strategies into different components and provide a unified framework to study the current alignment strategies, thereby establishing connections among them. In this survey, we decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm. This unified view offers an in-depth understanding of existing alignment algorithms and also opens up possibilities to synergize the strengths of different strategies. Furthermore, we present detailed working examples of prevalent existing algorithms to facilitate a comprehensive understanding for the readers. Finally, based on our unified perspective, we explore the challenges and future research directions for aligning large language models with human preferences., Comment: 23 pages, 6 figures
- Published
- 2024
6. TeFF: Tracking-enhanced Forgetting-free Few-shot 3D LiDAR Semantic Segmentation
- Author
-
Zhou, Junbao, Mei, Jilin, Wu, Pengze, Chen, Liang, Zhao, Fangzhou, Zhao, Xijun, and Hu, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
In autonomous driving, 3D LiDAR plays a crucial role in understanding the vehicle's surroundings. However, the newly emerged, unannotated objects presents few-shot learning problem for semantic segmentation. This paper addresses the limitations of current few-shot semantic segmentation by exploiting the temporal continuity of LiDAR data. Employing a tracking model to generate pseudo-ground-truths from a sequence of LiDAR frames, our method significantly augments the dataset, enhancing the model's ability to learn on novel classes. However, this approach introduces a data imbalance biased to novel data that presents a new challenge of catastrophic forgetting. To mitigate this, we incorporate LoRA, a technique that reduces the number of trainable parameters, thereby preserving the model's performance on base classes while improving its adaptability to novel classes. This work represents a significant step forward in few-shot 3D LiDAR semantic segmentation for autonomous driving. Our code is available at https://github.com/junbao-zhou/Track-no-forgetting.
- Published
- 2024
7. Equivalent Characterizations of the Aubin Property for Nonlinear Semidefinite Programming
- Author
-
Chen, Liang, Chen, Ruoning, Sun, Defeng, and Zhang, Liping
- Subjects
Mathematics - Optimization and Control ,49J53, 90C22, 90C31, 90C46 - Abstract
In this paper, we study the Aubin property of the Karush-Kuhn-Tucker solution mapping for the nonlinear semidefinite programming (NLSDP) problem at a locally optimal solution. In the literature, it is known that the Aubin property implies the constraint nondegeneracy by Fusek [SIAM J. Optim. 23 (2013), pp. 1041-1061] and the second-order sufficient condition by Ding et al. [SIAM J. Optim. 27 (2017), pp. 67-90]. Based on the Mordukhovich criterion, here we further prove that the strong second-order sufficient condition is also necessary for the Aubin property to hold. Consequently, several equivalent conditions including the strong regularity are established for NLSDP's Aubin property. Together with the recent progress made by Chen et al. on the equivalence between the Aubin property and the strong regularity for nonlinear second-order cone programming [arXiv:2406.13798v1 (2024)], this paper constitutes a significant step forward in characterizing the Aubin property for general non-polyhedral $C^2$-cone reducible constrained optimization problems.
- Published
- 2024
8. Magnetic Field of the Quasar 1604+159 from Parsec to Kilo-parsec Scale
- Author
-
Hu, Xu-Zhi, Hong, Xiaoyu, Zhao, Wei, Chen, Liang, Wang, Wei-Yang, and Wu, Linhui
- Subjects
Astrophysics - Astrophysics of Galaxies ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
We present a multi-frequency polarimetric study for the quasar 1604+159. The source was observed at the $L$ band with the American Very Long Baseline Array (VLBA) and the $L$, $X$, and $U$ bands with the Very Large Array (VLA). These observations provide different resolutions from mas to arcsec, enabling us to probe the morphology and magnetic field from tens of parsec to hundreds of kilo-parsec scale. We detect a symmetrical Fanaroff-Riley-Class-I-like structure. The source has several lobes and bulges, forming a cocoon shape. The polarization is normal to the edges of the structure with high fractional polarization up to $\sim 60\%$. Two hotspots are observed at the eastern and western sides of the source, located symmetrically relative to the core. The flux density ratio ($>1.5$) between the two hotspots suggests the Doppler beaming effect exists at a large scale. The polarized emission in the hotspots also shows a symmetrical structure with an oblique direction from the jet direction. In general, the jet propagates in a collimating structure with several bends. Polarization is also detected perpendicular to the local jet from $\sim$100 mas to $\sim$ 1 arcsec. The jet shows strong polarized intensity and high fractional polarization at the bending edges. We discuss the possible origins of the observed structure and magnetic field., Comment: 17 pages, accepted for publication in ApJ
- Published
- 2024
9. Two-Component gamma-ray Emission Spectrum and X-Ray Polarization of the Radio Galaxy Pictor A
- Author
-
Li, Jia-Xuan, Hu, Xin-Ke, Lian, Ji-Shun, Yu, Yu-Wei, Deng, Wei, Liu, Kuan, Zhang, Hai-Ming, Chen, Liang, and Zhang, Jin
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
Pictor A is a $\gamma$-ray emitting radio galaxy and has a bright hotspot called WHS, located $\sim$4 arcmin away from the nucleus. In this letter, we present an analysis of its 16-year Fermi-LAT data and report the first Imaging X-ray Polarimetry Explorer (IXPE) observation for this source. Our analysis of the Fermi-LAT observations reveals evidence of two components in the average $\gamma$-ray spectrum of Pictor A, exhibiting a statistically significant hardening from $\Gamma^1_{\gamma}=3.25\pm0.15$ to $\Gamma^2_{\gamma}=1.81\pm0.07$ at a break energy of $2.46\pm0.09$ GeV. The evident variability of $\gamma$-rays is observed in Pictor A. Interestingly, the variability is dominated by the component below the break energy, and the component above the break energy shows no variability. Furthermore, we find that a power-law function can adequately fit the spectrum during high-flux states, whereas a broken power-law is still required to explain the spectrum during low-flux state. We suggest that the low-energy component originates from the nucleus, while the high-energy component primarily stems from WHS. The broadband spectral energy distributions of both nucleus and WHS can be well represented by a simple leptonic model, with both $\gamma$-ray components attributed to the synchrotron-self-Compton (SSC) process. The analysis of IXPE data on the nucleus yields an upper limit to the polarization degree $\Pi_{\rm X}<$8.9\% in the 2--8 keV band, agreeing with its X-ray emission originating from SSC. However, $\Pi_{\rm X}=23.5\%\pm5.6\%$ is observed at a confidence level of $>99\%$ in the 5--7 keV band, and the possible physical origin of this narrow-energy-band polarization signal is discussed., Comment: 14 Pages, 4 Figures, 3 Tables, submitted, comments are welcome
- Published
- 2024
10. L^2CL: Embarrassingly Simple Layer-to-Layer Contrastive Learning for Graph Collaborative Filtering
- Author
-
Jin, Xinzhou, Li, Jintang, Chen, Liang, Yu, Chenyun, Xie, Yuanzhen, Xie, Tao, Zhuo, Chengxiang, Li, Zang, and Zheng, Zibin
- Subjects
Computer Science - Information Retrieval ,Computer Science - Machine Learning - Abstract
Graph neural networks (GNNs) have recently emerged as an effective approach to model neighborhood signals in collaborative filtering. Towards this research line, graph contrastive learning (GCL) demonstrates robust capabilities to address the supervision label shortage issue through generating massive self-supervised signals. Despite its effectiveness, GCL for recommendation suffers seriously from two main challenges: i) GCL relies on graph augmentation to generate semantically different views for contrasting, which could potentially disrupt key information and introduce unwanted noise; ii) current works for GCL primarily focus on contrasting representations using sophisticated networks architecture (usually deep) to capture high-order interactions, which leads to increased computational complexity and suboptimal training efficiency. To this end, we propose L2CL, a principled Layer-to-Layer Contrastive Learning framework that contrasts representations from different layers. By aligning the semantic similarities between different layers, L2CL enables the learning of complex structural relationships and gets rid of the noise perturbation in stochastic data augmentation. Surprisingly, we find that L2CL, using only one-hop contrastive learning paradigm, is able to capture intrinsic semantic structures and improve the quality of node representation, leading to a simple yet effective architecture. We also provide theoretical guarantees for L2CL in minimizing task-irrelevant information. Extensive experiments on five real-world datasets demonstrate the superiority of our model over various state-of-the-art collaborative filtering methods. Our code is available at https://github.com/downeykking/L2CL.
- Published
- 2024
11. Implementable Semismooth* Newton Methods for Generalized Equations are G-Semismooth Newton Methods
- Author
-
Chen, Liang, Sun, Defeng, and Zhang, Wangyongquan
- Subjects
Mathematics - Optimization and Control ,49J52, 49J53, 90C31, 90C33, 49M15 - Abstract
Semismooth* Newton methods have been proposed in recent years targeting multi-valued inclusion problems and have been successfully implemented to deal with several concrete generalized equations. In this paper, we show that these executable implementations are exactly the applications of G-semismooth Newton methods for solving nonsmooth equations localized from these generalized equations. This new understanding expands the breadth of G-semismooth Newton methods in theory, and more importantly, facilitates the design and implementation of practical Newton-type algorithms for solving generalized equations.
- Published
- 2024
12. PID: Physics-Informed Diffusion Model for Infrared Image Generation
- Author
-
Mao, Fangyuan, Mei, Jilin, Lu, Shun, Liu, Fuyang, Chen, Liang, Zhao, Fangzhou, and Hu, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Infrared imaging technology has gained significant attention for its reliable sensing ability in low visibility conditions, prompting many studies to convert the abundant RGB images to infrared images. However, most existing image translation methods treat infrared images as a stylistic variation, neglecting the underlying physical laws, which limits their practical application. To address these issues, we propose a Physics-Informed Diffusion (PID) model for translating RGB images to infrared images that adhere to physical laws. Our method leverages the iterative optimization of the diffusion model and incorporates strong physical constraints based on prior knowledge of infrared laws during training. This approach enhances the similarity between translated infrared images and the real infrared domain without increasing extra training parameters. Experimental results demonstrate that PID significantly outperforms existing state-of-the-art methods. Our code is available at https://github.com/fangyuanmao/PID.
- Published
- 2024
13. UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
- Author
-
Zhao, Haozhe, Ma, Xiaojian, Chen, Liang, Si, Shuzheng, Wu, Rujie, An, Kaikai, Yu, Peiyu, Zhang, Minjia, Li, Qing, and Chang, Baobao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples. UltraEdit offers several distinct advantages: 1) It features a broader range of editing instructions by leveraging the creativity of large language models (LLMs) alongside in-context editing examples from human raters; 2) Its data sources are based on real images, including photographs and artworks, which provide greater diversity and reduced bias compared to datasets solely generated by text-to-image models; 3) It also supports region-based editing, enhanced by high-quality, automatically produced region annotations. Our experiments show that canonical diffusion-based editing baselines trained on UltraEdit set new records on MagicBrush and Emu-Edit benchmarks. Our analysis further confirms the crucial role of real image anchors and region-based editing data. The dataset, code, and models can be found in https://ultra-editing.github.io., Comment: 32 pages, 14 figures
- Published
- 2024
14. Epistatic interactions between NMD and TRP53 control progenitor cell maintenance and brain size
- Author
-
Lin, Lin, Zhao, Jingrong, Kubota, Naoto, Li, Zhelin, Lam, Yi-Li, Nguyen, Lauren P, Yang, Lu, Pokharel, Sheela P, Blue, Steven M, Yee, Brian A, Chen, Renee, Yeo, Gene W, Chen, Chun-Wei, Chen, Liang, and Zheng, Sika
- Subjects
Biomedical and Clinical Sciences ,Neurosciences ,Brain Disorders ,Stem Cell Research - Nonembryonic - Non-Human ,Stem Cell Research ,Rare Diseases ,Pediatric ,Stem Cell Research - Embryonic - Non-Human ,Congenital Structural Anomalies ,Genetics ,1.1 Normal biological development and functioning ,2.1 Biological and endogenous factors ,Neurological ,Animals ,Tumor Suppressor Protein p53 ,Mice ,Brain ,Mice ,Knockout ,Neural Stem Cells ,Nonsense Mediated mRNA Decay ,Epistasis ,Genetic ,Microcephaly ,Cell Cycle ,Cyclin-Dependent Kinase Inhibitor p21 ,RNA-Binding Proteins ,EJC ,PAX6 ,TBR2 ,Upf1 ,Upf3a ,Upf3b ,cell division ,neurogenesis ,p21 ,p53 ,progenitor cell competence ,Psychology ,Cognitive Sciences ,Neurology & Neurosurgery ,Biological psychology - Abstract
Mutations in human nonsense-mediated mRNA decay (NMD) factors are enriched in neurodevelopmental disorders. We show that deletion of key NMD factor Upf2 in mouse embryonic neural progenitor cells causes perinatal microcephaly but deletion in immature neurons does not, indicating NMD's critical roles in progenitors. Upf2 knockout (KO) prolongs the cell cycle of radial glia progenitor cells, promotes their transition into intermediate progenitors, and leads to reduced upper-layer neurons. CRISPRi screening identified Trp53 knockdown rescuing Upf2KO progenitors without globally reversing NMD inhibition, implying marginal contributions of most NMD targets to the cell cycle defect. Integrated functional genomics shows that NMD degrades selective TRP53 downstream targets, including Cdkn1a, which, without NMD suppression, slow the cell cycle. Trp53KO restores the progenitor cell pool and rescues the microcephaly of Upf2KO mice. Therefore, one physiological role of NMD in the developing brain is to degrade selective TRP53 targets to control progenitor cell cycle and brain size.
- Published
- 2024
15. MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
- Author
-
Huang, Jinsheng, Chen, Liang, Guo, Taian, Zeng, Fu, Zhao, Yusheng, Wu, Bohan, Yuan, Ye, Zhao, Haozhe, Guo, Zhihui, Zhang, Yichi, Yuan, Jingyang, Ju, Wei, Liu, Luchen, Liu, Tianyu, Chang, Baobao, and Zhang, Ming
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Large Multimodal Models (LMMs) exhibit impressive cross-modal understanding and reasoning abilities, often assessed through multiple-choice questions (MCQs) that include an image, a question, and several options. However, many benchmarks used for such evaluations suffer from systematic biases. Remarkably, Large Language Models (LLMs) without any visual perception capabilities achieve non-trivial performance, undermining the credibility of these evaluations. To address this issue while maintaining the efficiency of MCQ evaluations, we propose MMEvalPro, a benchmark designed to avoid Type-I errors through a trilogy evaluation pipeline and more rigorous metrics. For each original question from existing benchmarks, human annotators augment it by creating one perception question and one knowledge anchor question through a meticulous annotation process. MMEvalPro comprises $2,138$ question triplets, totaling $6,414$ distinct questions. Two-thirds of these questions are manually labeled by human experts, while the rest are sourced from existing benchmarks (MMMU, ScienceQA, and MathVista). Compared with the existing benchmarks, our experiments with the latest LLMs and LMMs demonstrate that MMEvalPro is more challenging (the best LMM lags behind human performance by $31.73\%$, compared to an average gap of $8.03\%$ in previous benchmarks) and more trustworthy (the best LLM trails the best LMM by $23.09\%$, whereas the gap for previous benchmarks is just $14.64\%$). Our in-depth analysis explains the reason for the large performance gap and justifies the trustworthiness of evaluation, underscoring its significant potential for advancing future research., Comment: 21 pages, code released at https://github.com/chenllliang/MMEvalPro, Homepage at https://mmevalpro.github.io/
- Published
- 2024
16. A Random Integration Algorithm for High-dimensional Function Spaces
- Author
-
Chen, Liang, Xu, Minqiang, and Zhang, Haizhang
- Subjects
Mathematics - Numerical Analysis - Abstract
We introduce a novel random integration algorithm that boasts both high convergence order and polynomial tractability for functions characterized by sparse frequencies or rapidly decaying Fourier coefficients. Specifically, for integration in periodic isotropic Sobolev space and the isotropic Sobolev space with compact support, our approach attains a nearly optimal root mean square error (RMSE) bound. In contrast to previous nearly optimal algorithms, our method exhibits polynomial tractability, ensuring that the number of samples does not scale exponentially with increasing dimensions. Our integration algorithm also enjoys nearly optimal bound for weighted Korobov space. Furthermore, the algorithm can be applied without the need for prior knowledge of weights, distinguishing it from the component-by-component algorithm. For integration in the Wiener algebra, the sample complexity of our algorithm is independent of the decay rate of Fourier coefficients. The effectiveness of the integration is confirmed through numerical experiments.
- Published
- 2024
17. Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective
- Author
-
Liu, Yunfei, Li, Jintang, Chen, Yuehe, Wu, Ruofan, Wang, Ericbk, Zhou, Jing, Tian, Sheng, Shen, Shuheng, Fu, Xing, Meng, Changhua, Wang, Weiqiang, and Chen, Liang
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Graph clustering, a fundamental and challenging task in graph mining, aims to classify nodes in a graph into several disjoint clusters. In recent years, graph contrastive learning (GCL) has emerged as a dominant line of research in graph clustering and advances the new state-of-the-art. However, GCL-based methods heavily rely on graph augmentations and contrastive schemes, which may potentially introduce challenges such as semantic drift and scalability issues. Another promising line of research involves the adoption of modularity maximization, a popular and effective measure for community detection, as the guiding principle for clustering tasks. Despite the recent progress, the underlying mechanism of modularity maximization is still not well understood. In this work, we dig into the hidden success of modularity maximization for graph clustering. Our analysis reveals the strong connections between modularity maximization and graph contrastive learning, where positive and negative examples are naturally defined by modularity. In light of our results, we propose a community-aware graph clustering framework, coined MAGI, which leverages modularity maximization as a contrastive pretext task to effectively uncover the underlying information of communities in graphs, while avoiding the problem of semantic drift. Extensive experiments on multiple graph datasets verify the effectiveness of MAGI in terms of scalability and clustering performance compared to state-of-the-art graph clustering methods. Notably, MAGI easily scales a sufficiently large graph with 100M nodes while outperforming strong baselines., Comment: KDD 2024 research track. Code available at https://github.com/EdisonLeeeee/MAGI
- Published
- 2024
18. Aubin Property and Strong Regularity Are Equivalent for Nonlinear Second-Order Cone Programming
- Author
-
Chen, Liang, Chen, Ruoning, Sun, Defeng, and Zhu, Junyuan
- Subjects
Mathematics - Optimization and Control ,90C, 90C31, 90C46 - Abstract
This paper solves a fundamental open problem in variational analysis on the equivalence between the Aubin property and the strong regularity for nonlinear second-order cone programming (SOCP) at a locally optimal solution. We achieve this by introducing a reduction approach to the Aubin property characterized by the Mordukhovich criterion and a lemma of alternative choices on cones to replace the S-lemma used in Outrata and Ram\'irez [SIAM J. Optim. 21 (2011) 789-823] and Opazo, Outrata, and Ram\'irez [SIAM J. Optim. 27 (2017) 2141-2151], where the same SOCP was considered under the strict complementarity condition except for possibly only one block of constraints. As a byproduct, we also offer a new approach to the well-known result of Dontchev and Rockafellar [SIAM J. Optim. 6 (1996) 1087-1105] on the equivalence of the two concepts in conventional nonlinear programming.
- Published
- 2024
19. One Fits All: Learning Fair Graph Neural Networks for Various Sensitive Attributes
- Author
-
Zhu, Yuchang, Li, Jintang, Bian, Yatao, Zheng, Zibin, and Chen, Liang
- Subjects
Computer Science - Machine Learning - Abstract
Recent studies have highlighted fairness issues in Graph Neural Networks (GNNs), where they produce discriminatory predictions against specific protected groups categorized by sensitive attributes such as race and age. While various efforts to enhance GNN fairness have made significant progress, these approaches are often tailored to specific sensitive attributes. Consequently, they necessitate retraining the model from scratch to accommodate changes in the sensitive attribute requirement, resulting in high computational costs. To gain deeper insights into this issue, we approach the graph fairness problem from a causal modeling perspective, where we identify the confounding effect induced by the sensitive attribute as the underlying reason. Motivated by this observation, we formulate the fairness problem in graphs from an invariant learning perspective, which aims to learn invariant representations across environments. Accordingly, we propose a graph fairness framework based on invariant learning, namely FairINV, which enables the training of fair GNNs to accommodate various sensitive attributes within a single training session. Specifically, FairINV incorporates sensitive attribute partition and trains fair GNNs by eliminating spurious correlations between the label and various sensitive attributes. Experimental results on several real-world datasets demonstrate that FairINV significantly outperforms state-of-the-art fairness approaches, underscoring its effectiveness. Our code is available via: https://github.com/ZzoomD/FairINV/., Comment: Accepted by KDD 2024
- Published
- 2024
20. Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
- Author
-
Liu, Qihao, Zeng, Zhanpeng, He, Ju, Yu, Qihang, Shen, Xiaohui, and Chen, Liang-Chieh
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization. Diffusion models have gained prominence for their effectiveness in high-fidelity image generation. While conventional approaches rely on convolutional U-Net architectures, recent Transformer-based designs have demonstrated superior performance and scalability. However, Transformer architectures, which tokenize input data (via "patchification"), face a trade-off between visual fidelity and computational complexity due to the quadratic nature of self-attention operations concerning token length. While larger patch sizes enable attention computation efficiency, they struggle to capture fine-grained visual details, leading to image distortions. To address this challenge, we propose augmenting the Diffusion model with the Multi-Resolution network (DiMR), a framework that refines features across multiple resolutions, progressively enhancing detail from low to high resolution. Additionally, we introduce Time-Dependent Layer Normalization (TD-LN), a parameter-efficient approach that incorporates time-dependent parameters into layer normalization to inject time information and achieve superior performance. Our method's efficacy is demonstrated on the class-conditional ImageNet generation benchmark, where DiMR-XL variants outperform prior diffusion models, setting new state-of-the-art FID scores of 1.70 on ImageNet 256 x 256 and 2.89 on ImageNet 512 x 512. Project page: https://qihao067.github.io/projects/DiMR, Comment: Introducing DiMR, a new diffusion backbone that surpasses all existing image generation models of various sizes on ImageNet 256 with only 505M parameters. Project page: https://qihao067.github.io/projects/DiMR
- Published
- 2024
21. Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations
- Author
-
Cao, Zhen, Aharonian, F., An, Q., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, della Volpe, D., Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Heller, M., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, S. J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Saiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., and Zuo, X.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,High Energy Physics - Phenomenology - Abstract
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $\gamma$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived., Comment: 17 pages, 12 figures, accepted by PRL
- Published
- 2024
22. An Image is Worth 32 Tokens for Reconstruction and Generation
- Author
-
Yu, Qihang, Weber, Mark, Deng, Xueqing, Shen, Xiaohui, Cremers, Daniel, and Chen, Liang-Chieh
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent advancements in generative models have highlighted the crucial role of image tokenization in the efficient synthesis of high-resolution images. Tokenization, which transforms images into latent representations, reduces computational demands compared to directly processing pixels and enhances the effectiveness and efficiency of the generation process. Prior methods, such as VQGAN, typically utilize 2D latent grids with fixed downsampling factors. However, these 2D tokenizations face challenges in managing the inherent redundancies present in images, where adjacent regions frequently display similarities. To overcome this issue, we introduce Transformer-based 1-Dimensional Tokenizer (TiTok), an innovative approach that tokenizes images into 1D latent sequences. TiTok provides a more compact latent representation, yielding substantially more efficient and effective representations than conventional techniques. For example, a 256 x 256 x 3 image can be reduced to just 32 discrete tokens, a significant reduction from the 256 or 1024 tokens obtained by prior methods. Despite its compact nature, TiTok achieves competitive performance to state-of-the-art approaches. Specifically, using the same generator framework, TiTok attains 1.97 gFID, outperforming MaskGIT baseline significantly by 4.21 at ImageNet 256 x 256 benchmark. The advantages of TiTok become even more significant when it comes to higher resolution. At ImageNet 512 x 512 benchmark, TiTok not only outperforms state-of-the-art diffusion model DiT-XL/2 (gFID 2.74 vs. 3.04), but also reduces the image tokens by 64x, leading to 410x faster generation process. Our best-performing variant can significantly surpasses DiT-XL/2 (gFID 2.13 vs. 3.04) while still generating high-quality samples 74x faster., Comment: A compact 1D Image Tokenization method, leading to SOTA generation performance while being substantially faster. Project page at https://yucornetto.github.io/projects/titok.html
- Published
- 2024
23. A $\gamma$-Ray Emitting Blazar at Redshift 3.64: Fermi-LAT and OVRO Observations of PKS 0201+113
- Author
-
Lei, Hai, Zhang, Ying-Kang, Jiang, Xiong, Kiehlmann, Sebastian, Readhead, Anthony C. S., Chen, Liang, Liao, Neng-Hui, and An, Tao
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
High-redshift ($z >3$) $\gamma$-ray blazars are rare, but they are crucial for our understanding of jet evolution, $\gamma$-ray production and propagation, and the growth of supermassive black holes in the early universe. A new analysis of Fermi-LAT data reveals a significant (5$\sigma$), spectrally soft ($\Gamma \simeq$ 3.0) $\gamma$-ray source in a specific 4-month epoch, cospatial with PKS 0201+113 ($z$ = 3.64). Monitoring of PKS 0201+113 at 15 GHz by the Owens Valley Radio Observatory 40 m Telescope from 2008 to 2023 shows a prominent flare that dominates the radio light curve. The maximum of the radio flare coincides with the $\gamma$-ray flare, strongly suggesting an association ($\textrm{p-value}=0.023$) between the $\gamma$-ray and the radio sources. PKS 0201+113 is only the third $\gamma$-ray blazar to be identified with $z> 3.5$, and it is the first such object to be identified by the detection of quasi-simultaneous $\gamma$-ray and radio flares. The jet properties of this peculiar blazar have been investigated. A detailed study of a two-zone leptonic model is presented that fits the broadband spectral energy distribution. An alternative scenario is also briefly discussed., Comment: 22 pages, 6 figures, 2 Tables, accepted for publication in ApJ
- Published
- 2024
24. Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting
- Author
-
Shin, Inkyu, Yu, Qihang, Shen, Xiaohui, Kweon, In So, Yoon, Kuk-Jin, and Chen, Liang-Chieh
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent advancements in zero-shot video diffusion models have shown promise for text-driven video editing, but challenges remain in achieving high temporal consistency. To address this, we introduce Video-3DGS, a 3D Gaussian Splatting (3DGS)-based video refiner designed to enhance temporal consistency in zero-shot video editors. Our approach utilizes a two-stage 3D Gaussian optimizing process tailored for editing dynamic monocular videos. In the first stage, Video-3DGS employs an improved version of COLMAP, referred to as MC-COLMAP, which processes original videos using a Masked and Clipped approach. For each video clip, MC-COLMAP generates the point clouds for dynamic foreground objects and complex backgrounds. These point clouds are utilized to initialize two sets of 3D Gaussians (Frg-3DGS and Bkg-3DGS) aiming to represent foreground and background views. Both foreground and background views are then merged with a 2D learnable parameter map to reconstruct full views. In the second stage, we leverage the reconstruction ability developed in the first stage to impose the temporal constraints on the video diffusion model. To demonstrate the efficacy of Video-3DGS on both stages, we conduct extensive experiments across two related tasks: Video Reconstruction and Video Editing. Video-3DGS trained with 3k iterations significantly improves video reconstruction quality (+3 PSNR, +7 PSNR increase) and training efficiency (x1.9, x4.5 times faster) over NeRF-based and 3DGS-based state-of-art methods on DAVIS dataset, respectively. Moreover, it enhances video editing by ensuring temporal consistency across 58 dynamic monocular videos., Comment: Project page at https://video-3dgs-project.github.io/
- Published
- 2024
25. State Space Models on Temporal Graphs: A First-Principles Study
- Author
-
Li, Jintang, Wu, Ruofan, Jin, Xinzhou, Ma, Boqun, Chen, Liang, and Zheng, Zibin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Over the past few years, research on deep graph learning has shifted from static graphs to temporal graphs in response to real-world complex systems that exhibit dynamic behaviors. In practice, temporal graphs are formalized as an ordered sequence of static graph snapshots observed at discrete time points. Sequence models such as RNNs or Transformers have long been the predominant backbone networks for modeling such temporal graphs. Yet, despite the promising results, RNNs struggle with long-range dependencies, while transformers are burdened by quadratic computational complexity. Recently, state space models (SSMs), which are framed as discretized representations of an underlying continuous-time linear dynamical system, have garnered substantial attention and achieved breakthrough advancements in independent sequence modeling. In this work, we undertake a principled investigation that extends SSM theory to temporal graphs by integrating structural information into the online approximation objective via the adoption of a Laplacian regularization term. The emergent continuous-time system introduces novel algorithmic challenges, thereby necessitating our development of GraphSSM, a graph state space model for modeling the dynamics of temporal graphs. Extensive experimental results demonstrate the effectiveness of our GraphSSM framework across various temporal graph benchmarks., Comment: Preprint; Code will be made available at https://github.com/EdisonLeeeee/GraphSSM
- Published
- 2024
26. Chiral quantum heating and cooling with an optically controlled ion
- Author
-
Bu, Jin-Tao, Zhang, Jian-Qi, Ding, Ge-Yi, Li, Jia-Chong, Zhang, Jia-Wei, Wang, Bin, Ding, Wen-Qiang, Yuan, Wen-Fei, Chen, Liang, Zhong, Qi, Keçebaş, Ali, Özdemir, Şahin K., Zhou, Fei, Jing, Hui, and Feng, Mang
- Subjects
Quantum Physics - Abstract
Quantum heat engines and refrigerators are open quantum systems, whose dynamics can be well understood using a non-Hermitian formalism. A prominent feature of non-Hermiticity is the existence of exceptional points (EPs), which has no counterpart in closed quantum systems. It has been shown in classical systems that dynamical encirclement in the vicinity of an EP, whether the loop includes the EP or not, could lead to chiral mode conversion. Here, we show that this is valid also for quantum systems when dynamical encircling is performed in the vicinity of their Liouvillian EPs (LEPs) which include the effects of quantum jumps and associated noise - an important quantum feature not present in previous works. We demonstrate, using a Paul-trapped ultracold ion, the first chiral quantum heating and refrigeration by dynamically encircling a closed loop in the vicinity of an LEP. We witness the cycling direction to be associated with the chirality and heat release (absorption) of the quantum heat engine (quantum refrigerator). Our experiments have revealed that not only the adiabaticity-breakdown but also the Landau-Zener-St\"uckelberg process play an essential role during dynamic encircling, resulting in chiral thermodynamic cycles. Our observations contributes to further understanding of chiral and topological features in non-Hermitian systems and pave a way to exploring the relation between chirality and quantum thermodynamics., Comment: Accepted by Light: Science & Applications
- Published
- 2024
- Full Text
- View/download PDF
27. Rankability-enhanced Revenue Uplift Modeling Framework for Online Marketing
- Author
-
He, Bowei, Weng, Yunpeng, Tang, Xing, Cui, Ziqiang, Sun, Zexu, Chen, Liang, He, Xiuqiang, and Ma, Chen
- Subjects
Computer Science - Machine Learning - Abstract
Uplift modeling has been widely employed in online marketing by predicting the response difference between the treatment and control groups, so as to identify the sensitive individuals toward interventions like coupons or discounts. Compared with traditional \textit{conversion uplift modeling}, \textit{revenue uplift modeling} exhibits higher potential due to its direct connection with the corporate income. However, previous works can hardly handle the continuous long-tail response distribution in revenue uplift modeling. Moreover, they have neglected to optimize the uplift ranking among different individuals, which is actually the core of uplift modeling. To address such issues, in this paper, we first utilize the zero-inflated lognormal (ZILN) loss to regress the responses and customize the corresponding modeling network, which can be adapted to different existing uplift models. Then, we study the ranking-related uplift modeling error from the theoretical perspective and propose two tighter error bounds as the additional loss terms to the conventional response regression loss. Finally, we directly model the uplift ranking error for the entire population with a listwise uplift ranking loss. The experiment results on offline public and industrial datasets validate the effectiveness of our method for revenue uplift modeling. Furthermore, we conduct large-scale experiments on a prominent online fintech marketing platform, Tencent FiT, which further demonstrates the superiority of our method in real-world applications., Comment: Accepted by KDD 2024
- Published
- 2024
28. Data quality control system and long-term performance monitor of the LHAASO-KM2A
- Author
-
Cao, Zhen, Aharonian, F., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Bian, W., Bukevich, A. V., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, H. X., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, J. H., Fang, K., Feng, C. F., Feng, H., Feng, L., Feng, S. H., Feng, X. T., Feng, Y., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., Hasan, M., He, H. H., He, H. N., He, J. Y., He, Y., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Karpikov, I., Kuleshov, D., Kurinov, K., Li, B. B., Li, C. M., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, Jian, Li, Jie, Li, K., Li, S. D., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, S. J., Liu, B., Liu, C., Liu, D., Liu, D. B., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Luo, Q., Luo, Y., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, L. J., Pattarakijwanich, P., Pei, Z. Y., Qi, J. C., Qi, M. Y., Qiao, B. Q., Qin, J. J., Raza, A., Ruffolo, D., Sáiz, A., Saeed, M., Semikoz, D., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, D. X., Sun, Q. N., Sun, X. N., Sun, Z. B., Takata, J., Tam, P. H. T., Tang, Q. W., Tang, R., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, Kai, Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, Q. W., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, Y. L., Xing, Y., Xiong, D. R., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, C. Y., Yang, F., Yang, F. F., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, W. X., Yao, Y. H., Yao, Z. G., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zhao, X. H., Zheng, F., Zhong, W. J., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, B. Y., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., Zou, Y. C., and Zuo, X.
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,High Energy Physics - Experiment ,Physics - Instrumentation and Detectors - Abstract
The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively., Comment: 15 pages, 9 figures
- Published
- 2024
29. DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment
- Author
-
Han, Jianhong, Chen, Liang, and Wang, Yupei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Object detectors frequently encounter significant performance degradation when confronted with domain gaps between collected data (source domain) and data from real-world applications (target domain). To address this task, numerous unsupervised domain adaptive detectors have been proposed, leveraging carefully designed feature alignment techniques. However, these techniques primarily align instance-level features in a class-agnostic manner, overlooking the differences between extracted features from different categories, which results in only limited improvement. Furthermore, the scope of current alignment modules is often restricted to a limited batch of images, failing to learn the entire dataset-level cues, thereby severely constraining the detector's generalization ability to the target domain. To this end, we introduce a strong DETR-based detector named Domain Adaptive detection TRansformer (DATR) for unsupervised domain adaptation of object detection. Firstly, we propose the Class-wise Prototypes Alignment (CPA) module, which effectively aligns cross-domain features in a class-aware manner by bridging the gap between object detection task and domain adaptation task. Then, the designed Dataset-level Alignment Scheme (DAS) explicitly guides the detector to achieve global representation and enhance inter-class distinguishability of instance-level features across the entire dataset, which spans both domains, by leveraging contrastive learning. Moreover, DATR incorporates a mean-teacher based self-training framework, utilizing pseudo-labels generated by the teacher model to further mitigate domain bias. Extensive experimental results demonstrate superior performance and generalization capabilities of our proposed DATR in multiple domain adaptation scenarios. Code is released at https://github.com/h751410234/DATR., Comment: Manuscript submitted to IEEE Transactions on Image Processing
- Published
- 2024
30. Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO
- Author
-
Cao, Zhen, Aharonian, F., An, Q., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Sáiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zheng, J. H., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., Zou, Y. C., and Zuo, X.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $\gamma$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$\sigma$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons., Comment: 11 pages, 5 figures
- Published
- 2024
31. Fair Graph Representation Learning via Sensitive Attribute Disentanglement
- Author
-
Zhu, Yuchang, Li, Jintang, Zheng, Zibin, and Chen, Liang
- Subjects
Computer Science - Machine Learning ,Computer Science - Computers and Society - Abstract
Group fairness for Graph Neural Networks (GNNs), which emphasizes algorithmic decisions neither favoring nor harming certain groups defined by sensitive attributes (e.g., race and gender), has gained considerable attention. In particular, the objective of group fairness is to ensure that the decisions made by GNNs are independent of the sensitive attribute. To achieve this objective, most existing approaches involve eliminating sensitive attribute information in node representations or algorithmic decisions. However, such ways may also eliminate task-related information due to its inherent correlation with the sensitive attribute, leading to a sacrifice in utility. In this work, we focus on improving the fairness of GNNs while preserving task-related information and propose a fair GNN framework named FairSAD. Instead of eliminating sensitive attribute information, FairSAD enhances the fairness of GNNs via Sensitive Attribute Disentanglement (SAD), which separates the sensitive attribute-related information into an independent component to mitigate its impact. Additionally, FairSAD utilizes a channel masking mechanism to adaptively identify the sensitive attribute-related component and subsequently decorrelates it. Overall, FairSAD minimizes the impact of the sensitive attribute on GNN outcomes rather than eliminating sensitive attributes, thereby preserving task-related information associated with the sensitive attribute. Furthermore, experiments conducted on several real-world datasets demonstrate that FairSAD outperforms other state-of-the-art methods by a significant margin in terms of both fairness and utility performance. Our source code is available at https://github.com/ZzoomD/FairSAD., Comment: Accepted by WWW 2024
- Published
- 2024
32. Parameter-Efficient Fine-Tuning with Discrete Fourier Transform
- Author
-
Gao, Ziqi, Wang, Qichao, Chen, Aochuan, Liu, Zijing, Wu, Bingzhe, Chen, Liang, and Li, Jia
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Low-rank adaptation~(LoRA) has recently gained much interest in fine-tuning foundation models. It effectively reduces the number of trainable parameters by incorporating low-rank matrices $A$ and $B$ to represent the weight change, i.e., $\Delta W=BA$. Despite LoRA's progress, it faces storage challenges when handling extensive customization adaptations or larger base models. In this work, we aim to further compress trainable parameters by enjoying the powerful expressiveness of the Fourier transform. Specifically, we introduce FourierFT, which treats $\Delta W$ as a matrix in the spatial domain and learns only a small fraction of its spectral coefficients. With the trained spectral coefficients, we implement the inverse discrete Fourier transform to recover $\Delta W$. Empirically, our FourierFT method shows comparable or better performance with fewer parameters than LoRA on various tasks, including natural language understanding, natural language generation, instruction tuning, and image classification. For example, when performing instruction tuning on the LLaMA2-7B model, FourierFT surpasses LoRA with only 0.064M trainable parameters, compared to LoRA's 33.5M. Our code is released at \url{https://github.com/Chaos96/fourierft}., Comment: Accepted by ICML 2024
- Published
- 2024
33. Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation
- Author
-
Zhao, Haozhe, Cai, Zefan, Si, Shuzheng, Chen, Liang, He, Yufeng, An, Kaikai, and Chang, Baobao
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large-scale multilingual Pretrained Language Models (mPLMs) yield impressive performance on cross-language tasks, yet significant performance disparities exist across different languages within the same mPLM. Previous studies endeavored to narrow these disparities by supervise fine-tuning the mPLMs with multilingual data. However, obtaining labeled multilingual data is time-consuming, and fine-tuning mPLM with limited labeled multilingual data merely encapsulates the knowledge specific to the labeled data. Therefore, we introduce ALSACE to leverage the learned knowledge from the well-performing languages to guide under-performing ones within the same mPLM, eliminating the need for additional labeled multilingual data. Experiments show that ALSACE effectively mitigates language-level performance disparity across various mPLMs while showing the competitive performance on different multilingual NLU tasks, ranging from full resource to limited resource settings. The code for our approach is available at https://github.com/pkunlp-icler/ALSACE., Comment: NAACL 2024
- Published
- 2024
34. COCONut: Modernizing COCO Segmentation
- Author
-
Deng, Xueqing, Yu, Qihang, Wang, Peng, Shen, Xiaohui, and Chen, Liang-Chieh
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In recent decades, the vision community has witnessed remarkable progress in visual recognition, partially owing to advancements in dataset benchmarks. Notably, the established COCO benchmark has propelled the development of modern detection and segmentation systems. However, the COCO segmentation benchmark has seen comparatively slow improvement over the last decade. Originally equipped with coarse polygon annotations for thing instances, it gradually incorporated coarse superpixel annotations for stuff regions, which were subsequently heuristically amalgamated to yield panoptic segmentation annotations. These annotations, executed by different groups of raters, have resulted not only in coarse segmentation masks but also in inconsistencies between segmentation types. In this study, we undertake a comprehensive reevaluation of the COCO segmentation annotations. By enhancing the annotation quality and expanding the dataset to encompass 383K images with more than 5.18M panoptic masks, we introduce COCONut, the COCO Next Universal segmenTation dataset. COCONut harmonizes segmentation annotations across semantic, instance, and panoptic segmentation with meticulously crafted high-quality masks, and establishes a robust benchmark for all segmentation tasks. To our knowledge, COCONut stands as the inaugural large-scale universal segmentation dataset, verified by human raters. We anticipate that the release of COCONut will significantly contribute to the community's ability to assess the progress of novel neural networks., Comment: Accepted at CVPR2024, data available at https://xdeng7.github.io/coconut.github.io/
- Published
- 2024
35. LHAASO-KM2A detector simulation using Geant4
- Author
-
Cao, Zhen, Aharonian, F., An, Q., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Sáiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zheng, J. H., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., and Zuo, X.
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement.
- Published
- 2024
- Full Text
- View/download PDF
36. ViTamin: Designing Scalable Vision Models in the Vision-Language Era
- Author
-
Chen, Jieneng, Yu, Qihang, Shen, Xiaohui, Yuille, Alan, and Chen, Liang-Chieh
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent breakthroughs in vision-language models (VLMs) start a new page in the vision community. The VLMs provide stronger and more generalizable feature embeddings compared to those from ImageNet-pretrained models, thanks to the training on the large-scale Internet image-text pairs. However, despite the amazing achievement from the VLMs, vanilla Vision Transformers (ViTs) remain the default choice for the image encoder. Although pure transformer proves its effectiveness in the text encoding area, it remains questionable whether it is also the case for image encoding, especially considering that various types of networks are proposed on the ImageNet benchmark, which, unfortunately, are rarely studied in VLMs. Due to small data/model scale, the original conclusions of model design on ImageNet can be limited and biased. In this paper, we aim at building an evaluation protocol of vision models in the vision-language era under the contrastive language-image pretraining (CLIP) framework. We provide a comprehensive way to benchmark different vision models, covering their zero-shot performance and scalability in both model and training data sizes. To this end, we introduce ViTamin, a new vision models tailored for VLMs. ViTamin-L significantly outperforms ViT-L by 2.0% ImageNet zero-shot accuracy, when using the same publicly available DataComp-1B dataset and the same OpenCLIP training scheme. ViTamin-L presents promising results on 60 diverse benchmarks, including classification, retrieval, open-vocabulary detection and segmentation, and large multi-modal models. When further scaling up the model size, our ViTamin-XL with only 436M parameters attains 82.9% ImageNet zero-shot accuracy, surpassing 82.0% achieved by EVA-E that has ten times more parameters (4.4B)., Comment: CVPR 2024; https://github.com/Beckschen/ViTamin
- Published
- 2024
37. A Catalyst‐Like System Enables Efficient Perovskite Solar Cells
- Author
-
Yang, Yuqian, Li, Guodong, Zhao, Lichen, Tan, Pengju, Li, Yu, Li, Shunde, Tan, Lina, Deng, Chunyan, Wang, Shibo, Zhao, Zhenzhu, Yuan, Chengjian, Ding, Honghe, Chen, Liang, Zhu, Junfa, Guan, Yong, Hou, Cheng‐Hung, Tang, Pengyi, Li, Quiyang, Liu, Hong, Yang, Yingguo, Abate, Antonio, Shyue, Jing‐Jong, Wu, Jihuai, Russell, Thomas P, and Hu, Qin
- Subjects
Macromolecular and Materials Chemistry ,Chemical Sciences ,Physical Chemistry ,Engineering ,Materials Engineering ,catalyst‐like system ,formation kinetics ,homogeneity ,multiscale structure ,perovskite solar cells ,Physical Sciences ,Nanoscience & Nanotechnology ,Chemical sciences ,Physical sciences - Abstract
High-quality perovskite films are essential for achieving high performance of optoelectronic devices; However, solution-processed perovskite films are known to suffer from compositional and structural inhomogeneity due to lack of systematic control over the kinetics during the formation. Here, the microscopic homogeneity of perovskite films is successfully enhanced by modulating the conversion reaction kinetics using a catalyst-like system generated by a foaming agent. The chemical and structural evolution during this catalytic conversion is revealed by a multimodal synchrotron toolkit with spatial resolutions spanning many length scales. Combining these insights with computational investigations, a cyclic conversion pathway model is developed that yields exceptional perovskite homogeneity due to enhanced conversion, having a power conversion efficiency of 24.51% for photovoltaic devices. This work establishes a systematic link between processing of precursor and homogeneity of the perovskite films.
- Published
- 2024
38. SGHormer: An Energy-Saving Graph Transformer Driven by Spikes
- Author
-
Zhang, Huizhe, Li, Jintang, Chen, Liang, and Zheng, Zibin
- Subjects
Computer Science - Neural and Evolutionary Computing ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Graph Transformers (GTs) with powerful representation learning ability make a huge success in wide range of graph tasks. However, the costs behind outstanding performances of GTs are higher energy consumption and computational overhead. The complex structure and quadratic complexity during attention calculation in vanilla transformer seriously hinder its scalability on the large-scale graph data. Though existing methods have made strides in simplifying combinations among blocks or attention-learning paradigm to improve GTs' efficiency, a series of energy-saving solutions originated from biologically plausible structures are rarely taken into consideration when constructing GT framework. To this end, we propose a new spiking-based graph transformer (SGHormer). It turns full-precision embeddings into sparse and binarized spikes to reduce memory and computational costs. The spiking graph self-attention and spiking rectify blocks in SGHormer explicitly capture global structure information and recover the expressive power of spiking embeddings, respectively. In experiments, SGHormer achieves comparable performances to other full-precision GTs with extremely low computational energy consumption. The results show that SGHomer makes a remarkable progress in the field of low-energy GTs., Comment: 9 pages, 3 figures
- Published
- 2024
39. Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A
- Author
-
The LHAASO Collaboration, Cao, Zhen, Aharonian, F., An, Q., Axikegu, A., Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, della Volpe, D., Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Heller, M., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, S. J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Sáiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., and Zuo, X.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at $3.67 \pm 0.05 \pm 0.15$ PeV. Below the knee, the spectral index is found to be -$2.7413 \pm 0.0004 \pm 0.0050$, while above the knee, it is -$3.128 \pm 0.005 \pm 0.027$, with the sharpness of the transition measured with a statistical error of 2%. The mean logarithmic mass of cosmic rays is almost heavier than helium in the whole measured energy range. It decreases from 1.7 at 0.3 PeV to 1.3 at 3 PeV, representing a 24% decline following a power law with an index of -$0.1200 \pm 0.0003 \pm 0.0341$. This is equivalent to an increase in abundance of light components. Above the knee, the mean logarithmic mass exhibits a power law trend towards heavier components, which is reversal to the behavior observed in the all-particle energy spectrum. Additionally, the knee position and the change in power-law index are approximately the same. These findings suggest that the knee observed in the all-particle spectrum corresponds to the knee of the light component, rather than the medium-heavy components., Comment: 8 pages, 3 figures
- Published
- 2024
- Full Text
- View/download PDF
40. A Causal Inspired Early-Branching Structure for Domain Generalization
- Author
-
Chen, Liang, Zhang, Yong, Song, Yibing, Zhang, Zhen, and Liu, Lingqiao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Learning domain-invariant semantic representations is crucial for achieving domain generalization (DG), where a model is required to perform well on unseen target domains. One critical challenge is that standard training often results in entangled semantic and domain-specific features. Previous works suggest formulating the problem from a causal perspective and solving the entanglement problem by enforcing marginal independence between the causal (\ie semantic) and non-causal (\ie domain-specific) features. Despite its simplicity, the basic marginal independent-based idea alone may be insufficient to identify the causal feature. By d-separation, we observe that the causal feature can be further characterized by being independent of the domain conditioned on the object, and we propose the following two strategies as complements for the basic framework. First, the observation implicitly implies that for the same object, the causal feature should not be associated with the non-causal feature, revealing that the common practice of obtaining the two features with a shared base feature extractor and two lightweight prediction heads might be inappropriate. To meet the constraint, we propose a simple early-branching structure, where the causal and non-causal feature obtaining branches share the first few blocks while diverging thereafter, for better structure design; Second, the observation implies that the causal feature remains invariant across different domains for the same object. To this end, we suggest that augmentation should be incorporated into the framework to better characterize the causal feature, and we further suggest an effective random domain sampling scheme to fulfill the task. Theoretical and experimental results show that the two strategies are beneficial for the basic marginal independent-based framework. Code is available at \url{https://github.com/liangchen527/CausEB}., Comment: Accepted by IJCV
- Published
- 2024
41. An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
- Author
-
Chen, Liang, Zhao, Haozhe, Liu, Tianyu, Bai, Shuai, Lin, Junyang, Zhou, Chang, and Chang, Baobao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
In this study, we identify the inefficient attention phenomena in Large Vision-Language Models (LVLMs), notably within prominent models like LLaVA-1.5, QwenVL-Chat and Video-LLaVA. We find out that the attention computation over visual tokens is of extreme inefficiency in the deep layers of popular LVLMs, suggesting a need for a sparser approach compared to textual data handling. To this end, we introduce FastV, a versatile plug-and-play method designed to optimize computational efficiency by learning adaptive attention patterns in early layers and pruning visual tokens in subsequent ones. Our evaluations demonstrate FastV's ability to dramatically reduce computational costs (e.g., a 45 reduction in FLOPs for LLaVA-1.5-13B) without sacrificing performance in a wide range of image and video understanding tasks. The computational efficiency and performance trade-off of FastV are highly customizable and pareto-efficient. It can compress the FLOPs of a 13B-parameter model to achieve a lower budget than that of a 7B-parameter model, while still maintaining superior performance. We believe FastV has practical values for deployment of LVLMs in edge devices and commercial models. Code is released at https://github.com/pkunlp-icler/FastV., Comment: Accepted to ECCV 2024 (Oral), code is released at https://github.com/pkunlp-icler/FastV
- Published
- 2024
42. Consecutive Model Editing with Batch alongside HooK Layers
- Author
-
Li, Shuaiyi, Deng, Yang, Cai, Deng, Lu, Hongyuan, Chen, Liang, and Lam, Wai
- Subjects
Computer Science - Computation and Language - Abstract
As the typical retraining paradigm is unacceptably time- and resource-consuming, researchers are turning to model editing in order to seek an effective, consecutive, and batch-supportive way to edit the model behavior directly. Despite all these practical expectations, existing model editing methods fail to realize all of them. Furthermore, the memory demands for such succession-supportive model editing approaches tend to be prohibitive, frequently necessitating an external memory that grows incrementally over time. To cope with these challenges, we propose COMEBA-HK, a model editing method that is both consecutive and batch-supportive. COMEBA-HK is memory-friendly as it only needs a small amount of it to store several hook layers with updated weights. Experimental results demonstrate the superiority of our method over other batch-supportive model editing methods under both single-round and consecutive batch editing scenarios. Extensive analyses of COMEBA-HK have been conducted to verify the stability of our method over 1) the number of consecutive steps and 2) the number of editing instance., Comment: Under review
- Published
- 2024
43. PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization
- Author
-
Meng, Xiangdi, Dai, Damai, Luo, Weiyao, Yang, Zhe, Wu, Shaoxiang, Wang, Xiaochen, Wang, Peiyi, Dong, Qingxiu, Chen, Liang, and Sui, Zhifang
- Subjects
Computer Science - Computation and Language - Abstract
Supervised fine-tuning is the most common method to adapt large language models (LLMs) to downstream tasks, but full fine-tuning LLMs requires massive computational resources. Recently, parameter-efficient fine-tuning (PEFT) methods have been widely studied due to its cost-effectiveness. LoRA is one of the most widely used methods, which assumes that the optimization process is essentially low-dimensional. Although LoRA fine-tuning is effective, there is still a performance gap compared to full fine-tuning, since its weight update is limited to low-rank matrices. In order to break the low-rank bottleneck in LoRA Optimization, we propose PeriodicLoRA (PLoRA), which accumulates low-rank update matrices multiple times to achieve a higher update rank. PLoRA has multiple training stages. During each stage, we still update only the LoRA weights. However, at the end of each stage, we unload the LoRA weights into the backbone parameters and then reinitialize the LoRA states. Experimental results show that PLoRA has stronger learning ability, approximately 1.8 times that of LoRA's learning ability at most, but it does not increase memory usage. Further, we introduce a momentum-based unloading strategy for PLoRA to mitigate the training instability.
- Published
- 2024
44. PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
- Author
-
Chen, Liang, Zhang, Yichi, Ren, Shuhuai, Zhao, Haozhe, Cai, Zefan, Wang, Yuchi, Wang, Peiyi, Meng, Xiangdi, Liu, Tianyu, and Chang, Baobao
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition - Abstract
We present PCA-Bench, a multimodal decision-making benchmark for evaluating the integrated capabilities of Multimodal Large Language Models (MLLMs). Departing from previous benchmarks focusing on simplistic tasks and individual model capability, PCA-Bench introduces three complex scenarios: autonomous driving, domestic robotics, and open-world games. Given task instructions and diverse contexts, the model is required to seamlessly integrate multiple capabilities of Perception, Cognition, and Action in a reasoning chain to make accurate decisions. Moreover, PCA-Bench features error localization capabilities, scrutinizing model inaccuracies in areas such as perception, knowledge, or reasoning. This enhances the reliability of deploying MLLMs. To balance accuracy and efficiency in evaluation, we propose PCA-Eval, an automatic evaluation protocol, and assess 10 prevalent MLLMs. The results reveal significant performance disparities between open-source models and powerful proprietary models like GPT-4 Vision. To address this, we introduce Embodied-Instruction-Evolution (EIE), an automatic framework for synthesizing instruction tuning examples in multimodal embodied environments. EIE generates 7,510 training examples in PCA-Bench and enhances the performance of open-source MLLMs, occasionally surpassing GPT-4 Vision (+3\% in decision accuracy), thereby validating the effectiveness of EIE. Our findings suggest that robust MLLMs like GPT4-Vision show promise for decision-making in embodied agents, opening new avenues for MLLM research., Comment: Code and Data released at https://github.com/pkunlp-icler/PCA-EVAL. Leaderboard at: https://docs.qq.com/sheet/DVUd4WUpGRHRqUnNV. This article supersedes its workshop version arxiv: 2310.02071. arXiv admin note: text overlap with arXiv:2310.02071
- Published
- 2024
45. Multiwavelength Polarization Observations of Mrk 501
- Author
-
Hu, Xin-Ke, Yu, Yu-Wei, Zhang, Jin, Wang, Xiang-Gao, Patra, Kishore C., Brink, Thomas G., Zheng, Wei-Kang, Wang, Qi, Kong, De-Feng, Chen, Liang-Jun, Zhou, Ji-Wang, Cao, Jia-Xin, Lu, Ming-Xuan, Zhou, Zi-Min, Wei, Yi-Ning, Huang, Xin-Bo, Li, Xing-Lin, Lou, Hao, Mao, Ji-Rong, Liang, En-Wei, and Filippenko, Alexei V.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
Mrk 501 is a prototypical high-synchrotron-peaked blazar (HBL) and serves as one of the primary targets for the {\it Imaging X-ray Polarimetry Explorer} ({\it IXPE}). In this study, we report X-ray polarization measurements of Mrk 501 based on six {\it IXPE} observations. The detection of X-ray polarization at a confidence level exceeding 99\% is achieved in four out of the six observations conducted across the entire energy range (2--8 keV) of {\it IXPE}. The maximum polarization degree ($\Pi_{\rm X}$) is measured to be $15.8\%\pm2.8\%$, accompanied by a polarization angle ($\psi_{\rm X}$) of $98.0\deg\pm5.1\deg$ at a confidence level of $5.6 \sigma$. During the remaining two observations, only an upper limit of $\Pi_{\rm X}<$12\% could be derived at the 99\% confidence level. No temporal variability in polarization is observed throughout all six {\it IXPE} observations for Mrk 501. A discernible trend of energy-dependent variation in the polarization degree is detected in optical spectropolarimetry; however, no analogous indication is observed in $\Pi_{\rm X}$. The chromatic behavior of $\Pi$ and the consistent values of $\psi$ across different frequencies from X-rays to radio waves, along with the agreement between $\psi$ and jet position angle, strongly support the interpretation of the energy-stratified model with shock-accelerated particles in the jet of Mrk 501. Additionally, the possibility of the presence of a global helical magnetic field in the jet of Mrk 501 is discussed., Comment: 20 pages, 8 figures, 4 tables, accepted for publication in ApJL
- Published
- 2024
46. Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm
- Author
-
Xie, Yuanzhen, Jin, Xinzhou, Xie, Tao, Lin, MingXiong, Chen, Liang, Yu, Chenyun, Cheng, Lei, Zhuo, ChengXiang, Hu, Bo, and Li, Zang
- Subjects
Computer Science - Computation and Language - Abstract
In-context learning of large-language models (LLMs) has achieved remarkable success in the field of natural language processing, while extensive case studies reveal that the single-step chain-of-thought prompting approach faces challenges such as attention diffusion and inadequate performance in complex tasks like text-to-SQL. To improve the contextual learning capabilities of LLMs in text-to-SQL, a workflow paradigm method is proposed, aiming to enhance the attention and problem-solving scope of LLMs through decomposition. Specifically, the information determination module for eliminating redundant information and the brand-new prompt structure based on problem classification greatly enhance the model's attention. Additionally, the inclusion of self-correction and active learning modules greatly expands the problem-solving scope of LLMs, hence improving the upper limit of LLM-based approaches. Extensive experiments conducted on three datasets demonstrate that our approach outperforms other methods by a significant margin. About 2-3 percentage point improvements compared to the existing baseline on the Spider Dev, Spider-Realistic, and Bird Dev datasets and new SOTA results on the Spider Test dataset are achieved. Our code is available on GitHub: \url{https://github.com/FlyingFeather/DEA-SQL}.
- Published
- 2024
47. Evolution of magnetic field of the Quasar 1604+159 at pc scale
- Author
-
Hu, Xu-Zhi, Hong, Xiaoyu, Zhao, Wei, Chen, Liang, Wang, Wei-Yang, and Wu, Linhui
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
We have analyzed the total intensity, spectral index, linear polarization, and RM distributions at pc scale for the quasar 1604+159. The source was observed in 2002 and 2020 with the VLBA. Combining the MOJAVE results, we studied the evolution of the magnetic field. We detected a core-jet structure. The jet extends to a distance of ~25 mas. The jet shape varies slightly with time. We divided the source structure into the central region and the jet region. In the jet region, we find the polarized emission varies with time. The flatter spectral index values and EVPA direction indicate the possible existence of shocks, contributing to the variation. In the central region, the derived core shift index k_r values indicate that the core in 2002 is close to the equipartition case while deviating from it in 2020. The measured magnetic field strength in 2020 is two orders of magnitude lower than that in 2002. We detected transverse RM gradients, evidence of a helical magnetic field, in the core. At 15 GHz, in the place close to the jet base, the polarization direction changes significantly with time from perpendicular to parallel to the jet direction. The evolution of RM and magnetic field structure are potential reasons for the observed polarization change. The core |RM| in 2020 increases with frequency following a power law with index a = 2.7, suggesting a fast electron density fall-off in the medium with distance from the jet base., Comment: 24 pages, 14 figures, accepted for publication in ApJ
- Published
- 2024
48. Treatment-Aware Hyperbolic Representation Learning for Causal Effect Estimation with Social Networks
- Author
-
Cui, Ziqiang, Tang, Xing, Qiao, Yang, He, Bowei, Chen, Liang, He, Xiuqiang, and Ma, Chen
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Social and Information Networks ,Statistics - Methodology - Abstract
Estimating the individual treatment effect (ITE) from observational data is a crucial research topic that holds significant value across multiple domains. How to identify hidden confounders poses a key challenge in ITE estimation. Recent studies have incorporated the structural information of social networks to tackle this challenge, achieving notable advancements. However, these methods utilize graph neural networks to learn the representation of hidden confounders in Euclidean space, disregarding two critical issues: (1) the social networks often exhibit a scalefree structure, while Euclidean embeddings suffer from high distortion when used to embed such graphs, and (2) each ego-centric network within a social network manifests a treatment-related characteristic, implying significant patterns of hidden confounders. To address these issues, we propose a novel method called Treatment-Aware Hyperbolic Representation Learning (TAHyper). Firstly, TAHyper employs the hyperbolic space to encode the social networks, thereby effectively reducing the distortion of confounder representation caused by Euclidean embeddings. Secondly, we design a treatment-aware relationship identification module that enhances the representation of hidden confounders by identifying whether an individual and her neighbors receive the same treatment. Extensive experiments on two benchmark datasets are conducted to demonstrate the superiority of our method., Comment: Accepted by SIAM SDM'24
- Published
- 2024
49. SPFormer: Enhancing Vision Transformer with Superpixel Representation
- Author
-
Mei, Jieru, Chen, Liang-Chieh, Yuille, Alan, and Xie, Cihang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In this work, we introduce SPFormer, a novel Vision Transformer enhanced by superpixel representation. Addressing the limitations of traditional Vision Transformers' fixed-size, non-adaptive patch partitioning, SPFormer employs superpixels that adapt to the image's content. This approach divides the image into irregular, semantically coherent regions, effectively capturing intricate details and applicable at both initial and intermediate feature levels. SPFormer, trainable end-to-end, exhibits superior performance across various benchmarks. Notably, it exhibits significant improvements on the challenging ImageNet benchmark, achieving a 1.4% increase over DeiT-T and 1.1% over DeiT-S respectively. A standout feature of SPFormer is its inherent explainability. The superpixel structure offers a window into the model's internal processes, providing valuable insights that enhance the model's interpretability. This level of clarity significantly improves SPFormer's robustness, particularly in challenging scenarios such as image rotations and occlusions, demonstrating its adaptability and resilience.
- Published
- 2024
50. Expected Transaction Value Optimization for Precise Marketing in FinTech Platforms
- Author
-
Weng, Yunpeng, Tang, Xing, Chen, Liang, Liu, Dugang, and He, Xiuqiang
- Subjects
Computer Science - Information Retrieval - Abstract
FinTech platforms facilitated by digital payments are watching growth rapidly, which enable the distribution of mutual funds personalized to individual investors via mobile Apps. As the important intermediation of financial products investment, these platforms distribute thousands of mutual funds obtaining impressions under guaranteed delivery (GD) strategy required by fund companies. Driven by the profit from fund purchases of users, the platform aims to maximize each transaction amount of customers by promoting mutual funds to these investors who will be interested in. Different from the conversions in traditional advertising or e-commerce recommendations, the investment amount in each purchase varies greatly even for the same financial product, which provides a significant challenge for the promotion recommendation of mutual funds. In addition to predicting the click-through rate (CTR) or the conversion rate (CVR) as in traditional recommendations, it is essential for FinTech platforms to estimate the customers' purchase amount for each delivered fund and achieve an effective allocation of impressions based on the predicted results to optimize the total expected transaction value (ETV). In this paper, we propose an ETV optimized customer allocation framework (EOCA) that aims to maximize the total ETV of recommended funds, under the constraints of GD dealt with fund companies. To the best of our knowledge, it's the first attempt to solve the GD problem for financial product promotions based on customer purchase amount prediction. We conduct extensive experiments on large scale real-world datasets and online tests based on LiCaiTong, Tencent wealth management platform, to demonstrate the effectiveness of our proposed EOCA framework., Comment: Accepted by Workshop on Deep Learning Practice for High-Dimensional Sparse Data in RecSys'23 (DLP@RecSys), Singapore, 2023
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.