99,354 results on '"LI, Ping"'
Search Results
2. FASP: Fast and Accurate Structured Pruning of Large Language Models
- Author
-
Hu, Hanyu, Zhao, Pengxiang, Li, Ping, Zheng, Yi, Wang, Zhefeng, and Yuan, Xiaoming
- Subjects
Computer Science - Machine Learning - Abstract
The rapid increase in the size of large language models (LLMs) has significantly escalated their computational and memory demands, posing challenges for efficient deployment, especially on resource-constrained devices. Structured pruning has emerged as an effective model compression method that can reduce these demands while preserving performance. In this paper, we introduce FASP (Fast and Accurate Structured Pruning), a novel structured pruning framework for LLMs that emphasizes both speed and accuracy. FASP employs a distinctive pruning structure that interlinks sequential layers, allowing for the removal of columns in one layer while simultaneously eliminating corresponding rows in the preceding layer without incurring additional performance loss. The pruning metric, inspired by Wanda, is computationally efficient and effectively selects components to prune. Additionally, we propose a restoration mechanism that enhances model fidelity by adjusting the remaining weights post-pruning. We evaluate FASP on the OPT and LLaMA model families, demonstrating superior performance in terms of perplexity and accuracy on downstream tasks compared to state-of-the-art methods. Our approach achieves significant speed-ups, pruning models such as OPT-125M in 17 seconds and LLaMA-30B in 15 minutes on a single NVIDIA RTX 4090 GPU, making it a highly practical solution for optimizing LLMs.
- Published
- 2025
3. Chern numbers on positive vector bundles and combinatorics
- Author
-
Li, Ping
- Subjects
Mathematics - Differential Geometry ,Mathematics - Algebraic Geometry ,Mathematics - Combinatorics ,32Q55, 57R20, 06A07, 32M10 - Abstract
Combinatorial ideas are developed in this article to study Chern numbers on ample and numerically effective vector bundles. An effective lower bound for Chern numbers of ample vector bundles is established, which makes some progress towards a long-standing question. Along this line we prove that Chern numbers on nef vector bundles obey reverse dominance ordering, which improves upon some classical and recent results. We propose a simultaneous positivity question on (signed) Chern numbers of compact complex or K\"{a}hler manifolds whose (co)tangent bundles are semipositive in various senses, and show that it holds true for compact homogeneous complex manifolds., Comment: 18 pages
- Published
- 2025
4. Large Language Model is Secretly a Protein Sequence Optimizer
- Author
-
Wang, Yinkai, He, Jiaxing, Du, Yuanqi, Chen, Xiaohui, Li, Jianan Canal, Liu, Li-Ping, Xu, Xiaolin, and Hassoun, Soha
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Quantitative Biology - Quantitative Methods - Abstract
We consider the protein sequence engineering problem, which aims to find protein sequences with high fitness levels, starting from a given wild-type sequence. Directed evolution has been a dominating paradigm in this field which has an iterative process to generate variants and select via experimental feedback. We demonstrate large language models (LLMs), despite being trained on massive texts, are secretly protein sequence optimizers. With a directed evolutionary method, LLM can perform protein engineering through Pareto and experiment-budget constrained optimization, demonstrating success on both synthetic and experimental fitness landscapes., Comment: Preprint
- Published
- 2025
5. Intelligent Reflecting Surfaces Aided Wireless Network: Deployment Architectures and Solutions
- Author
-
Wu, Qingqing, Chen, Guangji, Peng, Qiaoyan, Chen, Wen, Yuan, Yifei, Cheng, Zhenqiao, Dou, Jianwu, Zhao, Zhiyong, and Li, Ping
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Intelligent reflecting surfaces (IRSs) have emerged as a transformative technology for wireless networks by improving coverage, capacity, and energy efficiency through intelligent manipulation of wireless propagation environments. This paper provides a comprehensive study on the deployment and coordination of IRSs for wireless networks. By addressing both single- and multi-reflection IRS architectures, we examine their deployment strategies across diverse scenarios, including point-to-point, point-to-multipoint, and point-to-area setups. For the single-reflection case, we highlight the trade-offs between passive and active IRS architectures in terms of beamforming gain, coverage extension, and spatial multiplexing. For the multi-reflection case, we discuss practical strategies to optimize IRS deployment and element allocation, balancing cooperative beamforming gains and path loss. The paper further discusses practical challenges in IRS implementation, including environmental conditions, system compatibility, and hardware limitations. Numerical results and field tests validate the effectiveness of IRS-aided wireless networks and demonstrate their capacity and coverage improvements. Lastly, promising research directions, including movable IRSs, near-field deployments, and network-level optimization, are outlined to guide future investigations.
- Published
- 2025
6. Learning from Ambiguous Data with Hard Labels
- Author
-
Xie, Zeke, He, Zheng, Lu, Nan, Bai, Lichen, Li, Bao, Yang, Shuo, Sun, Mingming, and Li, Ping
- Subjects
Computer Science - Machine Learning - Abstract
Real-world data often contains intrinsic ambiguity that the common single-hard-label annotation paradigm ignores. Standard training using ambiguous data with these hard labels may produce overly confident models and thus leading to poor generalization. In this paper, we propose a novel framework called Quantized Label Learning (QLL) to alleviate this issue. First, we formulate QLL as learning from (very) ambiguous data with hard labels: ideally, each ambiguous instance should be associated with a ground-truth soft-label distribution describing its corresponding probabilistic weight in each class, however, this is usually not accessible; in practice, we can only observe a quantized label, i.e., a hard label sampled (quantized) from the corresponding ground-truth soft-label distribution, of each instance, which can be seen as a biased approximation of the ground-truth soft-label. Second, we propose a Class-wise Positive-Unlabeled (CPU) risk estimator that allows us to train accurate classifiers from only ambiguous data with quantized labels. Third, to simulate ambiguous datasets with quantized labels in the real world, we design a mixing-based ambiguous data generation procedure for empirical evaluation. Experiments demonstrate that our CPU method can significantly improve model generalization performance and outperform the baselines., Comment: 9 pages, 4 figures, accepted by ICASSP 2025
- Published
- 2025
7. Graph Generative Pre-trained Transformer
- Author
-
Chen, Xiaohui, Wang, Yinkai, He, Jiaxing, Du, Yuanqi, Hassoun, Soha, Xu, Xiaolin, and Liu, Li-Ping
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Graph generation is a critical task in numerous domains, including molecular design and social network analysis, due to its ability to model complex relationships and structured data. While most modern graph generative models utilize adjacency matrix representations, this work revisits an alternative approach that represents graphs as sequences of node set and edge set. We advocate for this approach due to its efficient encoding of graphs and propose a novel representation. Based on this representation, we introduce the Graph Generative Pre-trained Transformer (G2PT), an auto-regressive model that learns graph structures via next-token prediction. To further exploit G2PT's capabilities as a general-purpose foundation model, we explore fine-tuning strategies for two downstream applications: goal-oriented generation and graph property prediction. We conduct extensive experiments across multiple datasets. Results indicate that G2PT achieves superior generative performance on both generic graph and molecule datasets. Furthermore, G2PT exhibits strong adaptability and versatility in downstream tasks from molecular design to property prediction., Comment: preprint
- Published
- 2025
8. $B_c$, $B_s$ and $D_s$ production at lepton-hadron colliders
- Author
-
Hu, Qiwei, Qiao, Cong-Feng, and Sun, Li-Ping
- Subjects
High Energy Physics - Phenomenology - Abstract
Within the framework of nonrelativistic quantum chromodynamics, this study examines the electroproduction processes $e+p\to e+B_c+\overline{c}+b$, $e+p\to e+B_s+\overline{s}+b$, and $e+p\to e+D_s+\overline{c}+s$ at lepton-hadron colliders. The differential cross sections in $\cos\theta$ and $p_T^2$ at HERA are presented. The results indicate that the production of $B_c$ is feasible at the EIC, whereas it is not at HERA. The cross sections for $B_s$ and $D_s$ are notably significant at HERA, yet they exhibit sensitivity to variations in quark mass and renormalization scale., Comment: 15 pages, 4 figures
- Published
- 2024
9. Superoutbursts and Positive Superhumps Occurred During the Standstill of a Z Cam-type Dwarf Nova
- Author
-
Sun, Qi-Bin, Qian, Sheng-Bang, Zhu, Li-Ying, Li, Qin-Mei, Li, Fu-Xing, Li, Min-Yu, and Li, Ping
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
Dwarf novae are semi-detached binaries, where a white dwarf accretes material from a cool main-sequence companion via an accretion disk, and are known for their intermittent outbursts, making them key systems for studying accretion physics. The accumulation of large survey datasets has challenged traditional models, which assumed that the disk remains hot and cannot produce superoutbursts during the standstill of Z Cam-type dwarf nova and that superoutbursts require a mass ratio of q = M_2/M_1 < 0.25 - 0.33. Here we report the detection of superoutbursts and positive superhumps (PSHs) during a standstill in the Z Cam-type star AT Cnc with a mass ratio larger than 0.33. Notably, the PSHs evolve gradually before the superoutburst begins, suggesting that an eccentric, precessing disk forms first, with the superoutburst occurring as the disk radius continues to expand. These findings provide the first detailed observational evidence of superoutbursts and PSHs occurring during standstill, offering important new insights into the classification of dwarf novae and the underlying mechanisms of outbursts, Comment: 17 pages, 8 figures and 2 tables
- Published
- 2024
10. Building Gradient Bridges: Label Leakage from Restricted Gradient Sharing in Federated Learning
- Author
-
Zhang, Rui, Chow, Ka-Ho, and Li, Ping
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security - Abstract
The growing concern over data privacy, the benefits of utilizing data from diverse sources for model training, and the proliferation of networked devices with enhanced computational capabilities have all contributed to the rise of federated learning (FL). The clients in FL collaborate to train a global model by uploading gradients computed on their private datasets without collecting raw data. However, a new attack surface has emerged from gradient sharing, where adversaries can restore the label distribution of a victim's private data by analyzing the obtained gradients. To mitigate this privacy leakage, existing lightweight defenses restrict the sharing of gradients, such as encrypting the final-layer gradients or locally updating the parameters within. In this paper, we introduce a novel attack called Gradient Bridge (GDBR) that recovers the label distribution of training data from the limited gradient information shared in FL. GDBR explores the relationship between the layer-wise gradients, tracks the flow of gradients, and analytically derives the batch training labels. Extensive experiments show that GDBR can accurately recover more than 80% of labels in various FL settings. GDBR highlights the inadequacy of restricted gradient sharing-based defenses and calls for the design of effective defense schemes in FL.
- Published
- 2024
11. Phase diagram of Rydberg atoms in a two-leg rectangular ladder
- Author
-
Liao, Shu-Ao, Zhang, Jin, and Yang, Li-Ping
- Subjects
Condensed Matter - Quantum Gases ,High Energy Physics - Lattice ,Physics - Atomic Physics ,Quantum Physics - Abstract
Using the density matrix renormalization group algorithm, we map the ground-state phase diagram of a two-leg Rydberg ladder array with lattice spacings $a_x=2a_y$. We identify various density wave phases that spontaneously break the translational symmetry or the top-bottom reflection symmetry within the ladder. By increasing the laser detuning from zero, where the system is in a disordered phase that preserves all symmetries, we observe density wave orders with spontaneous breaking of the translational $\mathbb{Z}_p$ symmetries at intermediate detuning values, while the reflection symmetry is preserved. These orders exhibit nonzero bond orders with positive expectation values on every $p$th rung, thus labeled as $\mathbb{Z}_p^+$ phases. At larger detuning values, another spontaneous breaking of the reflection symmetry, which disrupted the bond orders on the rungs, occurs via an Ising phase transition. In these phases, either the top or the bottom site is occupied in a staggered way on every $p$th rung, breaking the translational $\mathbb{Z}_{2p}$ symmetry, thus labeled by $\mathbb{Z}_{2p}$ phases. We locate and characterize the 3-state Potts point and Ashkin-Teller point along the commensurate lines, as well as the direct chiral phase transitions between the disordered phase and the $\mathbb{Z}_p^+$ ($p = 3, 4$) phases. Critical exponents $\nu$ and $z$ are calculated for both conformal and chiral phase transition points. We finally identify two types of floating phases in the phase diagram: one characterized by a quasi-long-range incommensurate bond-order wave, and the other by a quasi-long-range incommensurate wave of density differences in the rungs. Our work motivates further applications of Rydberg atom arrays in quantum simulation., Comment: 15 pages, 13 figures
- Published
- 2024
12. Simplex tensor network renormalization group for boundary theory of 3+1D symTFT
- Author
-
Ji, Kaixin, Chen, Lin, Yang, Li-Ping, and Hung, Ling-Yan
- Subjects
Condensed Matter - Strongly Correlated Electrons ,High Energy Physics - Theory - Abstract
Following the construction in arXiv:2210.12127, we develop a symmetry-preserving renormalization group (RG) flow for 3D symmetric theories. These theories are expressed as boundary conditions of a symTFT, which in our case is a 3+1D Dijkgraaf-Witten topological theory in the bulk. The boundary is geometrically organized into tetrahedra and represented as a tensor network, which we refer to as the "simplex tensor network" state. Each simplex tensor is assigned indices corresponding to its vertices, edges, and faces. We propose a numerical algorithm to implement RG flows for these boundary conditions, and explicitly demonstrate its application to a $\mathbb{Z}_2$ symmetric theory. By linearly interpolating between three topological fixed-point boundaries, we map the phase transitions characterized by local and non-local order parameters, which respectively detects the breaking of a 0-form and a 2-form symmetry. This formalism is readily extendable to other discrete symmetry groups and, in principle, can be generalized to describe 3D symmetric topological orders.
- Published
- 2024
13. MONOPOLY: Learning to Price Public Facilities for Revaluing Private Properties with Large-Scale Urban Data
- Author
-
Fan, Miao, Huang, Jizhou, Zhuo, An, Li, Ying, Li, Ping, and Wang, Haifeng
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Social and Information Networks - Abstract
The value assessment of private properties is an attractive but challenging task which is widely concerned by a majority of people around the world. A prolonged topic among us is ``\textit{how much is my house worth?}''. To answer this question, most experienced agencies would like to price a property given the factors of its attributes as well as the demographics and the public facilities around it. However, no one knows the exact prices of these factors, especially the values of public facilities which may help assess private properties. In this paper, we introduce our newly launched project ``Monopoly'' (named after a classic board game) in which we propose a distributed approach for revaluing private properties by learning to price public facilities (such as hospitals etc.) with the large-scale urban data we have accumulated via Baidu Maps. To be specific, our method organizes many points of interest (POIs) into an undirected weighted graph and formulates multiple factors including the virtual prices of surrounding public facilities as adaptive variables to parallelly estimate the housing prices we know. Then the prices of both public facilities and private properties can be iteratively updated according to the loss of prediction until convergence. We have conducted extensive experiments with the large-scale urban data of several metropolises in China. Results show that our approach outperforms several mainstream methods with significant margins. Further insights from more in-depth discussions demonstrate that the ``Monopoly'' is an innovative application in the interdisciplinary field of business intelligence and urban computing, and it will be beneficial to tens of millions of our users for investments and to the governments for urban planning as well as taxation., Comment: CIKM'19
- Published
- 2024
14. Effects of Muscle Synergy during Overhead Work with a Passive Shoulder Exoskeleton: A Case Study
- Author
-
Tian, Jin, Wei, Baichun, Yang, Chifu, Luo, Suo, Feng, Jiadong, Li, Ping, Chen, Changbing, Liu, Yingjie, Zhu, Haiqi, and Yi, Chunzhi
- Subjects
Physics - Medical Physics ,Computer Science - Robotics - Abstract
Objective: Shoulder exoskeletons can effectively assist with overhead work. However, their impacts on muscle synergy remain unclear. The objective is to systematically investigate the effects of the shoulder exoskeleton on muscle synergies during overhead work.Methods: Eight male participants were recruited to perform a screwing task both with (Intervention) and without (Normal) the exoskeleton. Eight muscles were monitored and muscle synergies were extracted using non-negative matrix factorization and electromyographic topographic maps. Results: The number of synergies extracted was the same (n = 2) in both conditions. Specifically, the first synergies in both conditions were identical, with the highest weight of AD and MD; while the second synergies were different between conditions, with highest weight of PM and MD, respectively. As for the first synergy in the Intervention condition, the activation profile significantly decreased, and the average recruitment level and activation duration were significantly lower (p<0.05). The regression analysis for the muscle synergies across conditions shows the changes of muscle synergies did not influence the sparseness of muscle synergies (p=0.7341). In the topographic maps, the mean value exhibited a significant decrease (p<0.001) and the entropy significantly increased (p<0.01). Conclusion: The exoskeleton does not alter the number of synergies and existing major synergies but may induce new synergies. It can also significantly decrease neural activation and may influence the heterogeneity of the distribution of monitored muscle activations. Significance: This study provides insights into the potential mechanisms of exoskeleton-assisted overhead work and guidance on improving the performance of exoskeletons.
- Published
- 2024
15. CZ Aqr: an oscillating eclipsing Algol-type system composed of a $\delta$ Sct primary star and a subgiant star in a quadruple system
- Author
-
Zeng, Qi-Huan, Liao, Wen-Ping, Qian, Sheng-Bang, Li, Lin-Jia, Li, Ping, and Deng, Zhao-Long
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
Eclipsing Algol-type systems containing a $\delta$ Scuti (hereafter $\delta$ Sct) star enable precise determination of physical parameters and the investigation of stellar internal structure and evolution. We present the absolute parameters of CZ Aquarius (hereafter CZ Aqr) based on TESS data. CZ Aqr has an orbital period of 0.86275209 d, a mass ratio of 0.489 (6), and the secondary component nearly fills its Roche lobe. $O-C$ analysis reveals a downward parabolic trend and a cyclical variation with a period of 88.2 yr. The downward parabola suggests a long-term decrease in the orbital period with $\dot{P}$ = -3.09$\times$$10^{-8}$ d $\textrm{yr}^{-1}$. The mass loss rate is estimated to be 4.54$\times$$10^{-9}$ M$_{\odot}$ $\textrm{yr}^{-1}$, which possibly due to magnetic stellar wind or hot spot. The cyclical variation might be caused by the light travel time effect via the presence of a third body with a minimum mass of $M_{3min}$ = 0.312 (21) M$_{\odot}$. Additionally, there are two possible celestial bodies in a 2:7 resonance orbit around CZ Aqr. The asymmetric light curve is explained by adding a hot spot on the surface of the primary star. After removing the binary model, 26 frequencies were extracted from TESS data. Two radial modes were newly identified among three possible independent frequencies. Our results show that the eclipsing Algol-type system is composed of a $\delta$ Sct primary star and a subgiant star in a quadruple system.
- Published
- 2024
- Full Text
- View/download PDF
16. Multifield tunable valley splitting and anomalous valley Hall effect in two-dimensional antiferromagnetic MnBr
- Author
-
Wang, Yiding, Sun, Hanbo, Wu, Chao, Zhang, Weixi, Guo, San-Dong, She, Yanchao, and Li, Ping
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Compared to the ferromagnetic materials that realize the anomalous valley Hall effect by breaking time-reversal symmetry and spin-orbit coupling, the antiferromagnetic materials with the joint spatial inversion and time-reversal (PT) symmetry are rarely reported that achieve the anomalous valley Hall effect. Here, we predict that the antiferromagnetic monolayer MnBr possesses spontaneous valley polarization. The valley splitting of valence band maximum is 21.55 meV at K and K' points, which is originated from Mn-dx2-y2 orbital by analyzing the effective Hamiltonian. Importantly, monolayer MnBr has zero Berry curvature in the entire momentum space but non-zero spin-layer locked Berry curvature, which offers the condition for the anomalous valley Hall effect. In addition, the magnitude of valley splitting can be signally tuned by the onsite correlation, strain, magnetization rotation, electric field, and built-in electric field. The electric field and built-in electric field induce spin splitting due to breaking the P symmetry. Therefore, the spin-layer locked anomalous valley Hall effect can be observed in MnBr. More remarkably, the ferroelectric substrate Sc2CO2 can tune monolayer MnBr to realize the transition from metal to valley polarization semiconductor. Our findings not only extend the implementation of the anomalous valley Hall effect, but also provides a platform for designing low-power and non-volatile valleytronics devices., Comment: 9 pages, 10 figures
- Published
- 2024
17. Proposal of a general scheme: valley polarization in antiferromagnetic bilayer systems
- Author
-
Guo, San-Dong, Li, Ping, and Wang, Guangzhao
- Subjects
Condensed Matter - Materials Science - Abstract
Superior to ferromagnetic (FM) valleytronics, antiferromagnetic (AFM) counterpart exhibits ultradense and ultrafast potential due to their intrinsic advantages of zero stray field, terahertz dynamics, and compensated moment of antiferromagnets. However, the physics of spontaneous valley polarization is mainly rooted in FM hexagonal lattices and is rarely used to explore the simultaneous spin and valley polarizations in AFM materials. Here, we propose a general stacking way to achieve valley polarization in AFM bilayer systems. The hexagonal ferrovalley material is used as the basic building unit, and then the space-inversion centrosymmetric bilayer system with interlayer AFM ordering is constructed by horizontal mirror and 2-fold rotational operations, which can exhibit spontaneous valley polarization. In this construction process, the rarely explored \textit{layer-locked hidden valley polarization}, hidden Berry curvature and layer Hall effect are involved, and an out-of-plane electric field can be used to detect hidden valley polarization and to realize layer-locked anomalous valley Hall effect. We use three examples to illustrate our proposal. Firstly, the Janus GdBrI is used to prove concepts and effects involved in our design process. Secondly, the $\mathrm{RuBr_2}$ is used to demonstrate other phenomena, including valley polarization transition and \textit{near-ideal quantum spin Hall insulator}. Finally, we use our design principles to understand the valley polarization of experimentally synthesized MnSe from a new perspective. Our works establish a robust general scheme to achieve valley polarization in AFM bilayer systems, thereby opening up new avenues for AFM valleytronics., Comment: 7 pages, 4 figures
- Published
- 2024
18. Pseudo-labeling with Keyword Refining for Few-Supervised Video Captioning
- Author
-
Li, Ping, Wang, Tao, Zhao, Xinkui, Xu, Xianghua, and Song, Mingli
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video captioning generate a sentence that describes the video content. Existing methods always require a number of captions (\eg, 10 or 20) per video to train the model, which is quite costly. In this work, we explore the possibility of using only one or very few ground-truth sentences, and introduce a new task named few-supervised video captioning. Specifically, we propose a few-supervised video captioning framework that consists of lexically constrained pseudo-labeling module and keyword-refined captioning module. Unlike the random sampling in natural language processing that may cause invalid modifications (\ie, edit words), the former module guides the model to edit words using some actions (\eg, copy, replace, insert, and delete) by a pretrained token-level classifier, and then fine-tunes candidate sentences by a pretrained language model. Meanwhile, the former employs the repetition penalized sampling to encourage the model to yield concise pseudo-labeled sentences with less repetition, and selects the most relevant sentences upon a pretrained video-text model. Moreover, to keep semantic consistency between pseudo-labeled sentences and video content, we develop the transformer-based keyword refiner with the video-keyword gated fusion strategy to emphasize more on relevant words. Extensive experiments on several benchmarks demonstrate the advantages of the proposed approach in both few-supervised and fully-supervised scenarios. The code implementation is available at https://github.com/mlvccn/PKG_VidCap, Comment: 12 figures, Accepted in Pattern Recognition
- Published
- 2024
19. Gradient Descent Finds Over-Parameterized Neural Networks with Sharp Generalization for Nonparametric Regression
- Author
-
Yang, Yingzhen and Li, Ping
- Subjects
Statistics - Machine Learning ,Computer Science - Information Theory ,Computer Science - Machine Learning - Abstract
We study nonparametric regression by an over-parameterized two-layer neural network trained by gradient descent (GD) in this paper. We show that, if the neural network is trained by GD with early stopping, then the trained network renders a sharp rate of the nonparametric regression risk of $\cO(\eps_n^2)$, which is the same rate as that for the classical kernel regression trained by GD with early stopping, where $\eps_n$ is the critical population rate of the Neural Tangent Kernel (NTK) associated with the network and $n$ is the size of the training data. It is remarked that our result does not require distributional assumptions about the covariate as long as the covariate is bounded, in a strong contrast with many existing results which rely on specific distributions of the covariates such as the spherical uniform data distribution or distributions satisfying certain restrictive conditions. The rate $\cO(\eps_n^2)$ is known to be minimax optimal for specific cases, such as the case that the NTK has a polynomial eigenvalue decay rate which happens under certain distributional assumptions on the covariates. Our result formally fills the gap between training a classical kernel regression model and training an over-parameterized but finite-width neural network by GD for nonparametric regression without distributional assumptions on the bounded covariate. We also provide confirmative answers to certain open questions or address particular concerns in the literature of training over-parameterized neural networks by GD with early stopping for nonparametric regression, including the characterization of the stopping time, the lower bound for the network width, and the constant learning rate used in GD., Comment: This article draws results with revisions from the first author's other work in arXiv:2407.11353. arXiv admin note: text overlap with arXiv:2407.11353
- Published
- 2024
20. Graph-based Confidence Calibration for Large Language Models
- Author
-
Li, Yukun, Wang, Sijia, Huang, Lifu, and Liu, Li-Ping
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval ,Computer Science - Machine Learning - Abstract
One important approach to improving the reliability of large language models (LLMs) is to provide accurate confidence estimations regarding the correctness of their answers. However, developing a well-calibrated confidence estimation model is challenging, as mistakes made by LLMs can be difficult to detect. We propose a novel method combining the LLM's self-consistency with labeled data and training an auxiliary model to estimate the correctness of its responses to questions. This auxiliary model predicts the correctness of responses based solely on their consistent information. To set up the learning problem, we use a weighted graph to represent the consistency among the LLM's multiple responses to a question. Correctness labels are assigned to these responses based on their similarity to the correct answer. We then train a graph neural network to estimate the probability of correct responses. Experiments demonstrate that the proposed approach substantially outperforms several of the most recent methods in confidence calibration across multiple widely adopted benchmark datasets. Furthermore, the proposed approach significantly improves the generalization capability of confidence calibration on out-of-domain (OOD) data.
- Published
- 2024
21. MassSpecGym: A benchmark for the discovery and identification of molecules
- Author
-
Bushuiev, Roman, Bushuiev, Anton, de Jonge, Niek F., Young, Adamo, Kretschmer, Fleming, Samusevich, Raman, Heirman, Janne, Wang, Fei, Zhang, Luke, Dührkop, Kai, Ludwig, Marcus, Haupt, Nils A., Kalia, Apurva, Brungs, Corinna, Schmid, Robin, Greiner, Russell, Wang, Bo, Wishart, David S., Liu, Li-Ping, Rousu, Juho, Bittremieux, Wout, Rost, Hannes, Mak, Tytus D., Hassoun, Soha, Huber, Florian, van der Hooft, Justin J. J., Stravs, Michael A., Böcker, Sebastian, Sivic, Josef, and Pluskal, Tomáš
- Subjects
Quantitative Biology - Quantitative Methods ,Computer Science - Machine Learning - Abstract
The discovery and identification of molecules in biological and environmental samples is crucial for advancing biomedical and chemical sciences. Tandem mass spectrometry (MS/MS) is the leading technique for high-throughput elucidation of molecular structures. However, decoding a molecular structure from its mass spectrum is exceptionally challenging, even when performed by human experts. As a result, the vast majority of acquired MS/MS spectra remain uninterpreted, thereby limiting our understanding of the underlying (bio)chemical processes. Despite decades of progress in machine learning applications for predicting molecular structures from MS/MS spectra, the development of new methods is severely hindered by the lack of standard datasets and evaluation protocols. To address this problem, we propose MassSpecGym -- the first comprehensive benchmark for the discovery and identification of molecules from MS/MS data. Our benchmark comprises the largest publicly available collection of high-quality labeled MS/MS spectra and defines three MS/MS annotation challenges: \textit{de novo} molecular structure generation, molecule retrieval, and spectrum simulation. It includes new evaluation metrics and a generalization-demanding data split, therefore standardizing the MS/MS annotation tasks and rendering the problem accessible to the broad machine learning community. MassSpecGym is publicly available at \url{https://github.com/pluskal-lab/MassSpecGym}.
- Published
- 2024
22. Beware of Calibration Data for Pruning Large Language Models
- Author
-
Ji, Yixin, Xiang, Yang, Li, Juntao, Xia, Qingrong, Li, Ping, Duan, Xinyu, Wang, Zhefeng, and Zhang, Min
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
As large language models (LLMs) are widely applied across various fields, model compression has become increasingly crucial for reducing costs and improving inference efficiency. Post-training pruning is a promising method that does not require resource-intensive iterative training and only needs a small amount of calibration data to assess the importance of parameters. Previous research has primarily focused on designing advanced pruning methods, while different calibration data's impact on pruning performance still lacks systematical exploration. We fill this blank and surprisingly observe that the effects of calibration data even value more than designing advanced pruning strategies, especially for high sparsity. Our preliminary exploration also discloses that using calibration data similar to the training data can yield better performance. As pre-training data is usually inaccessible for advanced LLMs, we further provide a self-generating calibration data synthesis strategy to construct feasible calibration data. We conduct experiments on the recent strong open-source LLMs (e.g., DCLM, and LLaMA-3), and the results show that the proposed method outperforms commonly used calibration data and can effectively enhance strong pruning methods (e.g., Wanda, OWL)., Comment: under review
- Published
- 2024
23. Toward Generalizing Visual Brain Decoding to Unseen Subjects
- Author
-
Kong, Xiangtao, Huang, Kexin, Li, Ping, and Zhang, Lei
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Visual brain decoding aims to decode visual information from human brain activities. Despite the great progress, one critical limitation of current brain decoding research lies in the lack of generalization capability to unseen subjects. Prior works typically focus on decoding brain activity of individuals based on the observation that different subjects exhibit different brain activities, while it remains unclear whether brain decoding can be generalized to unseen subjects. This study aims to answer this question. We first consolidate an image-fMRI dataset consisting of stimulus-image and fMRI-response pairs, involving 177 subjects in the movie-viewing task of the Human Connectome Project (HCP). This dataset allows us to investigate the brain decoding performance with the increase of participants. We then present a learning paradigm that applies uniform processing across all subjects, instead of employing different network heads or tokenizers for individuals as in previous methods, which can accommodate a large number of subjects to explore the generalization capability across different subjects. A series of experiments are conducted and we have the following findings. First, the network exhibits clear generalization capabilities with the increase of training subjects. Second, the generalization capability is common to popular network architectures (MLP, CNN and Transformer). Third, the generalization performance is affected by the similarity between subjects. Our findings reveal the inherent similarities in brain activities across individuals. With the emerging of larger and more comprehensive datasets, it is possible to train a brain decoding foundation model in the future. Codes and models can be found at https://github.com/Xiangtaokong/TGBD.
- Published
- 2024
24. Task Adaptive Feature Distribution Based Network for Few-shot Fine-grained Target Classification
- Author
-
Li, Ping, Wang, Hongbo, and Lu, Lei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Metric-based few-shot fine-grained classification has shown promise due to its simplicity and efficiency. However, existing methods often overlook task-level special cases and struggle with accurate category description and irrelevant sample information. To tackle these, we propose TAFD-Net: a task adaptive feature distribution network. It features a task-adaptive component for embedding to capture task-level nuances, an asymmetric metric for calculating feature distribution similarities between query samples and support categories, and a contrastive measure strategy to boost performance. Extensive experiments have been conducted on three datasets and the experimental results show that our proposed algorithm outperforms recent incremental learning algorithms., Comment: The presentation logic of the algorithm section in the paper is unclear, and there are errors in the experimental part that need to be corrected, along with additional experiments to be conducted
- Published
- 2024
25. KM UMa: An active short-period detached eclipsing binary in a hierarchical quadruple system
- Author
-
Meng, Fangbin, Zhu, Liying, Liu, Nianping, Li, Ping, Zhang, Jia, Li, Linjia, and Matekov, Azizbek
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
The first detailed photometric and spectroscopic analysis of the G-type eclipsing binary KM UMa is presented, which indicates that the system is a short-period detached eclipsing binary. The radial velocity curves were calculated using the cross-correlation function method based on Large Sky Area Multi-Object Fiber Spectroscopic Telescope, Sloan Digital Sky Survey, and our observations, which determined the mass ratio as $q=0.45\ (\pm0.04)$. Based on the light curves from the Transiting Exoplanet Survey Satellite, other survey data, and our multiband observations, the positive and negative O'Connell effects have been detected evolving gradually and alternately over the last 20 yr, which can be explained by the presence of spots on the primary component. A superflare event was detected in the SuperWASP data on 2007 February 28, further indicating that KM UMa is a very active system. We calculated its energy to be $5\times10^{34}$ erg by assuming it occurred on the primary star. Utilizing hundreds of medium-resolution spectra and one low-resolution spectrum, the equivalent width variations of the $H_{\alpha}$ line were calculated, indicating the presence of a 5.21 ($\pm0.67$) yr magnetic activity cycle. The orbital period variations were analyzed using the O-C method, detecting a long-term decrease superimposed with a periodic variation. The amplitude of the cyclic variation is $0.01124\ (\pm0.00004)$ day, with a period of $33.66\ (\pm 0.0012)$ yr, which exceeds the 5.21 yr activity cycle, suggesting that this is more likely attributable to the light travel time effect of a third body. Simultaneously, a visual companion has been detected based on the Gaia astrometric data, indicating that KM UMa is actually in a 2+1+1 hierarchical quadruple system.
- Published
- 2024
- Full Text
- View/download PDF
26. Computer-aided Colorization State-of-the-science: A Survey
- Author
-
Cao, Yu, Duan, Xin, Meng, Xiangqiao, Mok, P. Y., Li, Ping, and Lee, Tong-Yee
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper reviews published research in the field of computer-aided colorization technology. We argue that the colorization task originates from computer graphics, prospers by introducing computer vision, and tends to the fusion of vision and graphics, so we put forward our taxonomy and organize the whole paper chronologically. We extend the existing reconstruction-based colorization evaluation techniques, considering that aesthetic assessment of colored images should be introduced to ensure that colorization satisfies human visual-related requirements and emotions more closely. We perform the colorization aesthetic assessment on seven representative unconditional colorization models and discuss the difference between our assessment and the existing reconstruction-based metrics. Finally, this paper identifies unresolved issues and proposes fruitful areas for future research and development. Access to the project associated with this survey can be obtained at https://github.com/DanielCho-HK/Colorization.
- Published
- 2024
27. Ferroelectricity-Driven Metallicity and Magnetic Skyrmions in van der Waals Cr2Ge2Te6/Hf2Ge2Te6 Multiferroic Heterostructure
- Author
-
Chen, Zheng, Hu, Hongliang, Zhang, Wenjun, Wu, Xiaoping, Li, Ping, and Song, Changsheng
- Subjects
Condensed Matter - Materials Science ,Physics - Computational Physics - Abstract
Two-dimensional (2D) multiferroic heterostructures present a promising platform for advanced spin devices by leveraging the coexisting ferromagnetic (FM) and ferroelectric (FE) orders. Through first-principles calculations and micromagnetic simulations, we reveal non-volatile control of metallicity and topological spin textures in the Cr2Ge2Te6/Hf2Ge2Te6(CGT/HGT) heterostructure. Notably, manipulating ferroelectric polarization in HGT significantly modulates the magnetic anisotropy energy (MAE) and Dzyaloshinskii-Moriya interaction (DMI) of CGT/HGT, reversing the easy magnetization axis from in-plane to out-of-plane. By analyzing the atomic-resolved SOC energy (\Delta Esoc), it is found that the cause of the change comes from the Fert-Levy mechanism. Additionally, this polarization control enables the creation and annihilation of bimerons and skyrmions, with interlayer sliding further altering magnetic ordering. Our findings offer valuable insights into magnetoelectric coupling and spin texture manipulation in 2D magnets, highlighting their potential for next-generation spintronic and memory devices., Comment: 9 pages,5 figures
- Published
- 2024
28. Laboratorial radiative shocks with multiple parameters and first quantifying verifications to core-collapse supernovae
- Author
-
Zhang, Lu, Zheng, Jianhua, Yang, Zhenghua, Song, Tianming, Zhang, Shuai, Liu, Tong, Wei, Yunfeng, Kuang, Longyu, Jing, Longfei, Lin, Zhiwei, Li, Liling, Li, Hang, Zheng, Jinhua, Yang, Pin, Zhang, Yuxue, Zhang, Zhiyu, Zhao, Yang, He, Zhibing, Li, Ping, Yang, Dong, Yang, Jiamin, Zhao, Zongqing, and Ding, Yongkun
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,High Energy Physics - Experiment ,Physics - Plasma Physics - Abstract
We present experiments to reproduce the characteristics of core-collapse supernovae with different stellar masses and initial explosion energies in the laboratory. In the experiments, shocks are driven in 1.2 atm and 1.9 atm xenon gas by laser with energy from 1600J to 2800J on the SGIII prototype laser facility. The average shock velocities and shocked densities are obtained from experiments. Experimental results reveal that higher laser energy and lower Xe gas density led to higher shock velocity, and lower Xe gas initial density has a higher compression. Modeling of the experiments using the 2D radiation hydrodynamic codes Icefire shows excellent agreement with the experimental results and gives the temperature. These results will contribute to time-domain astrophysical systems, such as gravitational supernovae, where a strong radiative shock propagates outward from the center of the star after the core collapses., Comment: 7 pages, 2 figures, 1 supplement (8 pages, 3 figures, 2 tables), accepted for publication in Science Bulletin
- Published
- 2024
29. Bridging the Gap: GRB 230812B -- A Three-Second Supernova-Associated Burst Detected by the GRID Mission
- Author
-
Wang, Chen-Yu, Yin, Yi-Han Iris, Zhang, Bin-Bin, Feng, Hua, Zeng, Ming, Xiong, Shao-Lin, Pan, Xiao-Fan, Yang, Jun, Zhang, Yan-Qiu, Li, Chen, Yan, Zhen-Yu, Wang, Chen-Wei, Zheng, Xu-Tao, Liu, Jia-Cong, Wang, Qi-Dong, Yang, Zi-Rui, Li, Long-Hao, Liu, Qi-Ze, Zhao, Zheng-Yang, Hu, Bo, Liu, Yi-Qi, Lu, Si-Yuan, Luo, Zi-You, Cang, Ji-Rong, Cao, De-Zhi, Han, Wen-Tao, Jia, Li-Ping, Pan, Xing-Yu, Tian, Yang, Xu, Ben-Da, Yang, Xiao, and Zeng, Zhi
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
GRB 230812B, detected by the Gamma-Ray Integrated Detectors (GRID) constellation mission, is an exceptionally bright gamma-ray burst (GRB) with a duration of only 3 seconds. Sitting near the traditional boundary ($\sim$ 2 s) between long and short GRBs, GRB 230812B is notably associated with a supernova (SN), indicating a massive star progenitor. This makes it a rare example of a short-duration GRB resulting from stellar collapse. Our analysis, using a time-evolving synchrotron model, suggests that the burst has an emission radius of approximately $10^{14.5}$~cm. We propose that the short duration of GRB 230812B is due to the combined effects of the central engine's activity time and the time required for the jet to break through the stellar envelope. Our findings provide another case that challenges the conventional view that short-duration GRBs originate exclusively from compact object mergers, demonstrating that a broader range of durations exists for GRBs arising from the collapse of massive stars., Comment: 10 pages, 3 tables, 11 figures
- Published
- 2024
30. Conflict-free chromatic index of trees
- Author
-
Guo, Shanshan, Li, Ethan Y. H., Li, Luyi, and Li, Ping
- Subjects
Computer Science - Discrete Mathematics ,Mathematics - Combinatorics - Abstract
A graph $G$ is conflict-free $k$-edge-colorable if there exists an assignment of $k$ colors to $E(G)$ such that for every edge $e\in E(G)$, there is a color that is assigned to exactly one edge among the closed neighborhood of $e$. The smallest $k$ such that $G$ is conflict-free $k$-edge-colorable is called the conflict-free chromatic index of $G$, denoted $\chi'_{CF}(G)$. D\c{e}bski and Przyby\a{l}o showed that $2\le\chi'_{CF}(T)\le 3$ for every tree $T$ of size at least two. In this paper, we present an algorithm to determine the conflict-free chromatic index of a tree without 2-degree vertices, in time $O(|V(T)|)$. This partially answer a question raised by Kamyczura, Meszka and Przyby\a{l}o.
- Published
- 2024
31. Phaseless uniqueness for determining internal source in photo-thermal effect
- Author
-
Deng, Li-Ping, Liu, Hongyu, Miao, Zhi-Qiang, and Zheng, Guang-Hui
- Subjects
Mathematics - Analysis of PDEs - Abstract
The paper investigates an inverse problem of recovering the internal source from external temperature measurements in photo-thermal effect. The photo-thermal effect actually involves two physical processes: electromagnetic scattering and heat transfer, described by a nonlinear coupled system of Maxwell's equation and the heat transfer equation. The nonlinear coupling term in the system is represented by the square of the modulus of the electromagnetic (missing the phase information of the electromagnetic field), and the absence of this phase information poses a significant challenge to the reconstruction of the internal source. In addition, the interaction and mutual influence of multiple physical fields, including electric field, magnetic field and temperature field, add to the complexity involved in the inversion of the internal source. Based on the potential theory and asymptotic analysis, we prove that the internal source can be uniquely determined up to sign by the external temperature field. This provides a solid theoretical basis for designing the internal source inversion algorithm and further exploring the theoretical aspects of photo-thermal effect.
- Published
- 2024
32. Dynamical Sampling in Shift-Invariant Spaces Associated with multi-dimensional Special Affine Fourier Transform
- Author
-
Ning, Meng, Wu, Li-Ping, Zhang, Qing-yue, and Liu, Bei
- Subjects
Mathematics - Functional Analysis ,94A20, 94A12, 42C15 - Abstract
The Special Affine Fourier Transformation(SAFT), which generalizes several well-known unitary transformations, has been demonstrated as a valuable tool in signal processing and optics. In this paper, we explore the multivariate dynamical sampling problem in shift-invariant spaces associated with the multi-dimensional SAFT. Specifically, we derive a sufficient and necessary condition under which a function in a shift-invariant space can be stably recovered from its dynamical sampling measurements associated with the multi-dimensional SAFT . We also present a straightforward example to elucidate our main result., Comment: 22 pages, 11 figures
- Published
- 2024
33. Rotating black holes in de Rham-Gabadadze-Tolley massive gravity: Analytic calculation procedure
- Author
-
Li, Ping and Yang, Jiang-he
- Subjects
General Relativity and Quantum Cosmology - Abstract
In this paper, we explore the solutions of rotating black holes within the framework of de Rham-Gabadadze-Tolley (dRGT) massive gravity. We provide a detailed, step-by-step analytical derivation of these solutions. Our solutions are characterized by several parameters: mass $M$ , electric charge $Q_{*}$, angular momentum $a$, and a graviton mass $m$. This graviton mass term incorporates both a cosmological constant $\Lambda$ and a St\"uckelberg charge $S_{*}$ into the black hole parameters. These solutions may serve as potential candidates for astrophysical black holes.
- Published
- 2024
34. Ferroelectric tuning of the valley polarized metal-semiconductor transition in Mn2P2S3Se3/Sc2CO2 van der Waals heterostructures and application to nonlinear Hall effect devices
- Author
-
Sun, Hanbo, Ren, Yewei, Wu, Chao, Dong, Pengqiang, Zhang, Weixi, Wu, Yin-Zhong, and Li, Ping
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
In order to promote the development of the next generation of nano-spintronic devices, it is of great significance to tune the freedom of valley in two-dimensional (2D) materials. Here, we propose a mechanism for manipulating the valley and nonlinear Hall effect by the 2D ferroelectric substrate. The monolayer Mn2P2S3Se3 is a robust antiferromagnetic valley polarized semiconductor. Importantly, the valley polarized metal-semiconductor phase transition of Mn2P2S3Se3 can be effectively tuned by switching the ferroelectric polarization of Sc2CO2. We reveal the microscopic mechanism of phase transition, which origins from the charge transfer and band alignment. Additionally, we find that transformed polarization direction of Sc2CO2 flexibly manipulate the Berry curvature dipole. Based on this discovery, we present the detection valley polarized metal-semiconductor transition by the nonlinear Hall effect devices. These findings not only offer a scheme to tune the valley degree of freedom, but also provide promising platform to design the nonlinear Hall effect devices., Comment: 8 pages, 6 figures
- Published
- 2024
35. Explicit formulas for the Hattori-Stong theorem and applications
- Author
-
Li, Ping and Lin, Wangyang
- Subjects
Mathematics - Differential Geometry ,Mathematics - Combinatorics ,Mathematics - Geometric Topology ,57R20, 05E05, 19L64, 32Q60 - Abstract
We employ combinatorial techniques to present an explicit formula for the coefficients in front of Chern classes involving in the Hattori-Stong integrability conditions. We also give an evenness condition for the signature of stably almost-complex manifolds in terms of Chern numbers. As an application, it can be showed that the signature of a $2n$-dimensional stably almost-complex manifold whose possibly nonzero Chern numbers being $c_n$ and $c_ic_{n-i}$ is even, which particularly rules out the existence of such structure on rational projective planes. Some other related results and remarks are also discussed in this article., Comment: 15 pages
- Published
- 2024
36. Large Margin Prototypical Network for Few-shot Relation Classification with Fine-grained Features
- Author
-
Fan, Miao, Bai, Yeqi, Sun, Mingming, and Li, Ping
- Subjects
Computer Science - Computation and Language - Abstract
Relation classification (RC) plays a pivotal role in both natural language understanding and knowledge graph completion. It is generally formulated as a task to recognize the relationship between two entities of interest appearing in a free-text sentence. Conventional approaches on RC, regardless of feature engineering or deep learning based, can obtain promising performance on categorizing common types of relation leaving a large proportion of unrecognizable long-tail relations due to insufficient labeled instances for training. In this paper, we consider few-shot learning is of great practical significance to RC and thus improve a modern framework of metric learning for few-shot RC. Specifically, we adopt the large-margin ProtoNet with fine-grained features, expecting they can generalize well on long-tail relations. Extensive experiments were conducted by FewRel, a large-scale supervised few-shot RC dataset, to evaluate our framework: LM-ProtoNet (FGF). The results demonstrate that it can achieve substantial improvements over many baseline approaches., Comment: Accepted by CIKM'19
- Published
- 2024
37. MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search
- Author
-
Fan, Miao, Guo, Jiacheng, Zhu, Shuai, Miao, Shuo, Sun, Mingming, and Li, Ping
- Subjects
Computer Science - Information Retrieval - Abstract
Baidu runs the largest commercial web search engine in China, serving hundreds of millions of online users every day in response to a great variety of queries. In order to build a high-efficiency sponsored search engine, we used to adopt a three-layer funnel-shaped structure to screen and sort hundreds of ads from billions of ad candidates subject to the requirement of low response latency and the restraints of computing resources. Given a user query, the top matching layer is responsible for providing semantically relevant ad candidates to the next layer, while the ranking layer at the bottom concerns more about business indicators (e.g., CPM, ROI, etc.) of those ads. The clear separation between the matching and ranking objectives results in a lower commercial return. The Mobius project has been established to address this serious issue. It is our first attempt to train the matching layer to consider CPM as an additional optimization objective besides the query-ad relevance, via directly predicting CTR (click-through rate) from billions of query-ad pairs. Specifically, this paper will elaborate on how we adopt active learning to overcome the insufficiency of click history at the matching layer when training our neural click networks offline, and how we use the SOTA ANN search technique for retrieving ads more efficiently (Here ``ANN'' stands for approximate nearest neighbor search). We contribute the solutions to Mobius-V1 as the first version of our next generation query-ad matching system., Comment: Accepted by KDD'19
- Published
- 2024
38. A New IW And-Type Star: Karachurin 12 with Tilted Disks and Diverse cycles
- Author
-
Sun, Qi-Bin, Qian, Sheng-Bang, Zhu, Li-Ying, Li, Qin-Mei, Li, Fu-Xing, Li, Min-Yu, and Li, Ping
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
The IW And-type phenomenon in cataclysmic variables presents a significant challenge to the accretion disk instability model. Using photometric data from the All-Sky Automated Survey for Supernovae, the Zwicky Transient Facility, and the Transiting Exoplanet Survey Satellite, we identify Karachurin 12 as a new non-eclipsing IW And-type object with a cycle period of 35.69(3) days. We also report for the first time that Karachurin 12 is a negative superhump (NSH) system featuring a precessing tilted disks, with precession, orbital, and NSH periods of 4.9588(2) days, 0.3168895(13) days, and 0.2979861(8) days, respectively. Our analysis, using dips as index and NSHs as probe, reveals diverse cycle patterns in Karachurin 12, with NSH amplitude varying throughout the cycle. These findings offer new insights for studying tilted disks and the IW And-type phenomenon. The mass-transfer burst model has difficulty explaining the observed variations in NSH amplitude, especially given the uncertainty surrounding the origin of the mass transfer burst. Meanwhile, the tilted thermally unstable disk model indicates a possible connection to the IW And-type phenomenon, but it also struggles to account for the detailed variations in Karachurin 12. Therefore, a wider range of factors must be considered to fully understand the complex changes in Karachurin 12., Comment: 20 pages, 9 figures and 3 tables, accepted for publication in The Astrophysical Journal
- Published
- 2024
- Full Text
- View/download PDF
39. Modeling the development of lexicon with a growing self-organizing map
- Author
-
Farkas, Igor and Li, Ping
- Subjects
Computer Science: Language ,Linguistics: Semantics ,Psychology: Psycholinguistics ,Language ,Semantics ,Psycholinguistics - Abstract
We present a self-organizing neural network model that can acquire an incremental lexicon. The model allows the acquisition of new words without disrupting learned structure. The model consists of three major components. First, the word co-occurrence detector computes word transition probabilities and represents word meanings in terms of context vectors. Second, word representations are projected to a lower, constant dimension. Third, the growing lexical map (GLM) self-organizes on the dimension-reduced word representations. The model is initialized with a subset of units in GLM and a subset of the lexicon, which enables it to capture the regularities of the input space and decrease chances of catastrophic interference. During growth, new nodes are inserted in order to reduce the map quantization error, and the insertion occurs only to yet unoccupied grid positions, thus preserving the 2D map topology. We have tested GLM on a portion of parental speech extracted from the CHILDES database, with an initial 200 words scattered among 800 nodes. The model demonstrates the ability to highly preserve learned lexical structure when 100 new words are gradually added. Implications of the model are discussed with respect to language acquisition by children.
- Published
- 2002
40. A self-organizing neural network model of the acquisition of word meaning
- Author
-
Farkas, Igor and Li, Ping
- Subjects
Computer Science: Language ,Computer Science: Neural Nets ,Linguistics: Semantics ,Psychology: Psycholinguistics ,Language ,Neural Nets ,Semantics ,Psycholinguistics - Abstract
In this paper we present a self-organizing connectionist model of the acquisition of word meaning. Our model consists of two neural networks and builds on the basic concepts of Hebbian learning and self-organization. One network learns to approximate word transition probabilities, which are used for lexical representation, and the other network, a self-organizing map, is trained on these representations, projecting them onto a 2D space. The model relies on lexical co-occurrence information to represent word meanings in the lexicon. The results show that our model is able to acquire semantic representations from both artificial data and real corpus of language use. In addition, the model demonstrates the ability to develop rather accurate word representations even with a sparse training set.
- Published
- 2001
41. Exploring Factors Influencing Students' Willingness to Use Translation Technology
- Author
-
Yu-xi Wang, Li-ping Chen, and Jia-yin Han
- Abstract
The study of students' willingness to use translation technology can motivate students to use the translation technology and improve their translation efficiency. This paper reports on a survey which collected 716 valid questionnaires from students in the Master-level Program in Translation and Interpreting. Structural equation modeling (SEM) was used to assess the influence of six potential variables on Master of Translation and interpreting (MTI) usage intention. The results show that perceived usefulness (PU), perceived ease of use (PEOU), subjective norm (SN) and translation technology self-efficacy (TTSE) have a significant positive influence on usage intention (UI), and perceived ease of use and perceived usefulness have significant mediating effects. Additionally, self-efficacy has a marginal or no influence at all on perceived usefulness, and flow experience (FE) has no effect on usage intention. Suggestions for future studies in the area of translation technology teaching are proposed based on the results and limitations of this study. These findings contribute to the promotion of the use of translation technology by master of Translation and interpreting at universities, which is expected to provide a beneficial research foundation for the teaching of translation technology.
- Published
- 2024
- Full Text
- View/download PDF
42. Highly Efficient and Stable Perovskite Solar Cells via MultiFunctional Curcumin Modified Buried Interface
- Author
-
Wu, Xianhu, Bi, Jieyu, Cu, Guanglei, Liu, Nian, Xia, Gaojie, Sun, Jilong, Jiang, Jiaxin, Lu, Ning, Li, Ping, Zhao, Chunyi, Zuo, Zewen, and Gu, Min
- Subjects
Condensed Matter - Materials Science ,Physics - Applied Physics - Abstract
The buried interface between the electron transport layer and the perovskite layer suffers from severe interface defects and imperfect energy level alignment. To address this issue, this study employs a multifunctional organic molecule, curcumin, to modify the interface between SnO2 and the perovskite layer. The functional groups on curcumin effectively passivate the defects on both sides of the interface, reducing -OH and oxygen vacancy defects on the SnO2 surface and passivating uncoordinated Pb2+ in the perovskite layer. This results in a more compatible energy level alignment and lower defect density at the interface, enhancing carrier transport across it. Consequently, the devices based on curcumin achieve an impressive champion power conversion efficiency (PCE) of 24.46%, compared to 22.03% for control devices. This work demonstrates a simple, green, hydrophobic, and efficient molecular modification method for the buried interface, laying the foundation for the development of high-performance and stable perovskite solar cells.
- Published
- 2024
43. EMP: Enhance Memory in Data Pruning
- Author
-
Xiao, Jinying, Li, Ping, Nie, Jie, and Tang, Zhe
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Recently, large language and vision models have shown strong performance, but due to high pre-training and fine-tuning costs, research has shifted towards faster training via dataset pruning. Previous methods used sample loss as an evaluation criterion, aiming to select the most "difficult" samples for training. However, when the pruning rate increases, the number of times each sample is trained becomes more evenly distributed, which causes many critical or general samples to not be effectively fitted. We refer to this as Low-Frequency Learning (LFL). In other words, LFL prevents the model from remembering most samples. In our work, we decompose the scoring function of LFL, provide a theoretical explanation for the inefficiency of LFL, and propose adding a memory term to the scoring function to enhance the model's memory capability, along with an approximation of this memory term. Similarly, we explore memory in Self-Supervised Learning (SSL), marking the first discussion on SSL memory. Using contrastive learning, we derive the memory term both theoretically and experimentally. Finally, we propose Enhance Memory Pruning (EMP), which addresses the issue of insufficient memory under high pruning rates by enhancing the model's memory of data, thereby improving its performance. We evaluated the performance of EMP in tasks such as image classification, natural language understanding, and model pre-training. The results show that EMP can improve model performance under extreme pruning rates. For example, in the CIFAR100-ResNet50 pre-training task, with 70\% pruning, EMP outperforms current methods by 2.2\%.
- Published
- 2024
44. Timeline and Boundary Guided Diffusion Network for Video Shadow Detection
- Author
-
Zhou, Haipeng, Wang, Honqiu, Ye, Tian, Xing, Zhaohu, Ma, Jun, Li, Ping, Wang, Qiong, and Zhu, Lei
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Video Shadow Detection (VSD) aims to detect the shadow masks with frame sequence. Existing works suffer from inefficient temporal learning. Moreover, few works address the VSD problem by considering the characteristic (i.e., boundary) of shadow. Motivated by this, we propose a Timeline and Boundary Guided Diffusion (TBGDiff) network for VSD where we take account of the past-future temporal guidance and boundary information jointly. In detail, we design a Dual Scale Aggregation (DSA) module for better temporal understanding by rethinking the affinity of the long-term and short-term frames for the clipped video. Next, we introduce Shadow Boundary Aware Attention (SBAA) to utilize the edge contexts for capturing the characteristics of shadows. Moreover, we are the first to introduce the Diffusion model for VSD in which we explore a Space-Time Encoded Embedding (STEE) to inject the temporal guidance for Diffusion to conduct shadow detection. Benefiting from these designs, our model can not only capture the temporal information but also the shadow property. Extensive experiments show that the performance of our approach overtakes the state-of-the-art methods, verifying the effectiveness of our components. We release the codes, weights, and results at \url{https://github.com/haipengzhou856/TBGDiff}., Comment: ACM MM2024
- Published
- 2024
- Full Text
- View/download PDF
45. A Comprehensive Analysis of Text-Book-Version Afterglow Light curves of Gamma-Ray Bursts and Implication for Universal Radiation Physics of Baryonic Jets
- Author
-
Zhang, Lu-Lu, Zhong, Shu-Qing, Xin, Li-Ping, and Liang, En-Wei
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The standard external shock model in the thin-shell scenario predicts an onset bump in the early optical afterglow light curves of gamma-ray bursts (GRBs). We collect such a textbook-version light curve sample of $30$ GRBs, and derive the jet properties from our joint fit to their X-ray and optical afterglow light curves. It is found that the distributions of the isotropic initial Lorentz factors ($\Gamma_0$), the deceleration radii ($R_{\rm dec}$), and the magnetic field strength ($B_0$) are log-normal, but the distributions of the isotropic kinetic energy ($E_{\rm k, iso}$), medium density ($n_{0}$), and the magnetization parameter ($\sigma_{B}\equiv\epsilon_B/\epsilon_e$) are tentatively bimodal. A tight $R_{\rm dec}\mbox{-}B_{0}\mbox{-}\sigma_{B}$ relation is found. It infers a universal $\epsilon_e E_{\rm k,iso}$ among bursts, plausibly supporting the previous argument of a universal GRB radiation energy among GRBs. A jet break is required for modeling the light curves of $26$ GRBs. The distributions of the jet opening angles and the jet-corrected kinetic energies log-normally center at $\log \theta_{\rm j,c}/{\rm rad}=-1.51$ (standard deviation $\sigma=0.27$) and $\log (E_{\rm k, j,c}/{\rm erg})=51.78$ ($\sigma=0.54$), respectively. Those GRBs ($19$ GRBs), whose prompt gamma-ray emission is well estimated with broad energy-band observations, satisfy the previously discovered $L_{\rm \gamma, p, iso}-E_{\rm p,z}-\Gamma_{0}$ relation, and their gamma-ray radiation efficiencies log-normally distribute in the range from $0.04\%$ to $10\%$ with a central value of $0.42\%$. Such a low efficiency favors the baryonic fireball model, and the distribution of their baryon mass loading in the GRB ejecta log-normally centers at $\log (M_{\rm fb,c}/M_{\rm sun})=-5$ ($\sigma=0.75$)., Comment: 12 pages, 4 tables, 9 figures. Publication in the Astrophysics Journal (in press)
- Published
- 2024
46. A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models
- Author
-
Zhao, Pengxiang, Hu, Hanyu, Li, Ping, Zheng, Yi, Wang, Zhefeng, and Yuan, Xiaoming
- Subjects
Computer Science - Machine Learning ,Mathematics - Optimization and Control - Abstract
Pruning is a critical strategy for compressing trained large language models (LLMs), aiming at substantial memory conservation and computational acceleration without compromising performance. However, existing pruning methods often necessitate inefficient retraining for billion-scale LLMs or rely on heuristic methods such as the optimal brain surgeon framework, which degrade performance. In this paper, we introduce FISTAPruner, the first post-training pruner based on convex optimization models and algorithms. Specifically, we propose a convex optimization model incorporating $\ell_1$ norm to induce sparsity and utilize the FISTA solver for optimization. FISTAPruner incorporates an intra-layer cumulative error correction mechanism and supports parallel pruning. We comprehensively evaluate FISTAPruner on models such as OPT, LLaMA, LLaMA-2, and LLaMA-3 with 125M to 70B parameters under unstructured and 2:4 semi-structured sparsity, demonstrating superior performance over existing state-of-the-art methods across various language benchmarks.
- Published
- 2024
47. Unveiling the Multifaceted GRB 200613A: Prompt Emission Dynamics, Afterglow Evolution, and the Host Galaxy's Properties
- Author
-
Fu, Shao-Yu, Xu, Dong, Lei, Wei-Hua, Postigo, Antonio de Ugarte, Kann, D. Alexander, Thöne, Christina C., Fernández, José Feliciano Agüí, Shuang-Xi, Yi, Xie, Wei, Zou, Yuan-Chuan, Liu, Xing, Jiang, Shuai-Qing, Lu, Tian-Hua, An, Jie, Zhu, Zi-Pei, Zheng, Jie, Tang, Qing-Wen, Zhao, Peng-Wei, Xin, Li-Ping, and Wei, Jian-Yan
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
We present our optical observations and multi-wavelength analysis of the GRB\,200613A detected by \texttt{Fermi} satellite. Time-resolved spectral analysis of the prompt $\gamma$-ray emission was conducted utilizing the Bayesian block method to determine statistically optimal time bins. Based on the Bayesian Information Criterion (BIC), the data generally favor the Band+Blackbody (short as BB) model. We speculate that the main Band component comes from the Blandford-Znajek mechanism, while the additional BB component comes from the neutrino annihilation process. The BB component becomes significant for a low-spin, high-accretion rate black hole central engine, as evidenced by our model comparison with the data. The afterglow light curve exhibits typical power-law decay, and its behavior can be explained by the collision between the ejecta and constant interstellar medium (ISM). Model fitting yields the following parameters: $E_{K,iso} = (2.04^{+11.8}_{-1.50})\times 10^{53}$ erg, $\Gamma_0=354^{+578}_{-217}$, $p=2.09^{+0.02}_{-0.03}$, $n_{18}=(2.04^{+9.71}_{-1.87})\times 10^{2}$ cm$^{-3}$, $\theta_j=24.0^{+6.50}_{-5.54}$ degree, $\epsilon_e=1.66^{+4.09}_{-1.39})\times 10^{-1}$ and $\epsilon_B=(7.76^{+48.5}_{-5.9})\times 10^{-6}$. In addition, we employed the public Python package \texttt{Prospector} perform a spectral energy distribution (SED) modeling of the host galaxy. The results suggest that the host galaxy is a massive galaxy ($\log(M_\ast / M_\odot)=11.75^{+0.10}_{-0.09}$) with moderate star formation rate ($\mbox{SFR}=22.58^{+13.63}_{-7.22} M_{\odot}$/yr). This SFR is consistent with the SFR of $\sim 34.2 M_{\odot}$ yr$^{-1}$ derived from the [OII] emission line in the observed spectrum., Comment: 30 pages, 16 figures, accepted by ApJ
- Published
- 2024
48. Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality
- Author
-
Hoang, Duy C., Le, Hung T. Q., Chu, Rui, Li, Ping, Zhao, Weijie, Lao, Yingjie, and Doan, Khoa D.
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
With the widespread adoption of Large Language Models (LLMs), concerns about potential misuse have emerged. To this end, watermarking has been adapted to LLM, enabling a simple and effective way to detect and monitor generated text. However, while the existing methods can differentiate between watermarked and unwatermarked text with high accuracy, they often face a trade-off between the quality of the generated text and the effectiveness of the watermarking process. In this work, we present a novel type of LLM watermark, Sparse Watermark, which aims to mitigate this trade-off by applying watermarks to a small subset of generated tokens distributed across the text. The key strategy involves anchoring watermarked tokens to words that have specific Part-of-Speech (POS) tags. Our experimental results demonstrate that the proposed watermarking scheme achieves high detectability while generating text that outperforms previous LLM watermarking methods in quality across various tasks
- Published
- 2024
49. Quantum Beam Splitter as a Quantum Coherence Controller
- Author
-
Yang, Li-Ping and Chang, Yue
- Subjects
Quantum Physics ,Physics - Optics - Abstract
We propose a quantum beam splitter (QBS) with tunable reflection and transmission coefficients. More importantly, our device based on a Hermitian parity-time ($\mathcal{PT}$) symmetric system enables the generation and manipulation of asymmetric quantum coherence of the output photons. For the interference of two weak coherent-state inputs, our QBS can produce anti-bunched photons from one output port and bunched photons from the other, showcasing high parity asymmetry and strong coherence control capabilities. Beyond the Hong-Ou-Mandel effect, perfect photon blockade with vanishing $g^{(2)}(0)$ is achievable in two-photon interference. These striking effects of the QBS fundamentally arise from the parity-symmetry-breaking interaction and the quantum interference between the photon scattering channels. Our results could inspire novel applications and the development of innovative photonic devices for the manipulation of weak quantum light., Comment: The document consists of 6 figures and spans 22 pages, including detailed appendices
- Published
- 2024
50. The white-light superflares from cool stars in GWAC triggers
- Author
-
Li, Guang-Wei, Wang, Liang, Yuan, Hai-Long, Xin, Li-Ping, Wang, Jing, Wu, Chao, Li, Hua-Li, Haerken, Hasitieer, Wang, Wei-Hua, Cai, Hong-Bo, Han, Xu-Hui, Xu, Yang, Huang, Lei, Lu, Xiao-Meng, Bai, Jian-Ying, Wang, Xiang-Yu, Dai, Zi-Gao, Liang, En-Wei, and Wei, Jian-Yan
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temperature ($T_{\rm eff}$) but both $\triangle G$ and equivalent duration $\log_{10}(ED)$ seem to be independent of $T_{\rm eff}$. Combining periods detected from light curves of TESS and K2, spectra from LAMOST, SDSS and the 2.16 m Telescope, and the Gaia DR3 data, we found that these GWAC flare stars are young. For the stars that have spectra, we found that these stars are in or very near to the saturation region, and $\log_{10}(L_{\rm H\alpha}/L_{\rm bol})$ is lower for M7-L1 stars than for M2-M6 stars. We also studied the relation between GWAC flare bolometric energy $E_{\rm bol}$ and stellar hemispherical area $S$, and found that $\log_{10}E_{\rm bol}$ (in erg) increases with increasing $S$ (in cm$^2$), and the maximum flare energy $\log_{10}E_{\rm bol, max} \geqslant \log_{10}S + 14.25$. For M7-L1 stars, there seem to be other factors limiting their maximum flare energies in addition to stellar hemispherical area., Comment: 18 pages, 11 figures, 4 tables
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.