55,271 results on '"WANG, JIE"'
Search Results
2. A refined lower bound theorem for $d$-polytopes with at most $2d$ vertices
- Author
-
Pineda-Villavicencio, Guillermo, Wang, Jie, and Yost, David
- Subjects
Mathematics - Combinatorics - Abstract
In 1967, Gr\"unbaum conjectured that the function $$ \phi_k(d+s,d):=\binom{d+1}{k+1}+\binom{d}{k+1}-\binom{d+1-s}{k+1},\; \text{for $2\le s\le d$} $$ provides the minimum number of $k$-faces for a $d$-dimensional polytope (abbreviated as a $d$-polytope) with $d+s$ vertices. In 2021, Xue proved this conjecture for each $k\in[1\ldots d-2]$ and characterised the unique minimisers, each having $d+2$ facets. In this paper, we refine Xue's theorem by considering $d$-polytopes with $d+s$ vertices ($2\le s\le d$) and at least $d+3$ facets. If $s=2$, then there is precisely one minimiser for many values of $k$. For other values of $s$, the number of $k$-faces is at least $\phi_k(d+s,d)+\binom{d-1}{k}-\binom{d+1-s}{k}$, which is met by precisely two polytopes in many cases, and up to five polytopes for certain values of $s$ and $k$. We also characterise the minimising polytopes., Comment: 31 pages, 2 figures
- Published
- 2025
3. Large Language Model driven Policy Exploration for Recommender Systems
- Author
-
Wang, Jie, Karatzoglou, Alexandros, Arapakis, Ioannis, and Jose, Joemon M.
- Subjects
Computer Science - Information Retrieval - Abstract
Recent advancements in Recommender Systems (RS) have incorporated Reinforcement Learning (RL), framing the recommendation as a Markov Decision Process (MDP). However, offline RL policies trained on static user data are vulnerable to distribution shift when deployed in dynamic online environments. Additionally, excessive focus on exploiting short-term relevant items can hinder exploration, leading to suboptimal recommendations and negatively impacting long-term user gains. Online RL-based RS also face challenges in production deployment, due to the risks of exposing users to untrained or unstable policies. Large Language Models (LLMs) offer a promising solution to mimic user objectives and preferences for pre-training policies offline to enhance the initial recommendations in online settings. Effectively managing distribution shift and balancing exploration are crucial for improving RL-based RS, especially when leveraging LLM-based pre-training. To address these challenges, we propose an Interaction-Augmented Learned Policy (iALP) that utilizes user preferences distilled from an LLM. Our approach involves prompting the LLM with user states to extract item preferences, learning rewards based on feedback, and updating the RL policy using an actor-critic framework. Furthermore, to deploy iALP in an online scenario, we introduce an adaptive variant, A-iALP, that implements a simple fine-tuning strategy (A-iALP$_{ft}$), and an adaptive approach (A-iALP$_{ap}$) designed to mitigate issues with compromised policies and limited exploration. Experiments across three simulated environments demonstrate that A-iALP introduces substantial performance improvements
- Published
- 2025
4. Geometrical Responses of Generalized Landau Levels: Structure Factor and the Quantized Hall Viscosity
- Author
-
Paiva, Carolina, Wang, Jie, Ozawa, Tomoki, and Mera, Bruno
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Quantum Gases ,Condensed Matter - Strongly Correlated Electrons ,High Energy Physics - Theory ,Mathematical Physics - Abstract
We present a new geometric characterization of generalized Landau levels (GLLs). The GLLs are a generalization of Landau levels to non-uniform Berry curvature, and are mathematically defined in terms of a holomorphic curve -- an ideal K\"ahler band -- and its associated unitary Frenet-Serret moving frame. Here, we find that GLLs are harmonic maps from the Brillouin zone to the complex projective space and they are critical points of the Dirichlet energy functional, as well as the static structure factor up to fourth order. We also find that filled GLLs exhibit quantized Hall viscosity, similar to the ordinary Landau levels. These results establish GLLs as a versatile generalization of Landau levels., Comment: 5 pages
- Published
- 2025
5. Exact Parent Hamiltonians for All Landau Level States in a Half-flux Lattice
- Author
-
Shen, Xin, Ji, Guangyue, Zhang, Jinjie, Palomino, David E., Mera, Bruno, Ozawa, Tomoki, and Wang, Jie
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Quantum Gases ,Mathematical Physics ,Physics - Atomic Physics ,Quantum Physics - Abstract
Realizing topological flat bands with tailored single-particle Hilbert spaces is a critical step toward exploring many-body phases, such as those featuring anyonic excitations. One prominent example is the Kapit-Mueller model, a variant of the Harper-Hofstadter model that stabilizes lattice analogs of the lowest Landau level states. The Kapit-Mueller model is constructed based on the Poisson summation rule, an exact lattice sum rule for coherent states. In this work, we consider higher Landau-level generalizations of the Poisson summation rule, from which we derive families of parent Hamiltonians on a half-flux lattice which have exact flat bands whose flatband wavefunctions are lattice version of higher Landau level states. Focusing on generic Bravais lattices with only translation and inversion symmetries, we discuss how these symmetries enforced gaplessness and singular points for odd Landau level series, and how to achieve fully gapped parent Hamiltonians by mixing even and odd series. Our model points to a large class of tight-binding models with suitable energetic and quantum geometries that are potentially useful for realizing non-Abelian fractionalized states when interactions are included. The model exhibits fast decay hopping amplitudes, making it potentially realizable with neutral atoms in optical lattices.
- Published
- 2025
6. Positivstellens\'atze for polynomial matrices with universal quantifiers
- Author
-
Guo, Feng and Wang, Jie
- Subjects
Mathematics - Optimization and Control ,90C23, 15A54, 13J30, 14P10, 11E25, 12D15 - Abstract
This paper studies Positivstellens\"atze for a polynomial matrix subject to polynomial matrix inequality constraints with universal quantifiers. We first present a Scherer-Hol-type Positivstellensatz under the Archimedean condition. When the objective is a scalar polynomial, we further provide a sparse Scherer-Hol-type Positivstellensatz in the presence of correlative sparsity. Next, without assuming the Archimedean condition, we derive Putinar-Vasilescu-type, P\'olya-type, and Lasserre-Netzer-type Positivstellens\"atze under the same setting. These results can be viewed as common generalizations of corresponding Positivstellens\"atze in the cases of polynomials, polynomials with universal quantifiers, and polynomial matrices. For the proofs, techniques from *-algebra, real algebraic geometry, operator theory, and convex optimization are employed. Applications of the established Positivstellens\"atze to robust polynomial matrix optimization are also discussed., Comment: 31 pages, 2 tables
- Published
- 2025
7. Interacting topological magnons in the Kitaev-Heisenberg honeycomb ferromagnet with the Dzyaloshinskii-Moriya interaction
- Author
-
Wang, Jie, Li, Jin Wen, Chen, Pei, and Tang, Bing
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
The study of the Heisenberg-Kitaev honeycomb ferromagnets has recently drawn attention because of their rich topological properties. Topological phase transitions may arise when there exist two or more distinct topological phases, and they are often revealed by a gap-closing phenomenon. In this work, we investigate the magnonic properties of honeycomb ferromagnets exhibiting Kitaev and DMI interactions in the presence of a Heisenberg exchange and magnetocrystalline anisotropy exposed to a magnetic field. We employ the Self-Consistent Renormalization (SCR) spin wave theory to investigate the effects of magnon-magnon interactions (MMIs) and thermal fluctuations on the properties of magnons. Our findings demonstrate that the magnon system undergoes topological phase transitions driven by temperature and magnetic fields, which are attributed to MMIs. Specifically, as the temperature rises, the magnon band gap at the Dirac points closes and reopens at the critical temperature Tc , which is below the Curie temperature. By showing that the Chern numbers of the magnonic bands are distinct above and below Tc , we confirm that the gap-closing phenomenon is indeed a signature for the topological phase transitions. Furthermore, our analysis indicates that the thermal Hall conductivity in the magnonic system exhibits a sign reversal at Tc , which can serve as an experimental probe of its topological nature.
- Published
- 2025
8. Exploring Temporal Event Cues for Dense Video Captioning in Cyclic Co-learning
- Author
-
Xie, Zhuyang, Yang, Yan, Yu, Yankai, Wang, Jie, Jiang, Yongquan, and Wu, Xiao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Dense video captioning aims to detect and describe all events in untrimmed videos. This paper presents a dense video captioning network called Multi-Concept Cyclic Learning (MCCL), which aims to: (1) detect multiple concepts at the frame level, using these concepts to enhance video features and provide temporal event cues; and (2) design cyclic co-learning between the generator and the localizer within the captioning network to promote semantic perception and event localization. Specifically, we perform weakly supervised concept detection for each frame, and the detected concept embeddings are integrated into the video features to provide event cues. Additionally, video-level concept contrastive learning is introduced to obtain more discriminative concept embeddings. In the captioning network, we establish a cyclic co-learning strategy where the generator guides the localizer for event localization through semantic matching, while the localizer enhances the generator's event semantic perception through location matching, making semantic perception and event localization mutually beneficial. MCCL achieves state-of-the-art performance on the ActivityNet Captions and YouCook2 datasets. Extensive experiments demonstrate its effectiveness and interpretability., Comment: Accepted at AAAI 2025
- Published
- 2024
9. VTD: Visual and Tactile Database for Driver State and Behavior Perception
- Author
-
Wang, Jie, Cai, Mobing, Zhu, Zhongpan, Ding, Hongjun, Yi, Jiwei, and Du, Aimin
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence - Abstract
In the domain of autonomous vehicles, the human-vehicle co-pilot system has garnered significant research attention. To address the subjective uncertainties in driver state and interaction behaviors, which are pivotal to the safety of Human-in-the-loop co-driving systems, we introduce a novel visual-tactile perception method. Utilizing a driving simulation platform, a comprehensive dataset has been developed that encompasses multi-modal data under fatigue and distraction conditions. The experimental setup integrates driving simulation with signal acquisition, yielding 600 minutes of fatigue detection data from 15 subjects and 102 takeover experiments with 17 drivers. The dataset, synchronized across modalities, serves as a robust resource for advancing cross-modal driver behavior perception algorithms.
- Published
- 2024
10. AntLM: Bridging Causal and Masked Language Models
- Author
-
Yu, Xinru, Guo, Bin, Luo, Shiwei, Wang, Jie, Ji, Tao, and Wu, Yuanbin
- Subjects
Computer Science - Computation and Language - Abstract
Causal Language Modeling (CLM) and Masked Language Modeling (MLM) are two mainstream learning paradigms based on Transformer networks, specifically the Decoder-only and Encoder-only architectures. The strengths of each paradigm in downstream tasks have shown a mix of advantages and disadvantages. In the past BabyLM Challenge 2023, although the MLM paradigm achieved the best average performance, the CLM paradigm demonstrated significantly faster convergence rates. For the BabyLM Challenge 2024, we propose a novel language modeling paradigm named $\textbf{AntLM}$, which integrates both CLM and MLM to leverage the advantages of these two classic paradigms. We chose the strict-small track and conducted experiments on two foundation models: BabyLlama, representing CLM, and LTG-BERT, representing MLM. During the training process for specific foundation models, we alternate between applying CLM or MLM training objectives and causal or bidirectional attention masks. Experimental results show that combining the two pretraining objectives leverages their strengths, enhancing overall training performance. Under the same epochs, $AntLM_{BabyLlama}$ improves Macro-average by 1%, and $AntLM_{LTG-BERT}$ achieves a 2.2% increase over the baselines., Comment: CoNLL Shared Task BabyLM Challenge
- Published
- 2024
11. SentiXRL: An advanced large language Model Framework for Multilingual Fine-Grained Emotion Classification in Complex Text Environment
- Author
-
Wang, Jie, Wang, Yichen, Zhang, Zhilin, Zeng, Jianhao, Wang, Kaidi, and Chen, Zhiyang
- Subjects
Computer Science - Computation and Language - Abstract
With strong expressive capabilities in Large Language Models(LLMs), generative models effectively capture sentiment structures and deep semantics, however, challenges remain in fine-grained sentiment classification across multi-lingual and complex contexts. To address this, we propose the Sentiment Cross-Lingual Recognition and Logic Framework (SentiXRL), which incorporates two modules,an emotion retrieval enhancement module to improve sentiment classification accuracy in complex contexts through historical dialogue and logical reasoning,and a self-circulating analysis negotiation mechanism (SANM)to facilitates autonomous decision-making within a single model for classification tasks.We have validated SentiXRL's superiority on multiple standard datasets, outperforming existing models on CPED and CH-SIMS,and achieving overall better performance on MELD,Emorynlp and IEMOCAP. Notably, we unified labels across several fine-grained sentiment annotation datasets and conducted category confusion experiments, revealing challenges and impacts of class imbalance in standard datasets.
- Published
- 2024
12. Perturbation Ontology based Graph Attention Networks
- Author
-
Wang, Yichen, Wang, Jie, Wang, Fulin, Li, Xiang, Yin, Hao, and Raj, Bhiksha
- Subjects
Computer Science - Machine Learning - Abstract
In recent years, graph representation learning has undergone a paradigm shift, driven by the emergence and proliferation of graph neural networks (GNNs) and their heterogeneous counterparts. Heterogeneous GNNs have shown remarkable success in extracting low-dimensional embeddings from complex graphs that encompass diverse entity types and relationships. While meta-path-based techniques have long been recognized for their ability to capture semantic affinities among nodes, their dependence on manual specification poses a significant limitation. In contrast, matrix-focused methods accelerate processing by utilizing structural cues but often overlook contextual richness. In this paper, we challenge the current paradigm by introducing ontology as a fundamental semantic primitive within complex graphs. Our goal is to integrate the strengths of both matrix-centric and meta-path-based approaches into a unified framework. We propose perturbation Ontology-based Graph Attention Networks (POGAT), a novel methodology that combines ontology subgraphs with an advanced self-supervised learning paradigm to achieve a deep contextual understanding. The core innovation of POGAT lies in our enhanced homogeneous perturbing scheme designed to generate rigorous negative samples, encouraging the model to explore minimal contextual features more thoroughly. Through extensive empirical evaluations, we demonstrate that POGAT significantly outperforms state-of-the-art baselines, achieving a groundbreaking improvement of up to 10.78\% in F1-score for the critical task of link prediction and 12.01\% in Micro-F1 for the critical task of node classification.
- Published
- 2024
13. Sparse Polynomial Matrix Optimization
- Author
-
Miller, Jared, Wang, Jie, and Guo, Feng
- Subjects
Mathematics - Optimization and Control ,90C23, 90C17, 90C22, 90C26 - Abstract
A polynomial matrix inequality is a statement that a symmetric polynomial matrix is positive semidefinite over a given constraint set. Polynomial matrix optimization concerns minimizing the smallest eigenvalue of a symmetric polynomial matrix subject to a tuple of polynomial matrix inequalities. This work explores the use of sparsity methods in reducing the complexity of sum-of-squares based methods in verifying polynomial matrix inequalities or solving polynomial matrix optimization. In the unconstrained setting, Newton polytopes can be employed to sparsify the monomial basis, resulting in smaller semidefinite programs. In the general setting, we show how to exploit different types of sparsity (term sparsity, correlative sparsity, matrix sparsity) encoded in polynomial matrices to derive sparse semidefinite programming relaxations for polynomial matrix optimization. For term sparsity, one intriguing phenomenon is that the related block structures do not necessarily converge to the one determined by sign symmetries, which is significantly distinguished from the scalar case. For correlative sparsity, unlike the scalar case, we provide a counterexample showing that asymptotic convergence does not hold under the Archimedean condition and the running intersection property. By employing the theory of matrix-valued measures, we establish several results on detecting global optimality and retrieving optimal solutions under correlative sparsity. The effectiveness of sparsity methods on reducing computational complexity is demonstrated on various examples of polynomial matrix optimization., Comment: 30 pages, 8 tables, 3 figures
- Published
- 2024
14. HiFAST: An HI Data Calibration and Imaging Pipeline for FAST III. Standing Wave Removal
- Author
-
Xu, Chen, Wang, Jie, Jing, Yingjie, Li, Fujia, Gan, Hengqian, Liu, Ziming, Liang, Tiantian, Chen, Qingze, Liu, Zerui, Hou, Zhipeng, Hu, Hao, Hu, Huijie, Huang, Shijie, Jiang, Peng, Zhang, Chuan-Peng, and Zhu, Yan
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
The standing waves existed in radio telescope data are primarily due to reflections among the instruments, which significantly impact the spectrum quality of the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Eliminating these standing waves for FAST is challenging given the constant changes in their phases and amplitudes. Over a ten-second period, the phases shift by 18$^{\circ}$ while the amplitudes fluctuate by 6 mK. Thus, we developed the fast Fourier transform (FFT) filter method to eliminate these standing waves for every individual spectrum. The FFT filter can decrease the root mean square (RMS) from 3.2 to 1.15 times the theoretical estimate. Compared to other methods such as sine fitting and running median, the FFT filter achieves a median RMS of approximately 1.2 times the theoretical expectation and the smallest scatter at 12%. Additionally, the FFT filter method avoids the flux loss issue encountered with some other methods. The FFT is also efficient in detecting harmonic radio frequency interference (RFI). In the FAST data, we identified three distinct types of harmonic RFI, each with amplitudes exceeding 100 mK and intrinsic frequency periods of 8.1, 0.5, and 0.37 MHz, respectively. The FFT filter, proven as the most effective method, is integrated into the HI data calibration and imaging pipeline for FAST (HiFAST, https://hifast.readthedocs.io)., Comment: 16 pages, 12 figures; accepted by RAA
- Published
- 2024
- Full Text
- View/download PDF
15. The HI Mass Function of the Local Universe: Combining Measurements from HIPASS, ALFALFA and FASHI
- Author
-
Ma, Wenlin, Guo, Hong, Xu, Haojie, Jones, Michael G., Zhang, Chuan-Peng, Zhu, Ming, Wang, Jing, Wang, Jie, and Jiang, Peng
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We present the first HI mass function (HIMF) measurement for the recent FAST All Sky HI (FASHI) survey and the most complete measurements of HIMF in the local universe so far by combining the HI catalogues from HI Parkes All Sky Survey (HIPASS), Arecibo Legacy Fast ALFA (ALFALFA) and FASHI surveys at redshift 0 < z < 0.05, covering 76% of the entire sky. We adopt the same methods to estimate distances, calculate sample completeness, and determine the HIMF for all three surveys. The best-fitting Schechter function for the total HIMF has a low-mass slope parameter alpha = -1.30 and a knee mass log(Ms) = 9.86 and a normalization phi_s = 0.00658. This gives the cosmic HI abundance omega_HI= 0.000454. We find that a double Schechter function with the same slope alpha better describes our HIMF, and the two different knee masses are log(Ms1) = 9.96 and log(Ms2) = 9.65. We verify that the measured HIMF is marginally affected by the choice of distance estimates. The effect of cosmic variance is significantly suppressed by combining the three surveys and it provides a unique opportunity to obtain an unbiased estimate of the HIMF in the local universe., Comment: 10 pages, 7 figures, submitted to A&A
- Published
- 2024
16. Nonperfused Retinal Capillaries -- A New Method Developed on OCT and OCTA
- Author
-
Gao, Min, Guo, Yukun, Hormel, Tristan T., Wang, Jie, White, Elizabeth, Park, Dong-Wouk, Hwang, Thomas S., Bailey, Steven T., and Jia, Yali
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
To develop a new method to quantify nonperfused retinal capillaries (NPCs) by using co-registered optical coherence tomography (OCT) and OCT angiography (OCTA), and to evaluate NPCs in eyes with age-related macular degeneration (AMD) and diabetic retinopathy (DR). Multiple consecutive 3x3-mm OCT/OCTA scans were obtained using a commercial device (Solix; Visionix/Optovue, Inc., California, USA). We averaged multiple registered OCT/OCTA scans to create high-definition volumes. The deep capillary plexus slab was defined and segmented. A novel deep learning denoising algorithm removed tissue background noise from capillaries in the en face OCT/OCTA. The algorithm segmented NPCs by identifying capillaries from OCT without corresponding flow signals in the OCTA. We then investigated the relationships between NPCs and known features in AMD and DR. The denoised en face OCT/OCTA revealed the structure and flow of the capillaries. The automatically segmented NPC achieved an accuracy of 88.2% compared to manual grading of DR. Compared to healthy controls, both the mean number and total length (mm) of NPCs were significantly increased in eyes with AMD and eyes with DR (P < 0.001). Compared to early and intermediate AMD, the number and total length of NPCs were significantly higher in advanced AMD (number: P<0.001, P<0.001; total length: P = 0.002, P =0.003). Geography atrophy, macular neovascularization, drusen volume, and extrafoveal avascular area (EAA) significantly correlated with increased NPCs (P<0.05). In eyes with DR, NPCs correlated with the number of microaneurysms and EAA (P<0.05). The presence of fluid did not significantly correlate with NPCs in AMD and DR. Conclusions A deep learning-based algorithm can segment and quantify retinal capillaries that lack flow using colocalized OCT/OCTA. This novel biomarker may be useful in AMD and DR.
- Published
- 2024
17. Half a Million Binary Stars from the low resolution spectra of LAMOST
- Author
-
Jing, Yingjie, Mao, Tian-Xiang, Wang, Jie, Liu, Chao, and Chen, Xiaodian
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Earth and Planetary Astrophysics ,Astrophysics - Astrophysics of Galaxies ,Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
Binary stars are prevalent yet challenging to detect. We present a novel approach using convolutional neural networks (CNNs) to identify binary stars from low-resolution spectra obtained by the LAMOST survey. The CNN is trained on a dataset that distinguishes binaries from single main sequence stars based on their positions on the Hertzsprung-Russell diagram. Specifically, the training data labels stars with mass ratios between approximately 0.71 and 0.93 as intermediate mass ratio binaries, while excluding those beyond this range. The network achieves high accuracy with an area under the receiver operating characteristic curve of 0.949 on the test set. Its performance is further validated against known eclipsing binaries (97% detection rate) and binary stars identified by radial velocity variations (92% detection rate). Applying the trained CNN to a sample of one million main sequence stars from LAMOST DR10 and Gaia DR3 yields a catalog of 468,634 binary stars, which are mainly intermediate mass ratio binaries given the training data. This catalog includes 115 binary stars located beyond 10 kpc from the Sun and 128 cross-matched with known exoplanet hosts from the NASA Exoplanet Archive. This new catalog provides a valuable resource for future research on the properties, formation, and evolution of binary systems, particularly for statistically characterizing large populations., Comment: Accepted by ApJS; 11 pages
- Published
- 2024
18. Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions
- Author
-
Yang, Rui, Wang, Jie, Wu, Guoping, and Li, Bin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Real-world offline datasets are often subject to data corruptions (such as noise or adversarial attacks) due to sensor failures or malicious attacks. Despite advances in robust offline reinforcement learning (RL), existing methods struggle to learn robust agents under high uncertainty caused by the diverse corrupted data (i.e., corrupted states, actions, rewards, and dynamics), leading to performance degradation in clean environments. To tackle this problem, we propose a novel robust variational Bayesian inference for offline RL (TRACER). It introduces Bayesian inference for the first time to capture the uncertainty via offline data for robustness against all types of data corruptions. Specifically, TRACER first models all corruptions as the uncertainty in the action-value function. Then, to capture such uncertainty, it uses all offline data as the observations to approximate the posterior distribution of the action-value function under a Bayesian inference framework. An appealing feature of TRACER is that it can distinguish corrupted data from clean data using an entropy-based uncertainty measure, since corrupted data often induces higher uncertainty and entropy. Based on the aforementioned measure, TRACER can regulate the loss associated with corrupted data to reduce its influence, thereby enhancing robustness and performance in clean environments. Experiments demonstrate that TRACER significantly outperforms several state-of-the-art approaches across both individual and simultaneous data corruptions., Comment: Accepted to NeurIPS 2024
- Published
- 2024
19. Target-Guided Adversarial Point Cloud Transformer Towards Recognition Against Real-world Corruptions
- Author
-
Wang, Jie, Xu, Tingfa, Ding, Lihe, and Li, Jianan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Achieving robust 3D perception in the face of corrupted data presents an challenging hurdle within 3D vision research. Contemporary transformer-based point cloud recognition models, albeit advanced, tend to overfit to specific patterns, consequently undermining their robustness against corruption. In this work, we introduce the Target-Guided Adversarial Point Cloud Transformer, termed APCT, a novel architecture designed to augment global structure capture through an adversarial feature erasing mechanism predicated on patterns discerned at each step during training. Specifically, APCT integrates an Adversarial Significance Identifier and a Target-guided Promptor. The Adversarial Significance Identifier, is tasked with discerning token significance by integrating global contextual analysis, utilizing a structural salience index algorithm alongside an auxiliary supervisory mechanism. The Target-guided Promptor, is responsible for accentuating the propensity for token discard within the self-attention mechanism, utilizing the value derived above, consequently directing the model attention towards alternative segments in subsequent stages. By iteratively applying this strategy in multiple steps during training, the network progressively identifies and integrates an expanded array of object-associated patterns. Extensive experiments demonstrate that our method achieves state-of-the-art results on multiple corruption benchmarks., Comment: Accepted by NeurIPS 2024; code: https://github.com/Roywangj/APCT
- Published
- 2024
20. The irreducible components of the primal cohomology of the theta divisor of an abelian fivefold
- Author
-
Izadi, Elham and Wang, Jie
- Published
- 2020
- Full Text
- View/download PDF
21. Phyllosticta paracitricarpa is synonymous with the EU quarantine fungus P. citricarpa based on phylogenomic analyses
- Author
-
van Ingen-Buijs, Valerie A, van Westerhoven, Anouk C, Skiadas, Petros, Zuijdgeest, Xander CL, Haridas, Sajeet, Daum, Christopher, Duffy, Kecia, Guo, Jie, Hundley, Hope, LaButti, Kurt, Lipzen, Anna, Pangilinan, Jasmyn, Riley, Robert, Wang, Jie, Yan, Mi, Martin, Francis, Barry, Kerrie, Grigoriev, Igor V, Groenewald, Johannes Z, Crous, Pedro W, and Seidl, Michael F
- Subjects
Biological Sciences ,Genetics ,Human Genome ,Phylogeny ,Ascomycota ,Plant Diseases ,Citrus ,Genome ,Fungal ,Genetic Variation ,Genomics ,Citrus black spot ,Comparative genomics ,Fungal taxonomy ,Phyllosticta citricarpa ,Phyllosticta paracitricarpa ,Quarantine plant pathogen ,Microbiology ,Plant Biology ,Plant biology - Abstract
Phyllosticta citricarpa is an important citrus-pathogen and a quarantine organism in the European Union. Its recently described relative, P. paracitricarpa, is very closely related and not listed as a quarantine organism. P. paracitricarpa is very difficult to distinguish from P. citricarpa, since its morphological features overlap and the barcoding gene sequences that were originally used to delimit them as distinct species have a low number of species-specific polymorphisms that have subsequently been shown to overlap between the two clades. Therefore, we performed extensive genomic analyses to determine whether the genetic variation between P. citricarpa and P. paracitricarpa strains should be considered to represent infraspecific variation within P. citricarpa, or whether it is indicative of distinct species. Using a phylogenomic analysis with 3,000 single copy ortholog genes and whole-genome comparisons, we determined that the variation between P. citricarpa and P. paracitricarpa can be considered as infraspecies variation within P. citricarpa. We also determined the level of variation in mitochondrial assemblies of several Phyllosticta species and concluded there are only minimal differences between the assemblies of P. citricarpa and P. paracitricarpa. Thus, using several orthogonal approaches, we here demonstrate that variation within the nuclear and mitochondrial genomes of other Phyllosticta species is larger than variation between genomes obtained from P. citricarpa and P. paracitricarpa strains. Thus, P. citricarpa and P. paracitricarpa should be considered as conspecific.
- Published
- 2024
22. MILP-StuDio: MILP Instance Generation via Block Structure Decomposition
- Author
-
Liu, Haoyang, Wang, Jie, Zhang, Wanbo, Geng, Zijie, Kuang, Yufei, Li, Xijun, Li, Bin, Zhang, Yongdong, and Wu, Feng
- Subjects
Computer Science - Machine Learning ,Computer Science - Discrete Mathematics - Abstract
Mixed-integer linear programming (MILP) is one of the most popular mathematical formulations with numerous applications. In practice, improving the performance of MILP solvers often requires a large amount of high-quality data, which can be challenging to collect. Researchers thus turn to generation techniques to generate additional MILP instances. However, existing approaches do not take into account specific block structures -- which are closely related to the problem formulations -- in the constraint coefficient matrices (CCMs) of MILPs. Consequently, they are prone to generate computationally trivial or infeasible instances due to the disruptions of block structures and thus problem formulations. To address this challenge, we propose a novel MILP generation framework, called Block Structure Decomposition (MILP-StuDio), to generate high-quality instances by preserving the block structures. Specifically, MILP-StuDio begins by identifying the blocks in CCMs and decomposing the instances into block units, which serve as the building blocks of MILP instances. We then design three operators to construct new instances by removing, substituting, and appending block units in the original instances, enabling us to generate instances with flexible sizes. An appealing feature of MILP-StuDio is its strong ability to preserve the feasibility and computational hardness of the generated instances. Experiments on the commonly-used benchmarks demonstrate that using instances generated by MILP-StuDio is able to significantly reduce over 10% of the solving time for learning-based solvers., Comment: Published in the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)
- Published
- 2024
23. Efficient Bilinear Attention-based Fusion for Medical Visual Question Answering
- Author
-
Zhang, Zhilin, Wang, Jie, Zhu, Ruiqi, and Gong, Xiaoliang
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Medical Visual Question Answering (MedVQA) has gained increasing attention at the intersection of computer vision and natural language processing. Its capability to interpret radiological images and deliver precise answers to clinical inquiries positions MedVQA as a valuable tool for supporting diagnostic decision-making for physicians and alleviating the workload on radiologists. While recent approaches focus on using unified pre-trained large models for multi-modal fusion like cross-modal Transformers, research on more efficient fusion methods remains relatively scarce within this discipline. In this paper, we introduce a novel fusion model that integrates Orthogonality loss, Multi-head attention and Bilinear Attention Network (OMniBAN) to achieve high computational efficiency and strong performance without the need for pre-training. We conduct comprehensive experiments and clarify aspects of how to enhance bilinear attention fusion to achieve performance comparable to that of large models. Experimental results show that OMniBAN outperforms traditional models on key MedVQA benchmarks while maintaining a lower computational cost, which indicates its potential for efficient clinical application in radiology and pathology image question answering.
- Published
- 2024
24. Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System through Distributed Database and Multimodal Perception: Demonstrated in Crossroads
- Author
-
Zhu, Xinwen, Li, Zihao, Jiang, Yuxuan, Xu, Jiazhen, Wang, Jie, and Bai, Xuyang
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence ,Electrical Engineering and Systems Science - Systems and Control - Abstract
The autonomous driving industry is rapidly advancing, with Vehicle-to-Vehicle (V2V) communication systems highlighting as a key component of enhanced road safety and traffic efficiency. This paper introduces a novel Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System (VVCCS), designed to revolutionize macro-scope traffic planning and collision avoidance in autonomous driving. Implemented on Quanser Car (Qcar) hardware platform, our system integrates the distributed databases into individual autonomous vehicles and an optional central server. We also developed a comprehensive multi-modal perception system with multi-objective tracking and radar sensing. Through a demonstration within a physical crossroad environment, our system showcases its potential to be applied in congested and complex urban environments., Comment: ICICT 2024, 18 pages
- Published
- 2024
25. Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
- Author
-
Lv, Qitan, Wang, Jie, Chen, Hanzhu, Li, Bin, Zhang, Yongdong, and Wu, Feng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM) -- which enhances models with up-to-date knowledge -- emerges as a promising method to reduce hallucination. However, existing RALMs may instead exacerbate hallucination when retrieving lengthy contexts. To address this challenge, we propose COFT, a novel \textbf{CO}arse-to-\textbf{F}ine highligh\textbf{T}ing method to focus on different granularity-level key texts, thereby avoiding getting lost in lengthy contexts. Specifically, COFT consists of three components: \textit{recaller}, \textit{scorer}, and \textit{selector}. First, \textit{recaller} applies a knowledge graph to extract potential key entities in a given context. Second, \textit{scorer} measures the importance of each entity by calculating its contextual weight. Finally, \textit{selector} selects high contextual weight entities with a dynamic threshold algorithm and highlights the corresponding paragraphs, sentences, or words in a coarse-to-fine manner. Extensive experiments on the knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over $30\%$ in the F1 score metric. Moreover, COFT also exhibits remarkable versatility across various long-form tasks, such as reading comprehension and question answering.
- Published
- 2024
26. Detecting AI-Generated Texts in Cross-Domains
- Author
-
Zhou, You and Wang, Jie
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,I.2.7 - Abstract
Existing tools to detect text generated by a large language model (LLM) have met with certain success, but their performance can drop when dealing with texts in new domains. To tackle this issue, we train a ranking classifier called RoBERTa-Ranker, a modified version of RoBERTa, as a baseline model using a dataset we constructed that includes a wider variety of texts written by humans and generated by various LLMs. We then present a method to fine-tune RoBERTa-Ranker that requires only a small amount of labeled data in a new domain. Experiments show that this fine-tuned domain-aware model outperforms the popular DetectGPT and GPTZero on both in-domain and cross-domain texts, where AI-generated texts may either be in a different domain or generated by a different LLM not used to generate the training datasets. This approach makes it feasible and economical to build a single system to detect AI-generated texts across various domains.
- Published
- 2024
- Full Text
- View/download PDF
27. Sparse Degree Optimization for BATS Codes
- Author
-
Yin, Hoover H. F. and Wang, Jie
- Subjects
Computer Science - Information Theory - Abstract
Batched sparse (BATS) code is a class of batched network code that can achieve a close-to-optimal rate when an optimal degree distribution is provided. We observed that most probability masses in this optimal distribution are very small, i.e., the distribution "looks" sparse. In this paper, we investigate the sparsity optimization of degree distribution for BATS codes that produces sparse degree distributions. There are many advantages to use a sparse degree distribution, say, it is robust to precision errors when sampling the degree distribution during encoding and decoding in practice. We discuss a few heuristics and also a way to obtain an exact sparsity solution. These approaches give a trade-off between computational time and achievable rate, thus give us the flexibility to adopt BATS codes in various scenarios, e.g., device with limited computational power, stable channel condition, etc., Comment: Full version of the conference version in ITW'24
- Published
- 2024
28. Constructing Cloze Questions Generatively
- Author
-
Sun, Yicheng and Wang, Jie
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,I.2.7 - Abstract
We present a generative method called CQG for constructing cloze questions from a given article using neural networks and WordNet, with an emphasis on generating multigram distractors. Built on sense disambiguation, text-to-text transformation, WordNet's synset taxonomies and lexical labels, CQG selects an answer key for a given sentence, segments it into a sequence of instances, generates instance-level distractor candidates (IDCs) using a transformer and sibling synsets.It then removes inappropriate IDCs, ranks the remaining IDCs based on contextual embedding similarities, as well as synset and lexical relatedness, forms distractor candidates by combinatorially replacing instances with the corresponding top-ranked IDCs, and checks if they are legitimate phrases. Finally, it selects top-ranked distractor candidates based on contextual semantic similarities to the answer key. Experiments show that this method significantly outperforms SOTA results. Human judges also confirm the high qualities of the generated distractors., Comment: 8 pages, 5 figures,5 tables, 2023 International Joint Conference on Neural Networks (IJCNN)
- Published
- 2024
- Full Text
- View/download PDF
29. Intrinsic Evaluation of RAG Systems for Deep-Logic Questions
- Author
-
Hu, Junyi, Zhou, You, and Wang, Jie
- Subjects
Computer Science - Artificial Intelligence ,I.2.7 - Abstract
We introduce the Overall Performance Index (OPI), an intrinsic metric to evaluate retrieval-augmented generation (RAG) mechanisms for applications involving deep-logic queries. OPI is computed as the harmonic mean of two key metrics: the Logical-Relation Correctness Ratio and the average of BERT embedding similarity scores between ground-truth and generated answers. We apply OPI to assess the performance of LangChain, a popular RAG tool, using a logical relations classifier fine-tuned from GPT-4o on the RAG-Dataset-12000 from Hugging Face. Our findings show a strong correlation between BERT embedding similarity scores and extrinsic evaluation scores. Among the commonly used retrievers, the cosine similarity retriever using BERT-based embeddings outperforms others, while the Euclidean distance-based retriever exhibits the weakest performance. Furthermore, we demonstrate that combining multiple retrievers, either algorithmically or by merging retrieved sentences, yields superior performance compared to using any single retriever alone.
- Published
- 2024
30. CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation
- Author
-
Huang, Fuxian, Zhang, Qi, Zhai, Shaopeng, Wang, Jie, Zhang, Tianyi, Zhang, Haoran, Zhou, Ming, Liu, Yu, and Qiao, Yu
- Subjects
Computer Science - Artificial Intelligence - Abstract
With the rapid development of artificial intelligence, multimodal learning has become an important research area. For intelligent agents, the state is a crucial modality to convey precise information alongside common modalities like images, videos, and language. This becomes especially clear with the broad adoption of reinforcement learning and multimodal large language models. Nevertheless, the representation of state modality still lags in development. To this end, we propose a High-Fidelity Contrastive Language-State Pre-training (CLSP) method, which can accurately encode state information into general representations for both reinforcement learning and multimodal large language models. Specifically, we first design a pre-training task based on the classification to train an encoder with coarse-grained information. Next, we construct data pairs of states and language descriptions, utilizing the pre-trained encoder to initialize the CLSP encoder. Then, we deploy contrastive learning to train the CLSP encoder to effectively represent precise state information. Additionally, we enhance the representation of numerical information using the Random Fourier Features (RFF) method for high-fidelity mapping. Extensive experiments demonstrate the superior precision and generalization capabilities of our representation, achieving outstanding results in text-state retrieval, reinforcement learning navigation tasks, and multimodal large language model understanding.
- Published
- 2024
31. SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graphs
- Author
-
Chen, Hanzhu, Shen, Xu, Lv, Qitan, Wang, Jie, Ni, Xiaoqi, and Ye, Jieping
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Knowledge graphs (KGs) play a pivotal role in knowledge-intensive tasks across specialized domains, where the acquisition of precise and dependable knowledge is crucial. However, existing KG construction methods heavily rely on human intervention to attain qualified KGs, which severely hinders the practical applicability in real-world scenarios. To address this challenge, we propose a general KG construction framework, named SAC-KG, to exploit large language models (LLMs) as Skilled Automatic Constructors for domain Knowledge Graph. SAC-KG effectively involves LLMs as domain experts to generate specialized and precise multi-level KGs. Specifically, SAC-KG consists of three components: Generator, Verifier, and Pruner. For a given entity, Generator produces its relations and tails from raw domain corpora, to construct a specialized single-level KG. Verifier and Pruner then work together to ensure precision by correcting generation errors and determining whether newly produced tails require further iteration for the next-level KG.Experiments demonstrate that SAC-KG automatically constructs a domain KG at the scale of over one million nodes and achieves a precision of 89.32%, leading to a superior performance with over 20% increase in precision rate compared to existing state-of-the-art methods for the KG construction task., Comment: ACL 2024 Main
- Published
- 2024
32. Sobolev inequalities involving 2-tensor fields in manifolds with nonnegative sectional curvature
- Author
-
Wang, Jie
- Subjects
Mathematics - Differential Geometry - Abstract
By applying the ABP method, we establish both Log Sobolev type inequality and Michael Simon Sobolev inequality for smooth symmetric uniformly positive definite (0,2) tensor fields in manifolds with nonnegative sectional curvature., Comment: 14 pages
- Published
- 2024
33. Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization
- Author
-
Zhao, Minyi, Wang, Jie, Li, Zhaoyang, Zhang, Jiyuan, Sun, Zhenbang, and Zhou, Shuigeng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent studies have shown that Vision Language Large Models (VLLMs) may output content not relevant to the input images. This problem, called the hallucination phenomenon, undoubtedly degrades VLLM performance. Therefore, various anti-hallucination techniques have been proposed to make model output more reasonable and accurate. Despite their successes, from extensive tests we found that augmenting the prompt (e.g. word appending, rewriting, and spell error etc.) may change model output and make the output hallucinate again. To cure this drawback, we propose a new instruct-tuning framework called Prompt Augmentation and Caption Utilization (PACU) to boost VLLM's generation ability under the augmented prompt scenario. Concretely, on the one hand, PACU exploits existing LLMs to augment and evaluate diverse prompts automatically. The resulting high-quality prompts are utilized to enhance VLLM's ability to process different prompts. On the other hand, PACU exploits image captions to jointly work with image features as well as the prompts for response generation. When the visual feature is inaccurate, LLM can capture useful information from the image captions for response generation. Extensive experiments on hallucination evaluation and prompt-augmented datasets demonstrate that our PACU method can work well with existing schemes to effectively boost VLLM model performance. Code is available in https://github.com/zhaominyiz/PACU.
- Published
- 2024
34. Bootstrapping the Quantum Hall problem
- Author
-
Gao, Qiang, Lanzetta, Ryan A., Ledwith, Patrick, Wang, Jie, and Khalaf, Eslam
- Subjects
Condensed Matter - Strongly Correlated Electrons - Abstract
The bootstrap method aims to solve problems by imposing constraints on the space of physical observables, which often follow from physical assumptions such as positivity and symmetry. Here, we employ a bootstrap approach to study interacting electrons in the lowest Landau level by minimizing the energy as a function of the static structure factor subject to a set of constraints, bypassing the need to construct the full many-body wavefunction. This approach rigorously lower bounds the ground state energy, making it complementary to conventional variational upper bounds. We show that the lower bound we obtain is relatively tight, within at most 5\% from the ground state energy computed with exact diagonalization (ED) at small system sizes, and generally gets tighter as we include more constraints. In addition to energetics, our results reproduce the correct power law dependence of the pair correlation function at short distances and the existence of a large entanglement gap in the two-particle entanglement spectra for the Laughlin states at $\nu = 1/3$. We further identify signatures of the composite Fermi liquid state close to half-filling. This shows that the bootstrap approach is capable, in principle, of describing non-trivial gapped topologically ordered, as well as gapless, phases. At the end, we will discuss possible extensions and limitations of this approach. Our work establishes numerical bootstrap as a promising method to study many-body phases in topological bands, paving the way to its application in moir\'e platforms where the energetic competition between fractional quantum anomalous Hall, symmetry broken, and gapless states remains poorly understood., Comment: Total 24 pages. Main text: 16 pages, 7 figures
- Published
- 2024
35. Nonreciprocal tripartite entanglement and asymmetric Einstein-Podolsky-Rosen steering via directional quantum squeezing
- Author
-
Jiao, Ya-Feng, Wang, Jie, Wang, Dong-Yang, Tang, Lei, Wang, Yan, Zuo, Yun-Lan, Bao, Wan-Su, Kuang, Le-Man, and Jing, Hui
- Subjects
Quantum Physics - Abstract
The generation and manipulation of multipartite entanglement and EPR steering in macroscopic systems not only play a fundamental role in exploring the nature of quantum mechanics, but are also at the core of current developments of various nascent quantum technologies. Here we report a theoretical method using directional injection of quantum squeezing to produce nonreciprocal multipartite entanglement and EPR steering in a three-mode optomechanical system with closed-loop coupling. We show that by directionally applying a two-photon parametric driving field with a phase-matched squeezed vacuum reservoir to an optomechanical resonator, a squeezed optical mode can be introduced for one of its input directions, thereby yielding an asymmetric enhancement of optomechanical interaction and the time-reversal symmetry breaking of the system. Based on this feature, it is found that bipartite and tripartite entanglement and the associated EPR steering of the subsystems can only be generated when the coherent driving field input from the squeezing injection direction, namely, achieving nonreciprocity in such quantum correlations. More excitingly, it is also found that by properly adjusting the squeezing parameter, the overall asymmetry of EPR steering can be stepwise driven from no-way regime, one-way regime to two-way regime. These findings, holding promise for preparing rich types of entangled quantum resources with nonreciprocal correlations, may have potential applications in the area of quantum information processing such as quantum secure direct communication and one-way quantum computing., Comment: 15 pages, 3 figures
- Published
- 2024
36. Physics-informed neural networks incorporating energy dissipation for the phase-field model of ferroelectric microstructure evolution
- Author
-
Shang, Lan, Zheng, Sizheng, Wang, Jin, and Wang, Jie
- Subjects
Condensed Matter - Materials Science - Abstract
Physics-informed neural networks (PINNs) are an emerging technique to solve partial differential equations (PDEs). In this work, we propose a simple but effective PINN approach for the phase-field model of ferroelectric microstructure evolution. This model is a time-dependent, nonlinear, and high-order PDE system of multi-physics, challenging to be solved using a baseline PINN. Considering that the acquisition of steady microstructures is one of the primary focuses in simulations of ferroelectric microstructure evolution, we simplify the time-dependent PDE system to be a static problem. This static problem, however, is ill-posed. To overcome this issue, a term originated from the law of energy dissipation is embedded into the loss function as an extra constraint for the PINN. With this modification, the PINN successfully predicts the steady ferroelectric microstructure without tracking the evolution process. In addition, although the proposed PINN approach cannot tackle the dynamic problem in a straightforward fashion, it is of benefit to the PINN prediction of the evolution process by providing labeled data. These data are crucial because they help the PINN avoid the propagation failure, a common failure mode of PINNs when predicting dynamic behaviors. The above mentioned advantages of the proposed PINN approach are demonstrated through a number of examples.
- Published
- 2024
37. On the Cosmic Variance of the Merger Rate Density of Binary Neutron Stars
- Author
-
Chen, Zhiwei, Lu, Youjun, Wang, Jie, Jiang, Zhen, Chu, Qingbo, and Ma, Xianghao
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,Astrophysics - Astrophysics of Galaxies - Abstract
The cosmic variance on the star formation history may lead to bias to the merger rate density estimation of binary neutron star (BNS) mergers by the compact binary population synthesis. In this paper, we take the advantage of the large boxsize of the Millennium Simulation combined with the semi-analytic galaxy formation model GABE, and the parameterized population binary star evolution (BSE) model to examine how much effect will the cosmic variance introduce on the estimation of merger rate density of BNS mergers. We find that for sub-box size of $100\rm Mpc$ and $200\rm Mpc$, the variance of merger rate density $\sigma_{\rm R}/\rm R$ at different redshift is about $23\%-35\%$ and $13\%-20\%$ respectively. On one hand, as for the variance of the detection rate on BNS mergers with current LIGO-Virgo-KAGRA (LVK) detector network, this value is very small $\lesssim 10\%$, which indicates ignoring the cosmic variance is reasonable for estimating the merger rate density from current LVK observation. On the other hand, with next-generation gravitational wave detectors, it is possible to localize BNS mergers within sub-boxes possessing length of $\rm 40 Mpc$ for source redshift $z_{s}<0.2$. In such a small box, the cosmic variance of the merger rate density is significant, i.e., the value of $\sigma_{\rm R}/\rm R$ is about $\sim 55\%$. This hints that estimating the merger rate density of BNS in different sky areas may provide useful information on the cosmic variance., Comment: 7 pages, 5 figures, Accepted for Publication in ApJ
- Published
- 2024
38. Layer skyrmions for ideal Chern bands and twisted bilayer graphene
- Author
-
Guerci, Daniele, Wang, Jie, and Mora, Christophe
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Ideal $C=1$ Chern bands exhibit a Landau level correspondence: they factorize as a lowest Landau levels and a spinor wavefunction that spans the layer index. We demonstrate that, in single Dirac moir\'e models, the spinor develops generally a Skyrme texture in real space with an associated Berry phase which compensates exactly the magnetic phase of the Landau level. For ideal bands with higher Chern numbers $C>1$, we find that $C$ color Landau levels are carried by $C$ spinors with Skyrme textures. We identify a SU(C) gauge symmetry in the color space of spinors and an emergent non-Abelian connection in real space intimately linked to the Pontryagin winding index of the layer skyrmions. They result in a total real-space Chern number of $-1$, screening the magnetic phase, irrespective of $C$ and of the number of layers. The topologically robust Skyrme texture remains remarkably intact in twisted bilayer graphene, even far from the chiral limit, and for realistic values of corrugation, making it an experimentally testable feature. We verify our predictions at the first magic angle of twisted bilayer, trilayer, and monolayer-bilayer graphene., Comment: 8+13 pages, 6 figures
- Published
- 2024
39. Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method
- Author
-
Liu, Ze, Zhang, Jin, Feng, Chao, Lian, Defu, Wang, Jie, and Chen, Enhong
- Subjects
Computer Science - Information Retrieval - Abstract
Although advancements in deep learning have significantly enhanced the recommendation accuracy of deep recommendation models, these methods still suffer from low recommendation efficiency. Recently proposed tree-based deep recommendation models alleviate the problem by directly learning tree structure and representations under the guidance of recommendation objectives. To guarantee the effectiveness of beam search for recommendation accuracy, these models strive to ensure that the tree adheres to the max-heap assumption, where a parent node's preference should be the maximum among its children's preferences. However, they employ a one-versus-all strategy, framing the training task as a series of independent binary classification objectives for each node, which limits their ability to fully satisfy the max-heap assumption. To this end, we propose a Deep Tree-based Retriever (DTR for short) for efficient recommendation. DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level, enabling explicit horizontal competition and more discriminative top-k selection among them, which mimics the beam search behavior during training. To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function, which further aligns with the max-heap assumption in expectation. As the number of tree nodes grows exponentially with the levels, we employ sampled softmax to approximate optimization and thereby enhance efficiency. Furthermore, we propose a tree-based sampling method to reduce the bias inherent in sampled softmax. Theoretical results reveal DTR's generalization capability, and both the rectification method and tree-based sampling contribute to improved generalization. The experiments are conducted on four real-world datasets, validating the effectiveness of the proposed method.
- Published
- 2024
40. Multi-level Monte-Carlo Gradient Methods for Stochastic Optimization with Biased Oracles
- Author
-
Hu, Yifan, Wang, Jie, Chen, Xin, and He, Niao
- Subjects
Mathematics - Optimization and Control ,Computer Science - Machine Learning - Abstract
We consider stochastic optimization when one only has access to biased stochastic oracles of the objective and the gradient, and obtaining stochastic gradients with low biases comes at high costs. This setting captures various optimization paradigms, such as conditional stochastic optimization, distributionally robust optimization, shortfall risk optimization, and machine learning paradigms, such as contrastive learning. We examine a family of multi-level Monte Carlo (MLMC) gradient methods that exploit a delicate tradeoff among bias, variance, and oracle cost. We systematically study their total sample and computational complexities for strongly convex, convex, and nonconvex objectives and demonstrate their superiority over the widely used biased stochastic gradient method. When combined with the variance reduction techniques like SPIDER, these MLMC gradient methods can further reduce the complexity in the nonconvex regime. Our results imply that a series of stochastic optimization problems with biased oracles, previously considered to be more challenging, is fundamentally no harder than the classical stochastic optimization with unbiased oracles. We also delineate the boundary conditions under which these problems become more difficult. Moreover, MLMC gradient methods significantly improve the best-known complexities in the literature for conditional stochastic optimization and shortfall risk optimization. Our extensive numerical experiments on distributionally robust optimization, pricing and staffing scheduling problems, and contrastive learning demonstrate the superior performance of MLMC gradient methods., Comment: A preliminary version of this manuscript has appeared in a conference proceeding. Please refer to Yifan Hu, Xin Chen, and Niao He. On the bias-variance-cost tradeoff of stochastic optimization. Advances in Neural Information Processing Systems, 2021
- Published
- 2024
41. Multi-agent Multi-armed Bandits with Stochastic Sharable Arm Capacities
- Author
-
Xie, Hong, Mo, Jinyu, Lian, Defu, Wang, Jie, and Chen, Enhong
- Subjects
Computer Science - Artificial Intelligence - Abstract
Motivated by distributed selection problems, we formulate a new variant of multi-player multi-armed bandit (MAB) model, which captures stochastic arrival of requests to each arm, as well as the policy of allocating requests to players. The challenge is how to design a distributed learning algorithm such that players select arms according to the optimal arm pulling profile (an arm pulling profile prescribes the number of players at each arm) without communicating to each other. We first design a greedy algorithm, which locates one of the optimal arm pulling profiles with a polynomial computational complexity. We also design an iterative distributed algorithm for players to commit to an optimal arm pulling profile with a constant number of rounds in expectation. We apply the explore then commit (ETC) framework to address the online setting when model parameters are unknown. We design an exploration strategy for players to estimate the optimal arm pulling profile. Since such estimates can be different across different players, it is challenging for players to commit. We then design an iterative distributed algorithm, which guarantees that players can arrive at a consensus on the optimal arm pulling profile in only M rounds. We conduct experiments to validate our algorithm., Comment: 28 pages
- Published
- 2024
42. Regularization for Adversarial Robust Learning
- Author
-
Wang, Jie, Gao, Rui, and Xie, Yao
- Subjects
Computer Science - Machine Learning ,Mathematics - Optimization and Control ,Statistics - Machine Learning - Abstract
Despite the growing prevalence of artificial neural networks in real-world applications, their vulnerability to adversarial attacks remains a significant concern, which motivates us to investigate the robustness of machine learning models. While various heuristics aim to optimize the distributionally robust risk using the $\infty$-Wasserstein metric, such a notion of robustness frequently encounters computation intractability. To tackle the computational challenge, we develop a novel approach to adversarial training that integrates $\phi$-divergence regularization into the distributionally robust risk function. This regularization brings a notable improvement in computation compared with the original formulation. We develop stochastic gradient methods with biased oracles to solve this problem efficiently, achieving the near-optimal sample complexity. Moreover, we establish its regularization effects and demonstrate it is asymptotic equivalence to a regularized empirical risk minimization framework, by considering various scaling regimes of the regularization parameter and robustness level. These regimes yield gradient norm regularization, variance regularization, or a smoothed gradient norm regularization that interpolates between these extremes. We numerically validate our proposed method in supervised learning, reinforcement learning, and contextual learning and showcase its state-of-the-art performance against various adversarial attacks., Comment: 51 pages, 5 figures
- Published
- 2024
43. Chinese Metaphor Recognition Using a Multi-stage Prompting Large Language Model
- Author
-
Wang, Jie, Wang, Jin, and Zhang, Xuejie
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Metaphors are common in everyday language, and the identification and understanding of metaphors are facilitated by models to achieve a better understanding of the text. Metaphors are mainly identified and generated by pre-trained models in existing research, but situations, where tenors or vehicles are not included in the metaphor, cannot be handled. The problem can be effectively solved by using Large Language Models (LLMs), but significant room for exploration remains in this early-stage research area. A multi-stage generative heuristic-enhanced prompt framework is proposed in this study to enhance the ability of LLMs to recognize tenors, vehicles, and grounds in Chinese metaphors. In the first stage, a small model is trained to obtain the required confidence score for answer candidate generation. In the second stage, questions are clustered and sampled according to specific rules. Finally, the heuristic-enhanced prompt needed is formed by combining the generated answer candidates and demonstrations. The proposed model achieved 3rd place in Track 1 of Subtask 1, 1st place in Track 2 of Subtask 1, and 1st place in both tracks of Subtask 2 at the NLPCC-2024 Shared Task 9.
- Published
- 2024
44. Mechanistic Modeling of Lipid Nanoparticle Formation for the Delivery of Nucleic Acid Therapeutics
- Author
-
Inguva, Pavan K., Mukherjee, Saikat, Walker, Pierre J., Kanso, Mona A., Wang, Jie, Wu, Yanchen, Tenberg, Vico, Santra, Srimanta, Singh, Shalini, Kim, Shin Hyuk, Trout, Bernhardt L., Bazant, Martin Z., Myerson, Allan S., and Braatz, Richard D.
- Subjects
Condensed Matter - Soft Condensed Matter ,Computer Science - Computational Engineering, Finance, and Science ,Physics - Biological Physics ,Physics - Chemical Physics - Abstract
Nucleic acids such as mRNA have emerged as a promising therapeutic modality with the capability of addressing a wide range of diseases. Lipid nanoparticles (LNPs) as a delivery platform for nucleic acids were used in the COVID-19 vaccines and have received much attention. While modern manufacturing processes which involve rapidly mixing an organic stream containing the lipids with an aqueous stream containing the nucleic acids are conceptually straightforward, detailed understanding of LNP formation and structure is still limited and scale-up can be challenging. Mathematical and computational methods are a promising avenue for deepening scientific understanding of the LNP formation process and facilitating improved process development and control. This article describes strategies for the mechanistic modeling of LNP formation, starting with strategies to estimate and predict important physicochemical properties of the various species such as diffusivities and solubilities. Subsequently, a framework is outlined for constructing mechanistic models of reactor- and particle-scale processes. Insights gained from the various models are mapped back to product quality attributes and process insights. Lastly, the use of the models to guide development of advanced process control and optimization strategies is discussed., Comment: 67 pages, 10 figures
- Published
- 2024
45. Assembly History and Internal Structure of Cluster Cold Dark Matter Haloes
- Author
-
Chen, Qingxiang, Liao, Shihong, Wang, Jie, and Gao, Liang
- Subjects
Astrophysics - Astrophysics of Galaxies ,Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
We use the Phoenix simulations to study the mass assembly history and internal structures of cluster dark matter haloes ($M_{200} \gtrsim 5\times 10^{14} h^{-1}{\rm M}_\odot$). We confirm that cluster haloes grow inside-out, similar to galactic haloes. Major merger events dominate the growth of the internal region and minor mergers/diffuse accretion shape the outskirts. However, compared to galactic haloes, cluster haloes tend to have a younger and more actively evolving inner region. On average, the majority of mass (> 80%) in the inner region ($R< 0.1 r_{200}$) of Phoenix haloes is accreted after $z = 3$, while for galactic haloes, most mass in the central region has already been accreted before $z=6$. The density profiles of cluster haloes are less stable than those of galactic haloes over different radii. The enclosed mass within $50$ or $150$ kpc of all Phoenix haloes evolves substantially in the past ${\sim} 7$ Gyr, while galactic haloes remained stable during the same period. We suggest that the relatively younger and more active state explains the various observations of cluster haloes, especially in central regions., Comment: 12 pages, 11 figures, accepted for publication in MNRAS
- Published
- 2024
- Full Text
- View/download PDF
46. Learning Rule-Induced Subgraph Representations for Inductive Relation Prediction
- Author
-
Liu, Tianyu, Lv, Qitan, Wang, Jie, Yang, Shuling, and Chen, Hanzhu
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Inductive relation prediction (IRP) -- where entities can be different during training and inference -- has shown great power for completing evolving knowledge graphs. Existing works mainly focus on using graph neural networks (GNNs) to learn the representation of the subgraph induced from the target link, which can be seen as an implicit rule-mining process to measure the plausibility of the target link. However, these methods cannot differentiate the target link and other links during message passing, hence the final subgraph representation will contain irrelevant rule information to the target link, which reduces the reasoning performance and severely hinders the applications for real-world scenarios. To tackle this problem, we propose a novel \textit{single-source edge-wise} GNN model to learn the \textbf{R}ule-induc\textbf{E}d \textbf{S}ubgraph represen\textbf{T}ations (\textbf{REST}), which encodes relevant rules and eliminates irrelevant rules within the subgraph. Specifically, we propose a \textit{single-source} initialization approach to initialize edge features only for the target link, which guarantees the relevance of mined rules and target link. Then we propose several RNN-based functions for \textit{edge-wise} message passing to model the sequential property of mined rules. REST is a simple and effective approach with theoretical support to learn the \textit{rule-induced subgraph representation}. Moreover, REST does not need node labeling, which significantly accelerates the subgraph preprocessing time by up to \textbf{11.66$\times$}. Experiments on inductive relation prediction benchmarks demonstrate the effectiveness of our REST. Our code is available at https://github.com/smart-lty/REST.
- Published
- 2024
- Full Text
- View/download PDF
47. SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation
- Author
-
Yu, Jieming, Wang, An, Dong, Wenzhen, Xu, Mengya, Islam, Mobarakol, Wang, Jie, Bai, Long, and Ren, Hongliang
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
The recent Segment Anything Model (SAM) 2 has demonstrated remarkable foundational competence in semantic segmentation, with its memory mechanism and mask decoder further addressing challenges in video tracking and object occlusion, thereby achieving superior results in interactive segmentation for both images and videos. Building upon our previous empirical studies, we further explore the zero-shot segmentation performance of SAM 2 in robot-assisted surgery based on prompts, alongside its robustness against real-world corruption. For static images, we employ two forms of prompts: 1-point and bounding box, while for video sequences, the 1-point prompt is applied to the initial frame. Through extensive experimentation on the MICCAI EndoVis 2017 and EndoVis 2018 benchmarks, SAM 2, when utilizing bounding box prompts, outperforms state-of-the-art (SOTA) methods in comparative evaluations. The results with point prompts also exhibit a substantial enhancement over SAM's capabilities, nearing or even surpassing existing unprompted SOTA methodologies. Besides, SAM 2 demonstrates improved inference speed and less performance degradation against various image corruption. Although slightly unsatisfactory results remain in specific edges or regions, SAM 2's robust adaptability to 1-point prompts underscores its potential for downstream surgical tasks with limited prompt requirements., Comment: Empirical study. Previous work "SAM Meets Robotic Surgery" is accessible at: arXiv:2308.07156
- Published
- 2024
48. Esketamine vs. placebo combined with erector spinae plane block vs. intercostal nerve block on quality of recovery following thoracoscopic lung resection: A randomized controlled factorial trial.
- Author
-
Hu, Jing-Hui, Zhong, Zhang-Zhen, Shi, Hai-Jing, Wang, Jie, Chen, Shaomu, Shan, Xi-Sheng, Liu, Hua-Yue, Liu, Hong, Meng, Lingzhong, Ji, Fu-Hai, and Peng, Ke
- Subjects
Clinical Sciences ,Surgery ,Clinical sciences - Abstract
Multimodal analgesic strategy is pivotal for enhanced recovery after surgery. The objective of this trial was to assess the effect of subanesthetic esketamine vs. placebo combined with erector spinae plane block (ESPB) vs. intercostal nerve block (ICNB) on postoperative recovery following thoracoscopic lung resection. This randomized, controlled, 2×2 factorial trial was conducted at a university hospital in Suzhou, China. One hundred adult patients undergoing thoracoscopic lung surgery were randomized to one of four groups (esketamine-ESPB, esketamine-ICNB, placebo-ESPB, and placebo-ICNB) to receive i.v. esketamine 0.3 mg/kg or normal saline placebo combined with ESPB or ICNB using 0.375% ropivacaine 20 mL. All patients received flurbiprofen axetil and patient-controlled fentanyl. The primary outcome was quality of recovery (QoR) at 24 h postoperatively, assessed using the QoR-15 scale, with a minimal clinically important difference of 6.0. The median age was 57 years and 52% were female. No significant interaction effect was found between esketamine and regional blocks on QoR (P=0.215). The QoR-15 score at 24 h was 111.5±5.8 in the esketamine group vs. 105.4±4.5 in the placebo group (difference=6.1, 95% CI, 4.0-8.1; P
- Published
- 2024
49. The Llama 3 Herd of Models
- Author
-
Grattafiori, Aaron, Dubey, Abhimanyu, Jauhri, Abhinav, Pandey, Abhinav, Kadian, Abhishek, Al-Dahle, Ahmad, Letman, Aiesha, Mathur, Akhil, Schelten, Alan, Vaughan, Alex, Yang, Amy, Fan, Angela, Goyal, Anirudh, Hartshorn, Anthony, Yang, Aobo, Mitra, Archi, Sravankumar, Archie, Korenev, Artem, Hinsvark, Arthur, Rao, Arun, Zhang, Aston, Rodriguez, Aurelien, Gregerson, Austen, Spataru, Ava, Roziere, Baptiste, Biron, Bethany, Tang, Binh, Chern, Bobbie, Caucheteux, Charlotte, Nayak, Chaya, Bi, Chloe, Marra, Chris, McConnell, Chris, Keller, Christian, Touret, Christophe, Wu, Chunyang, Wong, Corinne, Ferrer, Cristian Canton, Nikolaidis, Cyrus, Allonsius, Damien, Song, Daniel, Pintz, Danielle, Livshits, Danny, Wyatt, Danny, Esiobu, David, Choudhary, Dhruv, Mahajan, Dhruv, Garcia-Olano, Diego, Perino, Diego, Hupkes, Dieuwke, Lakomkin, Egor, AlBadawy, Ehab, Lobanova, Elina, Dinan, Emily, Smith, Eric Michael, Radenovic, Filip, Guzmán, Francisco, Zhang, Frank, Synnaeve, Gabriel, Lee, Gabrielle, Anderson, Georgia Lewis, Thattai, Govind, Nail, Graeme, Mialon, Gregoire, Pang, Guan, Cucurell, Guillem, Nguyen, Hailey, Korevaar, Hannah, Xu, Hu, Touvron, Hugo, Zarov, Iliyan, Ibarra, Imanol Arrieta, Kloumann, Isabel, Misra, Ishan, Evtimov, Ivan, Zhang, Jack, Copet, Jade, Lee, Jaewon, Geffert, Jan, Vranes, Jana, Park, Jason, Mahadeokar, Jay, Shah, Jeet, van der Linde, Jelmer, Billock, Jennifer, Hong, Jenny, Lee, Jenya, Fu, Jeremy, Chi, Jianfeng, Huang, Jianyu, Liu, Jiawen, Wang, Jie, Yu, Jiecao, Bitton, Joanna, Spisak, Joe, Park, Jongsoo, Rocca, Joseph, Johnstun, Joshua, Saxe, Joshua, Jia, Junteng, Alwala, Kalyan Vasuden, Prasad, Karthik, Upasani, Kartikeya, Plawiak, Kate, Li, Ke, Heafield, Kenneth, Stone, Kevin, El-Arini, Khalid, Iyer, Krithika, Malik, Kshitiz, Chiu, Kuenley, Bhalla, Kunal, Lakhotia, Kushal, Rantala-Yeary, Lauren, van der Maaten, Laurens, Chen, Lawrence, Tan, Liang, Jenkins, Liz, Martin, Louis, Madaan, Lovish, Malo, Lubo, Blecher, Lukas, Landzaat, Lukas, de Oliveira, Luke, Muzzi, Madeline, Pasupuleti, Mahesh, Singh, Mannat, Paluri, Manohar, Kardas, Marcin, Tsimpoukelli, Maria, Oldham, Mathew, Rita, Mathieu, Pavlova, Maya, Kambadur, Melanie, Lewis, Mike, Si, Min, Singh, Mitesh Kumar, Hassan, Mona, Goyal, Naman, Torabi, Narjes, Bashlykov, Nikolay, Bogoychev, Nikolay, Chatterji, Niladri, Zhang, Ning, Duchenne, Olivier, Çelebi, Onur, Alrassy, Patrick, Zhang, Pengchuan, Li, Pengwei, Vasic, Petar, Weng, Peter, Bhargava, Prajjwal, Dubal, Pratik, Krishnan, Praveen, Koura, Punit Singh, Xu, Puxin, He, Qing, Dong, Qingxiao, Srinivasan, Ragavan, Ganapathy, Raj, Calderer, Ramon, Cabral, Ricardo Silveira, Stojnic, Robert, Raileanu, Roberta, Maheswari, Rohan, Girdhar, Rohit, Patel, Rohit, Sauvestre, Romain, Polidoro, Ronnie, Sumbaly, Roshan, Taylor, Ross, Silva, Ruan, Hou, Rui, Wang, Rui, Hosseini, Saghar, Chennabasappa, Sahana, Singh, Sanjay, Bell, Sean, Kim, Seohyun Sonia, Edunov, Sergey, Nie, Shaoliang, Narang, Sharan, Raparthy, Sharath, Shen, Sheng, Wan, Shengye, Bhosale, Shruti, Zhang, Shun, Vandenhende, Simon, Batra, Soumya, Whitman, Spencer, Sootla, Sten, Collot, Stephane, Gururangan, Suchin, Borodinsky, Sydney, Herman, Tamar, Fowler, Tara, Sheasha, Tarek, Georgiou, Thomas, Scialom, Thomas, Speckbacher, Tobias, Mihaylov, Todor, Xiao, Tong, Karn, Ujjwal, Goswami, Vedanuj, Gupta, Vibhor, Ramanathan, Vignesh, Kerkez, Viktor, Gonguet, Vincent, Do, Virginie, Vogeti, Vish, Albiero, Vítor, Petrovic, Vladan, Chu, Weiwei, Xiong, Wenhan, Fu, Wenyin, Meers, Whitney, Martinet, Xavier, Wang, Xiaodong, Wang, Xiaofang, Tan, Xiaoqing Ellen, Xia, Xide, Xie, Xinfeng, Jia, Xuchao, Wang, Xuewei, Goldschlag, Yaelle, Gaur, Yashesh, Babaei, Yasmine, Wen, Yi, Song, Yiwen, Zhang, Yuchen, Li, Yue, Mao, Yuning, Coudert, Zacharie Delpierre, Yan, Zheng, Chen, Zhengxing, Papakipos, Zoe, Singh, Aaditya, Srivastava, Aayushi, Jain, Abha, Kelsey, Adam, Shajnfeld, Adam, Gangidi, Adithya, Victoria, Adolfo, Goldstand, Ahuva, Menon, Ajay, Sharma, Ajay, Boesenberg, Alex, Baevski, Alexei, Feinstein, Allie, Kallet, Amanda, Sangani, Amit, Teo, Amos, Yunus, Anam, Lupu, Andrei, Alvarado, Andres, Caples, Andrew, Gu, Andrew, Ho, Andrew, Poulton, Andrew, Ryan, Andrew, Ramchandani, Ankit, Dong, Annie, Franco, Annie, Goyal, Anuj, Saraf, Aparajita, Chowdhury, Arkabandhu, Gabriel, Ashley, Bharambe, Ashwin, Eisenman, Assaf, Yazdan, Azadeh, James, Beau, Maurer, Ben, Leonhardi, Benjamin, Huang, Bernie, Loyd, Beth, De Paola, Beto, Paranjape, Bhargavi, Liu, Bing, Wu, Bo, Ni, Boyu, Hancock, Braden, Wasti, Bram, Spence, Brandon, Stojkovic, Brani, Gamido, Brian, Montalvo, Britt, Parker, Carl, Burton, Carly, Mejia, Catalina, Liu, Ce, Wang, Changhan, Kim, Changkyu, Zhou, Chao, Hu, Chester, Chu, Ching-Hsiang, Cai, Chris, Tindal, Chris, Feichtenhofer, Christoph, Gao, Cynthia, Civin, Damon, Beaty, Dana, Kreymer, Daniel, Li, Daniel, Adkins, David, Xu, David, Testuggine, Davide, David, Delia, Parikh, Devi, Liskovich, Diana, Foss, Didem, Wang, Dingkang, Le, Duc, Holland, Dustin, Dowling, Edward, Jamil, Eissa, Montgomery, Elaine, Presani, Eleonora, Hahn, Emily, Wood, Emily, Le, Eric-Tuan, Brinkman, Erik, Arcaute, Esteban, Dunbar, Evan, Smothers, Evan, Sun, Fei, Kreuk, Felix, Tian, Feng, Kokkinos, Filippos, Ozgenel, Firat, Caggioni, Francesco, Kanayet, Frank, Seide, Frank, Florez, Gabriela Medina, Schwarz, Gabriella, Badeer, Gada, Swee, Georgia, Halpern, Gil, Herman, Grant, Sizov, Grigory, Guangyi, Zhang, Lakshminarayanan, Guna, Inan, Hakan, Shojanazeri, Hamid, Zou, Han, Wang, Hannah, Zha, Hanwen, Habeeb, Haroun, Rudolph, Harrison, Suk, Helen, Aspegren, Henry, Goldman, Hunter, Zhan, Hongyuan, Damlaj, Ibrahim, Molybog, Igor, Tufanov, Igor, Leontiadis, Ilias, Veliche, Irina-Elena, Gat, Itai, Weissman, Jake, Geboski, James, Kohli, James, Lam, Janice, Asher, Japhet, Gaya, Jean-Baptiste, Marcus, Jeff, Tang, Jeff, Chan, Jennifer, Zhen, Jenny, Reizenstein, Jeremy, Teboul, Jeremy, Zhong, Jessica, Jin, Jian, Yang, Jingyi, Cummings, Joe, Carvill, Jon, Shepard, Jon, McPhie, Jonathan, Torres, Jonathan, Ginsburg, Josh, Wang, Junjie, Wu, Kai, U, Kam Hou, Saxena, Karan, Khandelwal, Kartikay, Zand, Katayoun, Matosich, Kathy, Veeraraghavan, Kaushik, Michelena, Kelly, Li, Keqian, Jagadeesh, Kiran, Huang, Kun, Chawla, Kunal, Huang, Kyle, Chen, Lailin, Garg, Lakshya, A, Lavender, Silva, Leandro, Bell, Lee, Zhang, Lei, Guo, Liangpeng, Yu, Licheng, Moshkovich, Liron, Wehrstedt, Luca, Khabsa, Madian, Avalani, Manav, Bhatt, Manish, Mankus, Martynas, Hasson, Matan, Lennie, Matthew, Reso, Matthias, Groshev, Maxim, Naumov, Maxim, Lathi, Maya, Keneally, Meghan, Liu, Miao, Seltzer, Michael L., Valko, Michal, Restrepo, Michelle, Patel, Mihir, Vyatskov, Mik, Samvelyan, Mikayel, Clark, Mike, Macey, Mike, Wang, Mike, Hermoso, Miquel Jubert, Metanat, Mo, Rastegari, Mohammad, Bansal, Munish, Santhanam, Nandhini, Parks, Natascha, White, Natasha, Bawa, Navyata, Singhal, Nayan, Egebo, Nick, Usunier, Nicolas, Mehta, Nikhil, Laptev, Nikolay Pavlovich, Dong, Ning, Cheng, Norman, Chernoguz, Oleg, Hart, Olivia, Salpekar, Omkar, Kalinli, Ozlem, Kent, Parkin, Parekh, Parth, Saab, Paul, Balaji, Pavan, Rittner, Pedro, Bontrager, Philip, Roux, Pierre, Dollar, Piotr, Zvyagina, Polina, Ratanchandani, Prashant, Yuvraj, Pritish, Liang, Qian, Alao, Rachad, Rodriguez, Rachel, Ayub, Rafi, Murthy, Raghotham, Nayani, Raghu, Mitra, Rahul, Parthasarathy, Rangaprabhu, Li, Raymond, Hogan, Rebekkah, Battey, Robin, Wang, Rocky, Howes, Russ, Rinott, Ruty, Mehta, Sachin, Siby, Sachin, Bondu, Sai Jayesh, Datta, Samyak, Chugh, Sara, Hunt, Sara, Dhillon, Sargun, Sidorov, Sasha, Pan, Satadru, Mahajan, Saurabh, Verma, Saurabh, Yamamoto, Seiji, Ramaswamy, Sharadh, Lindsay, Shaun, Feng, Sheng, Lin, Shenghao, Zha, Shengxin Cindy, Patil, Shishir, Shankar, Shiva, Zhang, Shuqiang, Wang, Sinong, Agarwal, Sneha, Sajuyigbe, Soji, Chintala, Soumith, Max, Stephanie, Chen, Stephen, Kehoe, Steve, Satterfield, Steve, Govindaprasad, Sudarshan, Gupta, Sumit, Deng, Summer, Cho, Sungmin, Virk, Sunny, Subramanian, Suraj, Choudhury, Sy, Goldman, Sydney, Remez, Tal, Glaser, Tamar, Best, Tamara, Koehler, Thilo, Robinson, Thomas, Li, Tianhe, Zhang, Tianjun, Matthews, Tim, Chou, Timothy, Shaked, Tzook, Vontimitta, Varun, Ajayi, Victoria, Montanez, Victoria, Mohan, Vijai, Kumar, Vinay Satish, Mangla, Vishal, Ionescu, Vlad, Poenaru, Vlad, Mihailescu, Vlad Tiberiu, Ivanov, Vladimir, Li, Wei, Wang, Wenchen, Jiang, Wenwen, Bouaziz, Wes, Constable, Will, Tang, Xiaocheng, Wu, Xiaojian, Wang, Xiaolan, Wu, Xilun, Gao, Xinbo, Kleinman, Yaniv, Chen, Yanjun, Hu, Ye, Jia, Ye, Qi, Ye, Li, Yenda, Zhang, Yilin, Zhang, Ying, Adi, Yossi, Nam, Youngjin, Yu, Wang, Zhao, Yu, Hao, Yuchen, Qian, Yundi, Li, Yunlu, He, Yuzi, Rait, Zach, DeVito, Zachary, Rosnbrick, Zef, Wen, Zhaoduo, Yang, Zhenyu, Zhao, Zhiwei, and Ma, Zhiyu
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
- Published
- 2024
50. A Fan-type condition for cycles in $1$-tough and $k$-connected $(P_2\cup kP_1)$-free graphs
- Author
-
Hu, Zhiquan, Wang, Jie, and Shen, Changlong
- Subjects
Mathematics - Combinatorics ,05C38, 05C45 ,G.2.2 - Abstract
For a graph $G$, let $\mu_k(G):=\min~\{\max_{x\in S}d_G(x):~S\in \mathcal{S}_k\}$, where $\mathcal{S}_k$ is the set consisting of all independent sets $\{u_1,\ldots,u_k\}$ of $G$ such that some vertex, say $u_i$ ($1\leq i\leq k$), is at distance two from every other vertex in it. A graph $G$ is $1$-tough if for each cut set $S\subseteq V(G)$, $G-S$ has at most $|S|$ components. Recently, Shi and Shan \cite{Shi} conjectured that for each integer $k\geq 4$, being $2k$-connected is sufficient for $1$-tough $(P_2\cup kP_1)$-free graphs to be hamiltonian, which was confirmed by Xu et al. \cite{Xu} and Ota and Sanka \cite{Ota2}, respectively. In this article, we generalize the above results through the following Fan-type theorem: Let $k$ be an integer with $k\geq 2$ and let $G$ be a $1$-tough and $k$-connected $(P_2\cup kP_1)$-free graph with $\mu_{k+1}(G)\geq\frac{7k-6}{5}$, then $G$ is hamiltonian or the Petersen graph., Comment: 19 pages, 4 figures
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.