2,011,071 results on '"WANG, P."'
Search Results
2. Scaffolding Middle-School Mathematics Curricula with Large Language Models. EdWorkingPaper No. 24-1028
- Author
-
Annenberg Institute for School Reform at Brown University, Rizwaan Malik, Dorna Abdi, Rose Wang, and Dorottya Demszky
- Abstract
Despite well-designed curriculum materials, teachers often face challenges in their implementation due to diverse classroom needs. This paper investigates whether Large Language Models (LLMs) can support middle-school math teachers by helping create high-quality curriculum scaffolds, which we define as the adaptations and supplements teachers employ to ensure all students can access and engage with the curriculum. Through Cognitive Task Analysis with expert teachers, we identify a three-stage process for curriculum scaffolding: observation, strategy formulation, and implementation. We incorporate these insights into three LLM approaches to create warmup tasks that activate background knowledge. The best-performing approach, which provides the model with the original curriculum materials and an expert-informed prompt, generates warmups that are rated significantly higher than warmups created by expert teachers in terms of alignment to learning objectives, accessibility to students working below grade level, and teacher preference. This research demonstrates the potential of LLMs to support teachers in creating effective scaffolds and provides a methodology for developing AI-driven educational tools. [This report was funded by Stanford's Center for Human-Centered AI and the Stanford Accelerator for Learning.]
- Published
- 2024
3. Universal Prekindergarten Expansion in California: Progress and Opportunities
- Author
-
Learning Policy Institute, Victoria Wang, Melanie Leung-Gagné, Hanna Melnick, and Marjorie E. Wechsler
- Abstract
In 2021, California committed to providing universal prekindergarten (UPK) for all 4-year-olds and expanding access for income-eligible 3-year-olds by 2025-2026. California UPK includes several early learning programs, including transitional kindergarten (TK), the California State Preschool Program (CSPP), Head Start, and locally funded early learning programs. To support UPK expansion, California's legislature and administration established the Universal Prekindergarten Planning and Implementation Grant in 2021, which allocated $200 million to all local education agencies (LEAs) serving kindergarteners, which include school districts, charter schools, and county offices of education. The California Department of Education surveyed all grant recipients in August 2023 about their UPK programs. This report provides an update on UPK implementation across the state through an analysis of survey responses from 1,384 LEAs, which represent almost all (95%) public school districts and two thirds (65%) of charter schools that serve elementary grades. Findings provide insights into LEAs' progress in UPK implementation related to service delivery models, facilities and transportation, instruction and assessment, strategies to support student needs, workforce development, implementation challenges, and technical assistance needs. In addition to statewide insights, the survey revealed promising practices and wide access with UPK expansion in California's four largest districts during their first year of implementation. The findings in this report may help policymakers and practitioners identify areas for additional investments and supports.
- Published
- 2024
4. Report on Indicators of School Crime and Safety: 2023. NCES 2024-145/NCJ 309126
- Author
-
National Center for Education Statistics (NCES) (ED/IES), US Department of Justice, Bureau of Justice Statistics, American Institutes for Research (AIR), Véronique Irwin, Ke Wang, Jiashan Cui, and Alexandra Thompson
- Abstract
This report provides the most recent national indicators on school crime and safety. The information presented in this report serves as a reference for policymakers and practitioners so that they can develop effective programs and policies aimed at violence and school crime prevention. Accurate information about the nature, extent, and scope of the problem being addressed is essential for developing effective programs and policies. The report is organized into five sections: elementary and secondary student and teacher victimization; school environment; fights and weapons; safety, security, and mental health practices; and postsecondary campus safety and security. Each section begins with a set of key findings. In this report, where available, data on victimization that occurred away from school are offered as a point of comparison for data on victimization that occurred at school. Indicators of crime and safety are compared across different population subgroups and over time. All data reflect the most current data available at the time the report was produced. Data throughout this report represent the 50 states and the District of Columbia. Findings described with comparative language (e.g., higher, lower, increase, and decrease) are statistically significant at the 0.05 level.
- Published
- 2024
5. Evaluating the Impact of Cloud e-Learning in Higher Education: An Empirical Investigation
- Author
-
Lillian-Yee-Kiaw Wang
- Abstract
The motivation for conducting this study is to investigate the potential of Cloud e-learning to address the high-cost and high-complexity challenges of conventional learning methods for the upgraded learning processes in higher education. The overall direction of this research is driven towards how the actual usage of Cloud e-learning module affects students' perceptions and academic performance. A Cloud e-learning module is designed and developed to promote optimised resource utilisation in the e-learning processes in higher education. A pretest-posttest method was adopted to study the impact of Cloud e-learning usage among students and whether the diffusion of Cloud e-learning has caused a change in students' perceptions. The pretest-posttest results and students' academic performance were then analysed to examine the impact from the actual usage of Cloud e-learning module. The findings reveal that the change of students' perceptions is time variant, indicating students' mixed perceptions on the usage of Cloud e-learning module. Analysis evidently reveals that the use of Cloud e-learning improved students' learning performance in theoretical subjects. This research is useful to educators and ICT practitioners in making informed decisions in adopting the right ICT infrastructures to support e-learning in higher education.
- Published
- 2024
6. Rethinking How People Learn: A Holistic Framework for Effective Learning Design
- Author
-
Minhong Wang
- Abstract
Learning is an integral part of being human. How people learn has long been discussed, revealed in many learning theories, investigated in numerous studies, and demonstrated in extensive practices. The goal of this article is to rethink how people learn from four fundamental perspectives, that is, learning by interaction with content (C), learning by interaction with other people (O), learning by interaction with self (S), and learning by interaction with tasks or practices (T), so-called COST model. This framework offers a high-level view of human learning and the role of technology in human learning. Moreover, it serves as a guide for effective design of learning experiences, learning environments, and learning approaches, where technology has become a crucial component.
- Published
- 2024
7. Quest for Equitable Education in Phases: Insights from an NGO in China
- Author
-
Shirley Pan and Bo Wang
- Abstract
Among the East Asian nations, a recurring predicament faced by educational institutions is that of providing inclusive but high-quality education. Active involvement of non-governmental organizations (NGOs) in education is valuable in China. Adream was such an NGO on education in China, established in 2008 with a singular and noble objective: promotion of equitable access to quality education within the disadvantaged regions of China. The trajectory of Adream's endeavor to secure equitable access to quality education in rural China stands as a compelling exemplar of the transformative potential that NGOs wield within the realm of education.
- Published
- 2024
8. Loss of Schooling from Tropical Cyclones: Evidence from 13 Low- and Middle-Income Countries. EdWorkingPaper No. 24-980
- Author
-
Annenberg Institute for School Reform at Brown University, Renzhi Jing, Sam Heft-Neal, Zetianyu Wang, Jie Chen, Minghao Qiu, Isaac M. Opper, Zachary Wagner, and Eran Bendavid
- Abstract
Increasing educational attainment is one of the most important and effective tools for health and economic improvements. The extent to which extreme climate events disrupt education, resulting in fewer years of schooling and reduced educational attainment, remains under-studied. Children in low- and middle-income countries may be uniquely vulnerable to loss of schooling after such disasters due to the poor physical condition of schools and the lack of resources to rebuild and mitigate unexpected household shocks. Our analysis assesses this overlooked social cost of tropical cyclones on schooling attainment. We study the education records of nearly 5.1 million people living in 13 low- and middle-income countries that were exposed to tropical cyclones between 1954-2010. We find that exposure to tropical cyclones during preschool age is associated with a 2.7 percentage point decrease in primary school enrollment on average (14.2% decrease), with larger effects from more intense storms (up to 28% decrease for the most intense storms). These effects are more pronounced among school-age girls compared to boys and are greater in areas less accustomed to experiencing tropical cyclones. We estimate that, across all LMICs, tropical cyclone exposure has resulted in more than 410,000 children not attending primary school in the last 20 years, leading to a reduction of more than 4.1 million total years of schooling. These impacts, identified among some of the world's poorest populations, may grow in importance as exposure to severe tropical cyclones is projected to increase with climate change.
- Published
- 2024
9. DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving
- Author
-
Liao, Bencheng, Chen, Shaoyu, Yin, Haoran, Jiang, Bo, Wang, Cheng, Yan, Sixu, Zhang, Xinbang, Li, Xiangyu, Zhang, Ying, Zhang, Qian, and Wang, Xinggang
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
Recently, the diffusion model has emerged as a powerful generative technique for robotic policy learning, capable of modeling multi-mode action distributions. Leveraging its capability for end-to-end autonomous driving is a promising direction. However, the numerous denoising steps in the robotic diffusion policy and the more dynamic, open-world nature of traffic scenes pose substantial challenges for generating diverse driving actions at a real-time speed. To address these challenges, we propose a novel truncated diffusion policy that incorporates prior multi-mode anchors and truncates the diffusion schedule, enabling the model to learn denoising from anchored Gaussian distribution to the multi-mode driving action distribution. Additionally, we design an efficient cascade diffusion decoder for enhanced interaction with conditional scene context. The proposed model, DiffusionDrive, demonstrates 10$\times$ reduction in denoising steps compared to vanilla diffusion policy, delivering superior diversity and quality in just 2 steps. On the planning-oriented NAVSIM dataset, with the aligned ResNet-34 backbone, DiffusionDrive achieves 88.1 PDMS without bells and whistles, setting a new record, while running at a real-time speed of 45 FPS on an NVIDIA 4090. Qualitative results on challenging scenarios further confirm that DiffusionDrive can robustly generate diverse plausible driving actions. Code and model will be available at https://github.com/hustvl/DiffusionDrive., Comment: Work in progress. Code & demo & model will be available at https://github.com/hustvl/DiffusionDrive
- Published
- 2024
10. Material Anything: Generating Materials for Any 3D Object via Diffusion
- Author
-
Huang, Xin, Wang, Tengfei, Liu, Ziwei, and Wang, Qing
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Graphics - Abstract
We present Material Anything, a fully-automated, unified diffusion framework designed to generate physically-based materials for 3D objects. Unlike existing methods that rely on complex pipelines or case-specific optimizations, Material Anything offers a robust, end-to-end solution adaptable to objects under diverse lighting conditions. Our approach leverages a pre-trained image diffusion model, enhanced with a triple-head architecture and rendering loss to improve stability and material quality. Additionally, we introduce confidence masks as a dynamic switcher within the diffusion model, enabling it to effectively handle both textured and texture-less objects across varying lighting conditions. By employing a progressive material generation strategy guided by these confidence masks, along with a UV-space material refiner, our method ensures consistent, UV-ready material outputs. Extensive experiments demonstrate our approach outperforms existing methods across a wide range of object categories and lighting conditions., Comment: Project page: https://xhuangcv.github.io/MaterialAnything/
- Published
- 2024
11. PDS 70b Shows Stellar-like Carbon-to-Oxygen Ratio
- Author
-
Hsu, Chih-Chun, Wang, Jason J., Blake, Geoffrey A., Xuan, Jerry W., Zhang, Yapeng, Ruffio, Jean-Baptiste, Horstman, Katelyn, Cronin, Julianne, Sappey, Ben, Xin, Yinzi, Finnerty, Luke, Echeverri, Daniel, Mawet, Dimitri, Jovanovic, Nemanja, Ó, Clarissa R. Do, Baker, Ashley, Bartos, Randall, Calvin, Benjamin, Cetre, Sylvain, Delorme, Jacques-Robert, Doppmann, Gregory W., Fitzgerald, Michael P., Liberman, Joshua, López, Ronald A., Morris, Evan, Pezzato-Rovner, Jacklyn, Schofield, Tobias, Skemer, Andrew, Wallace, J. Kent, and Wang, Ji
- Subjects
Astrophysics - Earth and Planetary Astrophysics ,Astrophysics - Solar and Stellar Astrophysics - Abstract
The $\sim$5 Myr PDS 70 is the only known system with protoplanets residing in the cavity of the circumstellar disk from which they formed, ideal for studying exoplanet formation and evolution within its natal environment. Here we report the first spin constraint and C/O measurement of PDS 70b from Keck/KPIC high-resolution spectroscopy. We detected CO (3.8 $\sigma$) and H$_2$O (3.5 $\sigma$) molecules in the PDS 70b atmosphere via cross-correlation, with a combined CO and H$_2$O template detection significance of 4.2 $\sigma$. Our forward model fits, using BT-Settl model grids, provide an upper limit for the spin-rate of PDS 70b ($<$29 km s$^{-1}$). The atmospheric retrievals constrain the PDS 70b C/O ratio to ${0.28}^{+0.20}_{-0.12}$ ($<$0.63 under 95$\%$ confidence level) and a metallicity [C/H] of ${-0.2}^{+0.8}_{-0.5}$ dex, consistent with that of its host star. The following scenarios can explain our measured C/O of PDS 70b in contrast with that of the gas-rich outer disk (for which C/O $\gtrsim$ 1). First, the bulk composition of PDS 70b might be dominated by dust+ice aggregates rather than disk gas. Another possible explanation is that the disk became carbon-enriched $\textit{after}$ PDS 70b was formed, as predicted in models of disk chemical evolution and as observed in both very low mass star and older disk systems with $\textit{JWST}$/MIRI. Because PDS 70b continues to accrete and its chemical evolution is not yet complete, more sophisticated modeling of the planet and the disk, and higher quality observations of PDS 70b (and possibly PDS 70c), are necessary to validate these scenarios., Comment: Accepted to ApJ Letters; 15 pages, 3 figures
- Published
- 2024
12. Ultra-High-Efficiency Dual-Band Thin-Film Lithium Niobate Modulator Incorporating Low-k Underfill with 220 GHz Extrapolated Bandwidth for 390 Gbit/s PAM8 Transmission
- Author
-
Liu, Hao, He, Yutong, Xiong, Bing, Sun, Changzheng, Hao, Zhibiao, Wang, Lai, Wang, Jian, Han, Yanjun, Li, Hongtao, Gan, Lin, and Luo, Yi
- Subjects
Physics - Optics ,Physics - Applied Physics - Abstract
High-performance electro-optic modulators play a critical role in modern telecommunication networks and intra-datacenter interconnects. Low driving voltage, large electro-optic bandwidth, compact device size, and multi-band operation ability are essential for various application scenarios, especially energy-efficient high-speed data transmission. However, it is challenging to meet all these requirements simultaneously. Here, we demonstrate a high-performance dual-band thin-film lithium niobate electro-optic modulator with low-k underfill to achieve overall performance improvement. The low-k material helps reduce the RF loss of the modulator and achieve perfect velocity matching with narrow electrode gap to overcome the voltage-bandwidth limitation, extending electro-optic bandwidth and enhancing modulation efficiency simultaneously. The fabricated 7-mm-long modulator exhibits a low half-wave voltage of 1.9 V at C-band and 1.54 V at O-band, featuring a low half-wave voltage-length product of 1.33 V*cm and 1.08 V*cm, respectively. Meanwhile, the novel design yields an ultra-wide extrapolated 3 dB bandwidth of 220 GHz (218 GHz) in the C-band (O-band). High-speed data transmission in both C- and O-bands using the same device has been demonstrated for the first time by PAM8 with data rates up to 390 Gbit/s, corresponding to a record-low energy consumption of 0.69 fJ/bit for next-generation cost-effective ultra-high-speed optical communications.
- Published
- 2024
13. Benchmarking the Robustness of Optical Flow Estimation to Corruptions
- Author
-
Yi, Zhonghua, Shi, Hao, Jiang, Qi, Gao, Yao, Wang, Ze, Zhang, Yufan, Yang, Kailun, and Wang, Kaiwei
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
Optical flow estimation is extensively used in autonomous driving and video editing. While existing models demonstrate state-of-the-art performance across various benchmarks, the robustness of these methods has been infrequently investigated. Despite some research focusing on the robustness of optical flow models against adversarial attacks, there has been a lack of studies investigating their robustness to common corruptions. Taking into account the unique temporal characteristics of optical flow, we introduce 7 temporal corruptions specifically designed for benchmarking the robustness of optical flow models, in addition to 17 classical single-image corruptions, in which advanced PSF Blur simulation method is performed. Two robustness benchmarks, KITTI-FC and GoPro-FC, are subsequently established as the first corruption robustness benchmark for optical flow estimation, with Out-Of-Domain (OOD) and In-Domain (ID) settings to facilitate comprehensive studies. Robustness metrics, Corruption Robustness Error (CRE), Corruption Robustness Error ratio (CREr), and Relative Corruption Robustness Error (RCRE) are further introduced to quantify the optical flow estimation robustness. 29 model variants from 15 optical flow methods are evaluated, yielding 10 intriguing observations, such as 1) the absolute robustness of the model is heavily dependent on the estimation performance; 2) the corruptions that diminish local information are more serious than that reduce visual effects. We also give suggestions for the design and application of optical flow models. We anticipate that our benchmark will serve as a foundational resource for advancing research in robust optical flow estimation. The benchmarks and source code will be released at https://github.com/ZhonghuaYi/optical_flow_robustness_benchmark., Comment: The benchmarks and source code will be released at https://github.com/ZhonghuaYi/optical_flow_robustness_benchmark
- Published
- 2024
14. Continual SFT Matches Multimodal RLHF with Negative Supervision
- Author
-
Zhu, Ke, Wang, Yu, Sun, Yanpeng, Chen, Qiang, Liu, Jiangjiang, Zhang, Gang, and Wang, Jingdong
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Multimodal RLHF usually happens after supervised finetuning (SFT) stage to continually improve vision-language models' (VLMs) comprehension. Conventional wisdom holds its superiority over continual SFT during this preference alignment stage. In this paper, we observe that the inherent value of multimodal RLHF lies in its negative supervision, the logit of the rejected responses. We thus propose a novel negative supervised finetuning (nSFT) approach that fully excavates these information resided. Our nSFT disentangles this negative supervision in RLHF paradigm, and continually aligns VLMs with a simple SFT loss. This is more memory efficient than multimodal RLHF where 2 (e.g., DPO) or 4 (e.g., PPO) large VLMs are strictly required. The effectiveness of nSFT is rigorously proved by comparing it with various multimodal RLHF approaches, across different dataset sources, base VLMs and evaluation metrics. Besides, fruitful of ablations are provided to support our hypothesis. We hope this paper will stimulate further research to properly align large vision language models.
- Published
- 2024
15. TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior
- Author
-
Yang, Sen, Jiang, Minyue, Fan, Ziwei, Xie, Xiaolu, Tan, Xiao, Li, Yingying, Ding, Errui, Wang, Liang, and Wang, Jingdong
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Robotics - Abstract
Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness., Comment: 17 pages, 7 figures, and 7 tables
- Published
- 2024
16. Point Cloud Understanding via Attention-Driven Contrastive Learning
- Author
-
Wang, Yi, Wang, Jiaze, Guo, Ziyu, Zhang, Renrui, Zhou, Donghao, Chen, Guangyong, Liu, Anfeng, and Heng, Pheng-Ann
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Recently Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms, however, these methods often overlook latent information in less prominent regions, leading to increased sensitivity to perturbations and limited global comprehension. To solve this issue, we introduce PointACL, an attention-driven contrastive learning framework designed to address these limitations. Our method employs an attention-driven dynamic masking strategy that guides the model to focus on under-attended regions, enhancing the understanding of global structures within the point cloud. Then we combine the original pre-training loss with a contrastive learning loss, improving feature discrimination and generalization. Extensive experiments validate the effectiveness of PointACL, as it achieves state-of-the-art performance across a variety of 3D understanding tasks, including object classification, part segmentation, and few-shot learning. Specifically, when integrated with different Transformer backbones like Point-MAE and PointGPT, PointACL demonstrates improved performance on datasets such as ScanObjectNN, ModelNet40, and ShapeNetPart. This highlights its superior capability in capturing both global and local features, as well as its enhanced robustness against perturbations and incomplete data.
- Published
- 2024
17. Robust Mutual Fund Selection with False Discovery Rate Control
- Author
-
Wang, Hongfei, Feng, Long, Zhao, Ping, and Wang, Zhaojun
- Subjects
Statistics - Methodology - Abstract
In this article, we address the challenge of identifying skilled mutual funds among a large pool of candidates, utilizing the linear factor pricing model. Assuming observable factors with a weak correlation structure for the idiosyncratic error, we propose a spatial-sign based multiple testing procedure (SS-BH). When latent factors are present, we first extract them using the elliptical principle component method (He et al. 2022) and then propose a factor-adjusted spatial-sign based multiple testing procedure (FSS-BH). Simulation studies demonstrate that our proposed FSS-BH procedure performs exceptionally well across various applications and exhibits robustness to variations in the covariance structure and the distribution of the error term. Additionally, real data application further highlights the superiority of the FSS-BH procedure.
- Published
- 2024
18. How do imperfections cause asymmetry in elastic snap-through?
- Author
-
Giudici, Andrea, Huang, Weicheng, Wang, Qiong, Wang, Yuzhe, Liu, Mingchao, Tawfick, Sameh, and Vella, Dominic
- Subjects
Condensed Matter - Soft Condensed Matter - Abstract
A symmetrically-buckled arch whose boundaries are clamped at an angle has two stable equilibria: an inverted and a natural state. When the distance between the clamps is increased (i.e. the confinement is decreased) the system snaps from the inverted to the natural state. Depending on the rate at which the confinement is decreased ('unloading'), the symmetry of the system during snap-through may change: slow unloading results in snap-through occurring asymmetrically, while fast unloading results in a symmetric snap-through. It has recently been shown [Wang et al., Phys. Rev. Lett. 132, 267201 (2024)] that the transient asymmetry at slow unloading rates is the result of the amplification of small asymmetric precursor oscillations (shape perturbations) introduced dynamically to the system, even when the system itself is perfectly symmetric. In reality, however, imperfections, such as small asymmetries in the boundary conditions, are present too. Using numerical simulations and a simple toy model, we discuss the relative importance of intrinsic imperfections and initial asymmetric shape perturbations in determining the transient asymmetry observed. We show that, for small initial perturbations, the magnitude of the asymmetry grows in proportion to the size of the intrinsic imperfection but that, when initial shape perturbations are large, intrinsic imperfections are unimportant - the asymmetry of the system is dominated by the transient amplification of the initial asymmetric shape perturbations. We also show that the dominant origin of asymmetry changes the way that asymmetry grows dynamically. Our results may guide engineering and design of snapping beams used to control insect-sized jumping robots.
- Published
- 2024
19. Separable Mixture of Low-Rank Adaptation for Continual Visual Instruction Tuning
- Author
-
Wang, Ziqi, Che, Chang, Wang, Qi, Li, Yangyang, Shi, Zenglin, and Wang, Meng
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Visual instruction tuning (VIT) enables multimodal large language models (MLLMs) to effectively handle a wide range of vision tasks by framing them as language-based instructions. Building on this, continual visual instruction tuning (CVIT) extends the capability of MLLMs to incrementally learn new tasks, accommodating evolving functionalities. While prior work has advanced CVIT through the development of new benchmarks and approaches to mitigate catastrophic forgetting, these efforts largely follow traditional continual learning paradigms, neglecting the unique challenges specific to CVIT. We identify a dual form of catastrophic forgetting in CVIT, where MLLMs not only forget previously learned visual understanding but also experience a decline in instruction following abilities as they acquire new tasks. To address this, we introduce the Separable Mixture of Low-Rank Adaptation (SMoLoRA) framework, which employs separable routing through two distinct modules - one for visual understanding and another for instruction following. This dual-routing design enables specialized adaptation in both domains, preventing forgetting while improving performance. Furthermore, we propose a novel CVIT benchmark that goes beyond existing benchmarks by additionally evaluating a model's ability to generalize to unseen tasks and handle diverse instructions across various tasks. Extensive experiments demonstrate that SMoLoRA outperforms existing methods in mitigating dual forgetting, improving generalization to unseen tasks, and ensuring robustness in following diverse instructions.
- Published
- 2024
20. Robust Data-Driven Predictive Control for Mixed Platoons under Noise and Attacks
- Author
-
Li, Shuai, Chen, Chaoyi, Zheng, Haotian, Wang, Jiawei, Xu, Qing, Wang, Jianqiang, and Li, Keqiang
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
Controlling mixed platoons, which consist of both connected and automated vehicles (CAVs) and human-driven vehicles (HDVs), poses significant challenges due to the uncertain and unknown human driving behaviors. Data-driven control methods offer promising solutions by leveraging available trajectory data, but their performance can be compromised by process noise and adversarial attacks. To address this issue, this paper proposes a Robust Data-EnablEd Predictive Leading Cruise Control (RDeeP-LCC) framework based on data-driven reachability analysis. The framework over-approximates system dynamics under noise and attack using a matrix zonotope set derived from data, and develops a stabilizing feedback control law. By decoupling the mixed platoon system into nominal and error components, we employ data-driven reachability sets to recursively compute error reachable sets that account for noise and attacks, and obtain tightened safety constraints of the nominal system. This leads to a robust data-driven predictive control framework, solved in a tube-based control manner. Numerical simulations and human-in-the-loop experiments validate that the RDeeP-LCC method significantly enhances the robustness of mixed platoons, improving mixed traffic stability and safety against practical noise and attacks., Comment: 16 pages, 7 figures
- Published
- 2024
21. ALKPU: an active learning method for the DeePMD model with Kalman filter
- Author
-
Li, Haibo, Wu, Xingxing, Liu, Liping, Wang, Lin-Wang, Wang, Long, Tan, Guangming, and Jia, Weile
- Subjects
Physics - Computational Physics - Abstract
Neural network force field models such as DeePMD have enabled highly efficient large-scale molecular dynamics simulations with ab initio accuracy. However, building such models heavily depends on the training data obtained by costly electronic structure calculations, thereby it is crucial to carefully select and label the most representative configurations during model training to improve both extrapolation capability and training efficiency. To address this challenge, based on the Kalman filter theory we propose the Kalman Prediction Uncertainty (KPU) to quantify uncertainty of the model's prediction. With KPU we design the Active Learning by KPU (ALKPU) method, which can efficiently select representative configurations that should be labelled during model training. We prove that ALKPU locally leads to the fastest reduction of model's uncertainty, which reveals its rationality as a general active learning method. We test the ALKPU method using various physical system simulations and demonstrate that it can efficiently coverage the system's configuration space. Our work demonstrates the benefits of ALKPU as a novel active learning method, enhancing training efficiency and reducing computational resource demands.
- Published
- 2024
22. HotSpot: Screened Poisson Equation for Signed Distance Function Optimization
- Author
-
Wang, Zimo, Wang, Cheng, Yoshino, Taiki, Tao, Sirui, Fu, Ziyang, and Li, Tzu-Mao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
We propose a method, HotSpot, for optimizing neural signed distance functions, based on a relation between the solution of a screened Poisson equation and the distance function. Existing losses such as the eikonal loss cannot guarantee the recovered implicit function to be a distance function, even when the implicit function satisfies the eikonal equation almost everywhere. Furthermore, the eikonal loss suffers from stability issues in optimization and the remedies that introduce area or divergence minimization can lead to oversmoothing. We address these challenges by designing a loss function that when minimized can converge to the true distance function, is stable, and naturally penalize large surface area. We provide theoretical analysis and experiments on both challenging 2D and 3D datasets and show that our method provide better surface reconstruction and more accurate distance approximation.
- Published
- 2024
23. Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
- Author
-
Yang, Jiange, Zhu, Haoyi, Wang, Yating, Wu, Gangshan, He, Tong, and Wang, Limin
- Subjects
Computer Science - Robotics - Abstract
Learning from multiple domains is a primary factor that influences the generalization of a single unified robot system. In this paper, we aim to learn the trajectory prediction model by using broad out-of-domain data to improve its performance and generalization ability. Trajectory model is designed to predict any-point trajectories in the current frame given an instruction and can provide detailed control guidance for robotic policy learning. To handle the diverse out-of-domain data distribution, we propose a sparsely-gated MoE (\textbf{Top-1} gating strategy) architecture for trajectory model, coined as \textbf{Tra-MoE}. The sparse activation design enables good balance between parameter cooperation and specialization, effectively benefiting from large-scale out-of-domain data while maintaining constant FLOPs per token. In addition, we further introduce an adaptive policy conditioning technique by learning 2D mask representations for predicted trajectories, which is explicitly aligned with image observations to guide action prediction more flexibly. We perform extensive experiments on both simulation and real-world scenarios to verify the effectiveness of Tra-MoE and adaptive policy conditioning technique. We also conduct a comprehensive empirical study to train Tra-MoE, demonstrating that our Tra-MoE consistently exhibits superior performance compared to the dense baseline model, even when the latter is scaled to match Tra-MoE's parameter count., Comment: 15 pages, 5 figures
- Published
- 2024
24. Global Challenge for Safe and Secure LLMs Track 1
- Author
-
Jia, Xiaojun, Huang, Yihao, Liu, Yang, Tan, Peng Yan, Yau, Weng Kuan, Mak, Mun-Thye, Sim, Xin Ming, Ng, Wee Siong, Ng, See Kiong, Liu, Hanqing, Zhou, Lifeng, Yan, Huanqian, Sun, Xiaobing, Liu, Wei, Wang, Long, Qian, Yiming, Liu, Yong, Yang, Junxiao, Zhang, Zhexin, Lei, Leqi, Chen, Renmiao, Lu, Yida, Cui, Shiyao, Wang, Zizhou, Li, Shaohua, Wang, Yan, Goh, Rick Siow Mong, Zhen, Liangli, Zhang, Yingjie, and Zhao, Zhe
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks. With the increasing integration of LLMs in critical sectors such as healthcare, finance, and public administration, ensuring these models are resilient to adversarial attacks is vital for preventing misuse and upholding ethical standards. This competition focused on two distinct tracks designed to evaluate and enhance the robustness of LLM security frameworks. Track 1 tasked participants with developing automated methods to probe LLM vulnerabilities by eliciting undesirable responses, effectively testing the limits of existing safety protocols within LLMs. Participants were challenged to devise techniques that could bypass content safeguards across a diverse array of scenarios, from offensive language to misinformation and illegal activities. Through this process, Track 1 aimed to deepen the understanding of LLM vulnerabilities and provide insights for creating more resilient models.
- Published
- 2024
25. Night-to-Day Translation via Illumination Degradation Disentanglement
- Author
-
Lan, Guanzhou, Yang, Yuqi, Wang, Zhigang, Wang, Dong, Zhao, Bin, and Li, Xuelong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Night-to-Day translation (Night2Day) aims to achieve day-like vision for nighttime scenes. However, processing night images with complex degradations remains a significant challenge under unpaired conditions. Previous methods that uniformly mitigate these degradations have proven inadequate in simultaneously restoring daytime domain information and preserving underlying semantics. In this paper, we propose \textbf{N2D3} (\textbf{N}ight-to-\textbf{D}ay via \textbf{D}egradation \textbf{D}isentanglement) to identify different degradation patterns in nighttime images. Specifically, our method comprises a degradation disentanglement module and a degradation-aware contrastive learning module. Firstly, we extract physical priors from a photometric model based on Kubelka-Munk theory. Then, guided by these physical priors, we design a disentanglement module to discriminate among different illumination degradation regions. Finally, we introduce the degradation-aware contrastive learning strategy to preserve semantic consistency across distinct degradation regions. Our method is evaluated on two public datasets, demonstrating a significant improvement in visual quality and considerable potential for benefiting downstream tasks., Comment: 8 pages
- Published
- 2024
26. Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
- Author
-
Zhao, Yu, Yin, Huifeng, Zeng, Bo, Wang, Hao, Shi, Tianqi, Lyu, Chenyang, Wang, Longyue, Luo, Weihua, and Zhang, Kaifu
- Subjects
Computer Science - Computation and Language - Abstract
Currently OpenAI o1 has sparked a surge of interest in the study of large reasoning models (LRM). Building on this momentum, Marco-o1 not only focuses on disciplines with standard answers, such as mathematics, physics, and coding -- which are well-suited for reinforcement learning (RL) -- but also places greater emphasis on open-ended resolutions. We aim to address the question: "Can the o1 model effectively generalize to broader domains where clear standards are absent and rewards are challenging to quantify?" Marco-o1 is powered by Chain-of-Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and innovative reasoning strategies -- optimized for complex real-world problem-solving tasks.
- Published
- 2024
27. CoNFiLD-inlet: Synthetic Turbulence Inflow Using Generative Latent Diffusion Models with Neural Fields
- Author
-
Liu, Xin-Yang, Parikh, Meet Hemant, Fan, Xiantao, Du, Pan, Wang, Qing, Chen, Yi-Fan, and Wang, Jian-Xun
- Subjects
Physics - Fluid Dynamics ,Computer Science - Machine Learning - Abstract
Eddy-resolving turbulence simulations require stochastic inflow conditions that accurately replicate the complex, multi-scale structures of turbulence. Traditional recycling-based methods rely on computationally expensive precursor simulations, while existing synthetic inflow generators often fail to reproduce realistic coherent structures of turbulence. Recent advances in deep learning (DL) have opened new possibilities for inflow turbulence generation, yet many DL-based methods rely on deterministic, autoregressive frameworks prone to error accumulation, resulting in poor robustness for long-term predictions. In this work, we present CoNFiLD-inlet, a novel DL-based inflow turbulence generator that integrates diffusion models with a conditional neural field (CNF)-encoded latent space to produce realistic, stochastic inflow turbulence. By parameterizing inflow conditions using Reynolds numbers, CoNFiLD-inlet generalizes effectively across a wide range of Reynolds numbers ($Re_\tau$ between $10^3$ and $10^4$) without requiring retraining or parameter tuning. Comprehensive validation through a priori and a posteriori tests in Direct Numerical Simulation (DNS) and Wall-Modeled Large Eddy Simulation (WMLES) demonstrates its high fidelity, robustness, and scalability, positioning it as an efficient and versatile solution for inflow turbulence synthesis., Comment: 27 pages, 10 figures
- Published
- 2024
28. Measurement of two-neutrino double electron capture half-life of $^{124}$Xe with PandaX-4T
- Author
-
PandaX Collaboration, Bo, Zihao, Chen, Wei, Chen, Xun, Chen, Yunhua, Cheng, Zhaokan, Cui, Xiangyi, Fan, Yingjie, Fang, Deqing, Gao, Zhixing, Geng, Lisheng, Giboni, Karl, Guo, Xunan, Guo, Xuyuan, Guo, Zichao, Han, Chencheng, Han, Ke, He, Changda, He, Jinrong, Huang, Di, Huang, Houqi, Huang, Junting, Hou, Ruquan, Hou, Yu, Ji, Xiangdong, Ji, Xiangpan, Ju, Yonglin, Li, Chenxiang, Li, Jiafu, Li, Mingchuan, Li, Shuaijie, Li, Tao, Li, Zhiyuan, Lin, Qing, Liu, Jianglai, Lu, Congcong, Lu, Xiaoying, Luo, Lingyin, Luo, Yunyang, Ma, Wenbo, Ma, Yugang, Mao, Yajun, Meng, Yue, Ning, Xuyang, Pang, Binyu, Qi, Ningchun, Qian, Zhicheng, Ren, Xiangxiang, Shan, Dong, Shang, Xiaofeng, Shao, Xiyuan, Shen, Guofang, Shen, Manbin, Sun, Wenliang, Tao, Yi, Wang, Anqing, Wang, Guanbo, Wang, Hao, Wang, Jiamin, Wang, Lei, Wang, Meng, Wang, Qiuhong, Wang, Shaobo, Wang, Siguang, Wang, Wei, Wang, Xiuli, Wang, Xu, Wang, Zhou, Wei, Yuehuan, Wu, Weihao, Wu, Yuan, Xiao, Mengjiao, Xiao, Xiang, Xiong, Kaizhi, Xu, Yifan, Yao, Shunyu, Yan, Binbin, Yan, Xiyu, Yang, Yong, Ye, Peihua, Yu, Chunxu, Yuan, Ying, Yuan, Zhe, Yun, Youhui, Zeng, Xinning, Zhang, Minzhen, Zhang, Peng, Zhang, Shibo, Zhang, Shu, Zhang, Tao, Zhang, Wei, Zhang, Yang, Zhang, Yingxin, Zhang, Yuanyuan, Zhao, Li, Zhou, Jifang, Zhou, Jiaxu, Zhou, Jiayi, Zhou, Ning, Zhou, Xiaopeng, Zhou, Yubo, and Zhou, Zhizhen
- Subjects
Nuclear Experiment - Abstract
Detailed studies of two-neutrino double electron capture (2$\nu$DEC) is a crucial step towards searching for the neutrino-less mode to explore the Majorana nature of neutrinos. We have measured precisely the half-life of the 2$\nu$DEC process in $^{124}$Xe, utilizing a total exposure of 1.73 tonne$\cdot$year from the commissioning run and the first science run of the PandaX-4T experiment. A time-dependent background model in the $\mathcal{O}$(10 keV) energy is constructed for the first time in PandaX-4T data. With an unbinned maximum likelihood fit, we determine the half-life of the 2$\nu$DEC process to be $(1.03\pm0.15_{\rm stat}\pm0.06_{\rm sys})\times 10^{22}$$\,$yr. Furthermore, we have evaluated the branching ratio for both electrons captured from the $K$ shell ($KK$) to be $(65\pm5)\%$, which aligns with the $^{124}$Xe nuclear model calculations within 1.5$\,$$\sigma$., Comment: 18 pages, 5 figures, 3 tables
- Published
- 2024
29. Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body
- Author
-
Wang, Zeqing, Ma, Qingyang, Wan, Wentao, Li, Haojie, Wang, Keze, and Tian, Yonghong
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Recent improvements in visual synthesis have significantly enhanced the depiction of generated human photos, which are pivotal due to their wide applicability and demand. Nonetheless, the existing text-to-image or text-to-video models often generate low-quality human photos that might differ considerably from real-world body structures, referred to as "abnormal human bodies". Such abnormalities, typically deemed unacceptable, pose considerable challenges in the detection and repair of them within human photos. These challenges require precise abnormality recognition capabilities, which entail pinpointing both the location and the abnormality type. Intuitively, Visual Language Models (VLMs) that have obtained remarkable performance on various visual tasks are quite suitable for this task. However, their performance on abnormality detection in human photos is quite poor. Hence, it is quite important to highlight this task for the research community. In this paper, we first introduce a simple yet challenging task, i.e., \textbf{F}ine-grained \textbf{H}uman-body \textbf{A}bnormality \textbf{D}etection \textbf{(FHAD)}, and construct two high-quality datasets for evaluation. Then, we propose a meticulous framework, named HumanCalibrator, which identifies and repairs abnormalities in human body structures while preserving the other content. Experiments indicate that our HumanCalibrator achieves high accuracy in abnormality detection and accomplishes an increase in visual comparisons while preserving the other visual content., Comment: 16 pages, 14 figures
- Published
- 2024
30. CompetitorFormer: Competitor Transformer for 3D Instance Segmentation
- Author
-
Wang, Duanchu, Liu, Jing, Gong, Haoran, Quan, Yinghui, and Wang, Di
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Transformer-based methods have become the dominant approach for 3D instance segmentation. These methods predict instance masks via instance queries, ranking them by classification confidence and IoU scores to select the top prediction as the final outcome. However, it has been observed that the current models employ a fixed and higher number of queries than the instances present within a scene. In such instances, multiple queries predict the same instance, yet only a single query is ultimately optimized. The close scores of queries in the lower-level decoders make it challenging for the dominant query to distinguish itself rapidly, which ultimately impairs the model's accuracy and convergence efficiency. This phenomenon is referred to as inter-query competition. To address this challenge, we put forth a series of plug-and-play competition-oriented designs, collectively designated as the CompetitorFormer, with the aim of reducing competition and facilitating a dominant query. Experiments showed that integrating our designs with state-of-the-art frameworks consistently resulted in significant performance improvements in 3D instance segmentation across a range of datasets.
- Published
- 2024
31. Determination of cosmic curvature independent of the sound horizon and $H_0$ using BOSS/eBOSS and DESI DR1 BAO observations
- Author
-
Liu, Tonghua, Wang, Shengjia, Wu, Hengyu, and Wang, Jieci
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
We present an improved model-independent method for determining the cosmic curvature using the observations of Baryon Acoustic Oscillations (BAOs) and the Hubble parameter. The purpose of this work is to provide insights into late-universe curvature measurements using available observational data and techniques. Thus, we use two sources of BAO data sets, BOSS/eBOSS and latest DESI DR1, and two reconstruction methods, Gaussian process (GP) and artificial neural network (ANN). It is important to highlight that our method circumvents influence induced by the sound horizon in BAO observations and the Hubble constant. Combining BAO data from BOSS/eBOSS plus DESI DR1, we find that the constraint on the cosmic curvature results in $\Omega_K=-0.040^{+0.142}_{-0.145}$ with an observational uncertainty of $1\sigma$ in the framework of GP method. This result changes to $\Omega_K=-0.010^{+0.405}_{-0.424}$ when the ANN method is applied. Further comparative analysis of samples from two BAO data sources, we find that there is almost no difference between the two samples. Although the curvature values obtained from the data samples using DESI DR1 are on the slightly positive and the samples using BOSS/eBOSS are on the slightly negative, these results both report that our universe has a flat spatial curvature within uncertainties, and the precision of constraining the curvature with two BAO samples is almost equal., Comment: 8 pages, 3 figures, comments are welcome
- Published
- 2024
32. Compact Visual Data Representation for Green Multimedia -- A Human Visual System Perspective
- Author
-
Chen, Peilin, Fang, Xiaohan, Wang, Meng, Wang, Shiqi, and Ma, Siwei
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Multimedia - Abstract
The Human Visual System (HVS), with its intricate sophistication, is capable of achieving ultra-compact information compression for visual signals. This remarkable ability is coupled with high generalization capability and energy efficiency. By contrast, the state-of-the-art Versatile Video Coding (VVC) standard achieves a compression ratio of around 1,000 times for raw visual data. This notable disparity motivates the research community to draw inspiration to effectively handle the immense volume of visual data in a green way. Therefore, this paper provides a survey of how visual data can be efficiently represented for green multimedia, in particular when the ultimate task is knowledge extraction instead of visual signal reconstruction. We introduce recent research efforts that promote green, sustainable, and efficient multimedia in this field. Moreover, we discuss how the deep understanding of the HVS can benefit the research community, and envision the development of future green multimedia technologies.
- Published
- 2024
33. Learning from 'Silly' Questions Improves Large Language Models, But Only Slightly
- Author
-
Zhu, Tingyuan, Liu, Shudong, Wang, Yidong, Wong, Derek F., Yu, Han, Shinozaki, Takahiro, and Wang, Jindong
- Subjects
Computer Science - Computation and Language - Abstract
Constructing high-quality Supervised Fine-Tuning (SFT) datasets is critical for the training of large language models (LLMs). Recent studies have shown that using data from a specific source, Ruozhiba, a Chinese website where users ask "silly" questions to better understand certain topics, can lead to better fine-tuning performance. This paper aims to explore some hidden factors: the potential interpretations of its success and a large-scale evaluation of the performance. First, we leverage GPT-4 to analyze the successful cases of Ruozhiba questions from the perspective of education, psychology, and cognitive science, deriving a set of explanatory rules. Then, we construct fine-tuning datasets by applying these rules to the MMLU training set. Surprisingly, our results indicate that rules can significantly improve model performance in certain tasks, while potentially diminishing performance on others. For example, SFT data generated following the "Counterintuitive Thinking" rule can achieve approximately a 5% improvement on the "Global Facts" task, whereas the "Blurring the Conceptual Boundaries" rule leads to a performance drop of 6.14% on the "Econometrics" task. In addition, for specific tasks, different rules tend to have a consistent impact on model performance. This suggests that the differences between the extracted rules are not as significant, and the effectiveness of the rules is relatively consistent across tasks. Our research highlights the importance of considering task diversity and rule applicability when constructing SFT datasets to achieve more comprehensive performance improvements., Comment: 27 pages, 14 figures
- Published
- 2024
34. Interpretable QSPR Modeling using Recursive Feature Machines and Multi-scale Fingerprints
- Author
-
Shen, Jiaxuan, Zhang, Haitao, Wang, Yunjie, Wang, Yilong, Tao, Song, Qiu, Bo, and Shyh-Chang, Ng
- Subjects
Quantitative Biology - Biomolecules - Abstract
This study pioneers the application of Recursive Feature Machines (RFM) in QSPR modeling, introducing a tailored feature importance analysis approach to enhance interpretability. By leveraging deep feature learning through AGOP, RFM achieves state-of-the-art (SOTA) results in predicting molecular properties, as demonstrated through solubility prediction across nine benchmark datasets. To capture a wide array of structural information, we employ diverse molecular representations, including MACCS keys, Morgan fingerprints, and a custom multi-scale hybrid fingerprint (HF) derived from global descriptors and SMILES local fragmentation techniques. Notably, the HF offers significant advantages over MACCS and Morgan fingerprints in revealing structural determinants of molecular properties. The feature importance analysis in RFM provides robust local and global explanations, effectively identifying structural features that drive molecular behavior and offering valuable insights for drug development. Additionally, RFM demonstrates strong redundancy-filtering abilities, as model performance remains stable even after removing redundant features within custom fingerprints. Importantly, RFM introduces the deep feature learning capabilities of the average gradient outer product (AGOP) matrix into ultra-fast kernel machine learning, to imbue kernel machines with interpretable deep feature learning capabilities. We extend this approach beyond the Laplace Kernel to the Matern, Rational Quadratic, and Gaussian kernels, to find that the Matern and Laplace kernels deliver the best performance, thus reinforcing the flexibility and effectiveness of AGOP in RFM. Experimental results show that RFM-HF surpasses both traditional machine learning models and advanced graph neural networks.
- Published
- 2024
35. Rabi oscillation and fractional population via the bound states in the continuum in a giant atom waveguide QED setup
- Author
-
Yu, Hongwei, Zhang, Xiaojun, Wang, Zhihai, and Wang, Jin
- Subjects
Quantum Physics - Abstract
We study the dynamics of two giant atoms interacting with a coupled resonator waveguide (CRW) beyond the Markovian approximation. The distinct atomic configurations determine the number of bound states in the continuum (BIC), leading to different dynamical behaviors. Our results show that when the system supports two BICs, Rabi oscillations dominate the dynamics, whereas fractional population dynamics emerge in the presence of a single BIC. The connection between these dynamics and the existence of BICs is further verified by analyzing the photonic distribution in the CRW during time evolution. These findings challenge the conventional notion that the environment always induces dissipation and decoherence. Instead, the bound states in the CRW-emitters coupled system can suppress complete dissipation of the emitters. This work offers an effective approach for controlling dissipative dynamics in open quantum systems., Comment: 10 pages, 4 figures, All the comments are welcomed
- Published
- 2024
36. Measurement of the inclusive branching fractions for $B_s^0$ decays into $D$ mesons via hadronic tagging
- Author
-
Belle, Collaborations, Belle II, Adachi, I., Aggarwal, L., Ahmed, H., Aihara, H., Akopov, N., Aloisio, A., Said, S. Al, Althubiti, N., Ky, N. Anh, Asner, D. M., Atmacan, H., Aushev, T., Aushev, V., Aversano, M., Ayad, R., Babu, V., Bae, H., Baghel, N. K., Bahinipati, S., Bambade, P., Banerjee, Sw., Bansal, S., Barrett, M., Bartl, M., Baudot, J., Baur, A., Beaubien, A., Becherer, F., Becker, J., Belous, K., Bennett, J. V., Bernlochner, F. U., Bertacchi, V., Bertemes, M., Bertholet, E., Bessner, M., Bettarini, S., Bhardwaj, V., Bhuyan, B., Bianchi, F., Bierwirth, L., Bilka, T., Biswas, D., Bobrov, A., Bodrov, D., Bolz, A., Bondar, A., Borah, J., Boschetti, A., Bozek, A., Bračko, M., Branchini, P., Briere, R. A., Browder, T. E., Budano, A., Bussino, S., Campagna, Q., Campajola, M., Cao, L., Casarosa, G., Cecchi, C., Cerasoli, J., Chang, M. -C., Chang, P., Cheaib, R., Cheema, P., Cheon, B. G., Chilikin, K., Chirapatpimol, K., Cho, H. -E., Cho, K., Cho, S. -J., Choi, S. -K., Choudhury, S., Cochran, J., Corona, L., Cui, J. X., Dattola, F., De La Cruz-Burelo, E., De La Motte, S. A., De Nardo, G., De Nuccio, M., De Pietro, G., de Sangro, R., Destefanis, M., Dey, S., Dhamija, R., Di Canto, A., Di Capua, F., Dingfelder, J., Doležal, Z., Jiménez, I. Domínguez, Dong, T. V., Dorner, D., Dort, K., Dossett, D., Dreyer, S., Dubey, S., Dugic, K., Dujany, G., Ecker, P., Eliachevitch, M., Epifanov, D., Feichtinger, P., Ferber, T., Fillinger, T., Finck, C., Finocchiaro, G., Fodor, A., Forti, F., Frey, A., Fulsom, B. G., Gabrielli, A., Ganiev, E., Garcia-Hernandez, M., Garg, R., Gaudino, G., Gaur, V., Gellrich, A., Ghevondyan, G., Ghosh, D., Ghumaryan, H., Giakoustidis, G., Giordano, R., Giri, A., Gironell, P. Gironella, Glazov, A., Gobbo, B., Godang, R., Goldenzweig, P., Graziani, E., Greenwald, D., Gruberová, Z., Gu, T., Guan, Y., Gudkova, K., Haide, I., Halder, S., Han, Y., Hara, T., Harris, C., Hayasaka, K., Hayashii, H., Hazra, S., Hedges, M. T., Heidelbach, A., de la Cruz, I. Heredia, Villanueva, M. Hernández, Higuchi, T., Hoek, M., Hohmann, M., Hoppe, R., Horak, P., Hsu, C. -L., Humair, T., Iijima, T., Inami, K., Ipsita, N., Ishikawa, A., Itoh, R., Iwasaki, M., Jackson, P., Jacobs, W. W., Jang, E. -J., Ji, Q. P., Jia, S., Jin, Y., Johnson, A., Joo, K. K., Junkerkalefeld, H., Kaleta, M., Kalita, D., Kaliyar, A. B., Kandra, J., Kang, K. H., Kang, S., Karyan, G., Kawasaki, T., Keil, F., Ketter, C., Kiesling, C., Kim, C. -H., Kim, D. Y., Kim, J. -Y., Kim, K. -H., Kim, Y. -K., Kim, Y. J., Kindo, H., Kinoshita, K., Kodyš, P., Koga, T., Kohani, S., Kojima, K., Korobov, A., Korpar, S., Kovalenko, E., Križan, P., Krokovny, P., Kuhr, T., Kulii, Y., Kumar, D., Kumar, J., Kumar, M., Kumar, R., Kumara, K., Kunigo, T., Kuzmin, A., Kwon, Y. -J., Lacaprara, S., Lalwani, K., Lam, T., Lanceri, L., Lange, J. S., Lau, T. S., Laurenza, M., Lautenbach, K., Leboucher, R., Diberder, F. R. Le, Lee, M. J., Lemettais, C., Leo, P., Levit, D., Lewis, P. M., Li, L. K., Li, Q. M., Li, S. X., Li, W. Z., Li, Y., Li, Y. B., Liao, Y. P., Libby, J., Lin, J., Liptak, Z., Liu, M. H., Liu, Q. Y., Liu, Y., Liu, Z. Q., Liventsev, D., Longo, S., Lueck, T., Lyu, C., Ma, Y., Madaan, C., Maggiora, M., Maharana, S. P., Maiti, R., Maity, S., Mancinelli, G., Manfredi, R., Manoni, E., Mantovano, M., Marcantonio, D., Marcello, S., Marinas, C., Martellini, C., Martens, A., Martini, A., Martinov, T., Massaccesi, L., Masuda, M., Matvienko, D., Maurya, S. K., Maushart, M., McKenna, J. A., Meier, F., Merola, M., Metzner, F., Miller, C., Mirra, M., Mitra, S., Miyabayashi, K., Mizuk, R., Mohanty, G. B., Mondal, S., Moneta, S., Moser, H. -G., Mrvar, M., Mussa, R., Nakamura, I., Nakao, M., Nakazawa, Y., Naruki, M., Natkaniec, Z., Natochii, A., Nayak, M., Nazaryan, G., Neu, M., Niebuhr, C., Niiyama, M., Nishida, S., Ogawa, S., Onishchuk, Y., Ono, H., Onuki, Y., Otani, F., Pakhlov, P., Pakhlova, G., Paoloni, E., Pardi, S., Parham, K., Park, H., Park, J., Park, K., Park, S. -H., Paschen, B., Passeri, A., Patra, S., Paul, S., Pedlar, T. K., Peschke, R., Pestotnik, R., Piccolo, M., Piilonen, L. E., Angioni, G. Pinna, Podesta-Lerma, P. L. M., Podobnik, T., Pokharel, S., Praz, C., Prell, S., Prencipe, E., Prim, M. T., Prudiiev, I., Purwar, H., Rados, P., Raeuber, G., Raiz, S., Rauls, N., Ravindran, K., Rehman, J. U., Reif, M., Reiter, S., Remnev, M., Reuter, L., Herrmann, D. Ricalde, Ripp-Baudot, I., Rizzo, G., Roehrken, M., Roney, J. M., Rostomyan, A., Rout, N., Sanders, D. A., Sandilya, S., Santelj, L., Sato, Y., Savinov, V., Scavino, B., Schmitt, C., Schneider, S., Schnell, G., Schnepf, M., Schwanda, C., Schwartz, A. J., Seino, Y., Selce, A., Senyo, K., Serrano, J., Sevior, M. E., Sfienti, C., Shan, W., Sharma, C., Shen, C. P., Shi, X. D., Shillington, T., Shimasaki, T., Shiu, J. -G., Shtol, D., Sibidanov, A., Simon, F., Singh, J. B., Skorupa, J., Sobotzik, M., Soffer, A., Sokolov, A., Solovieva, E., Song, W., Spataro, S., Spruck, B., Starič, M., Stavroulakis, P., Stefkova, S., Stroili, R., Strube, J., Sue, Y., Sumihama, M., Sumisawa, K., Sutcliffe, W., Suwonjandee, N., Svidras, H., Takahashi, M., Takizawa, M., Tamponi, U., Tanaka, S., Tanida, K., Tenchini, F., Thaller, A., Tittel, O., Tiwary, R., Torassa, E., Trabelsi, K., Tsaklidis, I., Ueda, I., Uglov, T., Unger, K., Unno, Y., Uno, K., Uno, S., Urquijo, P., Ushiroda, Y., Vahsen, S. E., van Tonder, R., Varvell, K. E., Veronesi, M., Vinokurova, A., Vismaya, V. S., Vitale, L., Vobbilisetti, V., Volpe, R., Vossen, A., Wach, B., Wakai, M., Wallner, S., Wang, B., Wang, E., Wang, M. -Z., Wang, X. L., Wang, Z., Warburton, A., Watanabe, M., Watanuki, S., Wessel, C., Wiechczynski, J., Won, E., Xu, X. P., Yabsley, B. D., Yamada, S., Yang, S. B., Yasaveev, M., Yelton, J., Yin, J. H., Yook, Y. M., Yoshihara, K., Yuan, C. Z., Yuan, J., Yusa, Y., Zani, L., Zeng, F., Zhang, B., Zhilich, V., Zhou, J. S., Zhou, Q. D., Zhukova, V. I., and Žlebčík, R.
- Subjects
High Energy Physics - Experiment - Abstract
We report measurements of the absolute branching fractions $\mathcal{B}(B_s^0 \to D_s^{\pm} X)$, $\mathcal{B}(B_s^0 \to D^0/\bar{D}^0 X)$, and $\mathcal{B}(B_s^0 \to D^{\pm} X)$, where the latter is measured for the first time. The results are based on a 121.4\,fb$^{-1}$ data sample collected at the $\Upsilon(10860)$ resonance by the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. We reconstruct one $B_s^0$ meson in $e^+e^- \to \Upsilon(10860) \to B_s^{*} \bar{B}_s^{*}$ events and measure yields of $D_s^+$, $D^0$, and $D^+$ mesons in the rest of the event. We obtain $\mathcal{B}(B_s^0 \to D_s^{\pm} X) = (68.6 \pm 7.2 \pm 4.0)\%$, $\mathcal{B}(B_s^0 \to D^0/\bar{D}^0 X) = (21.5 \pm 6.1 \pm 1.8)\%$, and $\mathcal{B}(B_s^0 \to D^{\pm} X) = (12.6 \pm 4.6 \pm 1.3)\%$, where the first uncertainty is statistical and the second is systematic. Averaging with previous Belle measurements gives $\mathcal{B}(B_s^0 \to D_s^{\pm} X) = (63.4 \pm 4.5 \pm 2.2)\%$ and $\mathcal{B}(B_s^0 \to D^0/\bar{D}^0 X) = (23.9 \pm 4.1 \pm 1.8)\%$. For the $B_s^0$ production fraction at the $\Upsilon(10860)$, we find $f_s = (21.4^{+1.5}_{-1.7})\%$., Comment: 23 pages, 9 figures, submitted to JHEP
- Published
- 2024
37. Non-Bloch self-energy of dissipative interacting fermions
- Author
-
Wang, He-Ran, Wang, Zijian, and Wang, Zhong
- Subjects
Quantum Physics ,Condensed Matter - Quantum Gases ,Condensed Matter - Strongly Correlated Electrons ,Physics - Optics - Abstract
The non-Hermitian skin effect describes the phenomenon of exponential localization of single-particle eigenstates near the boundary of the system. We explore its generalization to the many-body regime by investigating interacting fermions in open quantum systems. Therein, the elementary excitations from the ``vacuum'' (steady state) are given by two types of dissipative quasi-particles composed of single-fermion operators. We perturbatively calculate the self-energy of these quasi-particles in the presence of interactions, and utilize the non-Bloch band theory to develop an exact integral formula, which is further simplified by imposing complex momentum conservation. The formula allows calculating the Liouvillian gap modified by interactions with high precision, as demonstrated by comparison to numerical results. Furthermore, our results show that interactions can even enhance the non-reciprocity of fermion hoppings, contrary to the conventional viewpoint from the Pauli exclusion principle. Our formulation provides a quantitative tool for investigating dissipative interacting fermions with non-Hermitian skin effect, and generalizes the Fermi liquid theory to open quantum systems in the context of diagrammatic perturbation theory., Comment: 7+5 pages, 3+1 figures
- Published
- 2024
38. Robust SG-NeRF: Robust Scene Graph Aided Neural Surface Reconstruction
- Author
-
Gu, Yi, Ye, Dongjun, Wang, Zhaorui, Wang, Jiaxu, Cao, Jiahang, and Xu, Renjing
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Neural surface reconstruction relies heavily on accurate camera poses as input. Despite utilizing advanced pose estimators like COLMAP or ARKit, camera poses can still be noisy. Existing pose-NeRF joint optimization methods handle poses with small noise (inliers) effectively but struggle with large noise (outliers), such as mirrored poses. In this work, we focus on mitigating the impact of outlier poses. Our method integrates an inlier-outlier confidence estimation scheme, leveraging scene graph information gathered during the data preparation phase. Unlike previous works directly using rendering metrics as the reference, we employ a detached color network that omits the viewing direction as input to minimize the impact caused by shape-radiance ambiguities. This enhanced confidence updating strategy effectively differentiates between inlier and outlier poses, allowing us to sample more rays from inlier poses to construct more reliable radiance fields. Additionally, we introduce a re-projection loss based on the current Signed Distance Function (SDF) and pose estimations, strengthening the constraints between matching image pairs. For outlier poses, we adopt a Monte Carlo re-localization method to find better solutions. We also devise a scene graph updating strategy to provide more accurate information throughout the training process. We validate our approach on the SG-NeRF and DTU datasets. Experimental results on various datasets demonstrate that our methods can consistently improve the reconstruction qualities and pose accuracies., Comment: https://rsg-nerf.github.io/RSG-NeRF/
- Published
- 2024
39. Disentangling Memory and Reasoning Ability in Large Language Models
- Author
-
Jin, Mingyu, Luo, Weidi, Cheng, Sitao, Wang, Xinyi, Hua, Wenyue, Tang, Ruixiang, Wang, William Yang, and Zhang, Yongfeng
- Subjects
Computer Science - Computation and Language - Abstract
Large Language Models (LLMs) have demonstrated strong performance in handling complex tasks requiring both extensive knowledge and reasoning abilities. However, the existing LLM inference pipeline operates as an opaque process without explicit separation between knowledge retrieval and reasoning steps, making the model's decision-making process unclear and disorganized. This ambiguity can lead to issues such as hallucinations and knowledge forgetting, which significantly impact the reliability of LLMs in high-stakes domains. In this paper, we propose a new inference paradigm that decomposes the complex inference process into two distinct and clear actions: (1) memory recall: which retrieves relevant knowledge, and (2) reasoning: which performs logical steps based on the recalled knowledge. To facilitate this decomposition, we introduce two special tokens memory and reason, guiding the model to distinguish between steps that require knowledge retrieval and those that involve reasoning. Our experiment results show that this decomposition not only improves model performance but also enhances the interpretability of the inference process, enabling users to identify sources of error and refine model responses effectively. The code is available at https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning.
- Published
- 2024
40. VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
- Author
-
Huang, Ziqi, Zhang, Fan, Xu, Xiaojie, He, Yinan, Yu, Jiashuo, Dong, Ziyue, Ma, Qianli, Chanpaisit, Nattapol, Si, Chenyang, Jiang, Yuming, Wang, Yaohui, Chen, Xinyuan, Chen, Ying-Cong, Wang, Limin, Lin, Dahua, Qiao, Yu, and Liu, Ziwei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video generation has witnessed significant advancements, yet evaluating these models remains a challenge. A comprehensive evaluation benchmark for video generation is indispensable for two reasons: 1) Existing metrics do not fully align with human perceptions; 2) An ideal evaluation system should provide insights to inform future developments of video generation. To this end, we present VBench, a comprehensive benchmark suite that dissects "video generation quality" into specific, hierarchical, and disentangled dimensions, each with tailored prompts and evaluation methods. VBench has several appealing properties: 1) Comprehensive Dimensions: VBench comprises 16 dimensions in video generation (e.g., subject identity inconsistency, motion smoothness, temporal flickering, and spatial relationship, etc). The evaluation metrics with fine-grained levels reveal individual models' strengths and weaknesses. 2) Human Alignment: We also provide a dataset of human preference annotations to validate our benchmarks' alignment with human perception, for each evaluation dimension respectively. 3) Valuable Insights: We look into current models' ability across various evaluation dimensions, and various content types. We also investigate the gaps between video and image generation models. 4) Versatile Benchmarking: VBench++ supports evaluating text-to-video and image-to-video. We introduce a high-quality Image Suite with an adaptive aspect ratio to enable fair evaluations across different image-to-video generation settings. Beyond assessing technical quality, VBench++ evaluates the trustworthiness of video generative models, providing a more holistic view of model performance. 5) Full Open-Sourcing: We fully open-source VBench++ and continually add new video generation models to our leaderboard to drive forward the field of video generation., Comment: Leaderboard: https://huggingface.co/spaces/Vchitect/VBench_Leaderboard Code: https://github.com/Vchitect/VBench Project page: https://vchitect.github.io/VBench-project/ extension of arXiv:2311.17982. arXiv admin note: substantial text overlap with arXiv:2311.17982
- Published
- 2024
41. Evidence of anisotropic three-dimensional weak-localization in TiSe$_{2}$ nanoflakes
- Author
-
Wang, Xiaocui, Yang, Yang, Li, Yongkai, Liu, Guangtong, Duan, Junxi, Wang, Zhiwei, Lu, Li, and Yang, Fan
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
TiSe$_2$ is a typical transition-metal dichalcogenide known for its charge-density wave order. In this study, we report the observation of an unusual anisotropic negative magnetoresistance in exfoliated TiSe$_2$ nanoflakes at low temperatures. Unlike the negative magnetoresistance reported in most other transition-metal dichalcogenides, our results cannot be explained by either the conventional two-dimensional weak localization effect or the Kondo effect. A comprehensive analysis of the data suggests that the observed anisotropic negative magnetoresistance in TiSe$_2$ flakes is most likely caused by the three-dimensional weak localization effect. Our findings contribute to a deeper understanding of the phase-coherent transport processes in TiSe$_2$.
- Published
- 2024
42. WaterPark: A Robustness Assessment of Language Model Watermarking
- Author
-
Liang, Jiacheng, Wang, Zian, Hong, Lauren, Ji, Shouling, and Wang, Ting
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
To mitigate the misuse of large language models (LLMs), such as disinformation, automated phishing, and academic cheating, there is a pressing need for the capability of identifying LLM-generated texts. Watermarking emerges as one promising solution: it plants statistical signals into LLMs' generative processes and subsequently verifies whether LLMs produce given texts. Various watermarking methods (``watermarkers'') have been proposed; yet, due to the lack of unified evaluation platforms, many critical questions remain under-explored: i) What are the strengths/limitations of various watermarkers, especially their attack robustness? ii) How do various design choices impact their robustness? iii) How to optimally operate watermarkers in adversarial environments? To fill this gap, we systematize existing LLM watermarkers and watermark removal attacks, mapping out their design spaces. We then develop WaterPark, a unified platform that integrates 10 state-of-the-art watermarkers and 12 representative attacks. More importantly, leveraging WaterPark, we conduct a comprehensive assessment of existing watermarkers, unveiling the impact of various design choices on their attack robustness. For instance, a watermarker's resilience to increasingly intensive attacks hinges on its context dependency. We further explore the best practices to operate watermarkers in adversarial environments. For instance, using a generic detector alongside a watermark-specific detector improves the security of vulnerable watermarkers. We believe our study sheds light on current LLM watermarking techniques while WaterPark serves as a valuable testbed to facilitate future research., Comment: 22 pages
- Published
- 2024
43. Reanalyzing the ringdown signal of GW150914 using the F-statistic method
- Author
-
Wang, Hai-Tian, Wang, Ziming, Dong, Yiming, Yim, Garvin, and Shao, Lijing
- Subjects
General Relativity and Quantum Cosmology ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
The ringdown phase of a gravitational wave (GW) signal from a binary black hole merger provides valuable insights into the properties of the final black hole and serves as a critical test of general relativity in the strong-field regime. A key aspect of this investigation is to determine whether the first overtone mode exists in real GW data, as its presence would offer significant implications for our understanding of general relativity under extreme conditions. To address this, we conducted a reanalysis of the ringdown signal from GW150914, using the newly proposed F-statistic method to search for the first overtone mode. Our results are consistent with those obtained through classical time-domain Bayesian inference, indicating that there is no evidence of the first overtone mode in the ringdown signal of GW150914. However, our results show the potentiality of utilizing the F-statistic methodology to unearth nuanced features within GW signals, thereby contributing novel insights into black hole properties., Comment: 7 pages, 3 figures
- Published
- 2024
44. Versatile photonic frequency synthetic dimensions using a single Mach-Zehnder-interferometer-assisted device on thin-film lithium niobate
- Author
-
Wang, Zhao-An, Zeng, Xiao-Dong, Wang, Yi-Tao, Ren, Jia-Ming, Ao, Chun, Li, Zhi-Peng, Liu, Wei, Guo, Nai-Jie, Xie, Lin-Ke, Liu, Jun-You, Ma, Yu-Hang, Wu, Ya-Qi, Wang, Shuang, Tang, Jian-Shun, Li, Chuan-Feng, and Guo, Guang-Can
- Subjects
Physics - Optics ,Quantum Physics - Abstract
Investigating physical models with photonic synthetic dimensions has been generating great interest in vast fields of science. The rapid developing thin-film lithium niobate (TFLN) platform, for its numerous advantages including high electro-optic coefficient and scalability, is well compatible with the realization of synthetic dimensions in the frequency together with spatial domain. While coupling resonators with fixed beam splitters is a common experimental approach, it often lacks tunability and limits coupling between adjacent lattices to sites occupying the same frequency domain positions. Here, on the contrary, we conceive the resonator arrays connected by electro-optic tunable Mach-Zehnder interferometers in our configuration instead of fixed beam splitters. By applying bias voltage and RF modulation on the interferometers, our design extends such coupling to long-range scenario and allows for continuous tuning on each coupling strength and synthetic effective magnetic flux. Therefore, our design enriches controllable coupling types that are essential for building programmable lattice networks and significantly increases versatility. As the example, we experimentally fabricate a two-resonator prototype on the TFLN platform, and on this single chip we realize well-known models including tight-binding lattices, topological Hall ladder and Creutz ladder. We directly observe the band structures in the quasi-momentum space and important phenomena such as spin-momentum locking and the Aharonov-Bohm cage effect. These results demonstrate the potential for convenient simulations of more complex models in our configuration.
- Published
- 2024
45. Scaling Laws for Online Advertisement Retrieval
- Author
-
Wang, Yunli, Yang, Zixuan, Zhang, Zhen, Wang, Zhiqiang, Yang, Jian, Wen, Shiyang, Jiang, Peng, and Gai, Kun
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
The scaling law is a notable property of neural network models and has significantly propelled the development of large language models. Scaling laws hold great promise in guiding model design and resource allocation. Recent research increasingly shows that scaling laws are not limited to NLP tasks or Transformer architectures; they also apply to domains such as recommendation. However, there is still a lack of literature on scaling law research in online advertisement retrieval systems. This may be because 1) identifying the scaling law for resource cost and online revenue is often expensive in both time and training resources for large-scale industrial applications, and 2) varying settings for different systems prevent the scaling law from being applied across various scenarios. To address these issues, we propose a lightweight paradigm to identify the scaling law of online revenue and machine cost for a certain online advertisement retrieval scenario with a low experimental cost. Specifically, we focus on a sole factor (FLOPs) and propose an offline metric named R/R* that exhibits a high linear correlation with online revenue for retrieval models. We estimate the machine cost offline via a simulation algorithm. Thus, we can transform most online experiments into low-cost offline experiments. We conduct comprehensive experiments to verify the effectiveness of our proposed metric R/R* and to identify the scaling law in the online advertisement retrieval system of Kuaishou. With the scaling law, we demonstrate practical applications for ROI-constrained model designing and multi-scenario resource allocation in Kuaishou advertising system. To the best of our knowledge, this is the first work to study the scaling laws for online advertisement retrieval of real-world systems, showing great potential for scaling law in advertising system optimization., Comment: 10 pages, 8 figures
- Published
- 2024
46. Mutual Information-oriented ISAC Beamforming Design under Statistical CSI
- Author
-
Xu, Shanfeng, Cheng, Yanshuo, Wang, Siqiang, Wang, Xinyi, Zheng, Zhong, and Fei, Zesong
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Existing integrated sensing and communication (ISAC) beamforming design were mostly designed under perfect instantaneous channel state information (CSI), limiting their use in practical dynamic environments. In this paper, we study the beamforming design for multiple-input multiple-output (MIMO) ISAC systems based on statistical CSI, with the weighted mutual information (MI) comprising sensing and communication perspectives adopted as the performance metric. In particular, the operator-valued free probability theory is utilized to derive the closed-form expression for the weighted MI under statistical CSI. Subsequently, an efficient projected gradient ascent (PGA) algorithm is proposed to optimize the transmit beamforming matrix with the aim of maximizing the weighted MI.Numerical results validate that the derived closed-form expression matches well with the Monte Carlo simulation results and the proposed optimization algorithm is able to improve the weighted MI significantly. We also illustrate the trade-off between sensing and communication MI., Comment: 14 pages, 5 figures, submitted to IEEE journal for possible publication
- Published
- 2024
47. EEG Signal Denoising Using pix2pix GAN: Enhancing Neurological Data Analysis
- Author
-
Wang, Haoyi, Chen, Xufang, Yang, Yue, Zhou, Kewei, Lv, Meining, Wang, Dongrui, and Zhang, Wenjie
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,I.4.9 - Abstract
Electroencephalography (EEG) is essential in neuroscience and clinical practice, yet it suffers from physiological artifacts, particularly electromyography (EMG), which distort signals. We propose a deep learning model using pix2pixGAN to remove such noise and generate reliable EEG signals. Leveraging the EEGdenoiseNet dataset, we created synthetic datasets with controlled EMG noise levels for model training and testing across a signal-to-noise ratio (SNR) from -7 to 2. Our evaluation metrics included RRMSE and Pearson's CC, assessing both time and frequency domains, and compared our model with others. The pix2pixGAN model excelled, especially under high noise conditions, showing significant improvements in lower RRMSE and higher CC values. This demonstrates the model's superior accuracy and stability in purifying EEG signals, offering a robust solution for EEG analysis challenges and advancing clinical and neuroscience applications., Comment: 17 pages,6 figures
- Published
- 2024
48. Paying more attention to local contrast: improving infrared small target detection performance via prior knowledge
- Author
-
Wang, Peichao, Wang, Jiabao, Chen, Yao, Zhang, Rui, Li, Yang, and Miao, Zhuang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The data-driven method for infrared small target detection (IRSTD) has achieved promising results. However, due to the small scale of infrared small target datasets and the limited number of pixels occupied by the targets themselves, it is a challenging task for deep learning methods to directly learn from these samples. Utilizing human expert knowledge to assist deep learning methods in better learning is worthy of exploration. To effectively guide the model to focus on targets' spatial features, this paper proposes the Local Contrast Attention Enhanced infrared small target detection Network (LCAE-Net), combining prior knowledge with data-driven deep learning methods. LCAE-Net is a U-shaped neural network model which consists of two developed modules: a Local Contrast Enhancement (LCE) module and a Channel Attention Enhancement (CAE) module. The LCE module takes advantages of prior knowledge, leveraging handcrafted convolution operator to acquire Local Contrast Attention (LCA), which could realize background suppression while enhance the potential target region, thus guiding the neural network to pay more attention to potential infrared small targets' location information. To effectively utilize the response information throughout downsampling progresses, the CAE module is proposed to achieve the information fusion among feature maps' different channels. Experimental results indicate that our LCAE-Net outperforms existing state-of-the-art methods on the three public datasets NUDT-SIRST, NUAA-SIRST, and IRSTD-1K, and its detection speed could reach up to 70 fps. Meanwhile, our model has a parameter count and Floating-Point Operations (FLOPs) of 1.945M and 4.862G respectively, which is suitable for deployment on edge devices., Comment: 16 pages, 8 figures
- Published
- 2024
49. XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
- Author
-
Wang, Ziyi, Wang, Yanbo, Yu, Xumin, Zhou, Jie, and Lu, Jiwen
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Existing methodologies in open vocabulary 3D semantic segmentation primarily concentrate on establishing a unified feature space encompassing 3D, 2D, and textual modalities. Nevertheless, traditional techniques such as global feature alignment or vision-language model distillation tend to impose only approximate correspondence, struggling notably with delineating fine-grained segmentation boundaries. To address this gap, we propose a more meticulous mask-level alignment between 3D features and the 2D-text embedding space through a cross-modal mask reasoning framework, XMask3D. In our approach, we developed a mask generator based on the denoising UNet from a pre-trained diffusion model, leveraging its capability for precise textual control over dense pixel representations and enhancing the open-world adaptability of the generated masks. We further integrate 3D global features as implicit conditions into the pre-trained 2D denoising UNet, enabling the generation of segmentation masks with additional 3D geometry awareness. Subsequently, the generated 2D masks are employed to align mask-level 3D representations with the vision-language feature space, thereby augmenting the open vocabulary capability of 3D geometry embeddings. Finally, we fuse complementary 2D and 3D mask features, resulting in competitive performance across multiple benchmarks for 3D open vocabulary semantic segmentation. Code is available at https://github.com/wangzy22/XMask3D., Comment: Accepted to NeurIPS 2024
- Published
- 2024
50. How interfacial tension enhances drag in turbulent Taylor-Couette flow with neutrally buoyant and equally viscous droplets
- Author
-
Su, Jinghong, Zhang, Yi-bao, Wang, Cheng, Yi, Lei, Xu, Fan, Fan, Yaning, Wang, Junwu, and Sun, Chao
- Subjects
Physics - Fluid Dynamics - Abstract
The presence of dispersed-phase droplets can result in a notable increase in the system's drag. However, our understanding of the mechanism underlying this phenomenon remains limited. In this study, we use three-dimensional direct numerical simulations with a modified multi-marker volume-of-fluid method to investigate liquid-liquid two-phase turbulence in a Taylor-Couette geometry. The dispersed phase has the same density and viscosity as the continuous phase. The Reynolds number $Re\equiv r_i\omega_i d/\nu$ is fixed at 5200, the volume fraction of the dispersed phase is up to $40\%$, and the Weber number $We\equiv \rho u^2_\tau d/\sigma$ is around 8. It is found that the increase in the system's drag originates from the contribution of interfacial tension. Specifically, droplets experience significant deformation and stretching in the streamwise direction due to shear near the inner cylinder. Consequently, the rear end of the droplets lags behind the fore head. This causes opposing interfacial tension effects on the fore head and rear end of the droplets. For the fore head of the droplets, the effect of interfacial tension appears to act against the flow direction. For the rear end, the effect appears to act in the flow direction. The increase in the system's drag is primarily attributed to the effect of interfacial tension on the fore head of the droplets which leads to the hindering effect of the droplets on the surrounding continuous phase. This hindering effect disrupts the formation of high-speed streaks, favoring the formation of low-speed ones, which are generally associated with higher viscous stress and drag of the system. This study provides new insights into the mechanism of drag enhancement reported in our previous experiments.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.