865,063 results on '"A Chu"'
Search Results
102. Unknottedness of free boundary minimal surfaces and self-shrinkers
- Author
-
Chu, Sabine and Franz, Giada
- Subjects
Mathematics - Differential Geometry - Abstract
We study unknottedness for free boundary minimal surfaces in a three-dimensional Riemannian manifold with nonnegative Ricci curvature and strictly convex boundary, and for self-shrinkers in the three-dimensional Euclidean space. For doing so, we introduce the concepts of boundary graph for free boundary minimal surfaces and of graph at infinity for self-shrinkers. We prove that these surfaces are unknotted in the sense that any two such surfaces with isomorphic boundary graph or graph at infinity are smoothly isotopic., Comment: 10 pages, 2 figures
- Published
- 2024
103. Unpacking Political Bias in Large Language Models: Insights Across Topic Polarization
- Author
-
Yang, Kaiqi, Li, Hang, Chu, Yucheng, Lin, Yuping, Peng, Tai-Quan, and Liu, Hui
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) have been widely used to generate responses on social topics due to their world knowledge and generative capabilities. Beyond reasoning and generation performance, political bias is an essential issue that warrants attention. Political bias, as a universal phenomenon in human society, may be transferred to LLMs and distort LLMs' behaviors of information acquisition and dissemination with humans, leading to unequal access among different groups of people. To prevent LLMs from reproducing and reinforcing political biases, and to encourage fairer LLM-human interactions, comprehensively examining political bias in popular LLMs becomes urgent and crucial. In this study, we systematically measure the political biases in a wide range of LLMs, using a curated set of questions addressing political bias in various contexts. Our findings reveal distinct patterns in how LLMs respond to political topics. For highly polarized topics, most LLMs exhibit a pronounced left-leaning bias. Conversely, less polarized topics elicit greater consensus, with similar response patterns across different LLMs. Additionally, we analyze how LLM characteristics, including release date, model scale, and region of origin affect political bias. The results indicate political biases evolve with model scale and release date, and are also influenced by regional factors of LLMs.
- Published
- 2024
104. A District-level Ensemble Model to Enhance Dengue Prediction and Control for the Mekong Delta Region of Vietnam
- Author
-
Areed, Wala Draidi, Nguyen, Thi Thanh Thao, Do, Kien Quoc, Nguyen, Thinh, Bui, Vinh, Nelson, Elisabeth, Warren, Joshua L., Doan, Quang-Van, Sinh, Nam Vu, Osborne, Nicholas, Richards, Russell, Tran, Nu Quy Linh, Le, Hong, Pham, Tuan, Hung, Trinh Manh, Nghiem, Son, Phung, Hai, Chu, Cordia, Dubrow, Robert, Weinberger, Daniel M., and Phung, Dung
- Subjects
Statistics - Applications - Abstract
The Mekong Delta Region of Vietnam faces increasing dengue risks driven by urbanization, globalization, and climate change. This study introduces a probabilistic forecasting model for predicting dengue incidence and outbreaks with one to three month lead times, integrating meteorological, sociodemographic, preventive, and epidemiological data. Seventy-two models were evaluated, and an ensemble combining top-performing spatiotemporal, supervised PCA, and semi-mechanistic hhh4 frameworks was developed. Using data from 2004-2022 for training, validation, and evaluation, the ensemble model demonstrated 69% accuracy at a 3-month horizon, outperforming a baseline model. While effective, its performance declined in years with atypical seasonality, such as 2019 and 2022. The model provides critical lead time for targeted dengue prevention and control measures, addressing a growing public health need in the region., Comment: 34 pages, 6 figures
- Published
- 2024
105. JailPO: A Novel Black-box Jailbreak Framework via Preference Optimization against Aligned LLMs
- Author
-
Li, Hongyi, Ye, Jiawei, Wu, Jie, Yan, Tianjie, Wang, Chu, and Li, Zhixin
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) aligned with human feedback have recently garnered significant attention. However, it remains vulnerable to jailbreak attacks, where adversaries manipulate prompts to induce harmful outputs. Exploring jailbreak attacks enables us to investigate the vulnerabilities of LLMs and further guides us in enhancing their security. Unfortunately, existing techniques mainly rely on handcrafted templates or generated-based optimization, posing challenges in scalability, efficiency and universality. To address these issues, we present JailPO, a novel black-box jailbreak framework to examine LLM alignment. For scalability and universality, JailPO meticulously trains attack models to automatically generate covert jailbreak prompts. Furthermore, we introduce a preference optimization-based attack method to enhance the jailbreak effectiveness, thereby improving efficiency. To analyze model vulnerabilities, we provide three flexible jailbreak patterns. Extensive experiments demonstrate that JailPO not only automates the attack process while maintaining effectiveness but also exhibits superior performance in efficiency, universality, and robustness against defenses compared to baselines. Additionally, our analysis of the three JailPO patterns reveals that attacks based on complex templates exhibit higher attack strength, whereas covert question transformations elicit riskier responses and are more likely to bypass defense mechanisms., Comment: Accepted by AAAI 2025
- Published
- 2024
106. Rare multi-nucleon decays with the full data sets of the Majorana Demonstrator
- Author
-
Arnquist, I. J., Avignone III, F. T., Barabash, A. S., Blalock, E., Bos, B., Busch, M., Chan, Y. -D., Chapman, J. R., Christofferson, C. D., Chu, P. -H., Cuesta, C., Detwiler, J. A., Efremenko, Yu., Ejiri, H., Elliott, S. R., Fuad, N., Giovanetti, G. K., Green, M. P., Gruszko, J., Guinn, I. S., Guiseppe, V. E., Henning, R., Hoppe, E. W., Kouzes, R. T., Li, A., Massarczyk, R., Meijer, S. J., Paudel, L. S., Pettus, W., Poon, A. W. P., Radford, D. C., Reine, A. L., Rielage, K., Schaper, D. C., Schleich, S. J., Tedeschi, D., Varner, R. L., Vasilyev, S., Watkins, S. L., Wilkerson, J. F., Wiseman, C., and Yu, C. -H.
- Subjects
Nuclear Experiment - Abstract
The Majorana Demonstrator was an ultra-low-background experiment designed for neutrinoless double-beta decay ($0\nu\beta\beta$) investigation in $^{76}$Ge. Located at the Sanford Underground Research Facility in Lead, South Dakota, the Demonstrator utilized modular high-purity Ge detector arrays within shielded vacuum cryostats, operating deep underground. The arrays, with a capacity of up to 40.4 kg (27.2 kg enriched to $\sim 88\%$ in $^{76}$Ge), have accumulated the full data set, totaling 64.5 kg yr of enriched active exposure and 27.4 kg yr of exposure for natural detectors. Our updated search improves previously explored three-nucleon decay modes in Ge isotopes, setting new half-life limits of $1.27\times10^{26}$ years (90\% confidence level) for $^{76}$Ge($ppp$) $\rightarrow$ $^{73}$Cu e$^+\pi^+\pi^+$ and $^{76}$Ge($ppn$) $\rightarrow$ $^{73}$Zn e$^+\pi^+$. The half-life limit for the invisible tri-proton decay mode of $^{76}$Ge is found to be $1.4\times10^{25}$ yr. Furthermore, we have updated limits for corresponding multi-nucleon decays.
- Published
- 2024
107. Safe Spaces or Toxic Places? Content Moderation and Social Dynamics of Online Eating Disorder Communities
- Author
-
Lerman, Kristina, Chu, Minh Duc, Bickham, Charles, Luceri, Luca, and Ferrara, Emilio
- Subjects
Computer Science - Social and Information Networks ,Computer Science - Computers and Society ,Computer Science - Human-Computer Interaction - Abstract
Social media platforms have become critical spaces for discussing mental health concerns, including eating disorders. While these platforms can provide valuable support networks, they may also amplify harmful content that glorifies disordered cognition and self-destructive behaviors. While social media platforms have implemented various content moderation strategies, from stringent to laissez-faire approaches, we lack a comprehensive understanding of how these different moderation practices interact with user engagement in online communities around these sensitive mental health topics. This study addresses this knowledge gap through a comparative analysis of eating disorder discussions across Twitter/X, Reddit, and TikTok. Our findings reveal that while users across all platforms engage similarly in expressing concerns and seeking support, platforms with weaker moderation (like Twitter/X) enable the formation of toxic echo chambers that amplify pro-anorexia rhetoric. These results demonstrate how moderation strategies significantly influence the development and impact of online communities, particularly in contexts involving mental health and self-harm., Comment: arXiv admin note: text overlap with arXiv:2401.09647
- Published
- 2024
108. Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework
- Author
-
Xu, Zhenjie, Chen, Wenqing, Tang, Yi, Li, Xuanying, Hu, Cheng, Chu, Zhixuan, Ren, Kui, Zheng, Zibin, and Lu, Zhichao
- Subjects
Computer Science - Computation and Language - Abstract
Natural language processing (NLP) has seen remarkable advancements with the development of large language models (LLMs). Despite these advancements, LLMs often produce socially biased outputs. Recent studies have mainly addressed this problem by prompting LLMs to behave ethically, but this approach results in unacceptable performance degradation. In this paper, we propose a multi-objective approach within a multi-agent framework (MOMA) to mitigate social bias in LLMs without significantly compromising their performance. The key idea of MOMA involves deploying multiple agents to perform causal interventions on bias-related contents of the input questions, breaking the shortcut connection between these contents and the corresponding answers. Unlike traditional debiasing techniques leading to performance degradation, MOMA substantially reduces bias while maintaining accuracy in downstream tasks. Our experiments conducted on two datasets and two models demonstrate that MOMA reduces bias scores by up to 87.7%, with only a marginal performance degradation of up to 6.8% in the BBQ dataset. Additionally, it significantly enhances the multi-objective metric icat in the StereoSet dataset by up to 58.1%. Code will be made available at https://github.com/Cortantse/MOMA., Comment: This work has been accepted at The 39th Annual AAAI Conference on Artificial Intelligence (AAAI-2025)
- Published
- 2024
109. Spin evolution and mass distribution of the Galactic Binary Neutron Stars
- Author
-
Chu, Qingbo, Lu, Youjun, and Yu, Shenghua
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,Astrophysics - Astrophysics of Galaxies ,Astrophysics - Solar and Stellar Astrophysics - Abstract
Binary neutron stars (BNSs) detected in the Milky Way have the total masses distributing narrowly around $\sim2.6-2.7M_\odot$, while the BNS merger GW190425 detected via gravitational wave has a significantly larger mass ($\sim3.4M_\odot$). This difference is not well understood, yet. In this paper, we investigate the BNS spin evolution via an improved binary star evolution model and its effects on the BNS observability, with implementation of various relevant astrophysical processes. We find that the first-born neutron star component in low-mass BNSs can be spun up to millisecond pulsars by the accretion of Roche-lobe overflow from its companion and its radio lifetime can be comparable to the Hubble time. However, most high-mass BNSs have substantially shorter radio lifetime than the low-mass BNSs, and thus smaller probability being detected via radio emission. Adopting the star formation and metal enrichment history of the Milky Way given by observations, we obtain the survived Galactic BNSs with pulsar components from our population synthesis model and find that their distributions on the diagrams of spin period versus spin-period-time-derivative ($P-\dot{P}$) and orbital period versus eccentricity ($P_{\rm orb}-e$) can well match those of the observed Galactic BNSs. The total mass distribution of the observed Galactic BNSs can also be matched by the model. A significant fraction ($\sim19\%-22\%$) of merging BNSs at redshift $z\sim0$ have masses $\gtrsim3M_\odot$, which seems compatible with the GW observations. Future radio observations may detect many more Galactic BNSs, which will put strong constraint on the spin evolution of BNSs during their formation processes., Comment: 19 pages, 11 figures, accepted for publication in The Astrophysical Journal
- Published
- 2024
110. EPN: An Ego Vehicle Planning-Informed Network for Target Trajectory Prediction
- Author
-
Peng, Saiqian, Chu, Duanfeng, Li, Guanjie, Lu, Liping, and Wang, Jinxiang
- Subjects
Computer Science - Robotics - Abstract
Trajectory prediction plays a crucial role in improving the safety and reliability of autonomous vehicles, serving as an intermediate link between perception and planning. However, due to the highly dynamic and multimodal nature of the task, accurately predicting the future trajectory of a target vehicle remains a significant challenge. To address these challenges, we propose an Ego vehicle Planning-informed Network (EPN) for multimodal trajectory prediction. Current trajectory prediction methods typically use the historical trajectory and vehicle attributes as inputs, focusing primarily on how historical information influences the future trajectory of the target vehicle. In real-world driving scenarios, however, the future trajectory of a vehicle is influenced not only by its own historical data but also by the behavior of other vehicles on the road. To address this, we incorporate the future planned trajectory of the ego vehicle as an additional input to simulate the mutual influence between the ego vehicle's planned trajectory and the predicted trajectory of the target vehicle. Furthermore, to tackle the challenges of intention ambiguity and large prediction errors often encountered in methods based on driving intentions, we propose a target's endpoint prediction module. This module first predicts the possible endpoints of the target vehicle, then refines these predictions through a correction mechanism, and finally generates a complete multimodal predicted trajectory based on the corrected endpoints. Experimental results demonstrate that, compared to other trajectory prediction methods, EPN achieves an average reduction of 34.9%, 30.7%, and 30.4% in RMSE, ADE, and FDE evaluation metrics on the NGSIM dataset, and an average reduction of 64.6%, 64.5%, and 64.3% in RMSE, ADE, and FDE on the HighD dataset. These results highlight the strong performance of EPN in trajectory prediction.
- Published
- 2024
111. Decoupling of carbonate-organic carbon isotope during the Carnian Pluvial Episode
- Author
-
Jia, Enhao, Wu, Kui, Du, Yong, Wu, Yuyang, Wang, Fengyu, Dai, Xu, Song, Huyue, Chu, Daoliang, Zhong, Lei, Yuan, Zhiwei, Chen, Xiangmin, Li, Zhe, and Song, Haijun
- Subjects
Physics - Atmospheric and Oceanic Physics - Abstract
The Carnian Pluvial Episode (CPE) was a major global climate change event in the early Late Triassic that significantly affected marine ecosystems and carbon cycles. One of the most prominent features of the CPE is the coupled multiple negative carbonate-organic carbon isotope excursions. However, at Erguan and Xiashulao from eastern Tethys, a decoupling between carbonate-organic carbon isotope during CPE was observed. At the end of early Carnian (Julian), the carbonate carbon isotope showed a negative excursion of 2-3 per-mille, while the organic carbon isotope exhibited a positive excursion of about 3-4 per-mille. In addition, increased terrestrial inputs is indicated by the rising C/N (3 to 10) and decreasing Y/Ho (42 to 27) that coexist with this decoupling. The coupling of carbon isotope negative excursions is from the shallow shelves and the deep slopes, whereas the decoupling occurs from the deep shelf to the shallow slope. In the deep shelf to the shallow slope, sedimentary organic matter is mainly sourced from pelagic before the CPE as evidenced by low C/N (3) and high Y/Ho (36-42). During the CPE, the increased fresh water flux (Sr/Ba <1) enhanced terrestrial input in organic matter, which may cause positive excursions in the carbon isotope record with elevated TOC content. As a result, the carbonate-organic carbon isotope decoupled. In contrast, organic matter in sediments from the shallow shelf and deep slope are mainly from terrestrial and pelagic sources, respectively. This study reveals the significant impact of terrestrial inputs on marine carbon cycling during the Carnian Pluvial Episode, highlighting the crucial role of climate events in modifying the carbon isotope record., Comment: 49 pages, 10 figures
- Published
- 2024
112. Attention-aware convolutional neural networks for identification of magnetic islands in the tearing mode on EAST tokamak
- Author
-
Long, Feifei, Zhao, Yian, Zhang, Yunjiao, Wan, Chenguang, Zhou, Yinan, Qiang, Ziwei, Yang, Kangning, Li, Jiuying, Shi, Tonghui, Guo, Bihao, Zhang, Yang, Zhao, Hailing, Ti, Ang, Liu, Adi, Zhou, Chu, Xie, Jinlin, Liu, Zixi, Zhuang, Ge, and Team, EAST
- Subjects
Physics - Plasma Physics - Abstract
The tearing mode, a large-scale MHD instability in tokamak, typically disrupts the equilibrium magnetic surfaces, leads to the formation of magnetic islands, and reduces core electron temperature and density, thus resulting in significant energy losses and may even cause discharge termination. This process is unacceptable for ITER. Therefore, the accurate identification of a magnetic island in real time is crucial for the effective control of the tearing mode in ITER in the future. In this study, based on the characteristics induced by tearing modes, an attention-aware convolutional neural network (AM-CNN) is proposed to identify the presence of magnetic islands in tearing mode discharge utilizing the data from ECE diagnostics in the EAST tokamak. A total of 11 ECE channels covering the range of core is used in the tearing mode dataset, which includes 2.5*10^9 data collected from 68 shots from 2016 to 2021 years. We split the dataset into training, validation, and test sets (66.5%, 5.7%, and 27.8%), respectively. An attention mechanism is designed to couple with the convolutional neural networks to improve the capability of feature extraction of signals. During the model training process, we utilized adaptive learning rate adjustment and early stopping mechanisms to optimize performance of AM-CNN. The model results show that a classification accuracy of 91.96% is achieved in tearing mode identification. Compared to CNN without AM, the attention-aware convolutional neural networks demonstrate great performance across accuracy, recall metrics, and F1 score. By leveraging the deep learning model, which incorporates a physical understanding of the tearing process to identify tearing mode behaviors, the combination of physical mechanisms and deep learning is emphasized, significantly laying an important foundation for the future intelligent control of tearing mode dynamics.
- Published
- 2024
113. Spectral comparison results for the $N$-Bakry-Emery Ricci tensor
- Author
-
Chu, Jianchun and Hao, Zihang
- Subjects
Mathematics - Differential Geometry - Abstract
We establish the diameter and global weighted volume comparison when the $N$-Bakry-Emery Ricci tensor has a positive lower bound in the spectrum sense., Comment: 15 pages
- Published
- 2024
114. A versatile method for nano-fabrication on diamond film: flexible diamond metasurfaces as a demonstration
- Author
-
Wang, Yicheng, Jing, Jixiang, Luo, Yumeng, Ma, Linjie, Wang, Zhongqiang, Wang, Qi, Li, Kwai Hei, and Chu, Zhiqin
- Subjects
Physics - Optics - Abstract
Diamond exhibits superb performance across a wide range of applications due to its enormous outstanding properties in electronic, photonic and quantum fields. Yet heterogeneous integration of diamond for on-chip functionalities, like 2D materials, remains challenging due to the hard acquisition of scalable, transferable and ultrathin diamond samples. Recently, the edge-exposed exfoliation has been demonstrated as an effective way to produce wafer-scale, freestanding and ultrathin diamond films. However, the incompatibility of the newly developed diamond film with conventional nano-fabrication methods makes it difficult to fabricate diamond film into practical devices. Herein, we demonstrate the mask-transferring by sugar as a versatile method for pattern-definition on diamond films, which shows excellent geometrical resolution and accuracy comparing to conventional approaches. Additionally, based on this method, the flexible all-diamond metasurfaces functioning as structural colors have been achieved, which indicates its huge potential for fabricating more diamond-related devices.
- Published
- 2024
115. Observation of the charmonium decay $\eta_c\to\gamma\gamma$
- Author
-
BESIII Collaboration, Ablikim, M., Achasov, M. N., Adlarson, P., Ai, X. C., Aliberti, R., Amoroso, A., An, Q., Bai, Y., Bakina, O., Ban, Y., Bao, H. -R., Batozskaya, V., Begzsuren, K., Berger, N., Berlowski, M., Bertani, M., Bettoni, D., Bianchi, F., Bianco, E., Bortone, A., Boyko, I., Briere, R. A., Brueggemann, A., Cai, H., Cai, M. H., Cai, X., Calcaterra, A., Cao, G. F., Cao, N., Cetin, S. A., Chai, X. Y., Chang, J. F., Che, G. R., Che, Y. Z., Chelkov, G., Chen, C., Chen, C. H., Chen, Chao, Chen, G., Chen, H. S., Chen, H. Y., Chen, M. L., Chen, S. J., Chen, S. L., Chen, S. M., Chen, T., Chen, X. R., Chen, X. T., Chen, Y. B., Chen, Y. Q., Chen, Z. J., Choi, S. K., Chu, X., Cibinetto, G., Cossio, F., Cui, J. J., Dai, H. L., Dai, J. P., Dbeyssi, A., de Boer, R. E., Dedovich, D., Deng, C. Q., Deng, Z. Y., Denig, A., Denysenko, I., Destefanis, M., De Mori, F., Ding, B., Ding, X. X., Ding, Y., Ding, Y. X., Dong, J., Dong, L. Y., Dong, M. Y., Dong, X., Du, M. C., Du, S. X., Duan, Y. Y., Duan, Z. H., Egorov, P., Fan, G. F., Fan, J. J., Fan, Y. H., Fang, J., Fang, S. S., Fang, W. X., Fang, Y. Q., Farinelli, R., Fava, L., Feldbauer, F., Felici, G., Feng, C. Q., Feng, J. H., Feng, Y. T., Fritsch, M., Fu, C. D., Fu, J. L., Fu, Y. W., Gao, H., Gao, X. B., Gao, Y. N., Gao, Y. Y., Gao, Yang, Garbolino, S., Garzia, I., Ge, P. T., Ge, Z. W., Geng, C., Gersabeck, E. M., Gilman, A., Goetzen, K., Gong, L., Gong, W. X., Gradl, W., Gramigna, S., Greco, M., Gu, M. H., Gu, Y. T., Guan, C. Y., Guo, A. Q., Guo, L. B., Guo, M. J., Guo, R. P., Guo, Y. P., Guskov, A., Gutierrez, J., Han, K. L., Han, T. T., Hanisch, F., Hao, X. Q., Harris, F. A., He, K. K., He, K. L., Heinsius, F. H., Heinz, C. H., Heng, Y. K., Herold, C., Holtmann, T., Hong, P. C., Hou, G. Y., Hou, X. T., Hou, Y. R., Hou, Z. L., Hu, B. Y., Hu, H. M., Hu, J. F., Hu, Q. P., Hu, S. L., Hu, T., Hu, Y., Huang, G. S., Huang, K. X., Huang, L. Q., Huang, P., Huang, X. T., Huang, Y. P., Huang, Y. S., Hussain, T., Hüsken, N., der Wiesche, N. in, Jackson, J., Janchiv, S., Ji, Q., Ji, Q. P., Ji, W., Ji, X. B., Ji, X. L., Ji, Y. Y., Jia, Z. K., Jiang, D., Jiang, H. B., Jiang, P. C., Jiang, S. J., Jiang, T. J., Jiang, X. S., Jiang, Y., Jiao, J. B., Jiao, J. K., Jiao, Z., Jin, S., Jin, Y., Jing, M. Q., Jing, X. M., Johansson, T., Kabana, S., Kalantar-Nayestanaki, N., Kang, X. L., Kang, X. S., Kavatsyuk, M., Ke, B. C., Khachatryan, V., Khoukaz, A., Kiuchi, R., Kolcu, O. B., Kopf, B., Kuessner, M., Kui, X., Kumar, N., Kupsc, A., Kühn, W., Lan, Q., Lan, W. N., Lei, T. T., Lei, Z. H., Lellmann, M., Lenz, T., Li, C., Li, C. H., Li, C. K., Li, Cheng, Li, D. M., Li, F., Li, G., Li, H. B., Li, H. J., Li, H. N., Li, Hui, Li, J. R., Li, J. S., Li, K., Li, K. L., Li, L. J., Li, Lei, Li, M. H., Li, M. R., Li, P. L., Li, P. R., Li, Q. M., Li, Q. X., Li, R., Li, T., Li, T. Y., Li, W. D., Li, W. G., Li, X., Li, X. H., Li, X. L., Li, X. Y., Li, X. Z., Li, Y., Li, Y. G., Li, Z. J., Li, Z. Y., Liang, C., Liang, H., Liang, Y. F., Liang, Y. T., Liao, G. R., Liao, Y. P., Libby, J., Limphirat, A., Lin, C. C., Lin, C. X., Lin, D. X., Lin, L. Q., Lin, T., Liu, B. J., Liu, B. X., Liu, C., Liu, C. X., Liu, F., Liu, F. H., Liu, Feng, Liu, G. M., Liu, H., Liu, H. B., Liu, H. H., Liu, H. M., Liu, Huihui, Liu, J. B., Liu, J. J., Liu, K., Liu, K. Y., Liu, Ke, Liu, L., Liu, L. C., Liu, Lu, Liu, M. H., Liu, P. L., Liu, Q., Liu, S. B., Liu, T., Liu, W. K., Liu, W. M., Liu, W. T., Liu, X., Liu, X. Y., Liu, Y., Liu, Y. B., Liu, Z. A., Liu, Z. D., Liu, Z. Q., Lou, X. C., Lu, F. X., Lu, H. J., Lu, J. G., Lu, Y., Lu, Y. H., Lu, Y. P., Lu, Z. H., Luo, C. L., Luo, J. R., Luo, J. S., Luo, M. X., Luo, T., Luo, X. L., Lyu, X. R., Lyu, Y. F., Lyu, Y. H., Ma, F. C., Ma, H., Ma, H. L., Ma, J. L., Ma, L. L., Ma, L. R., Ma, Q. M., Ma, R. Q., Ma, R. Y., Ma, T., Ma, X. T., Ma, X. Y., Ma, Y. M., Maas, F. E., MacKay, I., Maggiora, M., Malde, S., Mao, Y. J., Mao, Z. P., Marcello, S., Meng, Y. H., Meng, Z. X., Messchendorp, J. G., Mezzadri, G., Miao, H., Min, T. J., Mitchell, R. E., Mo, X. H., Moses, B., Muchnoi, N. Yu., Muskalla, J., Nefedov, Y., Nerling, F., Nie, L. S., Nikolaev, I. B., Ning, Z., Nisar, S., Niu, Q. L., Olsen, S. L., Ouyang, Q., Pacetti, S., Pan, X., Pan, Y., Pathak, A., Pei, Y. P., Pelizaeus, M., Peng, H. P., Peng, Y. Y., Peters, K., Ping, J. L., Ping, R. G., Plura, S., Prasad, V., Qi, F. Z., Qi, H. R., Qi, M., Qian, S., Qian, W. B., Qiao, C. F., Qiao, J. H., Qin, J. J., Qin, L. Q., Qin, L. Y., Qin, P. B., Qin, X. P., Qin, X. S., Qin, Z. H., Qiu, J. F., Qu, Z. H., Redmer, C. F., Rivetti, A., Rolo, M., Rong, G., Rong, S. S., Rosner, Ch., Ruan, M. Q., Ruan, S. N., Salone, N., Sarantsev, A., Schelhaas, Y., Schoenning, K., Scodeggio, M., Shan, K. Y., Shan, W., Shan, X. Y., Shang, Z. J., Shangguan, J. F., Shao, L. G., Shao, M., Shen, C. P., Shen, H. F., Shen, W. H., Shen, X. Y., Shi, B. A., Shi, H., Shi, J. L., Shi, J. Y., Shi, S. Y., Shi, X., Song, J. J., Song, T. Z., Song, W. M., Song, Y. J., Song, Y. X., Sosio, S., Spataro, S., Stieler, F., Su, S. S, Su, Y. J., Sun, G. B., Sun, G. X., Sun, H., Sun, H. K., Sun, J. F., Sun, K., Sun, L., Sun, S. S., Sun, T., Sun, Y. C., Sun, Y. H., Sun, Y. J., Sun, Y. Z., Sun, Z. Q., Sun, Z. T., Tang, C. J., Tang, G. Y., Tang, J., Tang, L. F., Tang, M., Tang, Y. A., Tao, L. Y., Tat, M., Teng, J. X., Thoren, V., Tian, W. H., Tian, Y., Tian, Z. F., Uman, I., Wang, B., Wang, Bo, Wang, C., Wang, D. Y., Wang, H. J., Wang, J. J., Wang, K., Wang, L. L., Wang, L. W., Wang, M., Wang, N. Y., Wang, S., Wang, T., Wang, T. J., Wang, W., Wang, W. P., Wang, X., Wang, X. F., Wang, X. J., Wang, X. L., Wang, X. N., Wang, Y., Wang, Y. D., Wang, Y. F., Wang, Y. H., Wang, Y. L., Wang, Y. N., Wang, Y. Q., Wang, Yaqian, Wang, Yi, Wang, Z., Wang, Z. L., Wang, Z. Y., Wei, D. H., Weidner, F., Wen, S. P., Wen, Y. R., Wiedner, U., Wilkinson, G., Wolke, M., Wu, C., Wu, J. F., Wu, L. H., Wu, L. J., Wu, Lianjie, Wu, S. G., Wu, S. M., Wu, X., Wu, X. H., Wu, Y. J., Wu, Z., Xia, L., Xian, X. M., Xiang, B. H., Xiang, T., Xiao, D., Xiao, G. Y., Xiao, H., Xiao, Y. L., Xiao, Z. J., Xie, C., Xie, K. J., Xie, X. H., Xie, Y., Xie, Y. G., Xie, Y. H., Xie, Z. P., Xing, T. Y., Xu, C. F., Xu, C. J., Xu, G. F., Xu, M., Xu, Q. J., Xu, Q. N., Xu, W. L., Xu, X. P., Xu, Y., Xu, Y. C., Xu, Z. S., Yan, F., Yan, H. Y., Yan, L., Yan, W. B., Yan, W. C., Yan, W. P., Yan, X. Q., Yang, H. J., Yang, H. L., Yang, H. X., Yang, J. H., Yang, R. J., Yang, T., Yang, Y., Yang, Y. F., Yang, Y. Q., Yang, Y. X., Yang, Y. Z., Ye, M., Ye, M. H., Yin, Junhao, You, Z. Y., Yu, B. X., Yu, C. X., Yu, G., Yu, J. S., Yu, M. C., Yu, T., Yu, X. D., Yu, Y. C., Yuan, C. Z., Yuan, H., Yuan, J., Yuan, L., Yuan, S. C., Yuan, Y., Yuan, Z. Y., Yue, C. X., Yue, Ying, Zafar, A. A., Zeng, S. H., Zeng, X., Zeng, Y., Zeng, Y. J., Zhai, X. Y., Zhan, Y. H., Zhang, A. Q., Zhang, B. L., Zhang, B. X., Zhang, D. H., Zhang, G. Y., Zhang, H., Zhang, H. C., Zhang, H. H., Zhang, H. Q., Zhang, H. R., Zhang, H. Y., Zhang, J., Zhang, J. J., Zhang, J. L., Zhang, J. Q., Zhang, J. S., Zhang, J. W., Zhang, J. X., Zhang, J. Y., Zhang, J. Z., Zhang, Jianyu, Zhang, L. M., Zhang, Lei, Zhang, N., Zhang, P., Zhang, Q., Zhang, Q. Y., Zhang, R. Y., Zhang, S. H., Zhang, Shulei, Zhang, X. M., Zhang, X. Y, Zhang, X. Y., Zhang, Y., Zhang, Y. T., Zhang, Y. H., Zhang, Y. M., Zhang, Yan, Zhang, Z. D., Zhang, Z. H., Zhang, Z. L., Zhang, Z. X., Zhang, Z. Y., Zhang, Z. Z., Zhang, Zh. Zh., Zhao, G., Zhao, J. Y., Zhao, J. Z., Zhao, L., Zhao, Lei, Zhao, M. G., Zhao, N., Zhao, R. P., Zhao, S. J., Zhao, Y. B., Zhao, Y. X., Zhao, Z. G., Zhemchugov, A., Zheng, B., Zheng, B. M., Zheng, J. P., Zheng, W. J., Zheng, X. R., Zheng, Y. H., Zhong, B., Zhong, X., Zhou, H., Zhou, J. Y., Zhou, S., Zhou, X., Zhou, X. K., Zhou, X. R., Zhou, X. Y., Zhou, Y. Z., Zhou, Z. C., Zhu, A. N., Zhu, J., Zhu, K., Zhu, K. J., Zhu, K. S., Zhu, L., Zhu, L. X., Zhu, S. H., Zhu, T. J., Zhu, W. D., Zhu, W. J., Zhu, W. Z., Zhu, Y. C., Zhu, Z. A., Zhuang, X. Y., Zou, J. H., and Zu, J.
- Subjects
High Energy Physics - Experiment - Abstract
Using $(2712.4\pm14.3)\times10^{6}$ $\psi(3686)$ events collected with the BESIII detector at the BEPCII collider, the decay $\eta_c\to\gamma\gamma$ in $J/\psi\to\gamma\eta_c$ is observed for the first time. We determine the product branching fraction $\mathcal{B}(J/\psi\to\gamma\eta_c)\times\mathcal{B}(\eta_c\to\gamma\gamma)=(5.23\pm0.26_{\rm{stat.}}\pm0.30_{\rm{syst.}})\times10^{-6}$. This result is well consistent with the LQCD calculation $(5.34\pm0.16)\times10^{-6}$ from HPQCD in 2023. By using the world-average values of $\mathcal{B}(J/\psi\to\gamma\eta_c)$ and the total decay width of $\eta_c$, the partial decay width $\Gamma(\eta_c\to\gamma\gamma)$ is determined to be $(11.30\pm0.56_{\rm{stat.}}\pm0.66_{\rm{syst.}}\pm1.14_{\rm{ref.}})~\rm{keV}$, which deviates from the corresponding world-average value by $3.4\sigma$., Comment: 10 pages, 4 figures
- Published
- 2024
116. The IBEX Imaging Knowledge-Base: A Community Resource Enabling Adoption and Development of Immunofluoresence Imaging Methods
- Author
-
Yaniv, Ziv, Anidi, Ifeanyichukwu U., Arakkal, Leanne, Arroyo-Mejías, Armando J., Beuschel, Rebecca T., Börner, Katy, Chu, Colin J., Clark, Beatrice, Clatworthy, Menna R., Colautti, Jake, Coscia, Fabian, Croteau, Joshua, Denha, Saven, Dever, Rose, Dutra, Walderez O., Fritzsche, Sonja, Fullam, Spencer, Gerner, Michael Y., Gola, Anita, Gollob, Kenneth J., Hernandez, Jonathan M., Hor, Jyh Liang, Ichise, Hiroshi, Jing, Zhixin, Jonigk, Danny, Kandov, Evelyn, Kastenmüller, Wolfgang, Koenig, Joshua F. E., Kothurkar, Aanandita, Kortekaas, Rosa K., Kreins, Alexandra Y., Lamborn, Ian T., Lin, Yuri, Morais, Katia Luciano Pereira, Lunich, Aleksandra, Luz, Jean C. S., MacDonald, Ryan B., Makranz, Chen, Maltez, Vivien I., McDonough, John E., Moriarty, Ryan V., Ocampo-Godinez, Juan M., Olyntho, Vitoria M., Oxenius, Annette, Padhan, Kartika, Remmert, Kirsten, Richoz, Nathan, Schrom, Edward C., Shang, Wanjing, Shi, Lihong, Shih, Rochelle M., Speranza, Emily, Stierli, Salome, Teichmann, Sarah A., Veres, Tibor Z., Vierhout, Megan, Wachter, Brianna T., Williams, Margaret, Zangger, Nathan, Germain, Ronald N., and Radtke, Andrea J.
- Subjects
Quantitative Biology - Tissues and Organs ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
The iterative bleaching extends multiplexity (IBEX) Knowledge-Base is a central portal for researchers adopting IBEX and related 2D and 3D immunofluorescence imaging methods. The design of the Knowledge-Base is modeled after efforts in the open-source software community and includes three facets: a development platform (GitHub), static website, and service for data archiving. The Knowledge-Base facilitates the practice of open science throughout the research life cycle by providing validation data for recommended and non-recommended reagents, e.g., primary and secondary antibodies. In addition to reporting negative data, the Knowledge-Base empowers method adoption and evolution by providing a venue for sharing protocols, videos, datasets, software, and publications. A dedicated discussion forum fosters a sense of community among researchers while addressing questions not covered in published manuscripts. Together, scientists from around the world are advancing scientific discovery at a faster pace, reducing wasted time and effort, and instilling greater confidence in the resulting data.
- Published
- 2024
117. RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion
- Author
-
Chu, Xiaomeng, Deng, Jiajun, You, Guoliang, Duan, Yifan, Li, Houqiang, and Zhang, Yanyong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We propose Radar-Camera fusion transformer (RaCFormer) to boost the accuracy of 3D object detection by the following insight. The Radar-Camera fusion in outdoor 3D scene perception is capped by the image-to-BEV transformation--if the depth of pixels is not accurately estimated, the naive combination of BEV features actually integrates unaligned visual content. To avoid this problem, we propose a query-based framework that enables adaptively sample instance-relevant features from both the BEV and the original image view. Furthermore, we enhance system performance by two key designs: optimizing query initialization and strengthening the representational capacity of BEV. For the former, we introduce an adaptive circular distribution in polar coordinates to refine the initialization of object queries, allowing for a distance-based adjustment of query density. For the latter, we initially incorporate a radar-guided depth head to refine the transformation from image view to BEV. Subsequently, we focus on leveraging the Doppler effect of radar and introduce an implicit dynamic catcher to capture the temporal elements within the BEV. Extensive experiments on nuScenes and View-of-Delft (VoD) datasets validate the merits of our design. Remarkably, our method achieves superior results of 64.9% mAP and 70.2% NDS on nuScenes, even outperforming several LiDAR-based detectors. RaCFormer also secures the 1st ranking on the VoD dataset. The code will be released.
- Published
- 2024
118. if-ZKP: Intel FPGA-Based Acceleration of Zero Knowledge Proofs
- Author
-
Butt, Shahzad Ahmad, Reynolds, Benjamin, Ramamurthy, Veeraraghavan, Xiao, Xiao, Chu, Pohrong, Sharifian, Setareh, Gribok, Sergey, and Pasca, Bogdan
- Subjects
Computer Science - Hardware Architecture ,Computer Science - Cryptography and Security - Abstract
Zero-Knowledge Proofs (ZKPs) have emerged as an important cryptographic technique allowing one party (prover) to prove the correctness of a statement to some other party (verifier) and nothing else. ZKPs give rise to user's privacy in many applications such as blockchains, digital voting, and machine learning. Traditionally, ZKPs suffered from poor scalability but recently, a sub-class of ZKPs known as Zero-knowledge Succinct Non-interactive ARgument of Knowledges (zk-SNARKs) have addressed this challenge. They are getting significant attention and are being implemented by many public libraries. In this paper, we present a novel scalable architecture that is suitable for accelerating the zk-SNARK prover compute on FPGAs. We focus on the multi-scalar multiplication (MSM) that accounts for the majority of computation time spent in zk-SNARK systems. The MSM calculations extensive rely on modular arithmetic so highly optimized Intel IP Libraries for modular arithmetic are used. The proposed architecture exploits the parallelism inherent to MSM and is implemented using the Intel OneAPI framework for FPGAs. Our implementation runs 110x-150x faster compared to reference software library, uses a generic curve form in Jacobian coordinates and is the first to report FPGA hardware acceleration results for BLS12-381 and BN128 family of elliptic curves.
- Published
- 2024
119. Field-Resilient Supercurrent Diode in a Multiferroic Josephson Junction
- Author
-
Yang, Hung-Yu, Cuozzo, Joseph J., Bokka, Anand Johnson, Qiu, Gang, Eckberg, Christopher, Lyu, Yanfeng, Huyan, Shuyuan, Chu, Ching-Wu, Watanabe, Kenji, Taniguchi, Takashi, and Wang, Kang L.
- Subjects
Condensed Matter - Superconductivity ,Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
The research on supercurrent diodes has surged rapidly due to their potential applications in electronic circuits at cryogenic temperatures. To unlock this functionality, it is essential to find supercurrent diodes that can work consistently at zero magnetic field and under ubiquitous stray fields generated in electronic circuits. However, a supercurrent diode with robust field tolerance is currently lacking. Here, we demonstrate a field-resilient supercurrent diode by incorporating a multiferroic material into a Josephson junction. We first observed a pronounced supercurrent diode effect at zero magnetic field. More importantly, the supercurrent rectification persists over a wide and bipolar magnetic field range beyond industrial standards for field tolerance. By theoretically modeling a multiferroic Josephson junction, we unveil that the interplay between spin-orbit coupling and multiferroicity underlies the unusual field resilience of the observed diode effect. This work introduces multiferroic Josephson junctions as a new field-resilient superconducting device for cryogenic electronics., Comment: Preprint, 33 pages, 4 main figures, 10 extended data figures
- Published
- 2024
120. Region-Based Optimization in Continual Learning for Audio Deepfake Detection
- Author
-
Chen, Yujie, Yi, Jiangyan, Fan, Cunhang, Tao, Jianhua, Ren, Yong, Zeng, Siding, Zhang, Chu Yuan, Yan, Xinrui, Gu, Hao, Xue, Jun, Wang, Chenglong, Lv, Zhao, and Zhang, Xiaohui
- Subjects
Computer Science - Sound ,Computer Science - Artificial Intelligence ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Rapid advancements in speech synthesis and voice conversion bring convenience but also new security risks, creating an urgent need for effective audio deepfake detection. Although current models perform well, their effectiveness diminishes when confronted with the diverse and evolving nature of real-world deepfakes. To address this issue, we propose a continual learning method named Region-Based Optimization (RegO) for audio deepfake detection. Specifically, we use the Fisher information matrix to measure important neuron regions for real and fake audio detection, dividing them into four regions. First, we directly fine-tune the less important regions to quickly adapt to new tasks. Next, we apply gradient optimization in parallel for regions important only to real audio detection, and in orthogonal directions for regions important only to fake audio detection. For regions that are important to both, we use sample proportion-based adaptive gradient optimization. This region-adaptive optimization ensures an appropriate trade-off between memory stability and learning plasticity. Additionally, to address the increase of redundant neurons from old tasks, we further introduce the Ebbinghaus forgetting mechanism to release them, thereby promoting the capability of the model to learn more generalized discriminative features. Experimental results show our method achieves a 21.3% improvement in EER over the state-of-the-art continual learning approach RWM for audio deepfake detection. Moreover, the effectiveness of RegO extends beyond the audio deepfake detection domain, showing potential significance in other tasks, such as image recognition. The code is available at https://github.com/cyjie429/RegO, Comment: Accepted by AAAI 2025
- Published
- 2024
121. Probabilistic GOSPA: A Metric for Performance Evaluation of Multi-Object Filters with Uncertainties
- Author
-
Xia, Yuxuan, García-Fernández, Ángel F., Karlsson, Johan, Yuan, Ting, Chang, Kuo-Chu, and Svensson, Lennart
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
This correspondence presents a probabilistic generalization of the Generalized Optimal Sub-Pattern Assignment (GOSPA) metric, termed P-GOSPA. The GOSPA metric is widely used to evaluate the distance between finite sets, particularly in multi-object estimation applications. The P-GOSPA extends GOSPA into the space of multi-Bernoulli densities, incorporating inherent uncertainty in probabilistic multi-object representations. Additionally, P-GOSPA retains the interpretability of GOSPA, such as its decomposition into localization, missed detection, and false detection errors in a sound and meaningful manner. Examples and simulations are provided to demonstrate the efficacy of the proposed P-GOSPA metric.
- Published
- 2024
122. UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
- Author
-
Han, Yuning, Zhao, Bingyin, Chu, Rui, Luo, Feng, Sikdar, Biplab, and Lao, Yingjie
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Recent studies show that diffusion models (DMs) are vulnerable to backdoor attacks. Existing backdoor attacks impose unconcealed triggers (e.g., a gray box and eyeglasses) that contain evident patterns, rendering remarkable attack effects yet easy detection upon human inspection and defensive algorithms. While it is possible to improve stealthiness by reducing the strength of the backdoor, doing so can significantly compromise its generality and effectiveness. In this paper, we propose UIBDiffusion, the universal imperceptible backdoor attack for diffusion models, which allows us to achieve superior attack and generation performance while evading state-of-the-art defenses. We propose a novel trigger generation approach based on universal adversarial perturbations (UAPs) and reveal that such perturbations, which are initially devised for fooling pre-trained discriminative models, can be adapted as potent imperceptible backdoor triggers for DMs. We evaluate UIBDiffusion on multiple types of DMs with different kinds of samplers across various datasets and targets. Experimental results demonstrate that UIBDiffusion brings three advantages: 1) Universality, the imperceptible trigger is universal (i.e., image and model agnostic) where a single trigger is effective to any images and all diffusion models with different samplers; 2) Utility, it achieves comparable generation quality (e.g., FID) and even better attack success rate (i.e., ASR) at low poison rates compared to the prior works; and 3) Undetectability, UIBDiffusion is plausible to human perception and can bypass Elijah and TERD, the SOTA defenses against backdoors for DMs. We will release our backdoor triggers and code.
- Published
- 2024
123. RAC3: Retrieval-Augmented Corner Case Comprehension for Autonomous Driving with Vision-Language Models
- Author
-
Wang, Yujin, Liu, Quanfeng, Fan, Jiaqi, Hong, Jinlong, Chu, Hongqing, Tian, Mengjian, Gao, Bingzhao, and Chen, Hong
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Understanding and addressing corner cases is essential for ensuring the safety and reliability of autonomous driving systems. Vision-Language Models (VLMs) play a crucial role in enhancing scenario comprehension, yet they face significant challenges, such as hallucination and insufficient real-world grounding, which compromise their performance in critical driving scenarios. In this work, we propose RAC3, a novel framework designed to improve VLMs' ability to handle corner cases effectively. The framework integrates Retrieval-Augmented Generation (RAG) to mitigate hallucination by dynamically incorporating context-specific external knowledge. A cornerstone of RAC3 is its cross-modal alignment fine-tuning, which utilizes contrastive learning to embed image-text pairs into a unified semantic space, enabling robust retrieval of similar scenarios. We evaluate RAC3 through extensive experiments using a curated dataset of corner case scenarios, demonstrating its ability to enhance semantic alignment, improve hallucination mitigation, and achieve superior performance metrics, such as Cosine Similarity and ROUGE-L scores. For example, for the LLaVA-v1.6-34B VLM, the cosine similarity between the generated text and the reference text has increased by 5.22\%. The F1-score in ROUGE-L has increased by 39.91\%, the Precision has increased by 55.80\%, and the Recall has increased by 13.74\%. This work underscores the potential of retrieval-augmented VLMs to advance the robustness and safety of autonomous driving in complex environments., Comment: 12 pages, 7 figures
- Published
- 2024
124. StyleDiT: A Unified Framework for Diverse Child and Partner Faces Synthesis with Style Latent Diffusion Transformer
- Author
-
Chiu, Pin-Yen, Wu, Dai-Jie, Chu, Po-Hsun, Hsu, Chia-Hsuan, Chiu, Hsiang-Chen, Wang, Chih-Yu, and Chen, Jun-Cheng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Kinship face synthesis is a challenging problem due to the scarcity and low quality of the available kinship data. Existing methods often struggle to generate descendants with both high diversity and fidelity while precisely controlling facial attributes such as age and gender. To address these issues, we propose the Style Latent Diffusion Transformer (StyleDiT), a novel framework that integrates the strengths of StyleGAN with the diffusion model to generate high-quality and diverse kinship faces. In this framework, the rich facial priors of StyleGAN enable fine-grained attribute control, while our conditional diffusion model is used to sample a StyleGAN latent aligned with the kinship relationship of conditioning images by utilizing the advantage of modeling complex kinship relationship distribution. StyleGAN then handles latent decoding for final face generation. Additionally, we introduce the Relational Trait Guidance (RTG) mechanism, enabling independent control of influencing conditions, such as each parent's facial image. RTG also enables a fine-grained adjustment between the diversity and fidelity in synthesized faces. Furthermore, we extend the application to an unexplored domain: predicting a partner's facial images using a child's image and one parent's image within the same framework. Extensive experiments demonstrate that our StyleDiT outperforms existing methods by striking an excellent balance between generating diverse and high-fidelity kinship faces.
- Published
- 2024
125. EnvPoser: Environment-aware Realistic Human Motion Estimation from Sparse Observations with Uncertainty Modeling
- Author
-
Xia, Songpengcheng, Zhang, Yu, Su, Zhuo, Zheng, Xiaozheng, Lv, Zheng, Wang, Guidong, Zhang, Yongjie, Wu, Qi, Chu, Lei, and Pei, Ling
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Estimating full-body motion using the tracking signals of head and hands from VR devices holds great potential for various applications. However, the sparsity and unique distribution of observations present a significant challenge, resulting in an ill-posed problem with multiple feasible solutions (i.e., hypotheses). This amplifies uncertainty and ambiguity in full-body motion estimation, especially for the lower-body joints. Therefore, we propose a new method, EnvPoser, that employs a two-stage framework to perform full-body motion estimation using sparse tracking signals and pre-scanned environment from VR devices. EnvPoser models the multi-hypothesis nature of human motion through an uncertainty-aware estimation module in the first stage. In the second stage, we refine these multi-hypothesis estimates by integrating semantic and geometric environmental constraints, ensuring that the final motion estimation aligns realistically with both the environmental context and physical interactions. Qualitative and quantitative experiments on two public datasets demonstrate that our method achieves state-of-the-art performance, highlighting significant improvements in human motion estimation within motion-environment interaction scenarios.
- Published
- 2024
126. Deterministic steady-state subradiance within a single-excitation basis
- Author
-
Chu, Meng-Jia, Ren, Jun, and Wang, Z. D.
- Subjects
Quantum Physics - Abstract
Subradiance shows promising applications in quantum information, yet its realization remains more challenging than superradiance due to the need to suppress various decay channels. This study introduces a state space within a single-excitation basis with perfect subradiance and genuine multipartite quantum entanglement resources for the all-to-all case. Utilizing the quantum jump operator method, we also provide an analytical derivation of the system's steady final state for any single-excitation initial state. Additionally, we determine the approximate final state in the quasi-all-to-all coupling scenario. As an illustrative example, we evaluate the coupling and dynamical properties of emitters in a photonic crystal slab possessing an ultra-high quality bound state in the continuum, thereby validating the efficacy of our theoretical approach. This theoretical framework facilitates the analytical prediction of dynamics for long-lived multipartite entanglement while elucidating a pathway toward realizing autonomous subradiance in atomic systems.
- Published
- 2024
127. Optimized Coordination Strategy for Multi-Aerospace Systems in Pick-and-Place Tasks By Deep Neural Network
- Author
-
Zhang, Ye, Chu, Linyue, Xu, Letian, Mo, Kangtong, Kang, Zhengjian, and Zhang, Xingyu
- Subjects
Computer Science - Robotics - Abstract
In this paper, we present an advanced strategy for the coordinated control of a multi-agent aerospace system, utilizing Deep Neural Networks (DNNs) within a reinforcement learning framework. Our approach centers on optimizing autonomous task assignment to enhance the system's operational efficiency in object relocation tasks, framed as an aerospace-oriented pick-and-place scenario. By modeling this coordination challenge within a MuJoCo environment, we employ a deep reinforcement learning algorithm to train a DNN-based policy to maximize task completion rates across the multi-agent system. The objective function is explicitly designed to maximize effective object transfer rates, leveraging neural network capabilities to handle complex state and action spaces in high-dimensional aerospace environments. Through extensive simulation, we benchmark the proposed method against a heuristic combinatorial approach rooted in game-theoretic principles, demonstrating a marked performance improvement, with the trained policy achieving up to 16\% higher task efficiency. Experimental validation is conducted on a multi-agent hardware setup to substantiate the efficacy of our approach in a real-world aerospace scenario.
- Published
- 2024
128. Synthetic multi-dimensional Aharonov-Bohm cages in Fock state lattices
- Author
-
Zhang, Jiajian, Huang, Wenhui, Chu, Ji, Qiu, Jiawei, Sun, Xuandong, Tao, Ziyu, Zhang, Jiawei, Zhang, Libo, Zhou, Yuxuan, Chen, Yuanzhen, Liu, Yang, Liu, Song, Zhong, Youpeng, Miao, Jian-Jian, Niu, Jingjing, and Yu, Dapeng
- Subjects
Quantum Physics - Abstract
Fock-state lattices (FSLs), composed of photon number states with infinite Hilbert space, have emerged as a promising platform for simulating high-dimensional physics due to their potential to extend into arbitrarily high dimensions. Here, we demonstrate the construction of multi-dimensional FSLs using superconducting quantum circuits. By controlling artificial gauge fields within their internal structures, we investigate flux-induced extreme localization dynamics, such as Aharonov-Bohm caging, extending from 2D to 3D. We also explore the coherent interference of quantum superposition states, achieving extreme localization within specific subspaces assisted by quantum entanglement. Our findings pave the way for manipulating the behavior of a broad class of quantum states in higher-dimensional systems., Comment: 6+23 pages; 4+18 figures
- Published
- 2024
129. FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion
- Author
-
Qiu, Haonan, Zhang, Shiwei, Wei, Yujie, Chu, Ruihang, Yuan, Hangjie, Wang, Xiang, Zhang, Yingya, and Liu, Ziwei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Visual diffusion models achieve remarkable progress, yet they are typically trained at limited resolutions due to the lack of high-resolution data and constrained computation resources, hampering their ability to generate high-fidelity images or videos at higher resolutions. Recent efforts have explored tuning-free strategies to exhibit the untapped potential higher-resolution visual generation of pre-trained models. However, these methods are still prone to producing low-quality visual content with repetitive patterns. The key obstacle lies in the inevitable increase in high-frequency information when the model generates visual content exceeding its training resolution, leading to undesirable repetitive patterns deriving from the accumulated errors. To tackle this challenge, we propose FreeScale, a tuning-free inference paradigm to enable higher-resolution visual generation via scale fusion. Specifically, FreeScale processes information from different receptive scales and then fuses it by extracting desired frequency components. Extensive experiments validate the superiority of our paradigm in extending the capabilities of higher-resolution visual generation for both image and video models. Notably, compared with the previous best-performing method, FreeScale unlocks the generation of 8k-resolution images for the first time., Comment: Project Page: http://haonanqiu.com/projects/FreeScale.html
- Published
- 2024
130. Emergent facilitation by random constraints in a facilitated random walk model of glass
- Author
-
Lam, Leo S. I., Deng, Hai-Yao, Zhang, Wei-Bing, Nwankwo, Udoka, Xiao, Chu, Yip, Cho-Tung, Lee, Chun-Shing, Ruan, Haihui, and Lam, Chi-Hang
- Subjects
Condensed Matter - Statistical Mechanics - Abstract
The physics of glass has been a significant topic of interest for decades. Dynamical facilitation is widely believed to be an important characteristic of glassy dynamics, but the precise mechanism is still under debate. We propose a lattice model of glass called the facilitated random walk (FRW). Each particle performs continuous time random walk in the presence of its own random local kinetic constraints. The particles do not interact energetically. Instead, they interact kinetically with a hopping rate resampling rule under which motions of a particle can randomly perturb the local kinetic constraints of other particles. This dynamic interaction is reversible, following a rate restoration rule. A step-by-step reversal of the particle motions exactly restore the previous constraints, modeling randomness quenched in the configuration space of glass. The model exhibits stretched exponential relaxation and dynamical heterogeneity typical of glasses. Despite the lack of explicit facilitation rule, the FRW shows facilitation behaviors closely analogous to those of the kinetically constrained models (KCM). The FRW is a coarse-grained version of the distinguishable particle lattice model (DPLM) and this exemplifies that compatible defect and atomistic models can complement each other on the study of glass.
- Published
- 2024
131. Extending Structures for Rota-Baxter family Hom-associative Algebras
- Author
-
Wang, Junwen, Zhang, Yuanyuan, and Chu, Yanjun
- Subjects
Mathematics - Rings and Algebras ,Mathematics - Representation Theory ,16W99 - Abstract
In this paper, we first define extending datums and unified products of Rota-Baxter family Hom-associative algebras, and theoretically solve the extending structure problem. Moreover, we consider flag datums as an application, and give an example of the extending structure problem. Second, we introduce matched pairs of Rota-Baxter family Hom-associative algebras, and theoretically solve the factorization problem. Finally, we define deformation maps on a Rota-Baxter family Hom extending structure, and theoretically solve the classifying complements problem., Comment: 24pages, comments are welcome! arXiv admin note: substantial text overlap with arXiv:2406.10992
- Published
- 2024
132. Digging into Intrinsic Contextual Information for High-fidelity 3D Point Cloud Completion
- Author
-
Chu, Jisheng, Li, Wenrui, Wang, Xingtao, Ning, Kanglin, Lu, Yidan, and Fan, Xiaopeng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The common occurrence of occlusion-induced incompleteness in point clouds has made point cloud completion (PCC) a highly-concerned task in the field of geometric processing. Existing PCC methods typically produce complete point clouds from partial point clouds in a coarse-to-fine paradigm, with the coarse stage generating entire shapes and the fine stage improving texture details. Though diffusion models have demonstrated effectiveness in the coarse stage, the fine stage still faces challenges in producing high-fidelity results due to the ill-posed nature of PCC. The intrinsic contextual information for texture details in partial point clouds is the key to solving the challenge. In this paper, we propose a high-fidelity PCC method that digs into both short and long-range contextual information from the partial point cloud in the fine stage. Specifically, after generating the coarse point cloud via a diffusion-based coarse generator, a mixed sampling module introduces short-range contextual information from partial point clouds into the fine stage. A surface freezing modules safeguards points from noise-free partial point clouds against disruption. As for the long-range contextual information, we design a similarity modeling module to derive similarity with rigid transformation invariance between points, conducting effective matching of geometric manifold features globally. In this way, the high-quality components present in the partial point cloud serve as valuable references for refining the coarse point cloud with high fidelity. Extensive experiments have demonstrated the superiority of the proposed method over SOTA competitors. Our code is available at https://github.com/JS-CHU/ContextualCompletion., Comment: Accepted to AAAI2025
- Published
- 2024
133. Accretion onto WD 2226$-$210, the central star of the Helix Nebula
- Author
-
Estrada-Dorado, S., Guerrero, M. A., Toalá, J. A., Maldonado, R. F., Lora, V., Vasquez-Torres, D. A., and Chu, Y. -H.
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
The central star of the Helix Nebula, WD 2226$-$210 presents enigmatic hard X-ray emission and mid-IR excess. The latter has been attributed to a dusty disk or a cloud-like structure around WD 2226$-$210 formed from material of Kuiper Belt-like or comet-like objects in highly eccentric orbits. We present here a detailed analysis of multi-epoch Chandra and XMM-Newton X-ray observations of WD 2226$-$210, comparing these to previous Einstein and ROSAT data. The luminosity of the hard X-ray component of WD 2226$-$210 has remained basically constant in the decade from 1992 to 2002, with very subtle evidence for variability in timescales of hours. Under the assumption that the X-ray emission from WD 2226$-$210 is due to accretion of material, an accretion rate of $\dot{M}\approx10^{-10}$ M$_\odot$ yr$^{-1}$ is estimated. The origin of the material accreted by WD 2226$-$210 is uncertain, and can be attributed to the disk-like structure around it or to a sub-stellar donor companion. The accretion rate proposed for the continuous replenishment by bombardment of the mid-IR-emitting structure around WD 2226$-$210 cannot match that required by the X-ray emission., Comment: 8 pages, 3 figures, 3 tables; accepted to MNRAS
- Published
- 2024
134. OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
- Author
-
Ouyang, Linke, Qu, Yuan, Zhou, Hongbin, Zhu, Jiawei, Zhang, Rui, Lin, Qunshu, Wang, Bin, Zhao, Zhiyuan, Jiang, Man, Zhao, Xiaomeng, Shi, Jin, Wu, Fan, Chu, Pei, Liu, Minghao, Li, Zhenxiang, Xu, Chao, Zhang, Bo, Shi, Botian, Tu, Zhongying, and He, Conghui
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval - Abstract
Document content extraction is crucial in computer vision, especially for meeting the high-quality data needs of large language models (LLMs) and retrieval-augmented generation (RAG) technologies. However, current document parsing methods suffer from significant limitations in terms of diversity and comprehensive evaluation. To address these challenges, we introduce OmniDocBench, a novel multi-source benchmark designed to advance automated document content extraction. OmniDocBench includes a meticulously curated and annotated high-quality evaluation dataset comprising nine diverse document types, such as academic papers, textbooks, slides, among others. Our benchmark provides a flexible and comprehensive evaluation framework with 19 layout category labels and 14 attribute labels, enabling multi-level assessments across entire datasets, individual modules, or specific data types. Using OmniDocBench, we perform an exhaustive comparative analysis of existing modular pipelines and multimodal end-to-end methods, highlighting their limitations in handling document diversity and ensuring fair evaluation. OmniDocBench establishes a robust, diverse, and fair evaluation standard for the document content extraction field, offering crucial insights for future advancements and fostering the development of document parsing technologies. The codes and dataset is available in https://github.com/opendatalab/OmniDocBench.
- Published
- 2024
135. Hallucination Elimination and Semantic Enhancement Framework for Vision-Language Models in Traffic Scenarios
- Author
-
Fan, Jiaqi, Wu, Jianhua, Chu, Hongqing, Ge, Quanbo, and Gao, Bingzhao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Large vision-language models (LVLMs) have demonstrated remarkable capabilities in multimodal understanding and generation tasks. However, these models occasionally generate hallucinatory texts, resulting in descriptions that seem reasonable but do not correspond to the image. This phenomenon can lead to wrong driving decisions of the autonomous driving system. To address this challenge, this paper proposes HCOENet, a plug-and-play chain-of-thought correction method designed to eliminate object hallucinations and generate enhanced descriptions for critical objects overlooked in the initial response. Specifically, HCOENet employs a cross-checking mechanism to filter entities and directly extracts critical objects from the given image, enriching the descriptive text. Experimental results on the POPE benchmark demonstrate that HCOENet improves the F1-score of the Mini-InternVL-4B and mPLUG-Owl3 models by 12.58% and 4.28%, respectively. Additionally, qualitative results using images collected in open campus scene further highlight the practical applicability of the proposed method. Compared with the GPT-4o model, HCOENet achieves comparable descriptive performance while significantly reducing costs. Finally, two novel semantic understanding datasets, CODA_desc and nuScenes_desc, are created for traffic scenarios to support future research. The codes and datasets are publicly available at https://github.com/fjq-tongji/HCOENet.
- Published
- 2024
136. Next generation Co-Packaged Optics Technology to Train & Run Generative AI Models in Data Centers and Other Computing Applications
- Author
-
Knickerbocker, John, Heroux, Jean Benoit, Bonilla, Griselda, Hsu, Hsiang, Liu, Neng, Ramos, Adrian Paz, Arguin, Francois, Tribodeau, Yan, Terjani, Badr, Schultz, Mark, Ganti, Raghu Kiran, Chu, Linsong, Marushima, Chinami, Taira, Yoichi, Kohara, Sayuri, Horibe, Akihiro, Mori, Hiroyuki, and Numata, Hidetoshi
- Subjects
Physics - Optics ,Condensed Matter - Materials Science - Abstract
We report on the successful design and fabrication of optical modules using a 50 micron pitch polymer waveguide interface, integrated for low loss, high density optical data transfer with very low space requirements on a Si photonics die. This prototype module meets JEDEC reliability standards and promises to increase the number of optical fibers that can be connected at the edge of a chip, a measure known as beachfront density, by six times compared to state of the art technology. Scalability of the polymer waveguide to less than 20 micron pitch stands to improve the bandwidth density upwards of 10 Tbps/mm.
- Published
- 2024
137. FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
- Author
-
Chu, Beilin, Xu, Xuan, Wang, Xin, Zhang, Yufei, You, Weike, and Zhou, Linna
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The rapid advancement of diffusion models has significantly improved high-quality image generation, making generated content increasingly challenging to distinguish from real images and raising concerns about potential misuse. In this paper, we observe that diffusion models struggle to accurately reconstruct mid-band frequency information in real images, suggesting the limitation could serve as a cue for detecting diffusion model generated images. Motivated by this observation, we propose a novel method called Frequency-guided Reconstruction Error (FIRE), which, to the best of our knowledge, is the first to investigate the influence of frequency decomposition on reconstruction error. FIRE assesses the variation in reconstruction error before and after the frequency decomposition, offering a robust method for identifying diffusion model generated images. Extensive experiments show that FIRE generalizes effectively to unseen diffusion models and maintains robustness against diverse perturbations., Comment: 14 pages, 14 figures
- Published
- 2024
138. Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models
- Author
-
Teng, Ma, Xiaojun, Jia, Ranjie, Duan, Xinfeng, Li, Yihao, Huang, Zhixuan, Chu, Yang, Liu, and Wenqi, Ren
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence - Abstract
With the rapid advancement of multimodal large language models (MLLMs), concerns regarding their security have increasingly captured the attention of both academia and industry. Although MLLMs are vulnerable to jailbreak attacks, designing effective multimodal jailbreak attacks poses unique challenges, especially given the distinct protective measures implemented across various modalities in commercial models. Previous works concentrate risks into a single modality, resulting in limited jailbreak performance. In this paper, we propose a heuristic-induced multimodal risk distribution jailbreak attack method, called HIMRD, which consists of two elements: multimodal risk distribution strategy and heuristic-induced search strategy. The multimodal risk distribution strategy is used to segment harmful instructions across multiple modalities to effectively circumvent MLLMs' security protection. The heuristic-induced search strategy identifies two types of prompts: the understanding-enhancing prompt, which helps the MLLM reconstruct the malicious prompt, and the inducing prompt, which increases the likelihood of affirmative outputs over refusals, enabling a successful jailbreak attack. Extensive experiments demonstrate that this approach effectively uncovers vulnerabilities in MLLMs, achieving an average attack success rate of 90% across seven popular open-source MLLMs and an average attack success rate of around 68% in three popular closed-source MLLMs. Our code will coming soon. Warning: This paper contains offensive and harmful examples, reader discretion is advised.
- Published
- 2024
139. f-P vs P-f based Grid-forming Control under RoCoF Event Considering Power and Energy Limits
- Author
-
Sun, Chu
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
Grid-forming (GFM) converter is deemed as one enabler for high penetration of renewable energy resources in power system. However, as will be pointed out in this letter, the conventional power-to-frequency (P-f) GFM control will face a dilemma in keeping power limit and grid synchronization when the energy resource of the converter reaches the limit. To address this challenge, a f-P and Q-V hybrid control is proposed, which exhibits similar GFM performance, particularly under weak grid condition, but is superior in power-limiting and grid synchronization as demonstrated by comparative studies.
- Published
- 2024
140. Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
- Author
-
Chen, Zhe, Wang, Weiyun, Cao, Yue, Liu, Yangzhou, Gao, Zhangwei, Cui, Erfei, Zhu, Jinguo, Ye, Shenglong, Tian, Hao, Liu, Zhaoyang, Gu, Lixin, Wang, Xuehui, Li, Qingyun, Ren, Yimin, Chen, Zixuan, Luo, Jiapeng, Wang, Jiahao, Jiang, Tan, Wang, Bo, He, Conghui, Shi, Botian, Zhang, Xingcheng, Lv, Han, Wang, Yi, Shao, Wenqi, Chu, Pei, Tu, Zhongying, He, Tong, Wu, Zhiyong, Deng, Huipeng, Ge, Jiaye, Chen, Kai, Zhang, Kaipeng, Wang, Limin, Dou, Min, Lu, Lewei, Zhu, Xizhou, Lu, Tong, Lin, Dahua, Qiao, Yu, Dai, Jifeng, and Wang, Wenhai
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We introduce InternVL 2.5, an advanced multimodal large language model (MLLM) series that builds upon InternVL 2.0, maintaining its core model architecture while introducing significant enhancements in training and testing strategies as well as data quality. In this work, we delve into the relationship between model scaling and performance, systematically exploring the performance trends in vision encoders, language models, dataset sizes, and test-time configurations. Through extensive evaluations on a wide range of benchmarks, including multi-discipline reasoning, document understanding, multi-image / video understanding, real-world comprehension, multimodal hallucination detection, visual grounding, multilingual capabilities, and pure language processing, InternVL 2.5 exhibits competitive performance, rivaling leading commercial models such as GPT-4o and Claude-3.5-Sonnet. Notably, our model is the first open-source MLLMs to surpass 70% on the MMMU benchmark, achieving a 3.7-point improvement through Chain-of-Thought (CoT) reasoning and showcasing strong potential for test-time scaling. We hope this model contributes to the open-source community by setting new standards for developing and applying multimodal AI systems. HuggingFace demo see https://huggingface.co/spaces/OpenGVLab/InternVL, Comment: Technical Report
- Published
- 2024
141. Infinite Grassmann time-evolving matrix product operators for non-equilibrium quantum impurity problems
- Author
-
Sun, Zhijie, Chen, Ruofan, Li, Zhenyu, and Guo, Chu
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Quantum Physics - Abstract
An emergent numerical approach to solve quantum impurity problems is to encode the impurity path integral as a matrix product state. For time-dependent problems, the cost of this approach generally scales with the evolution time. Here we consider a common non-equilibrium scenario where an impurity, initially in equilibrium with a thermal bath, is driven out of equilibrium by a time-dependent force term. Despite that there is no time-translational invariance in the problem, we show that we could still make full use of the infinite matrix product state technique, resulting in a method whose cost is essentially independent of the evolution time. We demonstrate the effectiveness of this method in the integrable case against exact diagonalization, and against existing calculations on the L-shaped Kadanoff-Baym contour in the general case. Our method could be a very competitive method for studying long-time non-equilibrium quantum dynamics, and be potentially used as an efficient impurity solver in the non-equilibrium dynamical mean field theory., Comment: 10 pages, 8 figures
- Published
- 2024
142. Measuring the Hubble constant through the galaxy pairwise peculiar velocity
- Author
-
Zhang, Wangzheng, Chu, Ming-chung, Liao, Shihong, Yeung, Shek, and Hu, Hui-Jie
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
The Hubble constant $H_0$, the current expansion rate of the universe, is one of the most important parameters in cosmology. The cosmic expansion regulates the mutually approaching motion of a pair of celestial objects due to their gravity. Therefore, the mean pairwise peculiar velocity of celestial objects, which quantifies their relative motion, is sensitive to both $H_0$ and the dimensionless total matter density $\Omega_m$. Based on this, using the Cosmicflows-4 data, we measured $H_0$ for the first time via the galaxy pairwise velocity in the nonlinear and quasi-linear range. Our results yield $H_0=75.5\pm1.4$ km s$^{-1}$ Mpc$^{-1}$ and $\Omega_m=0.311^{+0.029}_{-0.028}$ . The uncertainties of $H_0$ and $\Omega_m$ can be improved to around 0.6% and 2%, respectively, if the statistical errors become negligible in the future., Comment: 10 pages, 3 main + 2 appendix figures, accepted for publication in ApJL
- Published
- 2024
- Full Text
- View/download PDF
143. CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance
- Author
-
Thwal, Chu Myaet, Tun, Ye Lin, Nguyen, Minh N. H., Huh, Eui-Nam, and Hong, Choong Seon
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Beyond the success of Contrastive Language-Image Pre-training (CLIP), recent trends mark a shift toward exploring the applicability of lightweight vision-language models for resource-constrained scenarios. These models often deliver suboptimal performance when relying solely on a single image-text contrastive learning objective, spotlighting the need for more effective training mechanisms that guarantee robust cross-modal feature alignment. In this work, we propose CLIP-PING: Contrastive Language-Image Pre-training with Proximus Intrinsic Neighbors Guidance, a simple and efficient training paradigm designed to boost the performance of lightweight vision-language models with minimal computational overhead and lower data demands. CLIP-PING bootstraps unimodal features extracted from arbitrary pre-trained encoders to obtain intrinsic guidance of proximus neighbor samples, i.e., nearest-neighbor (NN) and cross nearest-neighbor (XNN). We find that extra contrastive supervision from these neighbors substantially boosts cross-modal alignment, enabling lightweight models to learn more generic features with rich semantic diversity. Extensive experiments reveal that CLIP-PING notably surpasses its peers in zero-shot generalization and cross-modal retrieval tasks. Specifically, a 5.5% gain on zero-shot ImageNet1K with 10.7% (I2T) and 5.7% (T2I) on Flickr30K, compared to the original CLIP when using ViT-XS image encoder trained on 3 million (image, text) pairs. Moreover, CLIP-PING showcases strong transferability under the linear evaluation protocol across several downstream tasks., Comment: 15 pages, 4 figures, 20 tables
- Published
- 2024
144. Quantum Theory of X-ray Photon Correlation Spectroscopy
- Author
-
Siriviboon, Phum, Fu, Chu-Liang, Landry, Michael, Okabe, Ryotaro, Carrizales, Denisse Córdova, Wang, Yao, and Li, Mingda
- Subjects
Condensed Matter - Materials Science - Abstract
Characterizing quantum materials is essential for understanding their microscopic interactions and advancing quantum technology. X-ray photon correlation spectroscopy (XPCS) with coherent X-ray sources offers access to higher-order correlations, but its theoretical basis, the Siegert relation, is derived from dynamical light scattering with independent classical scatterers, and its validity for XPCS remains unexamined. Here we present a microscopic quantum theory of XPCS derived from elecron-photon interaction Hamiltonians, introducing four configurations tied to distinct fourth-order electron-density correlation functions. We examine the validity of the Siegert relation and derive a generalized Siegert relation. Notably, the Siegert relation breaks down even in non-interacting Fermi gas due to exchange interactions. Furthermore, density matrix renormalization group calculations on 1D Kitaev chain reveal oscillatary signatures that can distinguish topologically trivial phases from topological phases with Majorana zero modes. Our work provides a robust theoretical foundation for XPCS and highlights the value of higher-order correlations in advanced X-ray and neutron sources for probing quantum materials., Comment: 16 pages,5 figures
- Published
- 2024
145. Diverse methods and practical aspects in controlling single semiconductor qubits: a review
- Author
-
Peng, Jia-Ao, Qiu, Chu-Dan, Ma, Wen-Long, and Luo, Jun-Wei
- Subjects
Quantum Physics - Abstract
Quantum control allows a wide range of quantum operations employed in molecular physics, nuclear magnetic resonance and quantum information processing. Thanks to the existing microelectronics industry, semiconducting qubits, where quantum information is encoded in spin or charge degree freedom of electrons or nuclei in semiconductor quantum dots, constitute a highly competitive candidate for scalable solid-state quantum technologies. In quantum information processing, advanced control techniques are needed to realize quantum manipulations with both high precision and noise resilience. In this review, we first introduce the basics of various widely-used control methods, including resonant excitation, adabatic passage, shortcuts to adiabaticity, composite pulses, and quantum optimal control. Then we review the practical aspects in applying these methods to realize accurate and robust quantum gates for single semiconductor qubits, such as Loss-DiVincenzo spin qubit, spinglet-triplet qubit, exchange-only qubit and charge qubit.
- Published
- 2024
146. Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis
- Author
-
Jin, Siyoon, Nam, Jisu, Kim, Jiyoung, Chung, Dahyun, Kim, Yeong-Seok, Park, Joonhyung, Chu, Heonjeong, and Kim, Seungryong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Exemplar-based semantic image synthesis aims to generate images aligned with given semantic content while preserving the appearance of an exemplar image. Conventional structure-guidance models, such as ControlNet, are limited in that they cannot directly utilize exemplar images as input, relying instead solely on text prompts to control appearance. Recent tuning-free approaches address this limitation by transferring local appearance from the exemplar image to the synthesized image through implicit cross-image matching in the augmented self-attention mechanism of pre-trained diffusion models. However, these methods face challenges when applied to content-rich scenes with significant geometric deformations, such as driving scenes. In this paper, we propose the Appearance Matching Adapter (AM-Adapter), a learnable framework that enhances cross-image matching within augmented self-attention by incorporating semantic information from segmentation maps. To effectively disentangle generation and matching processes, we adopt a stage-wise training approach. Initially, we train the structure-guidance and generation networks, followed by training the AM-Adapter while keeping the other networks frozen. During inference, we introduce an automated exemplar retrieval method to efficiently select exemplar image-segmentation pairs. Despite utilizing a limited number of learnable parameters, our method achieves state-of-the-art performance, excelling in both semantic alignment preservation and local appearance fidelity. Extensive ablation studies further validate our design choices. Code and pre-trained weights will be publicly available.: https://cvlab-kaist.github.io/AM-Adapter/
- Published
- 2024
147. Fixed-Term Decompositions Using Even-Indexed Fibonacci Numbers
- Author
-
Chu, Hung Viet, Kanji, Aney Manish, and Vasseur, Zachary Louis
- Subjects
Mathematics - General Mathematics ,11B39 - Abstract
As a variant of Zeckendorf's theorem, Chung and Graham proved that every positive integer can be uniquely decomposed into a sum of even-indexed Fibonacci numbers, whose coefficients are either $0, 1$, or $2$ so that between two coefficients $2$, there must be a coefficient $0$. This paper characterizes all positive integers that do not have $F_{2k}$ ($k\ge 1$) in their decompositions. This continues the work of Kimberling, Carlitz et al., Dekking, and Griffiths, to name a few, who studied such a characterization for Zeckendorf decomposition., Comment: 13 pages
- Published
- 2024
148. HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset
- Author
-
Chu, Zedong, Xiong, Feng, Liu, Meiduo, Zhang, Jinzhi, Shao, Mingqi, Sun, Zhaoxu, Wang, Di, and Xu, Mu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
With the rapid evolution of 3D generation algorithms, the cost of producing 3D humanoid character models has plummeted, yet the field is impeded by the lack of a comprehensive dataset for automatic rigging, which is a pivotal step in character animation. Addressing this gap, we present HumanRig, the first large-scale dataset specifically designed for 3D humanoid character rigging, encompassing 11,434 meticulously curated T-posed meshes adhered to a uniform skeleton topology. Capitalizing on this dataset, we introduce an innovative, data-driven automatic rigging framework, which overcomes the limitations of GNN-based methods in handling complex AI-generated meshes. Our approach integrates a Prior-Guided Skeleton Estimator (PGSE) module, which uses 2D skeleton joints to provide a preliminary 3D skeleton, and a Mesh-Skeleton Mutual Attention Network (MSMAN) that fuses skeleton features with 3D mesh features extracted by a U-shaped point transformer. This enables a coarse-to-fine 3D skeleton joint regression and a robust skinning estimation, surpassing previous methods in quality and versatility. This work not only remedies the dataset deficiency in rigging research but also propels the animation industry towards more efficient and automated character rigging pipelines., Comment: Website: https://github.com/c8241998/HumanRig
- Published
- 2024
149. X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models
- Author
-
Sun, Zeyi, Chu, Ziyang, Zhang, Pan, Wu, Tong, Dong, Xiaoyi, Zang, Yuhang, Xiong, Yuanjun, Lin, Dahua, and Wang, Jiaqi
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
In-context generation is a key component of large language models' (LLMs) open-task generalization capability. By leveraging a few examples as context, LLMs can perform both in-domain and out-of-domain tasks. Recent advancements in auto-regressive vision-language models (VLMs) built upon LLMs have showcased impressive performance in text-to-image generation. However, the potential of in-context learning for general image generation tasks remains largely unexplored. To address this, we introduce X-Prompt, a purely auto-regressive large-vision language model designed to deliver competitive performance across a wide range of both seen and unseen image generation tasks, all within a unified in-context learning framework. X-Prompt incorporates a specialized design that efficiently compresses valuable features from in-context examples, supporting longer in-context token sequences and improving its ability to generalize to unseen tasks. A unified training task for both text and image prediction enables X-Prompt to handle general image generation with enhanced task awareness from in-context examples. Extensive experiments validate the model's performance across diverse seen image generation tasks and its capacity to generalize to previously unseen tasks., Comment: code: https://github.com/SunzeY/X-Prompt
- Published
- 2024
150. SEAL: Semantic Attention Learning for Long Video Representation
- Author
-
Wang, Lan, Chen, Yujia, Tran, Du, Boddeti, Vishnu Naresh, and Chu, Wen-Sheng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Long video understanding presents challenges due to the inherent high computational complexity and redundant temporal information. An effective representation for long videos must process such redundancy efficiently while preserving essential contents for downstream tasks. This paper introduces SEmantic Attention Learning (SEAL), a novel unified representation for long videos. To reduce computational complexity, long videos are decomposed into three distinct types of semantic entities: scenes, objects, and actions, allowing models to operate on a handful of entities rather than a large number of frames or pixels. To further address redundancy, we propose an attention learning module that balances token relevance with diversity formulated as a subset selection optimization problem. Our representation is versatile, enabling applications across various long video understanding tasks. Extensive experiments show that SEAL significantly outperforms state-of-the-art methods in video question answering and temporal grounding tasks and benchmarks including LVBench, MovieChat-1K, and Ego4D.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.