7,210 results on '"Gou, P."'
Search Results
2. Influence of the Deformation of Coronal Mass Ejections on Their In-Situ Fitting with Circular-Cross-Section Flux Rope Models
- Author
-
Zhuang, Bin, Lugaz, Noé, Al-Haddad, Nada, Farrugia, Charles J., Amerstorfer, Ute, Davies, Emma E., Temmer, Manuela, Rüdisser, Hannah T., Yu, Wenyuan, Gou, Tingyu, and Winslow, Réka M.
- Subjects
Physics - Space Physics ,Astrophysics - Solar and Stellar Astrophysics - Abstract
Understanding the properties, especially the magnetohydrodynamic (MHD) invariants, of coronal mass ejections (CMEs) measured in-situ is key to bridging the CME properties from the Sun to interplanetary space. In order to investigate CMEs from the in-situ measurements that provide a one-dimensional (1-D) cut of the CME parameters over the spacecraft trajectory, various magnetic flux rope (MFR) models have been developed, among which the models with a circular cross-section are the most popular and widely used. CMEs are found to be deformed during their propagation in interplanetary space, in which the cross-section may be flattened in the direction of propagation, i.e., to develop an elliptical or even pancake-like shape. We use numerical MHD simulations in 2.5-D to investigate the influence of the CME deformation on the in-situ fitting using two linear force-free MFR models with a circular cross-section, and we focus on the axial and poloidal magnetic fluxes, which are conserved in the ideal MHD frame and simulations. We quantitatively compare the fitted axial and poloidal fluxes with those in simulations. We find that both models underestimate the axial flux compared to that in simulations, and such underestimation depends on the CME deformation. However, the fitting of the poloidal flux is independent of the deformation. We discuss the reasons for the axial flux underestimation and the implication of the CME deformation for the CME in-situ fitting., Comment: Accepted by Solar Physics
- Published
- 2025
3. ProAPO: Progressively Automatic Prompt Optimization for Visual Classification
- Author
-
Qu, Xiangyan, Gou, Gaopeng, Zhuang, Jiamin, Yu, Jing, Song, Kun, Wang, Qihao, Li, Yili, and Xiong, Gang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Vision-language models (VLMs) have made significant progress in image classification by training with large-scale paired image-text data. Their performances largely depend on the prompt quality. While recent methods show that visual descriptions generated by large language models (LLMs) enhance the generalization of VLMs, class-specific prompts may be inaccurate or lack discrimination due to the hallucination in LLMs. In this paper, we aim to find visually discriminative prompts for fine-grained categories with minimal supervision and no human-in-the-loop. An evolution-based algorithm is proposed to progressively optimize language prompts from task-specific templates to class-specific descriptions. Unlike optimizing templates, the search space shows an explosion in class-specific candidate prompts. This increases prompt generation costs, iterative times, and the overfitting problem. To this end, we first introduce several simple yet effective edit-based and evolution-based operations to generate diverse candidate prompts by one-time query of LLMs. Then, two sampling strategies are proposed to find a better initial search point and reduce traversed categories, saving iteration costs. Moreover, we apply a novel fitness score with entropy constraints to mitigate overfitting. In a challenging one-shot image classification setting, our method outperforms existing textual prompt-based methods and improves LLM-generated description methods across 13 datasets. Meanwhile, we demonstrate that our optimal prompts improve adapter-based methods and transfer effectively across different backbones., Comment: Accepted to the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025
- Published
- 2025
4. Nonlinear bound states with prescribed angular momentum in the mass supercritical regime
- Author
-
Gou, Tianxiang and Shen, Xiaoan
- Subjects
Mathematics - Analysis of PDEs ,35A15, 35Q41, 35R35 - Abstract
In this paper, we consider the existence, orbital stability/instability and regularity of bound state solutions to nonlinear Schr\"odinger equations with super-quadratic confinement in two and three spatial dimensions for the mass supercritical case. Such solutions, which are given by time-dependent rotations of a non-radially symmetric spatial profile, correspond to critical points of the underlying energy function restricted on the double constraints consisting of the mass and the angular momentum. The study exhibits new pictures for rotating Bose-Einstein condensates within the framework of Gross-Pitaevskii theory. It is proved that there exist two non-radial symmetric solutions, one of which is local minimizer and the other is mountain pass type critical point of the underlying energy function restricted on the constraints. Moreover, we derive conditions that guarantee that local minimizers are regular, the set of those is orbitally stable and mountain pass type solutions are strongly unstable. The results extend and complement the recent ones in \cite{NSS}, where the consideration is undertaken in the mass subcritical case., Comment: 17 pages
- Published
- 2025
5. Ultra-high-energy $\gamma$-ray emission associated with the tail of a bow-shock pulsar wind nebula
- Author
-
Cao, Zhen, Aharonian, F., Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Bian, W., Bukevich, A. V., Cai, C. M., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, H. X., Chen, Liang, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, X. B., Chen, X. J., Chen, Y., Cheng, N., Cheng, Y. D., Chu, M. C., Cui, M. Y., Cui, S. W., Cui, X. H., Dai, Y. D. Cui B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, Diao, Y. X., Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, J. H., Fang, K., Feng, C. F., Feng, H. Feng L., Feng, S. H., Feng, X. T., Feng, Y., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, T. T. Ge L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, J., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., Hannuksela, O. A., Hasan, M., He, H. H., He, H. N., He, J. Y., He, X. Y., He, Y., Hernández-Cadena, S., Hou, Y. K. Hor B. W., Hou, C., Hou, X., Hu, H. B., Hu, S. C., Huang, C., Huang, D. H., Huang, J. J., Huang, T. Q., Huang, W. J. Huang X. T., Huang, X. Y., Huang, Y., Huang, Y. Y., Ji, X. L., Jia, H. Y., Jia, K., Jiang, H. B., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kaci, S., Kang, M. M., Karpikov, I., Khangulyan, D., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, Jian, Li, Jie, Li, K., Li, L., Li, R. L., Li, S. D., Li, T. Y., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liu, E. W. Liang Y. F. Liang S. J. Lin B., Liu, C., Liu, D., Liu, D. B., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. R., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, X., Liu, Y., Liu, Y. N., Lou, Y. Q., Luo, Q. Luo Y., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mou, G. B., Mu, H. J., Nan, Y. C., Neronov, A., Ng, K. C. Y., Ni, M. Y., Nie, L., Ou, L. J., Pattarakijwanich, P., Pei, Z. Y., Qi, J. C., Qi, M. Y., Qin, J. J., Raza, A., Ren, C. Y., Ruffolo, D., Sáiz, A., Saeed, M., Semikoz, D., Shao, L., Shchegolev, O., Shen, Y. Z., Sheng, X. D., Shi, Z. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, D. X., Sun, Q. N., Sun, X. N. Sun Z. B., Takata, J., Tan, P. H. T. Tam H. B., Tang, Q. W., Tang, R., Tang, Z. B., Tian, W. W., Tong, C. N., Wang, L. H. Wan C., Wang, G. W., Wang, H. G., Wang, H. H. Wang J. C., Wang, K., Wang, Kai, Wang, L. P., Wang, L. Y., Wang, R., Wang, W. Wang X. G. Wang X. J., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Z. H., Wang, Z. X., Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Weng, S. S., Wu, C. Y., Wu, H. R., Wu, Q. W., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, Y. L., Xing, Y., Xiong, D. R., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, C. Y., Yang, F. F., Yang, L. L. Yang M. J., Yang, R. Z., Yang, W. X., Yao, Y. H., Yao, Z. G., Ye, X. A., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, B. T., Zhang, F., Zhang, H., Zhang, H. M. Zhang H. Y., Zhang, J. L., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. R., Zhang, S. S., Zhang, W. Y., Zhang, X., Zhang, X. P., Zhang, Yi, Zhang, Yong, Zhang, Z. P., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zhao, X. H., Zhao, Z. H., Zheng, F., Zhong, W. J., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, B. Y., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., Zou, Y. C., and Zuo, X.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,High Energy Physics - Phenomenology - Abstract
In this study, we present a comprehensive analysis of an unidentified point-like ultra-high-energy (UHE) $\gamma$-ray source, designated as 1LHAASO J1740+0948u, situated in the vicinity of the middle-aged pulsar PSR J1740+1000. The detection significance reached 17.1$\sigma$ (9.4$\sigma$) above 25$\,$TeV (100$\,$TeV). The source energy spectrum extended up to 300$\,$TeV, which was well fitted by a log-parabola function with $N0 = (1.93\pm0.23) \times 10^{-16} \rm{TeV^{-1}\,cm^{-2}\,s^{-2}}$, $\alpha = 2.14\pm0.27$, and $\beta = 1.20\pm0.41$ at E0 = 30$\,$TeV. The associated pulsar, PSR J1740+1000, resides at a high galactic latitude and powers a bow-shock pulsar wind nebula (BSPWN) with an extended X-ray tail. The best-fit position of the gamma-ray source appeared to be shifted by $0.2^{\circ}$ with respect to the pulsar position. As the (i) currently identified pulsar halos do not demonstrate such offsets, and (ii) centroid of the gamma-ray emission is approximately located at the extension of the X-ray tail, we speculate that the UHE $\gamma$-ray emission may originate from re-accelerated electron/positron pairs that are advected away in the bow-shock tail., Comment: Corrected spelling errors in several author names
- Published
- 2025
- Full Text
- View/download PDF
6. Corrupted but Not Broken: Rethinking the Impact of Corrupted Data in Visual Instruction Tuning
- Author
-
Gou, Yunhao, Yang, Hansi, Liu, Zhili, Chen, Kai, Zeng, Yihan, Hong, Lanqing, Li, Zhenguo, Liu, Qun, Kwok, James T., and Zhang, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Visual Instruction Tuning (VIT) enhances Multimodal Large Language Models (MLLMs) but it is hindered by corrupted datasets containing hallucinated content, incorrect responses, and poor OCR quality. While prior works focus on dataset refinement through high-quality data collection or rule-based filtering, they are costly or limited to specific types of corruption. To deeply understand how corrupted data affects MLLMs, in this paper, we systematically investigate this issue and find that while corrupted data degrades the performance of MLLMs, its effects are largely superficial in that the performance of MLLMs can be largely restored by either disabling a small subset of parameters or post-training with a small amount of clean data. Additionally, corrupted MLLMs exhibit improved ability to distinguish clean samples from corrupted ones, enabling the dataset cleaning without external help. Based on those insights, we propose a corruption-robust training paradigm combining self-validation and post-training, which significantly outperforms existing corruption mitigation strategies.
- Published
- 2025
7. Local Flaw Detection with Adaptive Pyramid Image Fusion Across Spatial Sampling Resolution for SWRs
- Author
-
You, Siyu, Gou, Huayi, Yang, Leilei, Liu, Zhiliang, and Zuo, Mingjian
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Electrical Engineering and Systems Science - Signal Processing - Abstract
The inspection of local flaws (LFs) in Steel Wire Ropes (SWRs) is crucial for ensuring safety and reliability in various industries. Magnetic Flux Leakage (MFL) imaging is commonly used for non-destructive testing, but its effectiveness is often hindered by the combined effects of inspection speed and sampling rate. To address this issue, the impacts of inspection speed and sampling rate on image quality are studied, as variations in these factors can cause stripe noise, axial compression of defect features, and increased interference, complicating accurate detection. We define the relationship between inspection speed and sampling rate as spatial sampling resolution (SSR) and propose an adaptive SSR target-feature-oriented (AS-TFO) method. This method incorporates adaptive adjustment and pyramid image fusion techniques to enhance defect detection under different SSR scenarios. Experimental results show that under high SSR scenarios, the method achieves a precision of 92.54% and recall of 98.41%. It remains robust under low SSR scenarios with a precision of 94.87% and recall of 97.37%. The overall results show that the proposed method outperforms conventional approaches, achieving state-of-the-art performance. This improvement in detection accuracy and robustness is particularly valuable for handling complex inspection conditions, where inspection speed and sampling rate can vary significantly, making detection more robust and reliable in industrial settings., Comment: Submitted to IEEE Sensors Journal for possible publication
- Published
- 2025
8. Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
- Author
-
Pahuja, Vardaan, Lu, Yadong, Rosset, Corby, Gou, Boyu, Mitra, Arindam, Whitehead, Spencer, Su, Yu, and Awadallah, Ahmed
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
Recent success in large multimodal models (LMMs) has sparked promising applications of agents capable of autonomously completing complex web tasks. While open-source LMM agents have made significant advances in offline evaluation benchmarks, their performance still falls substantially short of human-level capabilities in more realistic online settings. A key bottleneck is the lack of diverse and large-scale trajectory-level datasets across various domains, which are expensive to collect. In this paper, we address this challenge by developing a scalable recipe to synthesize the largest and most diverse trajectory-level dataset to date, containing over 94K successful multimodal web trajectories, spanning 49K unique URLs, 720K screenshots, and 33M web elements. In particular, we leverage extensive web exploration and refinement to obtain diverse task intents. The average cost is 28 cents per successful trajectory, making it affordable to a wide range of users in the community. Leveraging this dataset, we train Explorer, a multimodal web agent, and demonstrate strong performance on both offline and online web agent benchmarks such as Mind2Web-Live, Multimodal-Mind2Web, and MiniWob++. Additionally, our experiments highlight data scaling as a key driver for improving web agent capabilities. We hope this study makes state-of-the-art LMM-based agent research at a larger scale more accessible., Comment: 24 pages, 7 figures
- Published
- 2025
9. Progress of the TianQin project
- Author
-
Luo, Jun, Bai, Shaojun, Bai, Yan-Zheng, Cai, Lin, Dang, Hao, Dong, Qijia, Duan, Hui-Zong, Du, Yuanbo, Fan, Lei, Fu, Xinju, Gao, Yong, Gou, Xingyu, Guo, Changlei, Hong, Wei, Hu, Bin, Hu, Heran, Hu, Ming, Hu, Yi-Ming, Huang, Fa Peng, Gu, Defeng, Ji, Xin, Jiang, Yuan-Ze, Li, En-Kun, Li, Hongyin, Li, Ming, Li, Yong, Li, Zhu, Li, Zizheng, Lian, JunXiang, Liang, Yu-Rong, Lin, Xudong, Liu, Jianping, Liu, Lin-Xia, Liu, Kui, Liu, Li, Liu, Minghe, Liu, Qi, Liu, Yan-Chong, Liu, Yue, Luo, Peng-Shun, Luo, Yingxin, Ma, Yi-Qiu, Ma, Yun, Meng, Yunhe, Milyukov, Vadim, Peng, Jian-Guo, Postnov, Konstantin, Qu, Shao-Bo, Shan, Tilei, Shao, Cheng-Gang, Shi, Changfu, Song, Pei-Yi, Song, Yunfei, Su, Wei, Tan, Ding Yin, Tan, Shuping, Tan, Yu-Jie, Tan, Wenhai, Tu, Liangcheng, Wang, Cheng-Rui, Wang, Guoyong, Wang, Lijiao, Wang, Pan-Pan, Wang, Shun, Wang, Xiaoyong, Wang, Xudong, Wang, Yan, Wei, Ran, Wu, Shu-Chao, Xu, Jie, Xu, Zhi-Lin, Xue, Chao, Yan, Hao, Yan, Yong, Yang, Changpeng, Yang, Shanqing, Yeh, Hsien-Chi, Yin, Hang, Tong, Yelong, Yu, Jian-Bo, Yuan, Wen-Hao, Zhang, Bu-Tian, Zhang, Dexuan, Zhang, Jian-dong, Zhang, Jie, Zhang, Lihua, Zhang, Xuefeng, Zhao, Guoying, Zhao, Liqian, Zhao, Xin, Zhou, An-Nan, Zhou, Hao, Zhou, Peng, Zhou, Yupeng, Zhou, Ze-Bing, Zhu, Fan, Zhu, Liang-Gui, Zhu, Lin, Zou, Kui, and Mei, Jianwei
- Subjects
General Relativity and Quantum Cosmology ,Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
TianQin is a future space-based gravitational wave observatory targeting the frequency window of $10^{-4}$ Hz $\sim 1$ Hz. A large variety of gravitational wave sources are expected in this frequency band, including the merger of massive black hole binaries, the inspiral of extreme/intermediate mass ratio systems, stellar-mass black hole binaries, Galactic compact binaries, and so on. TianQin will consist of three Earth orbiting satellites on nearly identical orbits with orbital radii of about $10^5$ km. The satellites will form a normal triangle constellation whose plane is nearly perpendicular to the ecliptic plane. The TianQin project has been progressing smoothly following the ``0123" technology roadmap. In step ``0", the TianQin laser ranging station has been constructed and it has successfully ranged to all the five retro-reflectors on the Moon. In step ``1", the drag-free control technology has been tested and demonstrated using the TianQin-1 satellite. In step ``2", the inter-satellite laser interferometry technology will be tested using the pair of TianQin-2 satellites. The TianQin-2 mission has been officially approved and the satellites will be launched around 2026. In step ``3", i.e., the TianQin-3 mission, three identical satellites will be launched around 2035 to form the space-based gravitational wave detector, TianQin, and to start gravitational wave detection in space., Comment: 45 pages, 3 figures
- Published
- 2025
10. Ideas and Requirements for the Global Cosmic-Ray Observatory (GCOS)
- Author
-
Ahlers, Markus, Allekotte, Ingo, Alvarez-Muniz, Jaime, Anastasi, Gioacchino Alex, Anchordoqui, Luis, Anjos, Rita de Cassia Dos, Balakrishnan, Hari Haran, Batista, Rafael Alves, Bellido, Jose, Bertaina, Mario, Bhatnagar, Sonali, Billoir, Pierre, Bismark, Kathrin, Bister, Teresa, Bohacova, Martina, Bonifazi, Carla, Bradfield, Fraser, Castellina, Antonella, Cazon, Lorenzo, Cheminant, Kevin Almeida, Coleman, Alan, Convenga, Fabio, Veberič, Darko, Dasgupta, Paramita, Daumiller, Kai, Dawson, Bruce, Deval, Luca, Engel, Ralph, Eser, Johannes, Fang, Ke, Farrar, Glennys R., Fedynitch, Anatoli, Fenu, Francesco, Fitoussi, Thomas, Flaggs, Benjamin, Fodran, Tomas, Fujii, Toshihiro, Fujita, Keitaro, Garzelli, Maria Vittoria, Globus, Noemie, Goksu, Hazal, Gou, Quanbu, Hahn, Steffen, Hariharan, Balakrishnan, Haungs, Andreas, Higuchi, Ryo, Hnatyk, Bohdan, Hörandel, Jörg, Huege, Tim, Ikeda, Daisuke, Ikkatai, Yuko, Mariş, Ioana, Isar, Gina, James, Robin, Carvalho Jr, Washington, Kaderi, Yunos El, Kadler, Matthias, Kampert, Karl-Heinz, Kang, Donghwa, Khakurdikar, Abha, Kido, Eiji, Kleifges, Matthias, Koirala, Ramesh, Kong, Chuizheng, Koyama, C., Krizmanic, John, Kulshrestha, Shivam, Kungel, Viktoria, Leszczyńska, Agnieszka, Liu, Ruoyu, Luce, Quentin, Marchenko, Volodymyr, Mariazzi, Analisa, di Matteo, Armando, Matthews, John N., Mayotte, Eric, Mazur, Peter, Meli, Athina, Menjo, Hiroaki, Montanet, François, Müller, Ana Laura, Murase, Kohta, Muzio, Marco, Nellen, Lukas, Niechciol, Marcus, Nitz, David, Nonaka, Toshiyuki, Ogio, Shoichi, Ohira, Yutaka, Oikonomou, Foteini, Olinto, Angela V, Oshima, Hitoshi, Oueslati, Rami, Paudel, Ek Narayan, Paul, Thomas, Pawlowsky, Jannis, Payeras, Allan Machado, Pelgrims, Vincent, Perrone, Lorenzo, Pont, Bjarni, Porcelli, Alessio, Rautenberg, Julian, Riehn, Felix, Risse, Markus, Roth, Markus, Saftoiu, Alexandra, Sako, Takashi, Sakurai, Shunsuke, Salamida, Francesco, Sánchez, Juan Antonio Aguilar, Santangelo, Andrea, Santos, Eva, Sarazin, Fred, Schäfer, Christoph, Scherini, Viviana, Schieler, Harald, Schmidt, David, Schoorlemmer, Harm, Schroeder, Frank, Sergijenko, Olga, Shin, H. S., Soldin, Dennis, Suarez-Duran, Mauricio, Takahashi, Kaoru, Takeda, Masahiro, Tameda, Yuichiro, Tkachenko, Olena, Tomida, Takayuki, Travnicek, Petr, Unger, Michael, Urban, Federico, Venters, Tonia, Verzi, Valerio, Vicha, Jakub, van Vliet, Arjen, Watson, Alan A., Yushkov, Alexey, Zapparrata, Orazio, and Zhang, Pengfei
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
After a successful kick-off meeting in 2021. two workshops in 2022 and 2023 on the future Global Cosmic-Ray Observatory (GCOS) focused mainly on a straw man design of the detector and science possibilities for astro- and particle physics. About 100 participants gathered for in-person and hybrid panel discussions. In this report, we summarize these discussions, present a preliminary straw-man design for GCOS and collect short write-ups of the flash talks given during the focus sessions., Comment: 48 pages, 27 figures
- Published
- 2025
11. Broadband $\gamma$-ray spectrum of supernova remnant Cassiopeia A
- Author
-
Cao, Zhen, Aharonian, F., Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Bian, W., Bukevich, A. V., Cai, C. M., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, H. X., Chen, Liang, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, X. B., Chen, X. J., Chen, Y., Cheng, N., Cheng, Y. D., Chu, M. C., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, Diao, Y. X., Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, J. H., Fang, K., Feng, C. F., Feng, H., Feng, L., Feng, S. H., Feng, X. T., Feng, Y., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Ge, T. T., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, J., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., Hannuksela, O. A., Hasan, M., He, H. H., He, H. N., He, J. Y., He, X. Y., He, Y., Hernández-Cadena, S., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, S. C., Huang, C., Huang, D. H., Huang, J. J., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Y. Y., Ji, X. L., Jia, H. Y., Jia, K., Jiang, H. B., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kaci, S., Kang, M. M., Karpikov, I., Khangulyan, D., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, Jian, Li, Jie, Li, K., Li, L., Li, R. L., Li, S. D., Li, T. Y., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, S. J., Liu, B., Liu, C., Liu, D., Liu, D. B., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. R., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, X., Liu, Y., Liu, Y. N., Lou, Y. Q., Luo, Q., Luo, Y., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mou, G. B., Mu, H. J., Nan, Y. C., Neronov, A., Ng, K. C. Y., Ni, M. Y., Nie, L., Ou, L. J., Pattarakijwanich, P., Pei, Z. Y., Qi, J. C., Qi, M. Y., Qin, J. J., Raza, A., Ren, C. Y., Ruffolo, D., Sáiz, A., Saeed, M., Semikoz, D., Shao, L., Shchegolev, O., Shen, Y. Z., Sheng, X. D., Shi, Z. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, D. X., Sun, H., Sun, Q. N., Sun, X. N., Sun, Z. B., Tabasam, N. H., Takata, J., Tam, P. H. T., Tan, H. B., Tang, Q. W., Tang, R., Tang, Z. B., Tian, W. W., Tong, C. N., Wan, L. H., Wang, C., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, Kai, Wang, L. P., Wang, L. Y., Wang, R., Wang, W., Wang, X. G., Wang, X. J., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Z. H., Wang, Z. X., Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Weng, S. S., Wu, C. Y., Wu, H. R., Wu, Q. W., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, Y. L., Xing, Y., Xiong, D. R., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, C. Y., Yang, F. F., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, W. X., Yao, Y. H., Yao, Z. G., Ye, X. A., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, B. T., Zhang, F., Zhang, H., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. R., Zhang, S. S., Zhang, W. Y., Zhang, X., Zhang, X. P., Zhang, Yi, Zhang, Yong, Zhang, Z. P., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zhao, X. H., Zhao, Z. H., Zheng, F., Zhong, W. J., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, B. Y., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., Zou, Y. C., and Zuo, X.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The core-collapse supernova remnant (SNR) Cassiopeia A (Cas A) is one of the brightest galactic radio sources with an angular radius of $\sim$ 2.5 $\arcmin$. Although no extension of this source has been detected in the $\gamma$-ray band, using more than 1000 days of LHAASO data above $\sim 0.8$ TeV, we find that its spectrum is significantly softer than those obtained with Imaging Air Cherenkov Telescopes (IACTs) and its flux near $\sim 1$ TeV is about two times higher. In combination with analyses of more than 16 years of \textit{Fermi}-LAT data covering $0.1 \, \mathrm{GeV} - 1 \, \mathrm{TeV}$, we find that the spectrum above 30 GeV deviates significantly from a single power-law, and is best described by a smoothly broken power-law with a spectral index of $1.90 \pm 0.15_\mathrm{stat}$ ($3.41 \pm 0.19_\mathrm{stat}$) below (above) a break energy of $0.63 \pm 0.21_\mathrm{stat} \, \mathrm{TeV}$. Given differences in the angular resolution of LHAASO-WCDA and IACTs, TeV $\gamma$-ray emission detected with LHAASO may have a significant contribution from regions surrounding the SNR illuminated by particles accelerated earlier, which, however, are treated as background by IACTs. Detailed modelling can be used to constrain acceleration processes of TeV particles in the early stage of SNR evolution.
- Published
- 2025
12. New phase space of hardness materials and synergic enhancement of hardness and toughness in superconducting Ti2Co and Ti4Co2X (X = B, C, N, O)
- Author
-
Shi, Lifen, Ma, Keyuan, Hou, Jingyu, Ying, Pan, Wang, Ningning, Xiang, Xiaojun, Yang, Pengtao, Yu, Xiaohui, Gou, Huiyang, Sun, Jianping, Uwatoko, Yoshiya, von Rohr, Fabian O., Zhou, Xiang-Feng, Wang, Bosen, and Cheng, Jinguang
- Subjects
Condensed Matter - Materials Science - Abstract
Compared to traditional superhard materials with high electron density and strong covalent bonds, alloy materials mainly composed of metallic bonding structures typically have great toughness and lower hardness. Breaking through the limits of alloy materials is a preface and long term topic, which is of great significance and value for improving the comprehensive mechanical properties of alloy materials. Here, we report on the discovery of a cubic alloy semiconducting material Ti2Co with large Vickers of hardness Hvexp = 6.7 GPa and low fracture toughness of KICexp =1.51 MPa m0.5. Unexpectedly, the former value is nearly triple of the Hvcal = 2.66 GPa predicted by density functional theory (DFT) calculations and the latter value is about one or two orders of magnitude smaller than that of ordinary titanium alloy materials (KICexp = 30-120 MPa m0.5).These specifications place Ti2Co far from the phase space of the known alloy materials, but close to medium hardness materials such as MgO or TiO2. Upon incorporation of oxygen into structural void positions, both values were simultaneously improved for Ti4Co2O to = 9.7 GPa and 2.19 MPa m0.5, respectively. Further DFT calculations on the electron localization function of Ti4Co2X (X = B, C, N, O) vs. the interstitial elements indicate that these simultaneous improvements originate from the coexistence of Ti-Co metallic bonds, the emergence of newly oriented Ti-X covalent bonds, and the increase of electron concentration. Moreover, the large difference between Hvexp and Hvcal of Ti2Co suggests underlying mechanism concerning the absence of the O(16d) or Ti2-O bonds in the O-(Ti2)6 octahedron.Our discovery expands the phase space of alloy materials and illuminates the path of exploring superconducting materials with excellent mechanical performances., Comment: 17 pages, 4 figures
- Published
- 2025
13. Scalable Safe Multi-Agent Reinforcement Learning for Multi-Agent System
- Author
-
Du, Haikuo, Gou, Fandi, and Cai, Yunze
- Subjects
Computer Science - Multiagent Systems ,Computer Science - Artificial Intelligence - Abstract
Safety and scalability are two critical challenges faced by practical Multi-Agent Systems (MAS). However, existing Multi-Agent Reinforcement Learning (MARL) algorithms that rely solely on reward shaping are ineffective in ensuring safety, and their scalability is rather limited due to the fixed-size network output. To address these issues, we propose a novel framework, Scalable Safe MARL (SS-MARL), to enhance the safety and scalability of MARL methods. Leveraging the inherent graph structure of MAS, we design a multi-layer message passing network to aggregate local observations and communications of varying sizes. Furthermore, we develop a constrained joint policy optimization method in the setting of local observation to improve safety. Simulation experiments demonstrate that SS-MARL achieves a better trade-off between optimality and safety compared to baselines, and its scalability significantly outperforms the latest methods in scenarios with a large number of agents. The feasibility of our method is also verified by hardware implementation with Mecanum-wheeled vehicles.
- Published
- 2025
14. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
- Author
-
DeepSeek-AI, Guo, Daya, Yang, Dejian, Zhang, Haowei, Song, Junxiao, Zhang, Ruoyu, Xu, Runxin, Zhu, Qihao, Ma, Shirong, Wang, Peiyi, Bi, Xiao, Zhang, Xiaokang, Yu, Xingkai, Wu, Yu, Wu, Z. F., Gou, Zhibin, Shao, Zhihong, Li, Zhuoshu, Gao, Ziyi, Liu, Aixin, Xue, Bing, Wang, Bingxuan, Wu, Bochao, Feng, Bei, Lu, Chengda, Zhao, Chenggang, Deng, Chengqi, Zhang, Chenyu, Ruan, Chong, Dai, Damai, Chen, Deli, Ji, Dongjie, Li, Erhang, Lin, Fangyun, Dai, Fucong, Luo, Fuli, Hao, Guangbo, Chen, Guanting, Li, Guowei, Zhang, H., Bao, Han, Xu, Hanwei, Wang, Haocheng, Ding, Honghui, Xin, Huajian, Gao, Huazuo, Qu, Hui, Li, Hui, Guo, Jianzhong, Li, Jiashi, Wang, Jiawei, Chen, Jingchang, Yuan, Jingyang, Qiu, Junjie, Li, Junlong, Cai, J. L., Ni, Jiaqi, Liang, Jian, Chen, Jin, Dong, Kai, Hu, Kai, Gao, Kaige, Guan, Kang, Huang, Kexin, Yu, Kuai, Wang, Lean, Zhang, Lecong, Zhao, Liang, Wang, Litong, Zhang, Liyue, Xu, Lei, Xia, Leyi, Zhang, Mingchuan, Zhang, Minghua, Tang, Minghui, Li, Meng, Wang, Miaojun, Li, Mingming, Tian, Ning, Huang, Panpan, Zhang, Peng, Wang, Qiancheng, Chen, Qinyu, Du, Qiushi, Ge, Ruiqi, Zhang, Ruisong, Pan, Ruizhe, Wang, Runji, Chen, R. J., Jin, R. L., Chen, Ruyi, Lu, Shanghao, Zhou, Shangyan, Chen, Shanhuang, Ye, Shengfeng, Wang, Shiyu, Yu, Shuiping, Zhou, Shunfeng, Pan, Shuting, Li, S. S., Zhou, Shuang, Wu, Shaoqing, Yun, Tao, Pei, Tian, Sun, Tianyu, Wang, T., Zeng, Wangding, Zhao, Wanjia, Liu, Wen, Liang, Wenfeng, Gao, Wenjun, Yu, Wenqin, Zhang, Wentao, Xiao, W. L., An, Wei, Liu, Xiaodong, Wang, Xiaohan, Chen, Xiaokang, Nie, Xiaotao, Cheng, Xin, Liu, Xin, Xie, Xin, Liu, Xingchao, Yang, Xinyu, Li, Xinyuan, Su, Xuecheng, Lin, Xuheng, Li, X. Q., Jin, Xiangyue, Shen, Xiaojin, Chen, Xiaosha, Sun, Xiaowen, Wang, Xiaoxiang, Song, Xinnan, Zhou, Xinyi, Wang, Xianzu, Shan, Xinxia, Li, Y. K., Wang, Y. Q., Wei, Y. X., Zhang, Yang, Xu, Yanhong, Li, Yao, Zhao, Yao, Sun, Yaofeng, Wang, Yaohui, Yu, Yi, Zhang, Yichao, Shi, Yifan, Xiong, Yiliang, He, Ying, Piao, Yishi, Wang, Yisong, Tan, Yixuan, Ma, Yiyang, Liu, Yiyuan, Guo, Yongqiang, Ou, Yuan, Wang, Yuduan, Gong, Yue, Zou, Yuheng, He, Yujia, Xiong, Yunfan, Luo, Yuxiang, You, Yuxiang, Liu, Yuxuan, Zhou, Yuyang, Zhu, Y. X., Huang, Yanping, Li, Yaohui, Zheng, Yi, Zhu, Yuchen, Ma, Yunxian, Tang, Ying, Zha, Yukun, Yan, Yuting, Ren, Z. Z., Ren, Zehui, Sha, Zhangli, Fu, Zhe, Xu, Zhean, Xie, Zhenda, Zhang, Zhengyan, Hao, Zhewen, Ma, Zhicheng, Yan, Zhigang, Wu, Zhiyu, Gu, Zihui, Zhu, Zijia, Liu, Zijun, Li, Zilin, Xie, Ziwei, Song, Ziyang, Pan, Zizheng, Huang, Zhen, Xu, Zhipeng, Zhang, Zhongyu, and Zhang, Zhen
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters challenges such as poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama.
- Published
- 2025
15. Science objectives of the Einstein Probe mission
- Author
-
Yuan, Weimin, Dai, Lixin, Feng, Hua, Jin, Chichuan, Jonker, Peter, Kuulkers, Erik, Liu, Yuan, Nandra, Kirpal, O'Brien, Paul, Piro, Luigi, Rau, Arne, Rea, Nanda, Sanders, Jeremy, Tao, Lian, Wang, Junfeng, Wu, Xuefeng, Zhang, Bing, Zhang, Shuangnan, Ai, Shunke, Buchner, Johannes, Bulbul, Esra, Chen, Hechao, Chen, Minghua, Chen, Yong, Chen, Yu-Peng, Coleiro, Alexis, Zelati, Francesco Coti, Dai, Zigao, Fan, Xilong, Fan, Zhou, Friedrich, Susanne, Gao, He, Ge, Chong, Ge, Mingyu, Geng, Jinjun, Ghirlanda, Giancarlo, Gianfagna, Giulia, Gou, Lijun, Guillot, Sébastien, Hou, Xian, Hu, Jingwei, Huang, Yongfeng, Ji, Long, Jia, Shumei, Komossa, S., Kong, Albert K. H., Lan, Lin, Li, An, Li, Ang, Li, Chengkui, Li, Dongyue, Li, Jian, Li, Zhaosheng, Ling, Zhixing, Liu, Ang, Liu, Jinzhong, Liu, Liangduan, Liu, Zhu, Luo, Jiawei, Ma, Ruican, Maggi, Pierre, Maitra, Chandreyee, Marino, Alessio, Ng, Stephen Chi-Yung, Pan, Haiwu, Rukdee, Surangkhana, Soria, Roberto, Sun, Hui, Tam, Pak-Hin Thomas, Thakur, Aishwarya Linesh, Tian, Hui, Troja, Eleonora, Wang, Wei, Wang, Xiangyu, Wang, Yanan, Wei, Junjie, Wen, Sixiang, Wu, Jianfeng, Wu, Ting, Xiao, Di, Xu, Dong, Xu, Renxin, Xu, Yanjun, Xu, Yu, Yang, Haonan, You, Bei, Yu, Heng, Yu, Yunwei, Zhang, Binbin, Zhang, Chen, Zhang, Guobao, Zhang, Liang, Zhang, Wenda, Zhang, Yu, Zhou, Ping, and Zou, Zecheng
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The Einstein Probe (EP) is an interdisciplinary mission of time-domain and X-ray astronomy. Equipped with a wide-field lobster-eye X-ray focusing imager, EP will discover cosmic X-ray transients and monitor the X-ray variability of known sources in 0.5-4 keV, at a combination of detecting sensitivity and cadence that is not accessible to the previous and current wide-field monitoring missions. EP can perform quick characterisation of transients or outbursts with a Wolter-I X-ray telescope onboard. In this paper, the science objectives of the Einstein Probe mission are presented. EP is expected to enlarge the sample of previously known or predicted but rare types of transients with a wide range of timescales. Among them, fast extragalactic transients will be surveyed systematically in soft X-rays, which include {\gamma}-ray bursts and their variants, supernova shock breakouts, and the predicted X-ray transients associated with binary neutron star mergers. EP will detect X-ray tidal disruption events and outbursts from active galactic nuclei, possibly at an early phase of the flares for some. EP will monitor the variability and outbursts of X-rays from white dwarfs, neutron stars and black holes in our and neighbouring galaxies at flux levels fainter than those detectable by the current instruments, and is expected to discover new objects. A large sample of stellar X-ray flares will also be detected and characterised. In the era of multi-messenger astronomy, EP has the potential of detecting the possible X-ray counterparts of gravitational wave events, neutrino sources, and ultra-high energy {\gamma}-ray and cosmic ray sources. EP is expected to help advance the studies of extreme objects/phenomena and their underlying physical processes revealed in the dynamic X-ray universe, as well as studies in other areas of X-ray astronomy., Comment: 67 pages, 24 figures, accepted for publication in SCIENCE CHINA Physics, Mechanics & Astronomy
- Published
- 2025
- Full Text
- View/download PDF
16. Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion
- Author
-
Ma, Yongjia, Chen, Junlin, Di, Donglin, Xie, Qi, Fan, Lei, Chen, Wei, Gou, Xiaofei, Zhao, Na, and Yang, Xun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Creating high-fidelity, coherent long videos is a sought-after aspiration. While recent video diffusion models have shown promising potential, they still grapple with spatiotemporal inconsistencies and high computational resource demands. We propose GLC-Diffusion, a tuning-free method for long video generation. It models the long video denoising process by establishing denoising trajectories through Global-Local Collaborative Denoising to ensure overall content consistency and temporal coherence between frames. Additionally, we introduce a Noise Reinitialization strategy which combines local noise shuffling with frequency fusion to improve global content consistency and visual diversity. Further, we propose a Video Motion Consistency Refinement (VMCR) module that computes the gradient of pixel-wise and frequency-wise losses to enhance visual consistency and temporal smoothness. Extensive experiments, including quantitative and qualitative evaluations on videos of varying lengths (\textit{e.g.}, 3\times and 6\times longer), demonstrate that our method effectively integrates with existing video diffusion models, producing coherent, high-fidelity long videos superior to previous approaches.
- Published
- 2025
17. FedKD-hybrid: Federated Hybrid Knowledge Distillation for Lithography Hotspot Detection
- Author
-
Li, Yuqi, Lin, Xingyou, Zhang, Kai, Yang, Chuanguang, Guo, Zhongliang, Gou, Jianping, and Li, Yanli
- Subjects
Computer Science - Machine Learning ,Computer Science - Hardware Architecture - Abstract
Federated Learning (FL) provides novel solutions for machine learning (ML)-based lithography hotspot detection (LHD) under distributed privacy-preserving settings. Currently, two research pipelines have been investigated to aggregate local models and achieve global consensus, including parameter/nonparameter based (also known as knowledge distillation, namely KD). While these two kinds of methods show effectiveness in specific scenarios, we note they have not fully utilized and transferred the information learned, leaving the potential of FL-based LDH remains unexplored. Thus, we propose FedKDhybrid in this study to mitigate the research gap. Specifically, FedKD-hybrid clients agree on several identical layers across all participants and a public dataset for achieving global consensus. During training, the trained local model will be evaluated on the public dataset, and the generated logits will be uploaded along with the identical layer parameters. The aggregated information is consequently used to update local models via the public dataset as a medium. We compare our proposed FedKD-hybrid with several state-of-the-art (SOTA) FL methods under ICCAD-2012 and FAB (real-world collected) datasets with different settings; the experimental results demonstrate the superior performance of the FedKD-hybrid algorithm. Our code is available at https://github.com/itsnotacie/NN-FedKD-hybrid
- Published
- 2025
18. How to determine nucleon polarization at existing collider experiments?
- Author
-
Liang, Yu-Tie, Lv, Xiao-Rong, Kupsc, Andrzej, Gou, Boxing, and Li, Hai-Bo
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
We propose a novel approach to measure spin polarization of nucleons produced in electron--positron collisions. Using existing tracking devices and supporting structure material, general-purpose spectrometers can be utilized as a large-acceptance polarimeter without hardware upgrade. With the proposed approach, the spin polarization of nucleons can be revealed, providing a complementary and accurate description of the final-state particles. This could have far-reaching implications, such as enabling the complete determination of the time-like electromagnetic form factors of nucleons., Comment: 5 pages, 5 figures
- Published
- 2025
19. MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration
- Author
-
Li, Boyun, Zhao, Haiyu, Wang, Wenxin, Hu, Peng, Gou, Yuanbiao, and Peng, Xi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent advancements in Mamba have shown promising results in image restoration. These methods typically flatten 2D images into multiple distinct 1D sequences along rows and columns, process each sequence independently using selective scan operation, and recombine them to form the outputs. However, such a paradigm overlooks two vital aspects: i) the local relationships and spatial continuity inherent in natural images, and ii) the discrepancies among sequences unfolded through totally different ways. To overcome the drawbacks, we explore two problems in Mamba-based restoration methods: i) how to design a scanning strategy preserving both locality and continuity while facilitating restoration, and ii) how to aggregate the distinct sequences unfolded in totally different ways. To address these problems, we propose a novel Mamba-based Image Restoration model (MaIR), which consists of Nested S-shaped Scanning strategy (NSS) and Sequence Shuffle Attention block (SSA). Specifically, NSS preserves locality and continuity of the input images through the stripe-based scanning region and the S-shaped scanning path, respectively. SSA aggregates sequences through calculating attention weights within the corresponding channels of different sequences. Thanks to NSS and SSA, MaIR surpasses 40 baselines across 14 challenging datasets, achieving state-of-the-art performance on the tasks of image super-resolution, denoising, deblurring and dehazing. Our codes will be available after acceptance.
- Published
- 2024
20. DeepSeek-V3 Technical Report
- Author
-
DeepSeek-AI, Liu, Aixin, Feng, Bei, Xue, Bing, Wang, Bingxuan, Wu, Bochao, Lu, Chengda, Zhao, Chenggang, Deng, Chengqi, Zhang, Chenyu, Ruan, Chong, Dai, Damai, Guo, Daya, Yang, Dejian, Chen, Deli, Ji, Dongjie, Li, Erhang, Lin, Fangyun, Dai, Fucong, Luo, Fuli, Hao, Guangbo, Chen, Guanting, Li, Guowei, Zhang, H., Bao, Han, Xu, Hanwei, Wang, Haocheng, Zhang, Haowei, Ding, Honghui, Xin, Huajian, Gao, Huazuo, Li, Hui, Qu, Hui, Cai, J. L., Liang, Jian, Guo, Jianzhong, Ni, Jiaqi, Li, Jiashi, Wang, Jiawei, Chen, Jin, Chen, Jingchang, Yuan, Jingyang, Qiu, Junjie, Li, Junlong, Song, Junxiao, Dong, Kai, Hu, Kai, Gao, Kaige, Guan, Kang, Huang, Kexin, Yu, Kuai, Wang, Lean, Zhang, Lecong, Xu, Lei, Xia, Leyi, Zhao, Liang, Wang, Litong, Zhang, Liyue, Li, Meng, Wang, Miaojun, Zhang, Mingchuan, Zhang, Minghua, Tang, Minghui, Li, Mingming, Tian, Ning, Huang, Panpan, Wang, Peiyi, Zhang, Peng, Wang, Qiancheng, Zhu, Qihao, Chen, Qinyu, Du, Qiushi, Chen, R. J., Jin, R. L., Ge, Ruiqi, Zhang, Ruisong, Pan, Ruizhe, Wang, Runji, Xu, Runxin, Zhang, Ruoyu, Chen, Ruyi, Li, S. S., Lu, Shanghao, Zhou, Shangyan, Chen, Shanhuang, Wu, Shaoqing, Ye, Shengfeng, Ma, Shirong, Wang, Shiyu, Zhou, Shuang, Yu, Shuiping, Zhou, Shunfeng, Pan, Shuting, Wang, T., Yun, Tao, Pei, Tian, Sun, Tianyu, Xiao, W. L., Zeng, Wangding, Zhao, Wanjia, An, Wei, Liu, Wen, Liang, Wenfeng, Gao, Wenjun, Yu, Wenqin, Zhang, Wentao, Li, X. Q., Jin, Xiangyue, Wang, Xianzu, Bi, Xiao, Liu, Xiaodong, Wang, Xiaohan, Shen, Xiaojin, Chen, Xiaokang, Zhang, Xiaokang, Chen, Xiaosha, Nie, Xiaotao, Sun, Xiaowen, Wang, Xiaoxiang, Cheng, Xin, Liu, Xin, Xie, Xin, Liu, Xingchao, Yu, Xingkai, Song, Xinnan, Shan, Xinxia, Zhou, Xinyi, Yang, Xinyu, Li, Xinyuan, Su, Xuecheng, Lin, Xuheng, Li, Y. K., Wang, Y. Q., Wei, Y. X., Zhu, Y. X., Zhang, Yang, Xu, Yanhong, Huang, Yanping, Li, Yao, Zhao, Yao, Sun, Yaofeng, Li, Yaohui, Wang, Yaohui, Yu, Yi, Zheng, Yi, Zhang, Yichao, Shi, Yifan, Xiong, Yiliang, He, Ying, Tang, Ying, Piao, Yishi, Wang, Yisong, Tan, Yixuan, Ma, Yiyang, Liu, Yiyuan, Guo, Yongqiang, Wu, Yu, Ou, Yuan, Zhu, Yuchen, Wang, Yuduan, Gong, Yue, Zou, Yuheng, He, Yujia, Zha, Yukun, Xiong, Yunfan, Ma, Yunxian, Yan, Yuting, Luo, Yuxiang, You, Yuxiang, Liu, Yuxuan, Zhou, Yuyang, Wu, Z. F., Ren, Z. Z., Ren, Zehui, Sha, Zhangli, Fu, Zhe, Xu, Zhean, Huang, Zhen, Zhang, Zhen, Xie, Zhenda, Zhang, Zhengyan, Hao, Zhewen, Gou, Zhibin, Ma, Zhicheng, Yan, Zhigang, Shao, Zhihong, Xu, Zhipeng, Wu, Zhiyu, Zhang, Zhongyu, Li, Zhuoshu, Gu, Zihui, Zhu, Zijia, Liu, Zijun, Li, Zilin, Xie, Ziwei, Song, Ziyang, Gao, Ziyi, and Pan, Zizheng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. The model checkpoints are available at https://github.com/deepseek-ai/DeepSeek-V3.
- Published
- 2024
21. Social Optima in Linear Quadratic Graphon Field Control: Analysis via Infinite Dimensional Approach
- Author
-
Xu, De-xuan, Gou, Zhun, and Huang, Nan-jing
- Subjects
Mathematics - Optimization and Control - Abstract
This paper is concerned with linear quadratic graphon field social control problem where the noises of individual agents are correlated. Compared with the well-studied mean field system, the graphon field system consists of a large number of agents coupled weakly via a weighted undirected graph where each node represents an individual agent. Another notable feature of this paper is that the dynamics of states of agents are driven by Brownian motions with a correlation matrix. The infinite dimensional approach is adopted to design the centralized and decentralized controls for our large population system. By graphon theory, we prove that the linear quadratic (LQ) social optimum control problem under the centralized information pattern is equivalent to an LQ optimal control problem concerned with a stochastic evolution equation, and the feedback-type optimal centralized control is obtained. Then, by designing an auxiliary infinite dimensional optimal control problem through agent number $N\rightarrow\infty$, a set of decentralized strategies are constructed, which are further shown to be asymptotically social optimal.
- Published
- 2024
22. Linear-quadratic Stochastic Stackelberg Differential Games with Affine Constraints
- Author
-
Gou, Zhun, Huang, Nan-Jing, Long, Xian-Jun, and Kang, Jian-Hao
- Subjects
Mathematics - Optimization and Control - Abstract
This paper investigates the non-zero-sum linear-quadratic stochastic Stackelberg differential games with affine constraints, which depend on both the follower's response and the leader's strategy. With the help of the stochastic Riccati equations and the Lagrangian duality theory, the feedback expressions of optimal strategies of the follower and the leader are obtained and the dual problem of the leader's problem is established. Under the Slater condition, the equivalence is proved between the solutions to the dual problem and the leader's problem, and the KKT condition is also provided for solving the dual problem. Then, the feedback Stackelberg equilibrium is provided for the linear-quadratic stochastic Stackelberg differential games with affine constraints, and a new positive definite condition is proposed for ensuring the uniqueness of solutions to the dual problem. Finally, two non-degenerate examples with indefinite coefficients are provided to illustrate and to support our main results.
- Published
- 2024
23. Equilibrium reinsurance and investment strategies for insurers with random risk aversion under Heston's SV model
- Author
-
Kang, Jian-hao, Gou, Zhun, and Huang, Nan-jing
- Subjects
Mathematics - Optimization and Control - Abstract
This study employs expected certainty equivalents to explore the reinsurance and investment issue pertaining to an insurer that aims to maximize the expected utility while being subject to random risk aversion. The insurer's surplus process is modeled approximately by a drifted Brownian motion, and the financial market is comprised of a risk-free asset and a risky asset with its price depicted by Heston's stochastic volatility (SV) model. Within a game theory framework, a strict verification theorem is formulated to delineate the equilibrium reinsurance and investment strategies as well as the corresponding value function. Furthermore, through solving the pseudo Hamilton-Jacobi-Bellman (HJB) system, semi-analytical formulations for the equilibrium reinsurance and investment strategies and the associated value function are obtained under the exponential utility. Additionally, several numerical experiments are carried out to demonstrate the characteristics of the equilibrium reinsurance and investment strategies.
- Published
- 2024
24. Uncertainties of Satellite-based Essential Climate Variables from Deep Learning
- Author
-
Gou, Junyang, Salberg, Arnt-Børre, Shahvandi, Mostafa Kiani, Tourian, Mohammad J., Meyer, Ulrich, Boergens, Eva, Waldeland, Anders U., Velicogna, Isabella, Dahl, Fredrik, Jäggi, Adrian, Schindler, Konrad, and Soja, Benedikt
- Subjects
Physics - Geophysics ,Computer Science - Machine Learning - Abstract
Accurate uncertainty information associated with essential climate variables (ECVs) is crucial for reliable climate modeling and understanding the spatiotemporal evolution of the Earth system. In recent years, geoscience and climate scientists have benefited from rapid progress in deep learning to advance the estimation of ECV products with improved accuracy. However, the quantification of uncertainties associated with the output of such deep learning models has yet to be thoroughly adopted. This survey explores the types of uncertainties associated with ECVs estimated from deep learning and the techniques to quantify them. The focus is on highlighting the importance of quantifying uncertainties inherent in ECV estimates, considering the dynamic and multifaceted nature of climate data. The survey starts by clarifying the definition of aleatoric and epistemic uncertainties and their roles in a typical satellite observation processing workflow, followed by bridging the gap between conventional statistical and deep learning views on uncertainties. Then, we comprehensively review the existing techniques for quantifying uncertainties associated with deep learning algorithms, focusing on their application in ECV studies. The specific need for modification to fit the requirements from both the Earth observation side and the deep learning side in such interdisciplinary tasks is discussed. Finally, we demonstrate our findings with two ECV examples, snow cover and terrestrial water storage, and provide our perspectives for future research.
- Published
- 2024
25. Improved Forecasts of Global Extreme Marine Heatwaves Through a Physics-guided Data-driven Approach
- Author
-
Shu, Ruiqi, Wu, Hao, Gao, Yuan, Xu, Fanghua, Gou, Ruijian, and Huang, Xiaomeng
- Subjects
Physics - Atmospheric and Oceanic Physics ,Computer Science - Artificial Intelligence - Abstract
The unusually warm sea surface temperature events known as marine heatwaves (MHWs) have a profound impact on marine ecosystems. Accurate prediction of extreme MHWs has significant scientific and financial worth. However, existing methods still have certain limitations, especially in the most extreme MHWs. In this study, to address these issues, based on the physical nature of MHWs, we created a novel deep learning neural network that is capable of accurate 10-day MHW forecasting. Our framework significantly improves the forecast ability of extreme MHWs through two specially designed modules inspired by numerical models: a coupler and a probabilistic data argumentation. The coupler simulates the driving effect of atmosphere on MHWs while the probabilistic data argumentation approaches significantly boost the forecast ability of extreme MHWs based on the idea of ensemble forecast. Compared with traditional numerical prediction, our framework has significantly higher accuracy and requires fewer computational resources. What's more, explainable AI methods show that wind forcing is the primary driver of MHW evolution and reveal its relation with air-sea heat exchange. Overall, our model provides a framework for understanding MHWs' driving processes and operational forecasts in the future.
- Published
- 2024
26. Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
- Author
-
Tang, Yuanmin, Qin, Xiaoting, Zhang, Jue, Yu, Jing, Gou, Gaopeng, Xiong, Gang, Ling, Qingwei, Rajmohan, Saravan, Zhang, Dongmei, and Wu, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Composed Image Retrieval (CIR) aims to retrieve target images that closely resemble a reference image while integrating user-specified textual modifications, thereby capturing user intent more precisely. Existing training-free zero-shot CIR (ZS-CIR) methods often employ a two-stage process: they first generate a caption for the reference image and then use Large Language Models for reasoning to obtain a target description. However, these methods suffer from missing critical visual details and limited reasoning capabilities, leading to suboptimal retrieval performance. To address these challenges, we propose a novel, training-free one-stage method, One-Stage Reflective Chain-of-Thought Reasoning for ZS-CIR (OSrCIR), which employs Multimodal Large Language Models to retain essential visual information in a single-stage reasoning process, eliminating the information loss seen in two-stage methods. Our Reflective Chain-of-Thought framework further improves interpretative accuracy by aligning manipulation intent with contextual cues from reference images. OSrCIR achieves performance gains of 1.80% to 6.44% over existing training-free methods across multiple tasks, setting new state-of-the-art results in ZS-CIR and enhancing its utility in vision-language applications. Our code will be available at https://github.com/Pter61/osrcir2024/.
- Published
- 2024
27. Beyond Quantile Methods: Improved Top-K Threshold Estimation for Traditional and Learned Sparse Indexes
- Author
-
Gou, Jinrui, Liu, Yifan, Shao, Minghao, and Suel, Torsten
- Subjects
Computer Science - Information Retrieval - Abstract
Top-k threshold estimation is the problem of estimating the score of the k-th highest ranking result of a search query. A good estimate can be used to speed up many common top-k query processing algorithms, and thus a number of researchers have recently studied the problem. Among the various approaches that have been proposed, quantile methods appear to give the best estimates overall at modest computational costs, followed by sampling-based methods in certain cases. In this paper, we make two main contributions. First, we study how to get even better estimates than the state of the art. Starting from quantile-based methods, we propose a series of enhancements that give improved estimates in terms of the commonly used mean under-prediction fraction (MUF). Second, we study the threshold estimation problem on recently proposed learned sparse index structures, showing that our methods also work well for these cases. Our best methods substantially narrow the gap between the state of the art and the ideal MUF of 1.0, at some additional cost in time and space.
- Published
- 2024
28. 3 mm Spectroscopic Observations of Massive Star-Forming Regions with IRAM 30-m
- Author
-
Xu, Xuefang, Wang, Junzhi, Gou, Qian, Li, Juan, Quan, Donghui, Li, Di, Li, Fei, Duan, Chunguo, and Lei, Juncheng
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
Broadband spectroscopic observations with high sensitivity provide an unbiased way to detect emissions of molecules in space. We present deep observations from ~ 105.8 GHz to 113.6 GHz toward 50 Galactic massive star-forming regions using IRAM 30-m millimeter telescope, with noise levels ranging from 6 to 29 at frequency channel spacing of 195 kHz, which corresponds to ~ 0.54 km/s at 110 GHz. Totally, 27 molecular species have been identified, of which 16 are complex organic molecules. The related parameters, such as peak temperature, integrated intensity, and line width of the identified molecular lines were obtained. The line widths of the chemically related molecules show strong positive correlations, suggesting they likely originate from similar gases within star-forming regions. This work highlights the fundamental properties of the detected molecular lines and offers a valuable dataset for further studies on the astrochemical evolution of molecules in massive star-forming cores., Comment: 22 pages, 3 figures, 14 tables, accepted for publication in PASJ
- Published
- 2024
29. Observational studies on S-bearing molecules in massive star forming regions
- Author
-
Luo, R., Wang, J. Z., Zhang, X., Quan, D. H., Jiang, X. J., Li, J., Gou, Q., Li, Y. Q., Xu, Y. N., Zheng, S. Q., Ou, C., and Liu, Y. J.
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Aims. We present observational results of H$_{2}$S 1$_{10}$-1$_{01}$, H$_{2}$$^{34}$S 1$_{10}$-1$_{01}$, H$_{2}$CS 5$_{14}$-4$_{14}$, HCS$^{+}$ 4-3, SiO 4-3, HC$_{3}$N 19-18 and C$^{18}$O 1-0 toward a sample of 51 late-stage massive star-forming regions, to study relationships among H$_{2}$S, H$_{2}$CS, HCS$^{+}$ and SiO in hot cores. Chemical connections of these S-bearing molecules are discussed based on the relations between relative abundances in sources. Results. H$_{2}$S 1$_{10}$-1$_{01}$, H$_{2}$$^{34}$S 1$_{10}$-1$_{01}$, H$_{2}$CS 5$_{14}$-4$_{14}$, HCS$^{+}$ 4-3 and HC$_{3}$N 19-18 were detected in 50 of the 51 sources, while SiO 4-3 was detected in 46 sources. C$^{18}$O 1-0 was detected in all sources. The Pearson correlation coefficients between H$_{2}$CS and HCS$^+$ normalized by H$_{2}$ and H$_{2}$S are 0.94 and 0.87, respectively, and a tight linear relationship is found between them with slope of 1.00 and 1.09, while they are 0.77 and 0.98 between H$_2$S and H$_2$CS, respectively, and 0.76 and 0.97 between H$_2$S and HCS$^+$. The values of full width at half maxima (FWHM) of them in each source are similar to each other, which indicate that they can trace similar regions. Comparing the observed abundance with model results, there is one possible time (2-3$\times$10$^{5}$ yr) for each source in the model. The abundances of these molecules increase with the increment of SiO abundance in these sources, which implies that shock chemistry may be important for them. Conclusions. Close abundance relation of H$_2$S, H$_2$CS and HCS$^+$ molecules and similar line widths in observational results indicate that these three molecules could be chemically linked, with HCS$^+$ and H$_2$CS the most correlated. The comparison of the observational results with chemical models shows that the abundances can be reproduced for almost all the sources at a specific time. The observational results, including abundances in these sources need to be considered in further modeling H$_{2}$S, H$_{2}$CS and HCS$^{+}$ in hot cores with shock chemistry., Comment: 15 pages, 7 figures
- Published
- 2024
- Full Text
- View/download PDF
30. Orbital torque switching of room temperature two-dimensional van der Waals ferromagnet Fe3GaTe2
- Author
-
Zhang, Delin, Wei, Heshuang, Duan, Jinyu, Chen, Jiali, Yue, Dongdong, Yang, Yuhe, Gou, Jinlong, Yan, Junxin, Zhai, Kun, Wang, Ping, Hu, Shuai, Jia, Zhiyan, Jiang, Wei, Wang, Wenhong, Li, Yue, and Jiang, Yong
- Subjects
Condensed Matter - Materials Science - Abstract
Efficiently manipulating the magnetization of van der Waals ferromagnets has attracted considerable interest in developing room-temperature two-dimensional material-based memory and logic devices. Here, taking advantage of the unique properties of the van der Waals ferromagnet as well as promising characteristics of the orbital Hall effect, we demonstrate the room-temperature magnetization switching of van der Waals ferromagnet Fe3GaTe2 through the orbital torque generated by the orbital Hall material, Titanium (Ti). The switching current density is estimated to be around 1.6 x 10^6 A/cm^2, comparable to that achieved in Fe3GaTe2 using spin-orbit torque from spin Hall materials. The efficient magnetization switching arises from the combined effects of the large orbital Hall conductivity of Ti and the strong spin-orbit correlation of the Fe3GaTe2, as confirmed through theoretical calculations. Our findings advance the understanding of orbital torque switching and pave the way for exploring material-based orbitronic devices., Comment: 26 pages,4 figures, submitted
- Published
- 2024
31. MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification
- Author
-
Sengupta, Saptarshi, Vashistha, Harsh, Curtis, Kristal, Mallipeddi, Akshay, Mathur, Abhinav, Ross, Joseph, and Gou, Liang
- Subjects
Computer Science - Computation and Language - Abstract
Extending the capabilities of Large Language Models (LLMs) with functions or tools for environment interaction has led to the emergence of the agent paradigm. In industry, training an LLM is not always feasible because of the scarcity of domain data, legal holds on proprietary customer data, rapidly changing business requirements, and the need to prototype new assistants. Agents provide an elegant solution to the above by relying on the zero-shot reasoning abilities of the underlying LLM and utilizing tools to explore and reason over customer data and respond to user requests. However, there are two concerns here: (I) acquiring large scale customer queries for agent testing is time-consuming, and (II) high reliance on the tool call sequence (or trajectory) followed by the agent to respond to user queries may lead to unexpected or incorrect behavior. To address this, we propose MAG-V, a multi-agent framework to first generate a dataset of questions that mimic customer queries; and second, reverse-engineer alternate questions from the responses for trajectory verification. Initial results indicate that our synthetic data can improve agent performance on actual customer queries. Furthermore, our trajectory verification methodology, inspired by distant supervision and using traditional machine learning (ML) models, outperforms a GPT-4o judge baseline by 11% accuracy and matches the performance of a GPT-4 judge on our constructed dataset. Overall, our approach is a step towards unifying diverse task agents into a cohesive framework for achieving an aligned objective.
- Published
- 2024
32. Automating Energy-Efficient GPU Kernel Generation: A Fast Search-Based Compilation Approach
- Author
-
Zhang, Yijia, Gou, Zhihong, Cao, Shijie, Feng, Weigang, Zhang, Sicheng, Dai, Guohao, and Xu, Ningyi
- Subjects
Computer Science - Performance ,Computer Science - Machine Learning - Abstract
Deep Neural Networks (DNNs) have revolutionized various fields, but their deployment on GPUs often leads to significant energy consumption. Unlike existing methods for reducing GPU energy consumption, which are either hardware-inflexible or limited by workload constraints, this paper addresses the problem at the GPU kernel level. We propose a novel search-based compilation method to generate energy-efficient GPU kernels by incorporating energy efficiency into the search process. To accelerate the energy evaluation process, we develop an accurate energy cost model based on high-level kernel features. Furthermore, we introduce a dynamic updating strategy for the energy cost model, reducing the need for on-device energy measurements and accelerating the search process. Our evaluation demonstrates that the proposed approach can generate GPU kernels with up to 21.69% reduced energy consumption while maintaining low latency.
- Published
- 2024
33. Orientation Determination of Cryo-EM Images Using Block Stochastic Riemannian Subgradient Methods
- Author
-
Zhang, Wanyu, Gou, Ruili, Liu, Huikang, Wang, Zhiguo, and Ye, Yinyu
- Subjects
Mathematics - Optimization and Control - Abstract
The determination of molecular orientations is crucial for the three-dimensional reconstruction of Cryo-EM images. Traditionally addressed using the common-line method, this challenge is reformulated as a self-consistency error minimization problem constrained to rotation groups. In this paper, we consider the least-squared deviation (LUD) formulation and employ a Riemannian subgradient method to effectively solve the orientation determination problem. To enhance computational efficiency, a block stochastic version of the method is proposed, and its convergence properties are rigorously established. Extensive numerical evaluations reveal that our method not only achieves accuracy comparable to that of state-of-the-art methods but also delivers an average 20-fold speedup. Additionally, we implement a modified formulation and algorithm specifically designed to address scenarios characterized by very low SNR.
- Published
- 2024
34. Evaluating and Advancing Multimodal Large Language Models in Ability Lens
- Author
-
Chen, Feng, Gou, Chenhui, Liu, Jing, Yang, Yang, Li, Zhaoyang, Zhang, Jiyuan, Sun, Zhenbang, Zhuang, Bohan, and Wu, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
As multimodal large language models (MLLMs) advance rapidly, rigorous evaluation has become essential, providing further guidance for their development. In this work, we focus on a unified and robust evaluation of \textbf{vision perception} abilities, the foundational skill of MLLMs. We find that existing perception benchmarks, each focusing on different question types, domains, and evaluation metrics, introduce significant evaluation variance, complicating comprehensive assessments of perception abilities when relying on any single benchmark. To address this, we introduce \textbf{AbilityLens}, a unified benchmark designed to evaluate MLLMs across six key perception abilities, focusing on both accuracy and stability, with each ability encompassing diverse question types, domains, and metrics. With the assistance of AbilityLens, we: (1) identify the strengths and weaknesses of current models, highlighting stability patterns and revealing a notable performance gap between open-source and closed-source models; (2) introduce an online evaluation mode, which uncovers interesting ability conflict and early convergence phenomena during MLLM training; and (3) design a simple ability-specific model merging method that combines the best ability checkpoint from early training stages, effectively mitigating performance decline due to ability conflict. The benchmark and online leaderboard will be released soon.
- Published
- 2024
35. Conditional Distribution Learning on Graphs
- Author
-
Chen, Jie, Mao, Hua, Gou, Yuanbiao, Wang, Zhu, and Peng, Xi
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Leveraging the diversity and quantity of data provided by various graph-structured data augmentations while preserving intrinsic semantic information is challenging. Additionally, successive layers in graph neural network (GNN) tend to produce more similar node embeddings, while graph contrastive learning aims to increase the dissimilarity between negative pairs of node embeddings. This inevitably results in a conflict between the message-passing mechanism (MPM) of GNNs and the contrastive learning (CL) of negative pairs via intraviews. In this paper, we propose a conditional distribution learning (CDL) method that learns graph representations from graph-structured data for semisupervised graph classification. Specifically, we present an end-to-end graph representation learning model to align the conditional distributions of weakly and strongly augmented features over the original features. This alignment enables the CDL model to effectively preserve intrinsic semantic information when both weak and strong augmentations are applied to graph-structured data. To avoid the conflict between the MPM and the CL of negative pairs, positive pairs of node representations are retained for measuring the similarity between the original features and the corresponding weakly augmented features. Extensive experiments with several benchmark graph datasets demonstrate the effectiveness of the proposed CDL method., Comment: 9 pages
- Published
- 2024
36. SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search
- Author
-
Gou, Yutong, Gao, Jianyang, Xu, Yuexuan, and Long, Cheng
- Subjects
Computer Science - Databases ,Computer Science - Information Retrieval - Abstract
Approximate nearest neighbor (ANN) search in high-dimensional Euclidean space has a broad range of applications. Among existing ANN algorithms, graph-based methods have shown superior performance in terms of the time-accuracy trade-off. However, they face performance bottlenecks due to the random memory accesses caused by the searching process on the graph indices and the costs of computing exact distances to guide the searching process. To relieve the bottlenecks, a recent method named NGT-QG makes an attempt by integrating quantization and graph. It (1) replicates and stores the quantization codes of a vertex's neighbors compactly so that they can be accessed sequentially, and (2) uses a SIMD-based implementation named FastScan to efficiently estimate distances based on the quantization codes in batch for guiding the searching process. While NGT-QG achieves promising improvements over the vanilla graph-based methods, it has not fully unleashed the potential of integrating quantization and graph. For instance, it entails a re-ranking step to compute exact distances at the end, which introduces extra random memory accesses; its graph structure is not jointly designed considering the in-batch nature of FastScan, which causes wastes of computation in searching. In this work, following NGT-QG, we present a new method named SymphonyQG, which achieves more symphonious integration of quantization and graph (e.g., it avoids the explicit re-ranking step and refines the graph structure to be more aligned with FastScan). Based on extensive experiments on real-world datasets, SymphonyQG establishes the new state-of-the-art in terms of the time-accuracy trade-off., Comment: The paper has been accepted by SIGMOD 2025
- Published
- 2024
37. ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem Solving
- Author
-
Yu, Botao, Baker, Frazier N., Chen, Ziru, Herb, Garrett, Gou, Boyu, Adu-Ampratwum, Daniel, Ning, Xia, and Sun, Huan
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computational Engineering, Finance, and Science - Abstract
To enhance large language models (LLMs) for chemistry problem solving, several LLM-based agents augmented with tools have been proposed, such as ChemCrow and Coscientist. However, their evaluations are narrow in scope, leaving a large gap in understanding the benefits of tools across diverse chemistry tasks. To bridge this gap, we develop ChemToolAgent, an enhanced chemistry agent over ChemCrow, and conduct a comprehensive evaluation of its performance on both specialized chemistry tasks and general chemistry questions. Surprisingly, ChemToolAgent does not consistently outperform its base LLMs without tools. Our error analysis with a chemistry expert suggests that: For specialized chemistry tasks, such as synthesis prediction, we should augment agents with specialized tools; however, for general chemistry questions like those in exams, agents' ability to reason correctly with chemistry knowledge matters more, and tool augmentation does not always help., Comment: Accepted to NAACL 2025 Findings. Previous title: Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving. Based on the camera ready version, this version adds more experimental results
- Published
- 2024
38. Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
- Author
-
Gu, Yu, Zheng, Boyuan, Gou, Boyu, Zhang, Kai, Chang, Cheng, Srivastava, Sanjari, Xie, Yanan, Qi, Peng, Sun, Huan, and Su, Yu
- Subjects
Computer Science - Artificial Intelligence - Abstract
Language agents have demonstrated promising capabilities in automating web-based tasks, though their current reactive approaches still underperform largely compared to humans. While incorporating advanced planning algorithms, particularly tree search methods, could enhance these agents' performance, implementing tree search directly on live websites poses significant safety risks and practical constraints due to irreversible actions such as confirming a purchase. In this paper, we introduce a novel paradigm that augments language agents with model-based planning, pioneering the innovative use of large language models (LLMs) as world models in complex web environments. Our method, WebDreamer, builds on the key insight that LLMs inherently encode comprehensive knowledge about website structures and functionalities. Specifically, WebDreamer uses LLMs to simulate outcomes for each candidate action (e.g., "what would happen if I click this button?") using natural language descriptions, and then evaluates these imagined outcomes to determine the optimal action at each step. Empirical results on two representative web agent benchmarks with online interaction -- VisualWebArena and Mind2Web-live -- demonstrate that WebDreamer achieves substantial improvements over reactive baselines. By establishing the viability of LLMs as world models in web environments, this work lays the groundwork for a paradigm shift in automated web interaction. More broadly, our findings open exciting new avenues for future research into 1) optimizing LLMs specifically for world modeling in complex, dynamic environments, and 2) model-based speculative planning for language agents., Comment: 18 pages, 6 figures, 4 tables
- Published
- 2024
39. Detection of two TeV gamma-ray outbursts from NGC 1275 by LHAASO
- Author
-
Cao, Zhen, Aharonian, F., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, della Volpe, D., Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Heller, M., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, S. J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Sáiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., and Zuo., X.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The Water Cherenkov Detector Array (WCDA) is one of the components of Large High Altitude Air Shower Observatory (LHAASO) and can monitor any sources over two-thirds of the sky for up to 7 hours per day with >98\% duty cycle. In this work, we report the detection of two outbursts of the Fanaroff-Riley I radio galaxy NGC 1275 that were detected by LHAASO-WCDA between November 2022 and January 2023 with statistical significance of 5.2~$\sigma$ and 8.3~$\sigma$. The observed spectral energy distribution in the range from 500 GeV to 3 TeV is fitted by a power-law with a best-fit spectral index of $\alpha=-3.37\pm0.52$ and $-3.35\pm0.29$, respectively. The outburst flux above 0.5~TeV was ($4.55\pm 4.21)\times~10^{-11}~\rm cm^{-2}~s^{-1}$ and ($3.45\pm 1.78)\times~10^{-11}~\rm cm^{-2}~s^{-1}$, corresponding to 60\%, 45\% of Crab Nebula flux. Variation analysis reveals the variability time-scale of days at the TeV energy band. A simple test by one-zone synchrotron self-Compton model reproduces the data in the gamma-ray band well., Comment: 11 pages, 8 figures, 3 tables
- Published
- 2024
40. Spintwistronics: Photonic bilayer topological lattices tuning extreme spin-orbit interactions
- Author
-
Shi, Peng, Gou, Xinxin, Zhang, Qiang, Wei, Weiyu, Wu, Haijun, Li, Songze, Zhu, Zhihan, Shen, Yijie, and Yuan, Xiaocong
- Subjects
Physics - Optics - Abstract
Twistronics, the manipulation of Moir\'e superlattices via the twisting of two layers of two-dimensional (2D) materials to control diverse and nontrivial properties, has recently revolutionized the condensed matter and materials physics. Here, we introduce the principles of twistronics to spin photonics, coining this emerging field spintwistronics. In spintwistronics, instead of 2D materials, the two layers consist of photonic topological spin lattices on a surface plasmonic polariton (SPP) platform. Each 2D SPP wave supports the construction of topological lattices formed by photonic spins with stable skyrmion topology governed by rotational symmetry. By introducing spintwistronics into plasmonics, we demonstrate theoretically and experimentally that two layers of photonic spin lattices can produce Moir\'e spin superlattices at specific magic angles. These superlattices, modulated periodically by the quantum number of total angular momentum, exhibit novel properties-including new quasiparticle topologies, multiple fractal patterns, extremely slow-light control, and more-that cannot be achieved in conventional plasmonic systems. As a result, they open up multiple degrees of freedom for practical applications in quantum information, optical data storage and chiral light-matter interactions., Comment: 4 figures
- Published
- 2024
41. Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation
- Author
-
Xiao, Ruiyu, Wu, Lei, Gou, Yuhang, Zhang, Weinan, and Liu, Ting
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Argumentative essay generation (AEG) aims to generate complete texts on specific controversial topics or debates. Although current AEG methods can generate individual opinions, they often overlook the high-level connections between these opinions. This often leads to the generated results being mired in logical confusion, unable to proof their own arguments effectively. The generated essay may present evidence that contradicts the claims or they may fail to assemble the claims into logical flow. In this paper, we present a unified two-stage framework: Proof-Enhancement and Self-Annotation (PESA) for AEG with a focus on logical enhancement. Specifically, we first construct pseudo-labels for logical information,claims and grounds, using a large language model. We then propose a tree planning approach that introduces proof principles and ensures logical consistency. Extensive experimental results show that, benefiting from proof principle guidance, PESA generates argumentative essays with better logical validity and persuasiveness than strong baseline models., Comment: EMNLP 2024
- Published
- 2024
42. On the second moment of twisted $L$-functions
- Author
-
Gou, Haozhe and Li, Liangxun
- Subjects
Mathematics - Number Theory - Abstract
Under various suitable assumptions, we describe a general method to obtain the log-saving upper bound for the second moment of standard twisted $L$-function in the $q$-aspect. Specifically, let $L(s, F)$ be a standard $L$-function of degree $d\geq3$, the bound \[ \sideset{}{^*}{\sum}_{{\chi (\mod q)}}\Big|L\big(\frac{1}{2}, F\times \chi \big)\Big |^2\ll_{F,\eta} \frac{q^{\frac{d}{2}}}{\log^{\eta}q} \] holds for some small $\eta>0$., Comment: 24 pages, comments welcome!
- Published
- 2024
43. Denoise-I2W: Mapping Images to Denoising Words for Accurate Zero-Shot Composed Image Retrieval
- Author
-
Tang, Yuanmin, Yu, Jing, Gai, Keke, Zhuang, Jiamin, Gou, Gaopeng, Xiong, Gang, and Wu, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Zero-Shot Composed Image Retrieval (ZS-CIR) supports diverse tasks with a broad range of visual content manipulation intentions that can be related to domain, scene, object, and attribute. A key challenge for ZS-CIR is to accurately map image representation to a pseudo-word token that captures the manipulation intention relevant image information for generalized CIR. However, existing methods between the retrieval and pre-training stages lead to significant redundancy in the pseudo-word tokens. In this paper, we propose a novel denoising image-to-word mapping approach, named Denoise-I2W, for mapping images into denoising pseudo-word tokens that, without intention-irrelevant visual information, enhance accurate ZS-CIR. Specifically, a pseudo triplet construction module first automatically constructs pseudo triples (\textit{i.e.,} a pseudo-reference image, a pseudo-manipulation text, and a target image) for pre-training the denoising mapping network. Then, a pseudo-composed mapping module maps the pseudo-reference image to a pseudo-word token and combines it with the pseudo-manipulation text with manipulation intention. This combination aligns with the target image, facilitating denoising intention-irrelevant visual information for mapping. Our proposed Denoise-I2W is a model-agnostic and annotation-free approach. It demonstrates strong generalization capabilities across three state-of-the-art ZS-CIR models on four benchmark datasets. By integrating Denoise-I2W with existing best models, we obtain consistent and significant performance boosts ranging from 1.45\% to 4.17\% over the best methods without increasing inference costs. and achieve new state-of-the-art results on ZS-CIR. Our code is available at \url{https://github.com/Pter61/denoise-i2w-tmm}., Comment: This work was submitted to IJCAI 2024, with a score of weak accept and borderline accept
- Published
- 2024
44. CPE-Pro: A Structure-Sensitive Deep Learning Method for Protein Representation and Origin Evaluation
- Author
-
Gou, Wenrui, Ge, Wenhui, Tan, Yang, Li, Mingchen, Fan, Guisheng, and Yu, Huiqun
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Computation and Language ,Computer Science - Machine Learning ,Quantitative Biology - Quantitative Methods - Abstract
Protein structures are important for understanding their functions and interactions. Currently, many protein structure prediction methods are enriching the structure database. Discriminating the origin of structures is crucial for distinguishing between experimentally resolved and computationally predicted structures, evaluating the reliability of prediction methods, and guiding downstream biological studies. Building on works in structure prediction, We developed a structure-sensitive supervised deep learning model, Crystal vs Predicted Evaluator for Protein Structure (CPE-Pro), to represent and discriminate the origin of protein structures. CPE-Pro learns the structural information of proteins and captures inter-structural differences to achieve accurate traceability on four data classes, and is expected to be extended to more. Simultaneously, we utilized Foldseek to encode protein structures into "structure-sequences" and trained a protein Structural Sequence Language Model, SSLM. Preliminary experiments demonstrated that, compared to large-scale protein language models pre-trained on vast amounts of amino acid sequences, the "structure-sequence" enables the language model to learn more informative protein features, enhancing and optimizing structural representations. We have provided the code, model weights, and all related materials on https://github.com/GouWenrui/CPE-Pro-main.git.
- Published
- 2024
45. The Milky Way atlas for linear filaments II. clump rotation versus filament orientation
- Author
-
Xu, Xuefang, Wang, Ke, Gou, Qian, Baug, Tapas, Li, Di, Duan, Chunguo, and Lei, Juncheng
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Dense clumps distributed along filaments are the immediate medium for star formation. Kinematic properties of the clumps, such as velocity gradient and angular momentum, combined with filament orientation, provide important clues to the formation mechanism of filament-clump configurations and the role of filaments in star formation. By cross-matching the Milky Way atlas for linear filaments and the Structure, Excitation and Dynamics of the Inner Galactic Interstellar Medium (SEDIGISM) 13CO (2-1) data, we aim to derive the velocity gradient and its direction, the specific angular momentum (J/M), and the ratio (\beta) between the rotational energy and gravitational energy of clumps, as well as to investigate the alignment between clump rotation and filament orientation. We found a monotonic increase in J/M as a function of clump size (R), following a power-law relation J/M~\propto~R^{1.5\pm0.2}. The ratio \beta ranges from 1.1~\times~10^{-5} to 0.1, with a median value 1.0~\times~10^{-3}, suggesting that clump rotation provides insignificant support against gravitational collapse. The distribution of the angle between clump rotation and natal filament orientation is random, indicating that the clumps' rotational axes have no discernible correlation with the orientation of their hosting filaments. Counting only the most massive clump in each filament also finds no alignment between clump rotation and filament orientation., Comment: Accepted by MNRAS. 9 pages, 5 figures, 2 tables
- Published
- 2024
46. SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models
- Author
-
Li, Yuqi, Lu, Yao, Dong, Zeyu, Yang, Chuanguang, Chen, Yihao, and Gou, Jianping
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
The deployment of Deep Neural Network (DNN)-based networks on resource-constrained devices remains a significant challenge due to their high computational and parameter requirements. To solve this problem, layer pruning has emerged as a potent approach to reduce network size and improve computational efficiency. However, existing layer pruning methods mostly overlook the intrinsic connections and inter-dependencies between different layers within complicated deep neural networks. This oversight can result in pruned models that do not preserve the essential characteristics of the pre-trained network as effectively as desired. To address this limitations, we propose a Similarity Guided fast Layer Partition pruning for compressing large deep models (SGLP), which focuses on pruning layers from network segments partitioned via representation similarity. Specifically, our presented method first leverages Centered Kernel Alignment (CKA) to indicate the internal representations among the layers of the pre-trained network, which provides us with a potent basis for layer pruning. Based on similarity matrix derived from CKA, we employ Fisher Optimal Segmentation to partition the network into multiple segments, which provides a basis for removing the layers in a segment-wise manner. In addition, our method innovatively adopts GradNorm for segment-wise layer importance evaluation, eliminating the need for extensive fine-tuning, and finally prunes the unimportant layers to obtain a compact network. Experimental results in image classification and for large language models (LLMs) demonstrate that our proposed SGLP outperforms the state-of-the-art methods in both accuracy and computational efficiency, presenting a more effective solution for deploying DNNs on resource-limited platforms. Our codes are available at https://github.com/itsnotacie/information-fusion-SGLP., Comment: 20 pages
- Published
- 2024
47. Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image Classification
- Author
-
Gou, Jiaxiang, Ji, Luping, Liu, Pei, and Ye, Mao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Whole Slide Image (WSI) classification has very significant applications in clinical pathology, e.g., tumor identification and cancer diagnosis. Currently, most research attention is focused on Multiple Instance Learning (MIL) using static datasets. One of the most obvious weaknesses of these methods is that they cannot efficiently preserve and utilize previously learned knowledge. With any new data arriving, classification models are required to be re-trained on both previous and current new data. To overcome this shortcoming and break through traditional vision modality, this paper proposes the first Vision-Language-based framework with Queryable Prototype Multiple Instance Learning (QPMIL-VL) specially designed for incremental WSI classification. This framework mainly consists of two information processing branches: one is for generating bag-level features by prototype-guided aggregation of instance features, while the other is for enhancing class features through a combination of class ensemble, tunable vector and class similarity loss. The experiments on four public WSI datasets demonstrate that our QPMIL-VL framework is effective for incremental WSI classification and often significantly outperforms other compared methods, achieving state-of-the-art (SOTA) performance. Our source code is publicly available at https://github.com/can-can-ya/QPMIL-VL., Comment: Accepted by AAAI 2025
- Published
- 2024
48. Toeplitz Operators with Positive Measures on Harmonic Fock Spaces
- Author
-
Gou, Xue, Hu, Xin, and Huang, Sui
- Subjects
Mathematics - Functional Analysis ,47B35, 47B10 - Abstract
In this paper, we study the basic properties of Toeplitz Operators with positive measures $\mu$ on harmonic Fock spaces. We prove equivalent conditions for boundedness, compactness and Schatten classes $S_{p}$ of $T_{\mu}$ by using the methods of Berezin transform of operators., Comment: 12 pages
- Published
- 2024
49. Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
- Author
-
Gou, Boyu, Wang, Ruohan, Zheng, Boyuan, Xie, Yanan, Chang, Cheng, Shu, Yiheng, Sun, Huan, and Su, Yu
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Multimodal large language models (MLLMs) are transforming the capabilities of graphical user interface (GUI) agents, facilitating their transition from controlled simulations to complex, real-world applications across various platforms. However, the effectiveness of these agents hinges on the robustness of their grounding capability. Current GUI agents predominantly utilize text-based representations such as HTML or accessibility trees, which, despite their utility, often introduce noise, incompleteness, and increased computational overhead. In this paper, we advocate a human-like embodiment for GUI agents that perceive the environment entirely visually and directly perform pixel-level operations on the GUI. The key is visual grounding models that can accurately map diverse referring expressions of GUI elements to their coordinates on the GUI across different platforms. We show that a simple recipe, which includes web-based synthetic data and slight adaptation of the LLaVA architecture, is surprisingly effective for training such visual grounding models. We collect the largest dataset for GUI visual grounding so far, containing 10M GUI elements and their referring expressions over 1.3M screenshots, and use it to train UGround, a strong universal visual grounding model for GUI agents. Empirical results on six benchmarks spanning three categories (grounding, offline agent, and online agent) show that 1) UGround substantially outperforms existing visual grounding models for GUI agents, by up to 20% absolute, and 2) agents with UGround outperform state-of-the-art agents, despite the fact that existing agents use additional text-based input while ours only uses visual perception. These results provide strong support for the feasibility and promises of GUI agents that navigate the digital world as humans do., Comment: Accepted to ICLR 2025 (Oral)
- Published
- 2024
50. LHAASO detection of very-high-energy gamma-ray emission surrounding PSR J0248+6021
- Author
-
Cao, Zhen, Aharonian, F., An, Q., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Sáiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zheng, J. H., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., Zou, Y. C., and Zuo, X.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
We report the detection of an extended very-high-energy (VHE) gamma-ray source coincident with the location of middle-aged (62.4~\rm kyr) pulsar PSR J0248+6021, by using the LHAASO-WCDA data of live 796 days and LHAASO-KM2A data of live 1216 days. A significant excess of \gray induced showers is observed both by WCDA in energy bands of 1-25~\rm TeV and KM2A in energy bands of $>$ 25~\rm TeV with 7.3 $\sigma$ and 13.5 $\sigma$, respectively. The best-fit position derived through WCDA data is R.A. = 42.06$^\circ \pm$ 0.12$^\circ$ and Dec. = 60.24$^\circ \pm $ 0.13$^\circ$ with an extension of 0.69$^\circ\pm$0.15$^\circ$ and that of the KM2A data is R.A.= 42.29$^\circ \pm $ 0.13$^\circ$ and Dec. = 60.38$^\circ \pm$ 0.07$^\circ$ with an extension of 0.37$^\circ\pm$0.07$^\circ$. No clear extended multiwavelength counterpart of this LHAASO source has been found from the radio band to the GeV band. The most plausible explanation of the VHE \gray emission is the inverse Compton process of highly relativistic electrons and positrons injected by the pulsar. These electrons/positrons are hypothesized to be either confined within the pulsar wind nebula or to have already escaped into the interstellar medium, forming a pulsar halo., Comment: 12 pages, 10 figures, Accepted by Sci. China-Phys. Mech. Astron
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.