1,441,799 results on '"Tan AT"'
Search Results
2. BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices
- Author
-
Lu, Xudong, Chen, Yinghao, Chen, Cheng, Tan, Hui, Chen, Boheng, Xie, Yina, Hu, Rui, Tan, Guanxin, Wu, Renshou, Hu, Yan, Zeng, Yi, Wu, Lei, Bian, Liuyang, Wang, Zhaoxiong, Liu, Long, Yang, Yanzhou, Xiao, Han, Zhou, Aojun, Wen, Yafei, Chen, Xiaoxin, Ren, Shuai, and Li, Hongsheng
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Computation and Language - Abstract
The emergence and growing popularity of multimodal large language models (MLLMs) have significant potential to enhance various aspects of daily life, from improving communication to facilitating learning and problem-solving. Mobile phones, as essential daily companions, represent the most effective and accessible deployment platform for MLLMs, enabling seamless integration into everyday tasks. However, deploying MLLMs on mobile phones presents challenges due to limitations in memory size and computational capability, making it difficult to achieve smooth and real-time processing without extensive optimization. In this paper, we present BlueLM-V-3B, an algorithm and system co-design approach specifically tailored for the efficient deployment of MLLMs on mobile platforms. To be specific, we redesign the dynamic resolution scheme adopted by mainstream MLLMs and implement system optimization for hardware-aware deployment to optimize model inference on mobile phones. BlueLM-V-3B boasts the following key highlights: (1) Small Size: BlueLM-V-3B features a language model with 2.7B parameters and a vision encoder with 400M parameters. (2) Fast Speed: BlueLM-V-3B achieves a generation speed of 24.4 token/s on the MediaTek Dimensity 9300 processor with 4-bit LLM weight quantization. (3) Strong Performance: BlueLM-V-3B has attained the highest average score of 66.1 on the OpenCompass benchmark among models with $\leq$ 4B parameters and surpassed a series of models with much larger parameter sizes (e.g., MiniCPM-V-2.6, InternVL2-8B)., Comment: 21 pages
- Published
- 2024
3. Evaluating the Generation of Spatial Relations in Text and Image Generative Models
- Author
-
Sim, Shang Hong, Lee, Clarence, Tan, Alvin, and Tan, Cheston
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Understanding spatial relations is a crucial cognitive ability for both humans and AI. While current research has predominantly focused on the benchmarking of text-to-image (T2I) models, we propose a more comprehensive evaluation that includes \textit{both} T2I and Large Language Models (LLMs). As spatial relations are naturally understood in a visuo-spatial manner, we develop an approach to convert LLM outputs into an image, thereby allowing us to evaluate both T2I models and LLMs \textit{visually}. We examined the spatial relation understanding of 8 prominent generative models (3 T2I models and 5 LLMs) on a set of 10 common prepositions, as well as assess the feasibility of automatic evaluation methods. Surprisingly, we found that T2I models only achieve subpar performance despite their impressive general image-generation abilities. Even more surprisingly, our results show that LLMs are significantly more accurate than T2I models in generating spatial relations, despite being primarily trained on textual data. We examined reasons for model failures and highlight gaps that can be filled to enable more spatially faithful generations.
- Published
- 2024
4. Personalize to generalize: Towards a universal medical multi-modality generalization through personalization
- Author
-
Tan, Zhaorui, Yang, Xi, Pan, Tan, Liu, Tianyi, Jiang, Chen, Guo, Xin, Wang, Qiufeng, Nguyen, Anh, Qi, Yuan, Huang, Kaizhu, and Cheng, Yuan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
The differences among medical imaging modalities, driven by distinct underlying principles, pose significant challenges for generalization in multi-modal medical tasks. Beyond modality gaps, individual variations, such as differences in organ size and metabolic rate, further impede a model's ability to generalize effectively across both modalities and diverse populations. Despite the importance of personalization, existing approaches to multi-modal generalization often neglect individual differences, focusing solely on common anatomical features. This limitation may result in weakened generalization in various medical tasks. In this paper, we unveil that personalization is critical for multi-modal generalization. Specifically, we propose an approach to achieve personalized generalization through approximating the underlying personalized invariant representation ${X}_h$ across various modalities by leveraging individual-level constraints and a learnable biological prior. We validate the feasibility and benefits of learning a personalized ${X}_h$, showing that this representation is highly generalizable and transferable across various multi-modal medical tasks. Extensive experimental results consistently show that the additionally incorporated personalization significantly improves performance and generalization across diverse scenarios, confirming its effectiveness.
- Published
- 2024
5. PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption
- Author
-
Tan, Yifan, Tan, Cheng, Mi, Zeyu, and Chen, Haibo
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Confidential computing on GPUs, like NVIDIA H100, mitigates the security risks of outsourced Large Language Models (LLMs) by implementing strong isolation and data encryption. Nonetheless, this encryption incurs a significant performance overhead, reaching up to 52.8 percent and 88.2 percent throughput drop when serving OPT-30B and OPT-66B, respectively. To address this challenge, we introduce PipeLLM, a user-transparent runtime system. PipeLLM removes the overhead by overlapping the encryption and GPU computation through pipelining - an idea inspired by the CPU instruction pipelining - thereby effectively concealing the latency increase caused by encryption. The primary technical challenge is that, unlike CPUs, the encryption module lacks prior knowledge of the specific data needing encryption until it is requested by the GPUs. To this end, we propose speculative pipelined encryption to predict the data requiring encryption by analyzing the serving patterns of LLMs. Further, we have developed an efficient, low-cost pipeline relinquishing approach for instances of incorrect predictions. Our experiments on NVIDIA H100 GPU show that compared with vanilla systems without confidential computing (e.g., vLLM, PEFT, and FlexGen), PipeLLM incurs modest overhead (less than 19.6 percent in throughput) across various LLM sizes, from 13B to 175B., Comment: To appear in ASPLOS 2025
- Published
- 2024
6. Can Personalized Medicine Coexist with Health Equity? Examining the Cost Barrier and Ethical Implications
- Author
-
Francisco, Kishi Kobe Yee, Apuhin, Andrane Estelle Carnicer, Tan, Myles Joshua Toledo, Byers, Mickael Cavanaugh, Maravilla, Nicholle Mae Amor Tan, Karim, Hezerul Abdul, and AlDahoul, Nouar
- Subjects
Computer Science - Computers and Society - Abstract
Personalized medicine (PM) promises to transform healthcare by providing treatments tailored to individual genetic, environmental, and lifestyle factors. However, its high costs and infrastructure demands raise concerns about exacerbating health disparities, especially between high-income countries (HICs) and low- and middle-income countries (LMICs). While HICs benefit from advanced PM applications through AI and genomics, LMICs often lack the resources necessary to adopt these innovations, leading to a widening healthcare divide. This paper explores the financial and ethical challenges of PM implementation, with a focus on ensuring equitable access. It proposes strategies for global collaboration, infrastructure development, and ethical frameworks to support LMICs in adopting PM, aiming to prevent further disparities in healthcare accessibility and outcomes., Comment: 30 pages, 1 figure
- Published
- 2024
7. ChatGPT versus a Customized AI Chatbot (Anatbuddy) for Anatomy Education: A Comparative Pilot Study
- Author
-
Gautham Arun, Vivek Perumal, Francis Paul John Bato Urias, Yan En Ler, Bryan Wen Tao Tan, Ranganath Vallabhajosyula, Emmanuel Tan, Olivia Ng, Kian Bee Ng, and Sreenivasulu Reddy Mogali
- Abstract
Large Language Models (LLMs) have the potential to improve education by personalizing learning. However, ChatGPT-generated content has been criticized for sometimes producing false, biased, and/or hallucinatory information. To evaluate AI's ability to return clear and accurate anatomy information, this study generated a custom interactive and intelligent chatbot (Anatbuddy) through an Open AI Application Programming Interface (API) that enables seamless AI-driven interactions within a secured cloud infrastructure. Anatbuddy was programmed through a Retrieval Augmented Generation (RAG) method to provide context-aware responses to user queries based on a predetermined knowledge base. To compare their outputs, various queries (i.e., prompts) on thoracic anatomy (n = 18) were fed into Anatbuddy and ChatGPT 3.5. A panel comprising three experienced anatomists evaluated both tools' responses for factual accuracy, relevance, completeness, coherence, and fluency on a 5-point Likert scale. These ratings were reviewed by a third party blinded to the study, who revised and finalized scores as needed. Anatbuddy's factual accuracy (mean ± SD = 4.78/5.00 ± 0.43; median = 5.00) was rated significantly higher (U = 84, p = 0.01) than ChatGPT's accuracy (4.11 ± 0.83; median = 4.00). No statistically significant differences were detected between the chatbots for the other variables. Given ChatGPT's current content knowledge limitations, we strongly recommend the anatomy profession develop a custom AI chatbot for anatomy education utilizing a carefully curated knowledge base to ensure accuracy. Further research is needed to determine students' acceptance of custom chatbots for anatomy education and their influence on learning experiences and outcomes.
- Published
- 2024
- Full Text
- View/download PDF
8. COMET: Benchmark for Comprehensive Biological Multi-omics Evaluation Tasks and Language Models
- Author
-
Ren, Yuchen, Han, Wenwei, Zhang, Qianyuan, Tang, Yining, Bai, Weiqiang, Cai, Yuchen, Qiao, Lifeng, Jiang, Hao, Yuan, Dong, Chen, Tao, Sun, Siqi, Tan, Pan, Ouyang, Wanli, Dong, Nanqing, Ma, Xinzhu, and Ye, Peng
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
As key elements within the central dogma, DNA, RNA, and proteins play crucial roles in maintaining life by guaranteeing accurate genetic expression and implementation. Although research on these molecules has profoundly impacted fields like medicine, agriculture, and industry, the diversity of machine learning approaches-from traditional statistical methods to deep learning models and large language models-poses challenges for researchers in choosing the most suitable models for specific tasks, especially for cross-omics and multi-omics tasks due to the lack of comprehensive benchmarks. To address this, we introduce the first comprehensive multi-omics benchmark COMET (Benchmark for Biological COmprehensive Multi-omics Evaluation Tasks and Language Models), designed to evaluate models across single-omics, cross-omics, and multi-omics tasks. First, we curate and develop a diverse collection of downstream tasks and datasets covering key structural and functional aspects in DNA, RNA, and proteins, including tasks that span multiple omics levels. Then, we evaluate existing foundational language models for DNA, RNA, and proteins, as well as the newly proposed multi-omics method, offering valuable insights into their performance in integrating and analyzing data from different biological modalities. This benchmark aims to define critical issues in multi-omics research and guide future directions, ultimately promoting advancements in understanding biological processes through integrated and different omics data analysis.
- Published
- 2024
9. Quantum chromatic numbers of some graphs in Hamming schemes
- Author
-
Cao, Xiwang, Feng, Keqin, and Tan, Ying-Ying
- Subjects
Mathematics - Combinatorics ,05C15, 05E30, 94B25, 97K30 - Abstract
The study of quantum chromatic numbers of graphs is a hot research topic in recent years. However, the infinite family of graphs with known quantum chromatic numbers are rare, as far as we know, the only known such graphs (except for complete graphs, cycles, bipartite graphs and some trivial cases) are the Hadamard graphs $H_n$ with $2^n$ vertices and $n$ being a multiple of $4$. In this paper, we define a class of graphs named as generalized Hadamard graphs, we determined the quantum chromatic numbers of one class of such graphs. Notably, this is the second known family of graphs with the quantum chromatic numbers are explicitly determined except for some cases aforementioned. We also provide some bounds for the quantum chromatic numbers of some other generalized Hadamard graphs. Consequently, we can obtain the quantum chromatic numbers of products of some graphs., Comment: 17 pages
- Published
- 2024
10. CO-CHANGES II: spatially resolved IRAM 30M CO line observations of 23 nearby edge-on spiral galaxies
- Author
-
Jiang, Yan, Li, Jiang-Tao, Tan, Qing-Hua, Ji, Li, Bregman, Joel N., Wang, Q. Daniel, Wang, Jian-Fa, Lu, Li-Yuan, and Jiang, Xue-Jian
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Molecular gas, as the fuel for star formation, and its relationship with atomic gas are crucial for understanding how galaxies regulate their star forming (SF) activities. We conducted IRAM 30m observations of 23 nearby spiral galaxies from the CHANG-ES project to investigatet the distribution of molecular gas and the Kennicutt-Schmidt law. Combining these results with atomic gas masses from previous studies, we aim to investigate the scaling relations that connect the molecular and atomic gas masses with stellar masses and the baryonic Tully-Fisher relation. Based on spatially resolved observations of the three CO lines, we calculated the total molecular gas masses, the ratios between different CO lines, and derived physical parameters such as temperature and optical depth. The median line ratios for nuclear/disk regions are 8.6/6.1 (^{12}\mathrm{CO}/^{13}\mathrm{CO}\ J=1{-}0) and 0.53/0.39 (^{12}\mathrm{CO}\ J=2{-}1/J=1{-}0). Molecular gas mass derived from ^{13}\mathrm{CO} is correlated but systematically lower than that from ^{12}\mathrm{CO}. Most galaxies follow the spatially resolved SF scaling relation with a median gas depletion timescale of approximately 1 Gyr, while a few exhibit shorter timescales of approximately 0.1 Gyr. The molecular-to-atomic gas mass ratio correlates strongly with stellar mass, consistent with previous studies. Galaxies with lower stellar masses show an excess of atomic gas, indicating less efficient conversion to molecular gas. Most galaxies tightly follow the baryonic Tully-Fisher relation, but NGC 2992 and NGC 4594 deviate from the relation due to different physical factors. We find that the ratio of the cold gas (comprising molecular and atomic gas) to the total baryon mass decreases with the gravitational potential of the galaxy, as traced by rotation velocity, which could be due to gas consumption in SF or being heated to the hot phase., Comment: 15 pages, 11 figures (Figure 3 shows only a single galaxy example from the sample; all other galaxies in the sample are available online only.), 4 tables (full version of Table 2 and Table 3 is only available online). Accepted for publication in A&A
- Published
- 2024
11. ScaleOT: Privacy-utility-scalable Offsite-tuning with Dynamic LayerReplace and Selective Rank Compression
- Author
-
Yao, Kai, Tan, Zhaorui, Ye, Tiandi, Li, Lichun, Zhao, Yuan, Liu, Wenyan, Wang, Wei, and Zhu, Jianke
- Subjects
Computer Science - Computation and Language ,Computer Science - Cryptography and Security - Abstract
Offsite-tuning is a privacy-preserving method for tuning large language models (LLMs) by sharing a lossy compressed emulator from the LLM owners with data owners for downstream task tuning. This approach protects the privacy of both the model and data owners. However, current offsite tuning methods often suffer from adaptation degradation, high computational costs, and limited protection strength due to uniformly dropping LLM layers or relying on expensive knowledge distillation. To address these issues, we propose ScaleOT, a novel privacy-utility-scalable offsite-tuning framework that effectively balances privacy and utility. ScaleOT introduces a novel layerwise lossy compression algorithm that uses reinforcement learning to obtain the importance of each layer. It employs lightweight networks, termed harmonizers, to replace the raw LLM layers. By combining important original LLM layers and harmonizers in different ratios, ScaleOT generates emulators tailored for optimal performance with various model scales for enhanced privacy protection. Additionally, we present a rank reduction method to further compress the original LLM layers, significantly enhancing privacy with negligible impact on utility. Comprehensive experiments show that ScaleOT can achieve nearly lossless offsite tuning performance compared with full fine-tuning while obtaining better model privacy., Comment: accepted by AAAI2025
- Published
- 2024
12. A high optical access cryogenic system for Rydberg atom arrays with a 3000-second trap lifetime
- Author
-
Zhang, Zhenpu, Hsu, Ting-Wei, Tan, Ting You, Slichter, Daniel H., Kaufman, Adam M., Marinelli, Matteo, and Regal, Cindy A.
- Subjects
Physics - Atomic Physics ,Condensed Matter - Quantum Gases ,Quantum Physics - Abstract
We present an optical tweezer array of $^{87}$Rb atoms housed in an cryogenic environment that successfully combines a 4 K cryopumping surface, a <50 K cold box surrounding the atoms, and a room-temperature high-numerical-aperture objective lens. We demonstrate a 3000 s atom trap lifetime, which enables us to optimize and measure losses at the $10^{-4}$ level that arise during imaging and cooling, which are important to array rearrangement. We perform both ground-state qubit manipulation with an integrated microwave antenna and two-photon coherent Rydberg control, with the local electric field tuned to zero via integrated electrodes. We anticipate that the reduced blackbody radiation at the atoms from the cryogenic environment, combined with future electrical shielding, should decrease the rate of undesired transitions to nearby strongly-interacting Rydberg states, which cause many-body loss and impede Rydberg gates. This low-vibration, high-optical-access cryogenic platform can be used with a wide range of optically trapped atomic or molecular species for applications in quantum computing, simulation, and metrology., Comment: 19 pages, 10 figures
- Published
- 2024
13. Constraints on Pre-Big-Bang Cosmology from Advanced LIGO and Advanced Virgo's First Three Observing Runs
- Author
-
Tan, Qin, Chen, Zu Cheng, Wu, You, and Liu, Lang
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,General Relativity and Quantum Cosmology - Abstract
We search for the stochastic gravitational-wave background (SGWB) predicted by pre-big-bang (PBB) cosmology using data from the first three observing runs of Advanced LIGO and Advanced Virgo. PBB cosmology proposes an alternative to cosmic inflation where the Universe evolves from a weak-coupling, low-curvature state to the hot Big Bang through a high-curvature bounce phase, predicting a distinctive SGWB spectrum. We perform a Bayesian analysis of the cross-correlation data to constrain the model parameters characterizing the PBB spectrum. We find no evidence for a PBB-induced SGWB, with a Bayes factor of $0.03$ between the PBB and noise-only model, strongly favoring the noise-only hypothesis. Our analysis establishes a lower bound $\beta \gtrsim -0.19$ at $95\%$ confidence level, which is compatible with the theoretical requirement $\beta \geq 0$ for a smooth bounce transition. While we do not detect a signal, our constraints remain consistent with the basic theoretical framework of PBB cosmology, demonstrating the potential of gravitational-wave observations to test early Universe theories., Comment: 13 pages, 1 figures
- Published
- 2024
14. Search for $D^0$ meson decays to $\pi^+ \pi^- e^+ e^-$ and $K^+ K^- e^+ e^-$ final states
- Author
-
LHCb collaboration, Aaij, R., Abdelmotteleb, A. S. W., Beteta, C. Abellan, Abudinén, F., Ackernley, T., Adefisoye, A. A., Adeva, B., Adinolfi, M., Adlarson, P., Agapopoulou, C., Aidala, C. A., Ajaltouni, Z., Akar, S., Akiba, K., Albicocco, P., Albrecht, J., Alessio, F., Alexander, M., Aliouche, Z., Cartelle, P. Alvarez, Amalric, R., Amato, S., Amey, J. L., Amhis, Y., An, L., Anderlini, L., Andersson, M., Andreianov, A., Andreola, P., Andreotti, M., Andreou, D., Anelli, A., Ao, D., Archilli, F., Argenton, M., Cuendis, S. Arguedas, Artamonov, A., Artuso, M., Aslanides, E., Da Silva, R. Ataíde, Atzeni, M., Audurier, B., Bacher, D., Perea, I. Bachiller, Bachmann, S., Bachmayer, M., Back, J. J., Rodriguez, P. Baladron, Balagura, V., Balboni, A., Baldini, W., Balzani, L., Bao, H., Leite, J. Baptista de Souza, Pretel, C. Barbero, Barbetti, M., Barbosa, I. R., Barlow, R. J., Barnyakov, M., Barsuk, S., Barter, W., Bartz, J., Basels, J. M., Bashir, S., Bassi, G., Batsukh, B., Battista, P. B., Bay, A., Beck, A., Becker, M., Bedeschi, F., Bediaga, I. B., Behling, N. A., Belin, S., Belous, K., Belov, I., Belyaev, I., Benane, G., Bencivenni, G., Ben-Haim, E., Berezhnoy, A., Bernet, R., Andres, S. Bernet, Bertolin, A., Betancourt, C., Betti, F., Bex, J., Bezshyiko, Ia., Bhom, J., Bieker, M. S., Biesuz, N. V., Billoir, P., Biolchini, A., Birch, M., Bishop, F. C. R., Bitadze, A., Bizzeti, A., Blake, T., Blanc, F., Blank, J. E., Blusk, S., Bocharnikov, V., Boelhauve, J. A., Garcia, O. Boente, Boettcher, T., Bohare, A., Boldyrev, A., Bolognani, C. S., Bolzonella, R., Bonacci, R. B., Bondar, N., Bordelius, A., Borgato, F., Borghi, S., Borsato, M., Borsuk, J. T., Bottalico, E., Bouchiba, S. A., Bovill, M., Bowcock, T. J. V., Boyer, A., Bozzi, C., Brandenburg, J. D., Rodriguez, A. Brea, Breer, N., Brodzicka, J., Gonzalo, A. Brossa, Brown, J., Brundu, D., Buchanan, E., Buonincontri, L., Marcos, M. Burgos, Burke, A. T., Burr, C., Butter, J. S., Buytaert, J., Byczynski, W., Cadeddu, S., Cai, H., Caillet, A. C., Calabrese, R., Ramirez, S. Calderon, Calefice, L., Cali, S., Calvi, M., Gomez, M. Calvo, Magalhaes, P. Camargo, Bouzas, J. I. Cambon, Campana, P., Perez, D. H. Campora, Quezada, A. F. Campoverde, Capelli, S., Capriotti, L., Caravaca-Mora, R., Carbone, A., Salgado, L. Carcedo, Cardinale, R., Cardini, A., Carniti, P., Carus, L., Vidal, A. Casais, Caspary, R., Casse, G., Cattaneo, M., Cavallero, G., Cavallini, V., Celani, S., Cesare, S., Chadwick, A. J., Chahrour, I., Charles, M., Charpentier, Ph., Chatzianagnostou, E., Chefdeville, M., Chen, C., Chen, S., Chen, Z., Chernov, A., Chernyshenko, S., Chiotopoulos, X., Chobanova, V., Chrzaszcz, M., Chubykin, A., Chulikov, V., Ciambrone, P., Vidal, X. Cid, Ciezarek, G., Cifra, P., Clarke, P. E. L., Clemencic, M., Cliff, H. V., Closier, J., Toapaxi, C. Cocha, Coco, V., Cogan, J., Cogneras, E., Cojocariu, L., Collaviti, S., Collins, P., Colombo, T., Colonna, M., Comerma-Montells, A., Congedo, L., Contu, A., Cooke, N., Corredoira, I., Correia, A., Corti, G., Meldrum, J. J. Cottee, Couturier, B., Craik, D. C., Torres, M. Cruz, Rivera, E. Curras, Currie, R., Da Silva, C. L., Dadabaev, S., Dai, L., Dai, X., Dall'Occo, E., Dalseno, J., D'Ambrosio, C., Daniel, J., Danilina, A., d'Argent, P., Darze, G., Davidson, A., Davies, J. E., Francisco, O. De Aguiar, De Angelis, C., De Benedetti, F., de Boer, J., De Bruyn, K., De Capua, S., De Cian, M., Da Graca, U. De Freitas Carneiro, De Lucia, E., De Miranda, J. M., De Paula, L., De Serio, M., De Simone, P., De Vellis, F., de Vries, J. A., Debernardis, F., Decamp, D., Dedu, V., Dekkers, S., Del Buono, L., Delaney, B., Dembinski, H. -P., Deng, J., Denysenko, V., Deschamps, O., Dettori, F., Dey, B., Di Nezza, P., Diachkov, I., Didenko, S., Ding, S., Dittmann, L., Dobishuk, V., Docheva, A. D., Dong, C., Donohoe, A. M., Dordei, F., Reis, A. C. dos, Dowling, A. D., Duan, W., Duda, P., Dudek, M. W., Dufour, L., Duk, V., Durante, P., Duras, M. M., Durham, J. M., Durmus, O. D., Dziurda, A., Dzyuba, A., Easo, S., Eckstein, E., Egede, U., Egorychev, A., Egorychev, V., Eisenhardt, S., Ejopu, E., Eklund, L., Elashri, M., Ellbracht, J., Ely, S., Ene, A., Eschle, J., Esen, S., Evans, T., Fabiano, F., Falcao, L. N., Fan, Y., Fang, B., Fantini, L., Faria, M., Farmer, K., Fazzini, D., Felkowski, L., Feng, M., Feo, M., Casani, A. Fernandez, Gomez, M. Fernandez, Fernez, A. D., Ferrari, F., Rodrigues, F. Ferreira, Ferrillo, M., Ferro-Luzzi, M., Filippov, S., Fini, R. A., Fiorini, M., Firlej, M., Fischer, K. L., Fitzgerald, D. S., Fitzpatrick, C., Fiutowski, T., Fleuret, F., Fontana, M., Foreman, L. F., Forty, R., Foulds-Holt, D., Lima, V. Franco, Sevilla, M. Franco, Frank, M., Franzoso, E., Frau, G., Frei, C., Friday, D. A., Fu, J., Führing, Q., Fujii, Y., Fulghesu, T., Gabriel, E., Galati, G., Galati, M. D., Torreira, A. Gallas, Galli, D., Gambetta, S., Gandelman, M., Gandini, P., Ganie, B., Gao, H., Gao, R., Gao, T. Q., Gao, Y., Martin, L. M. Garcia, Moreno, P. Garcia, Pardiñas, J. García, Gardner, P., Garg, K. G., Garrido, L., Gaspar, C., Gerken, L. L., Gersabeck, E., Gersabeck, M., Gershon, T., Ghizzo, S., Ghorbanimoghaddam, Z., Giambastiani, L., Giasemis, F. I., Gibson, V., Giemza, H. K., Gilman, A. L., Giovannetti, M., Gioventù, A., Girardey, L., Giugliano, C., Giza, M. A., Gkougkousis, E. L., Glaser, F. C., Gligorov, V. V., Göbel, C., Golobardes, E., Golubkov, D., Golutvin, A., Fernandez, S. Gomez, Gomulka, W., Abrantes, F. Goncalves, Goncerz, M., Gong, G., Gooding, J. A., Gorelov, I. V., Gotti, C., Govorkova, E., Grabowski, J. P., Cardoso, L. A. Granado, Graugés, E., Graverini, E., Grazette, L., Graziani, G., Grecu, A. T., Greeven, L. M., Grieser, N. A., Grillo, L., Gromov, S., Gu, C., Guarise, M., Guerry, L., Guliaeva, V., Günther, P. A., Guseinov, A. -K., Gushchin, E., Guz, Y., Gys, T., Habermann, K., Hadavizadeh, T., Hadjivasiliou, C., Haefeli, G., Haen, C., Hallett, G., Halvorsen, M. M., Hamilton, P. M., Hammerich, J., Han, Q., Han, X., Hansmann-Menzemer, S., Hao, L., Harnew, N., Harris, T. H., Hartmann, M., Hashmi, S., He, J., Hemmer, F., Henderson, C., Henderson, R. D. L., Hennequin, A. M., Hennessy, K., Henry, L., Herd, J., Gascon, P. Herrero, Heuel, J., Hicheur, A., Mendizabal, G. Hijano, Horswill, J., Hou, R., Hou, Y., Howarth, N., Hu, J., Hu, W., Hu, X., Huang, W., Hulsbergen, W., Hunter, R. J., Hushchyn, M., Hutchcroft, D., Idzik, M., Ilin, D., Ilten, P., Inglessi, A., Iniukhin, A., Ishteev, A., Ivshin, K., Jacobsson, R., Jage, H., Elles, S. J. Jaimes, Jakobsen, S., Jans, E., Jashal, B. K., Jawahery, A., Jevtic, V., Jiang, E., Jiang, X., Jiang, Y., Jiang, Y. J., John, M., Rajan, A. John Rubesh, Johnson, D., Jones, C. R., Jones, T. P., Joshi, S., Jost, B., Castella, J. Juan, Jurik, N., Juszczak, I., Kaminaris, D., Kandybei, S., Kane, M., Kang, Y., Kar, C., Karacson, M., Karpenkov, D., Kauniskangas, A., Kautz, J. W., Kazanecki, M. K., Keizer, F., Kenzie, M., Ketel, T., Khanji, B., Kharisova, A., Kholodenko, S., Khreich, G., Kirn, T., Kirsebom, V. S., Kitouni, O., Klaver, S., Kleijne, N., Klimaszewski, K., Kmiec, M. R., Koliiev, S., Kolk, L., Konoplyannikov, A., Kopciewicz, P., Koppenburg, P., Korolev, M., Kostiuk, I., Kot, O., Kotriakhova, S., Kozachuk, A., Kravchenko, P., Kravchuk, L., Kreps, M., Krokovny, P., Krupa, W., Krzemien, W., Kshyvanskyi, O., Kubis, S., Kucharczyk, M., Kudryavtsev, V., Kulikova, E., Kupsc, A., Kutsenko, B. K., Lacarrere, D., Gonzalez, P. Laguarta, Lai, A., Lampis, A., Lancierini, D., Gomez, C. Landesa, Lane, J. J., Lane, R., Lanfranchi, G., Langenbruch, C., Langer, J., Lantwin, O., Latham, T., Lazzari, F., Lazzeroni, C., Gac, R. Le, Lee, H., Lefèvre, R., Leflat, A., Legotin, S., Lehuraux, M., Cid, E. Lemos, Leroy, O., Lesiak, T., Lesser, E. D., Leverington, B., Li, A., Li, C., Li, H., Li, K., Li, L., Li, M., Li, P., Li, P. -R., Li, Q., Li, S., Li, T., Li, Y., Lian, Z., Liang, X., Libralon, S., Lin, C., Lin, T., Lindner, R., Linton, H., Lisovskyi, V., Litvinov, R., Liu, F. L., Liu, G., Liu, K., Liu, S., Liu, W., Liu, Y., Liu, Y. L., Ordonez, G. Loachamin, Salvia, A. Lobo, Loi, A., Long, T., Lopes, J. H., Huertas, A. Lopez, Soliño, S. López, Lu, Q., Lucarelli, C., Lucchesi, D., Martinez, M. Lucio, Lukashenko, V., Luo, Y., Lupato, A., Luppi, E., Lynch, K., Lyu, X. -R., Ma, G. M., Maccolini, S., Machefert, F., Maciuc, F., Mack, B., Mackay, I., Mackey, L. M., Mohan, L. R. Madhan, Madurai, M. J., Maevskiy, A., Magdalinski, D., Maisuzenko, D., Malczewski, J. J., Malde, S., Malentacca, L., Malinin, A., Maltsev, T., Manca, G., Mancinelli, G., Mancuso, C., Escalero, R. Manera, Manganella, F. M., Manuzzi, D., Marangotto, D., Marchand, J. F., Marchevski, R., Marconi, U., Mariani, E., Mariani, S., Benito, C. Marin, Marks, J., Marshall, A. M., Martel, L., Martelli, G., Martellotti, G., Martinazzoli, L., Martinelli, M., Gomez, D. Martinez, Santos, D. Martinez, Vidal, F. Martinez, Granollers, A. Martorell i, Massafferri, A., Matev, R., Mathad, A., Matiunin, V., Matteuzzi, C., Mattioli, K. R., Mauri, A., Maurice, E., Mauricio, J., Mayencourt, P., de Cos, J. Mazorra, Mazurek, M., McCann, M., McGrath, T. H., McHugh, N. T., McNab, A., McNulty, R., Meadows, B., Meier, G., Melnychuk, D., Meng, F. M., Merk, M., Merli, A., Garcia, L. Meyer, Miao, D., Miao, H., Mikhasenko, M., Milanes, D. A., Minotti, A., Minucci, E., Miralles, T., Mitreska, B., Mitzel, D. S., Modak, A., Moeser, L., Mohammed, R. A., Moise, R. D., Mokhnenko, S., Cardenas, E. F. Molina, Mombächer, T., Monk, M., Monteil, S., Gomez, A. Morcillo, Morello, G., Morello, M. J., Morgenthaler, M. P., Moron, J., Morren, W., Morris, A. B., Morris, A. G., Mountain, R., Mu, H., Mu, Z. M., Muhammad, E., Muheim, F., Mulder, M., Müller, K., Muñoz-Rojas, F., Murta, R., Naik, P., Nakada, T., Nandakumar, R., Nanut, T., Nasteva, I., Needham, M., Neri, N., Neubert, S., Neufeld, N., Neustroev, P., Nicolini, J., Nicotra, D., Niel, E. M., Nikitin, N., Niu, Q., Nogarolli, P., Nogga, P., Normand, C., Fernandez, J. Novoa, Nowak, G., Nunez, C., Nur, H. N., Oblakowska-Mucha, A., Obraztsov, V., Oeser, T., Okamura, S., Okhotnikov, A., Okhrimenko, O., Oldeman, R., Oliva, F., Olocco, M., Onderwater, C. J. G., O'Neil, R. H., Osthues, D., Goicochea, J. M. Otalora, Owen, P., Oyanguren, A., Ozcelik, O., Paciolla, F., Padee, A., Padeken, K. O., Pagare, B., Pais, P. R., Pajero, T., Palano, A., Palutan, M., Pan, X., Panshin, G., Paolucci, L., Papanestis, A., Pappagallo, M., Pappalardo, L. L., Pappenheimer, C., Parkes, C., Parmar, D., Passalacqua, B., Passaleva, G., Passaro, D., Pastore, A., Patel, M., Patoc, J., Patrignani, C., Paul, A., Pawley, C. J., Pellegrino, A., Peng, J., Altarelli, M. Pepe, Perazzini, S., Pereima, D., Da Costa, H. Pereira, Castro, A. Pereiro, Perret, P., Perrevoort, A., Perro, A., Peters, M. J., Petridis, K., Petrolini, A., Pfaller, J. P., Pham, H., Pica, L., Piccini, M., Piccolo, L., Pietrzyk, B., Pietrzyk, G., Pilato, R. N., Pinci, D., Pisani, F., Pizzichemi, M., Placinta, V., Casasus, M. Plo, Poeschl, T., Polci, F., Lener, M. Poli, Poluektov, A., Polukhina, N., Polyakov, I., Polycarpo, E., Ponce, S., Popov, D., Poslavskii, S., Prasanth, K., Prouve, C., Provenzano, D., Pugatch, V., Punzi, G., Qasim, S., Qian, Q. Q., Qian, W., Qin, N., Qu, S., Quagliani, R., Trejo, R. I. Rabadan, Rademacker, J. H., Rama, M., García, M. Ramírez, De Oliveira, V. Ramos, Pernas, M. Ramos, Rangel, M. S., Ratnikov, F., Raven, G., De Miguel, M. Rebollo, Redi, F., Reich, J., Reiss, F., Ren, Z., Resmi, P. K., Ribatti, R., Ricart, G. R., Riccardi, D., Ricciardi, S., Richardson, K., Richardson-Slipper, M., Rinnert, K., Robbe, P., Robertson, G., Rodrigues, E., Alvarez, A. Rodriguez, Fernandez, E. Rodriguez, Lopez, J. A. Rodriguez, Rodriguez, E. Rodriguez, Roensch, J., Rogachev, A., Rogovskiy, A., Rolf, D. L., Roloff, P., Romanovskiy, V., Vidal, A. Romero, Romolini, G., Ronchetti, F., Rong, T., Rotondo, M., Roy, S. R., Rudolph, M. S., Diaz, M. Ruiz, Fernandez, R. A. Ruiz, Vidal, J. Ruiz, Ryzhikov, A., Ryzka, J., Saavedra-Arias, J. J., Silva, J. J. Saborido, Sadek, R., Sagidova, N., Sahoo, D., Sahoo, N., Saitta, B., Salomoni, M., Sanderswood, I., Santacesaria, R., Rios, C. Santamarina, Santimaria, M., Santoro, L., Santovetti, E., Saputi, A., Saranin, D., Sarnatskiy, A., Sarpis, G., Sarpis, M., Satriano, C., Satta, A., Saur, M., Savrina, D., Sazak, H., Sborzacchi, F., Smead, L. G. Scantlebury, Scarabotto, A., Schael, S., Scherl, S., Schiller, M., Schindler, H., Schmelling, M., Schmidt, B., Schmitt, S., Schmitz, H., Schneider, O., Schopper, A., Schulte, N., Schulte, S., Schune, M. H., Schwemmer, R., Schwering, G., Sciascia, B., Sciuccati, A., Segal, I., Sellam, S., Semennikov, A., Senger, T., Soares, M. Senghi, Sergi, A., Serra, N., Sestini, L., Seuthe, A., Shang, Y., Shangase, D. M., Shapkin, M., Sharma, R. S., Shchemerov, I., Shchutska, L., Shears, T., Shekhtman, L., Shen, Z., Sheng, S., Shevchenko, V., Shi, B., Shi, Q., Shimizu, Y., Shmanin, E., Shorkin, R., Shupperd, J. D., Coutinho, R. Silva, Simi, G., Simone, S., Skidmore, N., Skwarnicki, T., Slater, M. W., Smallwood, J. C., Smith, E., Smith, K., Smith, M., Snoch, A., Lavra, L. Soares, Sokoloff, M. D., Soler, F. J. P., Solomin, A., Solovev, A., Solovyev, I., Sommerfeld, N. S., Song, R., Song, Y., Song, Y. S., De Almeida, F. L. Souza, De Paula, B. Souza, Norella, E. Spadaro, Spedicato, E., Speer, J. G., Spiridenkov, E., Spradlin, P., Sriskaran, V., Stagni, F., Stahl, M., Stahl, S., Stanislaus, S., Stefaniak, M., Stein, E. N., Steinkamp, O., Stenyakin, O., Stevens, H., Strekalina, D., Su, Y., Suljik, F., Sun, J., Sun, L., Sundfeld, D., Sutcliffe, W., Swallow, P. N., Swientek, K., Swystun, F., Szabelski, A., Szumlak, T., Tan, Y., Tang, Y., Tat, M. D., Terentev, A., Terzuoli, F., Teubert, F., Thomas, E., Thompson, D. J. D., Tilquin, H., Tisserand, V., T'Jampens, S., Tobin, M., Tomassetti, L., Tonani, G., Tong, X., Tork, T., Machado, D. Torres, Toscano, L., Tou, D. Y., Trippl, C., Tuci, G., Tuning, N., Uecker, L. H., Ukleja, A., Unverzagt, D. J., Urbach, B., Usachov, A., Ustyuzhanin, A., Uwer, U., Vagnoni, V., Cadenas, V. Valcarce, Valenti, G., Canudas, N. Valls, van Eldik, J., Van Hecke, H., van Herwijnen, E., Van Hulse, C. B., Van Laak, R., van Veghel, M., Vasquez, G., Gomez, R. Vazquez, Regueiro, P. Vazquez, Sierra, C. Vázquez, Vecchi, S., Velthuis, J. J., Veltri, M., Venkateswaran, A., Verdoglia, M., Vesterinen, M., Benet, D. Vico, Villalba, P. Vidrier, Diaz, M. Vieites, Vilasis-Cardona, X., Figueras, E. Vilella, Villa, A., Vincent, P., Volle, F. C., Bruch, D. vom, Voropaev, N., Vos, K., Vrahas, C., Wagner, J., Walsh, J., Walton, E. J., Wan, G., Wang, C., Wang, G., Wang, H., Wang, J., Wang, M., Wang, N. W., Wang, R., Wang, X., Wang, X. W., Wang, Y., Wang, Y. W., Wang, Z., Ward, J. A., Waterlaat, M., Watson, N. K., Websdale, D., Wei, Y., Wendel, J., Westhenry, B. D. C., White, C., Whitehead, M., Whiter, E., Wiederhold, A. R., Wiedner, D., Wilkinson, G., Wilkinson, M. K., Williams, M., Williams, M. J., Williams, M. R. J., Williams, R., Williams, Z., Wilson, F. F., Winn, M., Wislicki, W., Witek, M., Witola, L., Wormser, G., Wotton, S. A., Wu, H., Wu, J., Wu, X., Wu, Y., Wu, Z., Wyllie, K., Xian, S., Xiang, Z., Xie, Y., Xing, T. X., Xu, A., Xu, L., Xu, M., Xu, Z., Yang, K., Yang, S., Yang, X., Yang, Y., Yang, Z., Yeroshenko, V., Yeung, H., Yin, H., Yin, X., Yu, C. Y., Yu, J., Yuan, X., Yuan, Y, Zaffaroni, E., Zavertyaev, M., Zdybal, M., Zenesini, F., Zeng, C., Zeng, M., Zhang, C., Zhang, D., Zhang, J., Zhang, L., Zhang, S., Zhang, Y., Zhang, Y. Z., Zhang, Z., Zhao, Y., Zhelezov, A., Zheng, S. Z., Zheng, X. Z., Zheng, Y., Zhou, T., Zhou, X., Zhou, Y., Zhovkovska, V., Zhu, L. Z., Zhu, X., Zhukov, V., Zhuo, J., Zou, Q., Zuliani, D., and Zunica, G.
- Subjects
High Energy Physics - Experiment - Abstract
A search for $D^0$ meson decays to the $\pi^+\pi^-e^+e^-$ and $K^+K^-e^+e^-$ final states is reported using a sample of proton-proton collisions collected by the LHCb experiment at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 6 fb$^{-1}$. The decay $D^0 \rightarrow \pi^+\pi^-e^+e^-$ is observed for the first time when requiring that the two electrons are consistent with coming from the decay of a $\phi$ or $\rho^0/\omega$ meson. The corresponding branching fractions are measured relative to the $D^0 \rightarrow K^-\pi^-[e^+e^-]_{\rho^0/\omega}$ decay, where the two electrons are consistent with coming from the decay of a $\rho^0$ or $\omega$ meson. No evidence is found for the $D^0 \rightarrow K^+K^-e^+e^-$ decay and world-best limits are set on its branching fraction. The results are compared to, and found to be consistent with, the branching fractions of the $D^0 \rightarrow \pi^+\pi^-\mu^+\mu^-$ and $D^0 \rightarrow K^+K^-\mu^+\mu^-$ decays recently measured by LHCb and confirm lepton universality at the current precision., Comment: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/1611/ (LHCb public pages)
- Published
- 2024
15. Beware of Metacognitive Laziness: Effects of Generative Artificial Intelligence on Learning Motivation, Processes, and Performance
- Author
-
Fan, Yizhou, Tang, Luzhen, Le, Huixiao, Shen, Kejie, Tan, Shufang, Zhao, Yueying, Shen, Yuan, Li, Xinyu, and Gašević, Dragan
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
With the continuous development of technological and educational innovation, learners nowadays can obtain a variety of support from agents such as teachers, peers, education technologies, and recently, generative artificial intelligence such as ChatGPT. The concept of hybrid intelligence is still at a nascent stage, and how learners can benefit from a symbiotic relationship with various agents such as AI, human experts and intelligent learning systems is still unknown. The emerging concept of hybrid intelligence also lacks deep insights and understanding of the mechanisms and consequences of hybrid human-AI learning based on strong empirical research. In order to address this gap, we conducted a randomised experimental study and compared learners' motivations, self-regulated learning processes and learning performances on a writing task among different groups who had support from different agents (ChatGPT, human expert, writing analytics tools, and no extra tool). A total of 117 university students were recruited, and their multi-channel learning, performance and motivation data were collected and analysed. The results revealed that: learners who received different learning support showed no difference in post-task intrinsic motivation; there were significant differences in the frequency and sequences of the self-regulated learning processes among groups; ChatGPT group outperformed in the essay score improvement but their knowledge gain and transfer were not significantly different. Our research found that in the absence of differences in motivation, learners with different supports still exhibited different self-regulated learning processes, ultimately leading to differentiated performance. What is particularly noteworthy is that AI technologies such as ChatGPT may promote learners' dependence on technology and potentially trigger metacognitive laziness.
- Published
- 2024
- Full Text
- View/download PDF
16. Crystal Symmetry Selected Pure Spin Photocurrent in Altermagnetic Insulators
- Author
-
Dong, Ruizhi, Cao, Ranquan, Tan, Dian, and Fei, Ruixiang
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
The generation of time-reversal-odd spin-current in metallic altermagnets has attracted considerable interest in spintronics. However, producing pure spin-current in insulating materials remains both challenging and desirable, as insulating states are frequently found in antiferromagnets. Nonlinear photogalvanic effects offer a promising method for generating spin-current in insulators. We here revealed that spin and charge photocurrents in altermagnets are protected by spin point group symmetry. Unlike the photocurrents in parity-time symmetric materials, where spin-orbit coupling (SOC) induces a significant charge current, the spin-current in altermagnets can exist as a pure spin current along specific crystal directions regardless of SOC. We applied our predictions using first-principles calculations to several distinct materials, including wurtzite MnTe and multiferroic BiFeO3. Additionally, we elucidated the previously overlooked linear-inject-current mechanism in BiFeO3 induced by SOC, which may account for the enhanced bulk photovotaic effect in multiferroics.
- Published
- 2024
17. PhishIntel: Toward Practical Deployment of Reference-based Phishing Detection
- Author
-
Li, Yuexin, Tan, Hiok Kuek, Meng, Qiaoran, Lock, Mei Lin, Cao, Tri, Deng, Shumin, Oo, Nay, Lim, Hoon Wei, and Hooi, Bryan
- Subjects
Computer Science - Cryptography and Security - Abstract
Phishing is a critical cyber threat, exploiting deceptive tactics to compromise victims and cause significant financial losses. While reference-based phishing detectors (RBPDs) achieve high precision by analyzing brand-domain consistency, their real-world deployment is hindered by challenges such as high latency and inefficiency in URL analysis. To address these limitations, we present PhishIntel, an end-to-end phishing detection system for real-world deployment. PhishIntel intelligently determines whether a URL can be processed immediately or not, segmenting the detection process into two distinct tasks: a fast task that checks against local blacklists and result cache, and a slow task that conducts online blacklist verification, URL crawling, and webpage analysis using an RBPD. This fast-slow task system architecture ensures low response latency while retaining the robust detection capabilities of RBPDs for zero-day phishing threats. Furthermore, we develop two downstream applications based on PhishIntel: a phishing intelligence platform and a phishing email detection plugin for Microsoft Outlook, demonstrating its practical efficacy and utility.
- Published
- 2024
18. Improvement in Sign Language Translation Using Text CTC Alignment
- Author
-
Tan, Sihan, Miyazaki, Taro, Khan, Nabeela, and Nakadai, Kazuhiro
- Subjects
Computer Science - Computation and Language - Abstract
Current sign language translation (SLT) approaches often rely on gloss-based supervision with Connectionist Temporal Classification (CTC), limiting their ability to handle non-monotonic alignments between sign language video and spoken text. In this work, we propose a novel method combining joint CTC/Attention and transfer learning. The joint CTC/Attention introduces hierarchical encoding and integrates CTC with the attention mechanism during decoding, effectively managing both monotonic and non-monotonic alignments. Meanwhile, transfer learning helps bridge the modality gap between vision and language in SLT. Experimental results on two widely adopted benchmarks, RWTH-PHOENIX-Weather 2014 T and CSL-Daily, show that our method achieves results comparable to state-of-the-art and outperforms the pure-attention baseline. Additionally, this work opens a new door for future research into gloss-free SLT using text-based CTC alignment.
- Published
- 2024
19. Photo-Induced Quenching of the 229Th Isomer in a Solid-State Host
- Author
-
Terhune, J. E. S., Elwell, R., Tan, H. B. Tran, Perera, U. C., Morgan, H. W. T., Alexandrova, A. N., Derevianko, Andrei, and Hudson, Eric R.
- Subjects
Physics - Atomic Physics ,Nuclear Experiment - Abstract
The population dynamics of the 229Th isomeric state is studied in a solid-state host under laser illumination. A photoquenching process is observed, where off-resonant vacuum-ultraviolet (VUV) radiation leads to relaxation of the isomeric state. The cross-section for this photoquenching process is measured and a model for the decay process, where photoexcitation of electronic states within the material bandgap opens an internal conversion decay channel, is presented and appears to reproduce the measured cross-section., Comment: 7 pages, 6 figures
- Published
- 2024
20. Askey-Wilson version of Second Main Theorem for holomorphic curves in projective space
- Author
-
Tan, Chengliang and Korhonen, Risto
- Subjects
Mathematics - Complex Variables ,32H30, 30D35, 39A13 - Abstract
In this paper, an Askey-Wilson version of the Wronskian-Casorati determinant $\mathcal{W}(f_{0}, \dots, f_{n})(x)$ for meromorphic functions $f_{0}, \dots, f_{n}$ is introduced to establish an Askey-Wilson version of the general form of the Second Main Theorem in projective space. This improves upon the original Second Main Theorem for the Askey-Wilson operator due to Chiang and Feng. In addition, by taking into account the number of irreducible components of hypersurfaces, an Askey-Wilson version of the Truncated Second Main Theorem for holomorphic curves into projective space with hypersurfaces located in $l$-subgeneral position is obtained., Comment: 50 pages
- Published
- 2024
21. Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation
- Author
-
Li, Haosheng, Mao, Weixin, Deng, Weipeng, Meng, Chenyu, Fan, Haoqiang, Wang, Tiancai, Tan, Ping, Wang, Hongan, and Deng, Xiaoming
- Subjects
Computer Science - Robotics ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Multi-hand semantic grasp generation aims to generate feasible and semantically appropriate grasp poses for different robotic hands based on natural language instructions. Although the task is highly valuable, due to the lack of multi-hand grasp datasets with fine-grained contact description between robotic hands and objects, it is still a long-standing difficult task. In this paper, we present Multi-GraspSet, the first large-scale multi-hand grasp dataset with automatically contact annotations. Based on Multi-GraspSet, we propose Multi-GraspLLM, a unified language-guided grasp generation framework. It leverages large language models (LLM) to handle variable-length sequences, generating grasp poses for diverse robotic hands in a single unified architecture. Multi-GraspLLM first aligns the encoded point cloud features and text features into a unified semantic space. It then generates grasp bin tokens which are subsequently converted into grasp pose for each robotic hand via hand-aware linear mapping. The experimental results demonstrate that our approach significantly outperforms existing methods on Multi-GraspSet. More information can be found on our project page https://multi-graspllm.github.io., Comment: 11 pages, 6 figures
- Published
- 2024
22. Adversarial Purification by Consistency-aware Latent Space Optimization on Data Manifolds
- Author
-
Zhang, Shuhai, Yang, Jiahao, Luo, Hui, Chen, Jie, Wang, Li, Liu, Feng, Han, Bo, and Tan, Mingkui
- Subjects
Computer Science - Machine Learning - Abstract
Deep neural networks (DNNs) are vulnerable to adversarial samples crafted by adding imperceptible perturbations to clean data, potentially leading to incorrect and dangerous predictions. Adversarial purification has been an effective means to improve DNNs robustness by removing these perturbations before feeding the data into the model. However, it faces significant challenges in preserving key structural and semantic information of data, as the imperceptible nature of adversarial perturbations makes it hard to avoid over-correcting, which can destroy important information and degrade model performance. In this paper, we break away from traditional adversarial purification methods by focusing on the clean data manifold. To this end, we reveal that samples generated by a well-trained generative model are close to clean ones but far from adversarial ones. Leveraging this insight, we propose Consistency Model-based Adversarial Purification (CMAP), which optimizes vectors within the latent space of a pre-trained consistency model to generate samples for restoring clean data. Specifically, 1) we propose a \textit{Perceptual consistency restoration} mechanism by minimizing the discrepancy between generated samples and input samples in both pixel and perceptual spaces. 2) To maintain the optimized latent vectors within the valid data manifold, we introduce a \textit{Latent distribution consistency constraint} strategy to align generated samples with the clean data distribution. 3) We also apply a \textit{Latent vector consistency prediction} scheme via an ensemble approach to enhance prediction reliability. CMAP fundamentally addresses adversarial perturbations at their source, providing a robust purification. Extensive experiments on CIFAR-10 and ImageNet-100 show that our CMAP significantly enhances robustness against strong adversarial attacks while preserving high natural accuracy., Comment: 17 pages, 8 figures
- Published
- 2024
23. Fast Beam Placement for Ultra-Dense LEO Networks
- Author
-
Van Chien, Trinh, Quan, Nguyen Minh, Do, Tri Nhu, Le, Cuong, Nguyen, Tan N., and Chatzinotas, Symeon
- Subjects
Computer Science - Information Theory - Abstract
Low Earth orbit (LEO) satellites has brought about significant improvements in wireless communications, characterized by low latency and reduced transmission loss compared to geostationary orbit (GSO) satellites. Ultra-dense LEO satellites can serve many users by generating active beams effective to their locations. The beam placement problem is challenging but important for efficiently allocating resources with a large number of users. This paper formulates and solves a fast beam placement optimization problem for ultra-dense satellite systems to enhance the link budget with a minimum number of active beams (NABs). To achieve this goal and balance load among beams within polynomial time, we propose two algorithms for large user groups exploiting the modified K-means clustering and the graph theory. Numerical results illustrate the effectiveness of the proposals in terms of the statistical channel gain-to-noise ratio and computation time over state-of-the-art benchmarks., Comment: 5 pages, 3 figures. Accepted by IEEE WCL
- Published
- 2024
24. Enhancing CGRA Efficiency Through Aligned Compute and Communication Provisioning
- Author
-
Li, Zhaoying, Dangi, Pranav, Yin, Chenyang, Bandara, Thilini Kaushalya, Juneja, Rohan, Tan, Cheng, Bai, Zhenyu, and Mitra, Tulika
- Subjects
Computer Science - Hardware Architecture - Abstract
Coarse-grained Reconfigurable Arrays (CGRAs) are domain-agnostic accelerators that enhance the energy efficiency of resource-constrained edge devices. The CGRA landscape is diverse, exhibiting trade-offs between performance, efficiency, and architectural specialization. However, CGRAs often overprovision communication resources relative to their modest computing capabilities. This occurs because the theoretically provisioned programmability for CGRAs often proves superfluous in practical implementations. In this paper, we propose Plaid, a novel CGRA architecture and compiler that aligns compute and communication capabilities, thereby significantly improving energy and area efficiency while preserving its generality and performance. We demonstrate that the dataflow graph, representing the target application, can be decomposed into smaller, recurring communication patterns called motifs. The primary contribution is the identification of these structural motifs within the dataflow graphs and the development of an efficient collective execution and routing strategy tailored to these motifs. The Plaid architecture employs a novel collective processing unit that can execute multiple operations of a motif and route related data dependencies together. The Plaid compiler can hierarchically map the dataflow graph and judiciously schedule the motifs. Our design achieves a 43% reduction in power consumption and 46% area savings compared to the baseline high-performance spatio-temporal CGRA, all while preserving its generality and performance levels. In comparison to the baseline energy-efficient spatial CGRA, Plaid offers a 1.4x performance improvement and a 48% area savings, with almost the same power., Comment: Accepted by ASPLOS '25
- Published
- 2024
- Full Text
- View/download PDF
25. Resolved mass assembly and star formation in Milky Way Progenitors since $z = 5$ from JWST/CANUCS: From clumps and mergers to well-ordered disks
- Author
-
Tan, Vivian Yun Yan, Muzzin, Adam, Sarrouh, Ghassan T. E., Antwi-Danso, Jacqueline, Sok, Visal, Jagga, Naadiyah, Abraham, Roberto, Asada, Yoshihisa, Desprez, Guillaume, Iyer, Kartheik, Martis, Nicholas S., Mérida, Rosa M., Mowla, Lamiya A., Noirot, Gaël, Omori, Kiyoaki Christopher, Sawicki, Marcin, Tripodi, Roberta, and Willott, Chris J.
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We present a resolved study of $>900$ progenitors of Milky Way Analogs (MWAs) at $0.3
- Published
- 2024
26. SAT: Spatial Aptitude Training for Multimodal Language Models
- Author
-
Ray, Arijit, Duan, Jiafei, Tan, Reuben, Bashkirova, Dina, Hendrix, Rose, Ehsani, Kiana, Kembhavi, Aniruddha, Plummer, Bryan A., Krishna, Ranjay, Zeng, Kuo-Hao, and Saenko, Kate
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Graphics ,Computer Science - Robotics - Abstract
Spatial perception is a fundamental component of intelligence. While many studies highlight that large multimodal language models (MLMs) struggle to reason about space, they only test for static spatial reasoning, such as categorizing the relative positions of objects. Meanwhile, real-world deployment requires dynamic capabilities like perspective-taking and egocentric action recognition. As a roadmap to improving spatial intelligence, we introduce SAT, Spatial Aptitude Training, which goes beyond static relative object position questions to the more dynamic tasks. SAT contains 218K question-answer pairs for 22K synthetic scenes across a training and testing set. Generated using a photo-realistic physics engine, our dataset can be arbitrarily scaled and easily extended to new actions, scenes, and 3D assets. We find that even MLMs that perform relatively well on static questions struggle to accurately answer dynamic spatial questions. Further, we show that SAT instruction-tuning data improves not only dynamic spatial reasoning on SAT, but also zero-shot performance on existing real-image spatial benchmarks: $23\%$ on CVBench, $8\%$ on the harder BLINK benchmark, and $18\%$ on VSR. When instruction-tuned on SAT, our 13B model matches larger proprietary MLMs like GPT4-V and Gemini-3-1.0 in spatial reasoning. Our data/code is available at http://arijitray1993.github.io/SAT/ ., Comment: Project webpage: http://arijitray1993.github.io/SAT/
- Published
- 2024
27. On Motion Blur and Deblurring in Visual Place Recognition
- Author
-
Ismagilov, Timur, Ferrarini, Bruno, Milford, Michael, Nguyen, Tan Viet Tuyen, Ramchurn, SD, and Ehsan, Shoaib
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Visual Place Recognition (VPR) in mobile robotics enables robots to localize themselves by recognizing previously visited locations using visual data. While the reliability of VPR methods has been extensively studied under conditions such as changes in illumination, season, weather and viewpoint, the impact of motion blur is relatively unexplored despite its relevance not only in rapid motion scenarios but also in low-light conditions where longer exposure times are necessary. Similarly, the role of image deblurring in enhancing VPR performance under motion blur has received limited attention so far. This paper bridges these gaps by introducing a new benchmark designed to evaluate VPR performance under the influence of motion blur and image deblurring. The benchmark includes three datasets that encompass a wide range of motion blur intensities, providing a comprehensive platform for analysis. Experimental results with several well-established VPR and image deblurring methods provide new insights into the effects of motion blur and the potential improvements achieved through deblurring. Building on these findings, the paper proposes adaptive deblurring strategies for VPR, designed to effectively manage motion blur in dynamic, real-world scenarios.
- Published
- 2024
28. Automatic Database Configuration Debugging using Retrieval-Augmented Language Models
- Author
-
Chen, Sibei, Fan, Ju, Wu, Bin, Tang, Nan, Deng, Chao, Wang, Pengyi, Li, Ye, Tan, Jian, Li, Feifei, Zhou, Jingren, and Du, Xiaoyong
- Subjects
Computer Science - Databases - Abstract
Database management system (DBMS) configuration debugging, e.g., diagnosing poorly configured DBMS knobs and generating troubleshooting recommendations, is crucial in optimizing DBMS performance. However, the configuration debugging process is tedious and, sometimes challenging, even for seasoned database administrators (DBAs) with sufficient experience in DBMS configurations and good understandings of the DBMS internals (e.g., MySQL or Oracle). To address this difficulty, we propose Andromeda, a framework that utilizes large language models (LLMs) to enable automatic DBMS configuration debugging. Andromeda serves as a natural surrogate of DBAs to answer a wide range of natural language (NL) questions on DBMS configuration issues, and to generate diagnostic suggestions to fix these issues. Nevertheless, directly prompting LLMs with these professional questions may result in overly generic and often unsatisfying answers. To this end, we propose a retrieval-augmented generation (RAG) strategy that effectively provides matched domain-specific contexts for the question from multiple sources. They come from related historical questions, troubleshooting manuals and DBMS telemetries, which significantly improve the performance of configuration debugging. To support the RAG strategy, we develop a document retrieval mechanism addressing heterogeneous documents and design an effective method for telemetry analysis. Extensive experiments on real-world DBMS configuration debugging datasets show that Andromeda significantly outperforms existing solutions.
- Published
- 2024
29. Dynamic Ensemble Reasoning for LLM Experts
- Author
-
Hu, Jinwu, Wang, Yufeng, Zhang, Shuhai, Zhou, Kai, Chen, Guohao, Hu, Yu, Xiao, Bin, and Tan, Mingkui
- Subjects
Computer Science - Artificial Intelligence - Abstract
Ensemble reasoning for the strengths of different LLM experts is critical to achieving consistent and satisfactory performance on diverse inputs across a wide range of tasks. However, existing LLM ensemble methods are either computationally intensive or incapable of leveraging complementary knowledge among LLM experts for various inputs. In this paper, we propose a Dynamic Ensemble Reasoning paradigm, called DER to integrate the strengths of multiple LLM experts conditioned on dynamic inputs. Specifically, we model the LLM ensemble reasoning problem as a Markov Decision Process (MDP), wherein an agent sequentially takes inputs to request knowledge from an LLM candidate and passes the output to a subsequent LLM candidate. Moreover, we devise a reward function to train a DER-Agent to dynamically select an optimal answering route given the input questions, aiming to achieve the highest performance with as few computational resources as possible. Last, to fully transfer the expert knowledge from the prior LLMs, we develop a Knowledge Transfer Prompt (KTP) that enables the subsequent LLM candidates to transfer complementary knowledge effectively. Experiments demonstrate that our method uses fewer computational resources to achieve better performance compared to state-of-the-art baselines., Comment: 18 pages
- Published
- 2024
30. Backdoor Attacks against No-Reference Image Quality Assessment Models via A Scalable Trigger
- Author
-
Yu, Yi, Xia, Song, Lin, Xun, Yang, Wenhan, Lu, Shijian, Tan, Yap-peng, and Kot, Alex
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Cryptography and Security - Abstract
No-Reference Image Quality Assessment (NR-IQA), responsible for assessing the quality of a single input image without using any reference, plays a critical role in evaluating and optimizing computer vision systems, e.g., low-light enhancement. Recent research indicates that NR-IQA models are susceptible to adversarial attacks, which can significantly alter predicted scores with visually imperceptible perturbations. Despite revealing vulnerabilities, these attack methods have limitations, including high computational demands, untargeted manipulation, limited practical utility in white-box scenarios, and reduced effectiveness in black-box scenarios. To address these challenges, we shift our focus to another significant threat and present a novel poisoning-based backdoor attack against NR-IQA (BAIQA), allowing the attacker to manipulate the IQA model's output to any desired target value by simply adjusting a scaling coefficient $\alpha$ for the trigger. We propose to inject the trigger in the discrete cosine transform (DCT) domain to improve the local invariance of the trigger for countering trigger diminishment in NR-IQA models due to widely adopted data augmentations. Furthermore, the universal adversarial perturbations (UAP) in the DCT space are designed as the trigger, to increase IQA model susceptibility to manipulation and improve attack effectiveness. In addition to the heuristic method for poison-label BAIQA (P-BAIQA), we explore the design of clean-label BAIQA (C-BAIQA), focusing on $\alpha$ sampling and image data refinement, driven by theoretical insights we reveal. Extensive experiments on diverse datasets and various NR-IQA models demonstrate the effectiveness of our attacks. Code will be released at https://github.com/yuyi-sd/BAIQA., Comment: Accept by AAAI 2025
- Published
- 2024
31. Detection with Uncertainty in Target Direction for Dual Functional Radar and Communication Systems
- Author
-
Ashraf, Mateen, Gaydamaka, Anna, Moltchanov, Dmitri, Thompson, John, Valkama, Mikko, and Tan, Bo
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Dual functional radar and communication (DFRC) systems are a viable approach to extend the services of future communication systems. Most studies designing DFRC systems assume that the target direction is known. In our paper, we address a critical scenario where this information is not exactly known. For such a system, a signal-to-clutter-plus-noise ratio (SCNR) maximization problem is formulated. Quality-of-service constraints for communication users (CUs) are also incorporated as constraints on their received signal-to-interference-plus-noise ratios (SINRs). To tackle the nonconvexity, an iterative alternating optimization approach is developed where, at each iteration, the optimization is alternatively performed with respect to transmit and receive beamformers. Specifically, a penalty-based approach is used to obtain an efficient sub-optimal solution for the resulting subproblem with regard to transmit beamformers. Next, a globally optimal solution is obtained for receive beamformers with the help of the Dinkleback approach. The convergence of the proposed algorithm is also proved by proving the nondecreasing nature of the objective function with iterations. The numerical results illustrate the effectiveness of the proposed approach. Specifically, it is observed that the proposed algorithm converges within almost 3 iterations, and the SCNR performance is almost unchanged with the number of possible target directions.
- Published
- 2024
32. A Survey of Open-Source Power System Dynamic Simulators with Grid-Forming Inverter for Machine Learning Applications
- Author
-
Su, Tong, Peng, Jiangkai, Selim, Alaa, Zhao, Junbo, and Tan, Jin
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
The emergence of grid-forming (GFM) inverter technology and the increasing role of machine learning in power systems highlight the need for evaluating the latest dynamic simulators. Open-source simulators offer distinct advantages in this field, being both free and highly customizable, which makes them well-suited for scientific research and validation of the latest models and methods. This paper provides a comprehensive survey and comparison of the latest open-source simulators that support GFM, with a focus on their capabilities and performance in machine-learning applications.
- Published
- 2024
33. On the Bargmann invariants for quantum imaginarity
- Author
-
Li, Mao-Sheng and Tan, Yi-Xi
- Subjects
Quantum Physics - Abstract
The imaginary in quantum theory plays a crucial role in describing quantum coherence and is widely applied in quantum information tasks such as state discrimination, pseudorandomness generation, and quantum metrology. A recent paper by Fernandes et al. [C. Fernandes, R. Wagner, L. Novo, and E. F. Galv\~ao, Phys. Rev. Lett. 133, 190201 (2024) ] showed how to use the Bargmann invariant to witness the imaginarity of a set of quantum states. In this work, we delve into the structure of Bargmann invariants and their quantum realization in qubit systems. First, we present a characterization of special sets of Bargmann invariants (also studied by Fernandes et al. for a set of four states) for a general set of $n$ quantum states. Then, we study the properties of the relevant Bargmann invariant set $\mathcal{B}_n$ and its quantum realization in qubit systems. Our results provide new insights into the structure of Bargmann invariants, contributing to the advancement of quantum information techniques, particularly within qubit systems., Comment: 11 pages, 4 figures
- Published
- 2024
34. The first exploration of the correlations between \textit{WISE} 12 \micron\ and CO emission in early-type galaxies
- Author
-
Gao, Yang, Wang, Enci, Tan, Qing-Hua, Davis, Timothy A., Liang, Fu-Heng, Jiang, Xue-Jian, Gai, Ning, Jiao, Qian, Shi, DongDong, Feng, Shuai, Tang, Yanke, Li, Shijie, and Wang, Yi-Fan
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We present the analysis of a comprehensive sample of 352 early-type galaxies using public data, to investigate the correlations between CO luminosities and mid-infrared luminosities observed by \textit{Wide-field Infrared Survey Explorer} (\textit{WISE}). We find strong correlations between both CO (1-0) and CO (2-1) luminosities and 12 \micron\ luminosity, boasting a correlation coefficient greater than 0.9 and an intrinsic scatter smaller than 0.1 dex. The consistent slopes observed for the relationships of CO (1-0) and CO (2-1) suggest that the line ratio R21 lacks correlation with mid-infrared emission in early-type galaxies, which is significantly different from star-forming galaxies. Moreover, the slopes of $L_{\rm CO (1-0)}$--$L_{\mbox{12\micron}}$ and $L_{\rm CO (2-1)}$--$L_{\mbox{12\micron}}$ relations in early-type galaxies are steeper than those observed in star-forming galaxies. Given the absence of correlation with color, morphology or sSFR, the correlation between deviations and the molecular gas mass surface density could be eliminated by correcting the possible 12 \micron\ emission from old stars or adopting a systematically different $\alpha_{\rm CO}$. The latter, on average, is equivalent to adding an constant CO brightness density, specifically ${2.8{_{-0.6}}\!\!\!\!\!\!\!\!\!^{+0.8}}~[\mathrm{K~km~s^{-1}}]$ and ${4.4{_{-1.4}}\!\!\!\!\!\!\!\!\!^{+2.2}}~[\mathrm{K~km~s^{-1}}]$ for CO (1-0) and (2-1) respectively. These explorations will serve as useful tools for estimating the molecular gas content in gas-poor galaxies and understanding associated quenching processes., Comment: 20 pages, 6 figures, accepted for publication in ApJ
- Published
- 2024
35. World-Consistent Data Generation for Vision-and-Language Navigation
- Author
-
Zhong, Yu, Zhang, Rui, Zhang, Zihao, Wang, Shuo, Fang, Chuan, Zhang, Xishan, Guo, Jiaming, Peng, Shaohui, Huang, Di, Yan, Yanyang, Hu, Xing, Tan, Ping, and Guo, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Vision-and-Language Navigation (VLN) is a challenging task that requires an agent to navigate through photorealistic environments following natural-language instructions. One main obstacle existing in VLN is data scarcity, leading to poor generalization performance over unseen environments. Tough data argumentation is a promising way for scaling up the dataset, how to generate VLN data both diverse and world-consistent remains problematic. To cope with this issue, we propose the world-consistent data generation (WCGEN), an efficacious data-augmentation framework satisfying both diversity and world-consistency, targeting at enhancing the generalizations of agents to novel environments. Roughly, our framework consists of two stages, the trajectory stage which leverages a point-cloud based technique to ensure spatial coherency among viewpoints, and the viewpoint stage which adopts a novel angle synthesis method to guarantee spatial and wraparound consistency within the entire observation. By accurately predicting viewpoint changes with 3D knowledge, our approach maintains the world-consistency during the generation procedure. Experiments on a wide range of datasets verify the effectiveness of our method, demonstrating that our data augmentation strategy enables agents to achieve new state-of-the-art results on all navigation tasks, and is capable of enhancing the VLN agents' generalization ability to unseen environments.
- Published
- 2024
36. Assessing the Impact of Conspiracy Theories Using Large Language Models
- Author
-
Jiang, Bohan, Li, Dawei, Tan, Zhen, Zhou, Xinyi, Rao, Ashwin, Lerman, Kristina, Bernard, H. Russell, and Liu, Huan
- Subjects
Computer Science - Computation and Language ,Computer Science - Computers and Society - Abstract
Measuring the relative impact of CTs is important for prioritizing responses and allocating resources effectively, especially during crises. However, assessing the actual impact of CTs on the public poses unique challenges. It requires not only the collection of CT-specific knowledge but also diverse information from social, psychological, and cultural dimensions. Recent advancements in large language models (LLMs) suggest their potential utility in this context, not only due to their extensive knowledge from large training corpora but also because they can be harnessed for complex reasoning. In this work, we develop datasets of popular CTs with human-annotated impacts. Borrowing insights from human impact assessment processes, we then design tailored strategies to leverage LLMs for performing human-like CT impact assessments. Through rigorous experiments, we textit{discover that an impact assessment mode using multi-step reasoning to analyze more CT-related evidence critically produces accurate results; and most LLMs demonstrate strong bias, such as assigning higher impacts to CTs presented earlier in the prompt, while generating less accurate impact assessments for emotionally charged and verbose CTs.
- Published
- 2024
37. TAE: A Model-Constrained Tikhonov Autoencoder Approach for Forward and Inverse Problems
- Author
-
Nguyen, Hai V. and Bui-Thanh, Tan
- Subjects
Computer Science - Machine Learning ,Physics - Computational Physics - Abstract
Efficient real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alternatives to traditional methods, offering substantially reduced computational time. Nevertheless, these models typically demand extensive training datasets to achieve robust generalization across diverse scenarios. While physics-based approaches can partially mitigate this data dependency and ensure physics-interpretable solutions, addressing scarce data regimes remains a challenge. Both purely data-driven and physics-based machine learning approaches demonstrate severe overfitting issues when trained with insufficient data. We propose a novel Tikhonov autoencoder model-constrained framework, called TAE, capable of learning both forward and inverse surrogate models using a single arbitrary observation sample. We develop comprehensive theoretical foundations including forward and inverse inference error bounds for the proposed approach for linear cases. For comparative analysis, we derive equivalent formulations for pure data-driven and model-constrained approach counterparts. At the heart of our approach is a data randomization strategy, which functions as a generative mechanism for exploring the training data space, enabling effective training of both forward and inverse surrogate models from a single observation, while regularizing the learning process. We validate our approach through extensive numerical experiments on two challenging inverse problems: 2D heat conductivity inversion and initial condition reconstruction for time-dependent 2D Navier-Stokes equations. Results demonstrate that TAE achieves accuracy comparable to traditional Tikhonov solvers and numerical forward solvers for both inverse and forward problems, respectively, while delivering orders of magnitude computational speedups.
- Published
- 2024
38. Photonic real-time signal processing
- Author
-
Ai, Qihang, Feng, Hanxiao, Yang, Xinyu, Tan, Mengxi, Xu, Xingyuan, Morandotti, Roberto, Su, Donglin, and Moss, David J.
- Subjects
Physics - Optics ,Physics - Applied Physics - Abstract
The simultaneous progress of integrated optical frequency comb (OFC) and radio frequency (RF) photonic signal processing technique have promoted the rapid development of real-time signal processing. Integrated optical frequency comb offer multiple wavelengths as a powerful source for RF photonic signal transversal filter. Here, we review development of real-time signal processing system consisting of integrated OFC and RF photonic signal transversal filter in chronological order, and focus on the applications of this system such as differentiator, integrator, Hilbert transformer, and image processor. We also discuss and present our outlook on more parallel functions and further integration of real-time signal processing system., Comment: 15pages,7 figures
- Published
- 2024
39. A Scalable Decentralized Reinforcement Learning Framework for UAV Target Localization Using Recurrent PPO
- Author
-
Fernando, Leon, Lau, Billy Pik Lik, Yuen, Chau, and Tan, U-Xuan
- Subjects
Computer Science - Robotics ,Computer Science - Machine Learning ,I.2.9 - Abstract
The rapid advancements in unmanned aerial vehicles (UAVs) have unlocked numerous applications, including environmental monitoring, disaster response, and agricultural surveying. Enhancing the collective behavior of multiple decentralized UAVs can significantly improve these applications through more efficient and coordinated operations. In this study, we explore a Recurrent PPO model for target localization in perceptually degraded environments like places without GNSS/GPS signals. We first developed a single-drone approach for target identification, followed by a decentralized two-drone model. Our approach can utilize two types of sensors on the UAVs, a detection sensor and a target signal sensor. The single-drone model achieved an accuracy of 93%, while the two-drone model achieved an accuracy of 86%, with the latter requiring fewer average steps to locate the target. This demonstrates the potential of our method in UAV swarms, offering efficient and effective localization of radiant targets in complex environmental conditions., Comment: Submitted to TENCON 2024
- Published
- 2024
40. MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
- Author
-
Shi, Shuwei, Gong, Biao, Chen, Xi, Zheng, Dandan, Tan, Shuai, Yang, Zizheng, Li, Yuyuan, He, Jingwen, Zheng, Kecheng, Chen, Jingdong, Yang, Ming, and Zheng, Yinqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The image-to-video (I2V) generation is conditioned on the static image, which has been enhanced recently by the motion intensity as an additional control signal. These motion-aware models are appealing to generate diverse motion patterns, yet there lacks a reliable motion estimator for training such models on large-scale video set in the wild. Traditional metrics, e.g., SSIM or optical flow, are hard to generalize to arbitrary videos, while, it is very tough for human annotators to label the abstract motion intensity neither. Furthermore, the motion intensity shall reveal both local object motion and global camera movement, which has not been studied before. This paper addresses the challenge with a new motion estimator, capable of measuring the decoupled motion intensities of objects and cameras in video. We leverage the contrastive learning on randomly paired videos and distinguish the video with greater motion intensity. Such a paradigm is friendly for annotation and easy to scale up to achieve stable performance on motion estimation. We then present a new I2V model, named MotionStone, developed with the decoupled motion estimator. Experimental results demonstrate the stability of the proposed motion estimator and the state-of-the-art performance of MotionStone on I2V generation. These advantages warrant the decoupled motion estimator to serve as a general plug-in enhancer for both data processing and video generation training.
- Published
- 2024
41. Towards Long Video Understanding via Fine-detailed Video Story Generation
- Author
-
You, Zeng, Wen, Zhiquan, Chen, Yaofo, Li, Xin, Zeng, Runhao, Wang, Yaowei, and Tan, Mingkui
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Long video understanding has become a critical task in computer vision, driving advancements across numerous applications from surveillance to content retrieval. Existing video understanding methods suffer from two challenges when dealing with long video understanding: intricate long-context relationship modeling and interference from redundancy. To tackle these challenges, we introduce Fine-Detailed Video Story generation (FDVS), which interprets long videos into detailed textual representations. Specifically, to achieve fine-grained modeling of long-temporal content, we propose a Bottom-up Video Interpretation Mechanism that progressively interprets video content from clips to video. To avoid interference from redundant information in videos, we introduce a Semantic Redundancy Reduction mechanism that removes redundancy at both the visual and textual levels. Our method transforms long videos into hierarchical textual representations that contain multi-granularity information of the video. With these representations, FDVS is applicable to various tasks without any fine-tuning. We evaluate the proposed method across eight datasets spanning three tasks. The performance demonstrates the effectiveness and versatility of our method.
- Published
- 2024
42. Exploring lattice thermal conductivity models via interpretable deep learning to accelerate the discovery of novel materials
- Author
-
Zeng, Yuxuan, Cao, Wei, Zuo, Yijing, Peng, Tan, Hou, Yue, Miao, Ling, Wang, Ziyu, and Shi, Jing
- Subjects
Condensed Matter - Materials Science ,Physics - Applied Physics - Abstract
Lattice thermal conductivity, being integral to thermal transport properties, is indispensable to advancements in areas such as thermoelectric materials and thermal management. Traditional methods, such as Density Functional Theory and Molecular Dynamics, require significant computational resources, posing challenges to the high-throughput prediction of lattice thermal conductivity. Although AI-driven material science has achieved fruitful progress, the trade-off between accuracy and interpretability in machine learning continues to hinder further advancements. This study utilizes interpretable deep learning techniques to construct a rapid prediction framework that enables both qualitative assessments and quantitative predictions, accurately forecasting the thermal transport properties of three novel materials. Furthermore, interpretable deep learning offers analytically grounded physical models while integrating with sensitivity analysis to uncover deeper theoretical insights.
- Published
- 2024
43. WATER-GS: Toward Copyright Protection for 3D Gaussian Splatting via Universal Watermarking
- Author
-
Tan, Yuqi, Liu, Xiang, Xie, Shuzhao, Chen, Bin, Xia, Shu-Tao, and Wang, Zhi
- Subjects
Computer Science - Cryptography and Security - Abstract
3D Gaussian Splatting (3DGS) has emerged as a pivotal technique for 3D scene representation, providing rapid rendering speeds and high fidelity. As 3DGS gains prominence, safeguarding its intellectual property becomes increasingly crucial since 3DGS could be used to imitate unauthorized scene creations and raise copyright issues. Existing watermarking methods for implicit NeRFs cannot be directly applied to 3DGS due to its explicit representation and real-time rendering process, leaving watermarking for 3DGS largely unexplored. In response, we propose WATER-GS, a novel method designed to protect 3DGS copyrights through a universal watermarking strategy. First, we introduce a pre-trained watermark decoder, treating raw 3DGS generative modules as potential watermark encoders to ensure imperceptibility. Additionally, we implement novel 3D distortion layers to enhance the robustness of the embedded watermark against common real-world distortions of point cloud data. Comprehensive experiments and ablation studies demonstrate that WATER-GS effectively embeds imperceptible and robust watermarks into 3DGS without compromising rendering efficiency and quality. Our experiments indicate that the 3D distortion layers can yield up to a 20% improvement in accuracy rate. Notably, our method is adaptable to different 3DGS variants, including 3DGS compression frameworks and 2D Gaussian splatting.
- Published
- 2024
44. Predicting Organic-Inorganic Halide Perovskite Photovoltaic Performance from Optical Properties of Constituent Films through Machine Learning
- Author
-
Zhang, Ruiqi, Motes, Brandon, Tan, Shaun, Lu, Yongli, Shih, Meng-Chen, Hao, Yilun, Yang, Karen, Srinivasan, Shreyas, Bawendi, Moungi G., and Bulovic, Vladimir
- Subjects
Condensed Matter - Materials Science ,Computer Science - Machine Learning ,Physics - Computational Physics - Abstract
We demonstrate a machine learning (ML) approach that accurately predicts the current-voltage behavior of 3D/2D-structured (FAMA)Pb(IBr)3/OABr hybrid organic-inorganic halide perovskite (HOIP) solar cells under AM1.5 illumination. Our neural network algorithm is trained on measured responses from several hundred HOIP solar cells, using three simple optical measurements of constituent HOIP films as input: optical transmission spectrum, spectrally-resolved photoluminescence, and time-resolved photoluminescence, from which we predict the open-circuit voltage (Voc), short-circuit current (Jsc), and fill factors (FF) values of solar cells that contain the HOIP active layers. Determined average prediction accuracies for 95 % of the predicted Voc, Jsc, and FF values are 91%, 94% and 89%, respectively, with R2 coefficients of determination of 0.47, 0.77, and 0.58, respectively. Quantifying the connection between ML predictions and physical parameters extracted from the measured HOIP films optical properties, allows us to identify the most significant parameters influencing the prediction results. With separate ML-classifying algorithms, we identify degraded solar cells using the same optical input data, achieving over 90% classification accuracy through support vector machine, cross entropy loss, and artificial neural network algorithms. To our knowledge, the demonstrated regression and classification work is the first to use ML to predict device photovoltaic properties solely from the optical properties of constituent materials., Comment: 36 pages, 6 figures
- Published
- 2024
45. TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
- Author
-
Ma, Zixian, Zhang, Jianguo, Liu, Zhiwei, Zhang, Jieyu, Tan, Juntao, Shu, Manli, Niebles, Juan Carlos, Heinecke, Shelby, Wang, Huan, Xiong, Caiming, Krishna, Ranjay, and Savarese, Silvio
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
While open-source multi-modal language models perform well on simple question answering tasks, they often fail on complex questions that require multiple capabilities, such as fine-grained recognition, visual grounding, and reasoning, and that demand multi-step solutions. We present TACO, a family of multi-modal large action models designed to improve performance on such complex, multi-step, and multi-modal tasks. During inference, TACO produces chains-of-thought-and-action (CoTA), executes intermediate steps by invoking external tools such as OCR, depth estimation and calculator, then integrates both the thoughts and action outputs to produce coherent responses. To train TACO, we create a large dataset of over 1M synthetic CoTA traces generated with GPT-4o and Python programs. We then experiment with various data filtering and mixing techniques and obtain a final subset of 293K high-quality CoTA examples. This dataset enables TACO to learn complex reasoning and action paths, surpassing existing models trained on instruction tuning data with only direct answers. Our model TACO outperforms the instruction-tuned baseline across 8 benchmarks, achieving a 3.6% improvement on average, with gains of up to 15% in MMVet tasks involving OCR, mathematical reasoning, and spatial reasoning. Training on high-quality CoTA traces sets a new standard for complex multi-modal reasoning, highlighting the need for structured, multi-step instruction tuning in advancing open-source mutli-modal models' capabilities.
- Published
- 2024
46. UniScene: Unified Occupancy-centric Driving Scene Generation
- Author
-
Li, Bohan, Guo, Jiazhe, Liu, Hongsi, Zou, Yingshuang, Ding, Yikang, Chen, Xiwu, Zhu, Hu, Tan, Feiyang, Zhang, Chi, Wang, Tiancai, Zhou, Shuchang, Zhang, Li, Qi, Xiaojuan, Zhao, Hao, Yang, Mu, Zeng, Wenjun, and Jin, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Generating high-fidelity, controllable, and annotated training data is critical for autonomous driving. Existing methods typically generate a single data form directly from a coarse scene layout, which not only fails to output rich data forms required for diverse downstream tasks but also struggles to model the direct layout-to-data distribution. In this paper, we introduce UniScene, the first unified framework for generating three key data forms - semantic occupancy, video, and LiDAR - in driving scenes. UniScene employs a progressive generation process that decomposes the complex task of scene generation into two hierarchical steps: (a) first generating semantic occupancy from a customized scene layout as a meta scene representation rich in both semantic and geometric information, and then (b) conditioned on occupancy, generating video and LiDAR data, respectively, with two novel transfer strategies of Gaussian-based Joint Rendering and Prior-guided Sparse Modeling. This occupancy-centric approach reduces the generation burden, especially for intricate scenes, while providing detailed intermediate representations for the subsequent generation stages. Extensive experiments demonstrate that UniScene outperforms previous SOTAs in the occupancy, video, and LiDAR generation, which also indeed benefits downstream driving tasks.
- Published
- 2024
47. Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
- Author
-
Chen, Zhe, Wang, Weiyun, Cao, Yue, Liu, Yangzhou, Gao, Zhangwei, Cui, Erfei, Zhu, Jinguo, Ye, Shenglong, Tian, Hao, Liu, Zhaoyang, Gu, Lixin, Wang, Xuehui, Li, Qingyun, Ren, Yimin, Chen, Zixuan, Luo, Jiapeng, Wang, Jiahao, Jiang, Tan, Wang, Bo, He, Conghui, Shi, Botian, Zhang, Xingcheng, Lv, Han, Wang, Yi, Shao, Wenqi, Chu, Pei, Tu, Zhongying, He, Tong, Wu, Zhiyong, Deng, Huipeng, Ge, Jiaye, Chen, Kai, Dou, Min, Lu, Lewei, Zhu, Xizhou, Lu, Tong, Lin, Dahua, Qiao, Yu, Dai, Jifeng, and Wang, Wenhai
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We introduce InternVL 2.5, an advanced multimodal large language model (MLLM) series that builds upon InternVL 2.0, maintaining its core model architecture while introducing significant enhancements in training and testing strategies as well as data quality. In this work, we delve into the relationship between model scaling and performance, systematically exploring the performance trends in vision encoders, language models, dataset sizes, and test-time configurations. Through extensive evaluations on a wide range of benchmarks, including multi-discipline reasoning, document understanding, multi-image / video understanding, real-world comprehension, multimodal hallucination detection, visual grounding, multilingual capabilities, and pure language processing, InternVL 2.5 exhibits competitive performance, rivaling leading commercial models such as GPT-4o and Claude-3.5-Sonnet. Notably, our model is the first open-source MLLMs to surpass 70% on the MMMU benchmark, achieving a 3.7-point improvement through Chain-of-Thought (CoT) reasoning and showcasing strong potential for test-time scaling. We hope this model contributes to the open-source community by setting new standards for developing and applying multimodal AI systems. HuggingFace demo see https://huggingface.co/spaces/OpenGVLab/InternVL, Comment: Technical Report
- Published
- 2024
48. The neutron veto of the XENONnT experiment: Results with demineralized water
- Author
-
XENON Collaboration, Aprile, E., Aalbers, J., Abe, K., Maouloud, S. Ahmed, Althueser, L., Andrieu, B., Angelino, E., Martin, D. Antón, Arneodo, F., Baudis, L., Bazyk, M., Bellagamba, L., Biondi, R., Bismark, A., Boese, K., Brown, A., Bruno, G., Budnik, R., Cai, C., Capelli, C., Cardoso, J. M. R., Chávez, A. P. Cimental, Colijn, A. P., Conrad, J., Cuenca-García, J. J., D'Andrea, V., Garcia, L. C. Daniel, Decowski, M. P., Deisting, A., Di Donato, C., Di Gangi, P., Diglio, S., Eitel, K., Morabit, S. el, Elykov, A., Ferella, A. D., Ferrari, C., Fischer, H., Flehmke, T., Flierman, M., Fulgione, W., Fuselli, C., Gaemers, P., Gaior, R., Galloway, M., Gao, F., Ghosh, S., Giacomobono, R., Glade-Beucke, R., Grandi, L., Grigat, J., Guan, H., Guida, M., Gyorgy, P., Hammann, R., Higuera, A., Hils, C., Hoetzsch, L., Hood, N. F., Iacovacci, M., Itow, Y., Jakob, J., Joerg, F., Kaminaga, Y., Kara, M., Kavrigin, P., Kazama, S., Kobayashi, M., Koke, D., Kopec, A., Landsman, H., Lang, R. F., Levinson, L., Li, I., Li, S., Liang, S., Lin, Y. -T., Lindemann, S., Lindner, M., Liu, K., Liu, M., Loizeau, J., Lombardi, F., Long, J., Lopes, J. A. M., Luce, T., Ma, Y., Macolino, C., Mahlstedt, J., Mancuso, A., Manenti, L., Marignetti, F., Undagoitia, T. Marrodán, Martens, K., Masbou, J., Masson, E., Mastroianni, S., Melchiorre, A., Merz, J., Messina, M., Michael, A., Miuchi, K., Molinario, A., Moriyama, S., Morá, K., Mosbacher, Y., Murra, M., Müller, J., Ni, K., Oberlack, U., Paetsch, B., Pan, Y., Pellegrini, Q., Peres, R., Peters, C., Pienaar, J., Pierre, M., Plante, G., Pollmann, T. R., Principe, L., Qi, J., Qin, J., García, D. Ramírez, Rajado, M., Singh, R., Sanchez, L., Santos, J. M. F. dos, Sarnoff, I., Sartorelli, G., Schreiner, J., Schulte, P., Eißing, H. Schulze, Schumann, M., Lavina, L. Scotto, Selvi, M., Semeria, F., Shagin, P., Shi, S., Shi, J., Silva, M., Simgen, H., Szyszka, C., Takeda, A., Takeuchi, Y., Tan, P. -L., Thers, D., Toschi, F., Trinchero, G., Tunnell, C. D., Tönnies, F., Valerius, K., Vecchi, S., Vetter, S., Solar, F. I. Villazon, Volta, G., Weinheimer, C., Weiss, M., Wenz, D., Wittweg, C., Wu, V. H. S., Xing, Y., Xu, D., Xu, Z., Yamashita, M., Yang, L., Ye, J., Yuan, L., Zavattini, G., and Zhong, M.
- Subjects
Physics - Instrumentation and Detectors ,Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Instrumentation and Methods for Astrophysics ,High Energy Physics - Experiment - Abstract
Radiogenic neutrons emitted by detector materials are one of the most challenging backgrounds for the direct search of dark matter in the form of weakly interacting massive particles (WIMPs). To mitigate this background, the XENONnT experiment is equipped with a novel gadolinium-doped water Cherenkov detector, which encloses the xenon dual-phase time projection chamber (TPC). The neutron veto (NV) tags neutrons via their capture on gadolinium or hydrogen, which release $\gamma$-rays that are subsequently detected as Cherenkov light. In this work, we present the key features and the first results of the XENONnT NV when operated with demineralized water in the initial phase of the experiment. Its efficiency for detecting neutrons is $(82\pm 1)\,\%$, the highest neutron detection efficiency achieved in a water Cherenkov detector. This enables a high efficiency of $(53\pm 3)\,\%$ for the tagging of WIMP-like neutron signals, inside a tagging time window of $250\,\mathrm{\mu s}$ between TPC and NV, leading to a livetime loss of $1.6\,\%$ during the first science run of XENONnT.
- Published
- 2024
49. LinVT: Empower Your Image-level Large Language Model to Understand Videos
- Author
-
Gao, Lishuai, Zhong, Yujie, Zeng, Yingsen, Tan, Haoxian, Li, Dengjie, and Zhao, Zheng
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
Large Language Models (LLMs) have been widely used in various tasks, motivating us to develop an LLM-based assistant for videos. Instead of training from scratch, we propose a module to transform arbitrary well-trained image-based LLMs into video-LLMs (after being trained on video data). To better adapt image-LLMs for processing videos, we introduce two design principles: linear transformation to preserve the original visual-language alignment and representative information condensation from redundant video content. Guided by these principles, we propose a plug-and-play Linear Video Tokenizer(LinVT), which enables existing image-LLMs to understand videos. We benchmark LinVT with six recent visual LLMs: Aquila, Blip-3, InternVL2, Mipha, Molmo and Qwen2-VL, showcasing the high compatibility of LinVT. LinVT-based LLMs achieve state-of-the-art performance across various video benchmarks, illustrating the effectiveness of LinVT in multi-modal video understanding.
- Published
- 2024
50. SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization
- Author
-
Tan, Xiaofeng, Wang, Hongsong, Geng, Xin, and Zhou, Pan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Text-to-motion generation is essential for advancing the creative industry but often presents challenges in producing consistent, realistic motions. To address this, we focus on fine-tuning text-to-motion models to consistently favor high-quality, human-preferred motions, a critical yet largely unexplored problem. In this work, we theoretically investigate the DPO under both online and offline settings, and reveal their respective limitation: overfitting in offline DPO, and biased sampling in online DPO. Building on our theoretical insights, we introduce Semi-online Preference Optimization (SoPo), a DPO-based method for training text-to-motion models using "semi-online" data pair, consisting of unpreferred motion from online distribution and preferred motion in offline datasets. This method leverages both online and offline DPO, allowing each to compensate for the other's limitations. Extensive experiments demonstrate that SoPo outperforms other preference alignment methods, with an MM-Dist of 3.25% (vs e.g. 0.76% of MoDiPO) on the MLD model, 2.91% (vs e.g. 0.66% of MoDiPO) on MDM model, respectively. Additionally, the MLD model fine-tuned by our SoPo surpasses the SoTA model in terms of R-precision and MM Dist. Visualization results also show the efficacy of our SoPo in preference alignment. Our project page is https://sopo-motion.github.io.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.