7,097,628 results on '"Lee, On On"'
Search Results
152. Giant coercivity and enhanced intrinsic anomalous Hall effect at vanishing magnetization in a compensated kagome ferrimagnet
- Author
-
DeStefano, Jonathan M., Rosenberg, Elliott, Ren, Guodong, Lee, Yongbin, Ning, Zhenhua, Peek, Olivia, Harrison, Kamal, Khondaker, Saiful I., Ke, Liqin, Mazin, Igor I., Idrobo, Juan Carlos, and Chu, Jiun-Haw
- Subjects
Condensed Matter - Strongly Correlated Electrons - Abstract
Ferrimagnets that can be driven to magnetic compensation show promise for use in spintronics as they exhibit a finite anomalous Hall effect at zero magnetic field without having a significant magnetic moment. Compensated ferrimagnet spintronic devices with both a large anomalous Hall effect and a high coercivity would be simultaneously easy to read and difficult to erase. The kagome ferrimagnet TbMn$_6$Sn$_6$ has been reported to host a large intrinsic anomalous Hall effect. Here, we demonstrate that doping the Mn sites with Cr drives the system towards magnetic compensation. For nearly compensated compositions at low temperatures, giant coercive fields exceeding 14 T are observed. Additionally, Cr doping significantly enhances the intrinsic anomalous Hall effect, which can be attributed to a shift in the Fermi level. Our results extend the range of unique magnetic states observed in kagome materials, demonstrating that chemical doping is an effective strategy to tune and realize these states.
- Published
- 2025
153. Enhanced chemical vapour deposition of monolayer MoS2 films via a clean promoter
- Author
-
Wang, Lulin, Sun, Yue, Kannan, Kaushik, Gannon, Lee, Guo, Xuyun, Rafferty, Aran, Gaff, Karl, Mullani, Navaj B., Weng, Haizhong, Zhou, Yangbo, Nicolosi, Valeria, Guinness, Cormac Mc, and Zhang, Hongzhou
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Two-dimensional (2D) transition metal dichalcogenides (TMDCs), exemplified by molybdenum disulfide (MoS2), have shown exceptional potential for data-centred, energy-efficient electronic applications due to their unique electrical, optoelectronic, and mechanical properties. However, challenges such as the controllable synthesis of high-quality, large-area 2D MoS2 films and the mitigation of contamination during growth remain significant barriers to their integration into advanced technologies. Here, we developed a novel contamination-free growth promoter, enabling the clean and scalable synthesis of high quality 2D MoS2 with desirable grain structures via chemical vapour deposition (CVD). By optimising the reactant concentration and S/Mo ratio, we achieved promoter-dominated enhanced growth with enhanced quality, as evidenced by the increased MoS2 flake size and coverage, alongside a strong PL A exciton peak at 1.84 eV, matching that of the mechanically exfoliated sample. This approach facilitates the clean and site-specific growth of high-quality 2D MoS2, establishing a robust pathway for the practical implementation of 2D MoS2 in next-generation electronic devices., Comment: 27 pages, 4 figures
- Published
- 2025
154. Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
- Author
-
Feng, Shangbin, Wang, Zifeng, Goyal, Palash, Wang, Yike, Shi, Weijia, Xia, Huang, Palangi, Hamid, Zettlemoyer, Luke, Tsvetkov, Yulia, Lee, Chen-Yu, and Pfister, Tomas
- Subjects
Computer Science - Computation and Language - Abstract
We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights. We represent multi-LLM systems as directed acyclic graphs (DAGs) of LLMs with topological message passing for collaborative generation. Given a pool of LLM experts and a utility function, Heterogeneous Swarms employs two iterative steps: role-step and weight-step. For role-step, we interpret model roles as learning a DAG that specifies the flow of inputs and outputs between LLMs. Starting from a swarm of random continuous adjacency matrices, we decode them into discrete DAGs, call the LLMs in topological order, evaluate on the utility function (e.g. accuracy on a task), and optimize the adjacency matrices with particle swarm optimization based on the utility score. For weight-step, we assess the contribution of individual LLMs in the multi-LLM systems and optimize model weights with swarm intelligence. We propose JFK-score to quantify the individual contribution of each LLM in the best-found DAG of the role-step, then optimize model weights with particle swarm optimization based on the JFK-score. Experiments demonstrate that Heterogeneous Swarms outperforms 15 role- and/or weight-based baselines by 18.5% on average across 12 tasks. Further analysis reveals that Heterogeneous Swarms discovers multi-LLM systems with heterogeneous model roles and substantial collaborative gains, and benefits from the diversity of language models.
- Published
- 2025
155. When One LLM Drools, Multi-LLM Collaboration Rules
- Author
-
Feng, Shangbin, Ding, Wenxuan, Liu, Alisa, Wang, Zifeng, Shi, Weijia, Wang, Yike, Shen, Zejiang, Han, Xiaochuang, Lang, Hunter, Lee, Chen-Yu, Pfister, Tomas, Choi, Yejin, and Tsvetkov, Yulia
- Subjects
Computer Science - Computation and Language - Abstract
This position paper argues that in many realistic (i.e., complex, contextualized, subjective) scenarios, one LLM is not enough to produce a reliable output. We challenge the status quo of relying solely on a single general-purpose LLM and argue for multi-LLM collaboration to better represent the extensive diversity of data, skills, and people. We first posit that a single LLM underrepresents real-world data distributions, heterogeneous skills, and pluralistic populations, and that such representation gaps cannot be trivially patched by further training a single LLM. We then organize existing multi-LLM collaboration methods into a hierarchy, based on the level of access and information exchange, ranging from API-level, text-level, logit-level, to weight-level collaboration. Based on these methods, we highlight how multi-LLM collaboration addresses challenges that a single LLM struggles with, such as reliability, democratization, and pluralism. Finally, we identify the limitations of existing multi-LLM methods and motivate future work. We envision multi-LLM collaboration as an essential path toward compositional intelligence and collaborative AI development.
- Published
- 2025
156. Gig2Gether: Data-sharing to Empower, Unify and Demystify Gig Work
- Author
-
Hsieh, Jane, Zhang, Angie, Surati, Sajel, Xie, Sijia, Ayala, Yeshua, Sathiya, Nithila, Kuo, Tzu-Sheng, Lee, Min Kyung, and Zhu, Haiyi
- Subjects
Computer Science - Human-Computer Interaction - Abstract
The wide adoption of platformized work has generated remarkable advancements in the labor patterns and mobility of modern society. Underpinning such progress, gig workers are exposed to unprecedented challenges and accountabilities: lack of data transparency, social and physical isolation, as well as insufficient infrastructural safeguards. Gig2Gether presents a space designed for workers to engage in an initial experience of voluntarily contributing anecdotal and statistical data to affect policy and build solidarity across platforms by exchanging unifying and diverse experiences. Our 7-day field study with 16 active workers from three distinct platforms and work domains showed existing affordances of data-sharing: facilitating mutual support across platforms, as well as enabling financial reflection and planning. Additionally, workers envisioned future use cases of data-sharing for collectivism (e.g., collaborative examinations of algorithmic speculations) and informing policy (e.g., around safety and pay), which motivated (latent) worker desiderata of additional capabilities and data metrics. Based on these findings, we discuss remaining challenges to address and how data-sharing tools can complement existing structures to maximize worker empowerment and policy impact.
- Published
- 2025
- Full Text
- View/download PDF
157. Near-Optimal Sample Complexity for MDPs via Anchoring
- Author
-
Lee, Jongmin, Bravo, Mario, and Cominetti, Roberto
- Subjects
Mathematics - Optimization and Control ,Computer Science - Data Structures and Algorithms - Abstract
We study a new model-free algorithm to compute $\varepsilon$-optimal policies for average reward Markov decision processes, in the weakly communicating case. Given a generative model, our procedure combines a recursive sampling technique with Halpern's anchored iteration, and computes an $\varepsilon$-optimal policy with sample and time complexity $\widetilde{O}(|\mathcal{S}||\mathcal{A}|\|h^*\|_{\text{sp}}^{2}/\varepsilon^{2})$ both in high probability and in expectation. To our knowledge, this is the best complexity among model-free algorithms, matching the known lower bound up to a factor $\|h^*\|_{\text{sp}}$. Although the complexity bound involves the span seminorm $\|h^*\|_{\text{sp}}$ of the unknown bias vector, the algorithm requires no prior knowledge and implements a stopping rule which guarantees with probability 1 that the procedure terminates in finite time. We also analyze how these techniques can be adapted for discounted MDPs.
- Published
- 2025
158. Planet Masses, Radii, and Orbits from NASA's K2 Mission
- Author
-
Howard, Andrew W., Sinukoff, Evan, Blunt, Sarah, Petigura, Erik A., Crossfield, Ian J. M., Isaacson, Howard, Kosiarek, Molly, Rubenzahl, Ryan A., Brewer, John M., Fulton, Benjamin J., Dressing, Courtney D., Hirsch, Lea A., Knutson, Heather, Livingston, John H., Mills, Sean M., Roy, Arpita, Weiss, Lauren M., Benneke, Bjorn, Ciardi, David R., Christiansen, Jessie L., Cochran, William D., Crepp, Justin R., Gonzales, Erica, Hansen, Brad M. S., Hardegree-Ullman, Kevin, Howell, Steve B., Lépine, Sébastien, Martinez, Arturo O., Rogers, Leslie A., Schlieder, Joshua E., Werner, Michael, Polanski, Alex S., Angelo, Isabel, Beard, Corey, Behmard, Aida, Bouma, Luke G., Brinkman, Casey L., Chontos, Ashley, Dai, Fei, Dalba, Paul A., Giacalone, Steven, Grunblatt, Samuel K., Hill, Michelle L., Kane, Stephen R., Lubin, Jack, Mayo, Andrew W., Mocnik, Teo, Murphy, Joseph M. Akana, Rice, Malena, Rosenthal, Lee J., Tyler, Dakotah, Van Zandt, Judah, and Yee, Samuel W.
- Subjects
Astrophysics - Earth and Planetary Astrophysics ,Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - Solar and Stellar Astrophysics - Abstract
We report the masses, sizes, and orbital properties of 86 planets orbiting 55 stars observed by NASA's K2 Mission with follow-up Doppler measurements by the HIRES spectrometer at the W. M. Keck Observatory and the Automated Planet Finder at Lick Observatory. Eighty-one of the planets were discovered from their transits in the K2 photometry, while five were found based on subsequent Doppler measurements of transiting planet host stars. The sizes of the transiting planets range from Earth-size to larger than Jupiter (1-3 REarth is typical), while the orbital periods range from less than a day to a few months. For 32 of the planets, the Doppler signal was detected with significance greater than 5-sigma (51 were detected with >3-sigma significance). An important characteristic of this catalog is the use of uniform analysis procedures to determine stellar and planetary properties. This includes the transit search and fitting procedures applied to the K2 photometry, the Doppler fitting techniques applied to the radial velocities, and the spectral modeling to determine bulk stellar parameters. Such a uniform treatment will make the catalog useful for statistical studies of the masses, densities, and system architectures of exoplanetary systems. This work also serves as a data release for all previously unpublished RVs and associated stellar activity indicators obtained by our team for these systems, along with derived stellar and planet parameters., Comment: 156 pages, 86 planets, 55 stars, 104 figures, 48 tables. Accepted to ApJS
- Published
- 2025
159. Saflo: eBPF-Based MPTCP Scheduler for Mitigating Traffic Analysis Attacks in Cellular Networks
- Author
-
Lee, Sangwoo, Jin, Liuyi, and Stoleru, Radu
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Cryptography and Security - Abstract
This paper presents the $\underline{\textbf{saf}}$e sub$\underline{\textbf{flo}}$w (Saflo) eBPF-based multipath TCP (MPTCP) scheduler, designed to mitigate traffic analysis attacks in cellular networks. Traffic analysis attacks, which exploit vulnerabilities in Downlink Control Information (DCI) messages, remain a significant security threat in LTE/5G networks. To counter such threats, the Saflo scheduler employs multipath communication combined with additional security-related tasks. Specifically, it utilizes eBPF tools to operate in both kernel and user spaces. In the kernel space, the eBPF scheduler performs multipath scheduling while excluding paths disabled by the user-space programs. The user-space programs conduct security-related computations and machine learning-based attack detection, determining whether each path should be enabled or disabled. This approach offloads computationally intensive tasks to user-space programs, enabling timely multipath scheduling in kernel space. The Saflo scheduler was evaluated in a private LTE/5G testbed. The results demonstrated that it significantly reduces the accuracy of video identification and user identification attacks in cellular networks while maintaining reasonable network performance for users.
- Published
- 2025
160. Efficient Distributed Optimization under Heavy-Tailed Noise
- Author
-
Lee, Su Hyeong, Zaheer, Manzil, and Li, Tian
- Subjects
Computer Science - Machine Learning - Abstract
Distributed optimization has become the default training paradigm in modern machine learning due to the growing scale of models and datasets. To mitigate communication overhead, local updates are often applied before global aggregation, resulting in a nested optimization approach with inner and outer steps. However, heavy-tailed stochastic gradient noise remains a significant challenge, particularly in attention-based models, hindering effective training. In this work, we propose TailOPT, an efficient framework designed to address heavy-tailed noise by leveraging adaptive optimization or clipping techniques. We establish convergence guarantees for the TailOPT framework under heavy-tailed noise with potentially unbounded gradient variance and local updates. Among its variants, we highlight a memory and communication efficient instantiation which we call $Bi^2Clip$, which performs coordinate-wise clipping at both the inner and outer optimizers, achieving adaptive-like performance (e.g., Adam) without the cost of maintaining or transmitting additional gradient statistics. Empirically, TailOPT, including $Bi^2Clip$, demonstrates superior performance on several language tasks and models, outperforming state-of-the-art methods.
- Published
- 2025
161. TQ-DiT: Efficient Time-Aware Quantization for Diffusion Transformers
- Author
-
Hwang, Younghye, Lee, Hyojin, and Kang, Joonhyuk
- Subjects
Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Diffusion transformers (DiTs) combine transformer architectures with diffusion models. However, their computational complexity imposes significant limitations on real-time applications and sustainability of AI systems. In this study, we aim to enhance the computational efficiency through model quantization, which represents the weights and activation values with lower precision. Multi-region quantization (MRQ) is introduced to address the asymmetric distribution of network values in DiT blocks by allocating two scaling parameters to sub-regions. Additionally, time-grouping quantization (TGQ) is proposed to reduce quantization error caused by temporal variation in activations. The experimental results show that the proposed algorithm achieves performance comparable to the original full-precision model with only a 0.29 increase in FID at W8A8. Furthermore, it outperforms other baselines at W6A6, thereby confirming its suitability for low-bit quantization. These results highlight the potential of our method to enable efficient real-time generative models., Comment: 8 pages
- Published
- 2025
162. Search for resonance-enhanced $CP$ and angular asymmetries in the $\Lambda^+_{c}\to p\mu^+\mu^-$ decay at LHCb
- Author
-
LHCb collaboration, Aaij, R., Abdelmotteleb, A. S. W., Beteta, C. Abellan, Abudinén, F., Ackernley, T., Adefisoye, A. A., Adeva, B., Adinolfi, M., Adlarson, P., Agapopoulou, C., Aidala, C. A., Ajaltouni, Z., Akar, S., Akiba, K., Albicocco, P., Albrecht, J., Alessio, F., Alexander, M., Aliouche, Z., Cartelle, P. Alvarez, Amalric, R., Amato, S., Amey, J. L., Amhis, Y., An, L., Anderlini, L., Andersson, M., Andreianov, A., Andreola, P., Andreotti, M., Andreou, D., Anelli, A., Ao, D., Archilli, F., Argenton, M., Cuendis, S. Arguedas, Artamonov, A., Artuso, M., Aslanides, E., Da Silva, R. Ataíde, Atzeni, M., Audurier, B., Bacher, D., Perea, I. Bachiller, Bachmann, S., Bachmayer, M., Back, J. J., Rodriguez, P. Baladron, Balagura, V., Balboni, A., Baldini, W., Balzani, L., Bao, H., Leite, J. Baptista de Souza, Pretel, C. Barbero, Barbetti, M., Barbosa, I. R., Barlow, R. J., Barnyakov, M., Barsuk, S., Barter, W., Bartz, J., Basels, J. M., Bashir, S., Bassi, G., Batsukh, B., Battista, P. B., Bay, A., Beck, A., Becker, M., Bedeschi, F., Bediaga, I. B., Behling, N. A., Belin, S., Belous, K., Belov, I., Belyaev, I., Benane, G., Bencivenni, G., Ben-Haim, E., Berezhnoy, A., Bernet, R., Andres, S. Bernet, Bertolin, A., Betancourt, C., Betti, F., Bex, J., Bezshyiko, Ia., Bhom, J., Bieker, M. S., Biesuz, N. V., Billoir, P., Biolchini, A., Birch, M., Bishop, F. C. R., Bitadze, A., Bizzeti, A., Blake, T., Blanc, F., Blank, J. E., Blusk, S., Bocharnikov, V., Boelhauve, J. A., Garcia, O. Boente, Boettcher, T., Bohare, A., Boldyrev, A., Bolognani, C. S., Bolzonella, R., Bonacci, R. B., Bondar, N., Bordelius, A., Borgato, F., Borghi, S., Borsato, M., Borsuk, J. T., Bottalico, E., Bouchiba, S. A., Bovill, M., Bowcock, T. J. V., Boyer, A., Bozzi, C., Brandenburg, J. D., Rodriguez, A. Brea, Breer, N., Brodzicka, J., Gonzalo, A. Brossa, Brown, J., Brundu, D., Buchanan, E., Buonincontri, L., Marcos, M. Burgos, Burke, A. T., Burr, C., Butter, J. S., Buytaert, J., Byczynski, W., Cadeddu, S., Cai, H., Caillet, A., Calabrese, R., Ramirez, S. Calderon, Calefice, L., Cali, S., Calvi, M., Gomez, M. Calvo, Magalhaes, P. Camargo, Bouzas, J. I. Cambon, Campana, P., Perez, D. H. Campora, Quezada, A. F. Campoverde, Capelli, S., Capriotti, L., Caravaca-Mora, R., Carbone, A., Salgado, L. Carcedo, Cardinale, R., Cardini, A., Carniti, P., Carus, L., Vidal, A. Casais, Caspary, R., Casse, G., Cattaneo, M., Cavallero, G., Cavallini, V., Celani, S., Cesare, S., Chadwick, A. J., Chahrour, I., Chang, H., Charles, M., Charpentier, Ph., Chatzianagnostou, E., Chefdeville, M., Chen, C., Chen, S., Chen, Z., Chernov, A., Chernyshenko, S., Chiotopoulos, X., Chobanova, V., Chrzaszcz, M., Chubykin, A., Chulikov, V., Ciambrone, P., Vidal, X. Cid, Ciezarek, G., Cifra, P., Clarke, P. E. L., Clemencic, M., Cliff, H. V., Closier, J., Toapaxi, C. Cocha, Coco, V., Cogan, J., Cogneras, E., Cojocariu, L., Collaviti, S., Collins, P., Colombo, T., Colonna, M., Comerma-Montells, A., Congedo, L., Contu, A., Cooke, N., Corredoira, I., Correia, A., Corti, G., Meldrum, J. Cottee, Couturier, B., Craik, D. C., Torres, M. Cruz, Rivera, E. Curras, Currie, R., Da Silva, C. L., Dadabaev, S., Dai, L., Dai, X., Dall'Occo, E., Dalseno, J., D'Ambrosio, C., Daniel, J., Danilina, A., d'Argent, P., Darze, G., Davidson, A., Davies, J. E., Francisco, O. De Aguiar, De Angelis, C., De Benedetti, F., de Boer, J., De Bruyn, K., De Capua, S., De Cian, M., Da Graca, U. De Freitas Carneiro, De Lucia, E., De Miranda, J. M., De Paula, L., De Serio, M., De Simone, P., De Vellis, F., de Vries, J. A., Debernardis, F., Decamp, D., Dedu, V., Dekkers, S., Del Buono, L., Delaney, B., Dembinski, H. -P., Deng, J., Denysenko, V., Deschamps, O., Dettori, F., Dey, B., Di Nezza, P., Diachkov, I., Didenko, S., Ding, S., Dittmann, L., Dobishuk, V., Docheva, A. D., Dong, C., Donohoe, A. M., Dordei, F., Reis, A. C. dos, Dowling, A. D., Duan, W., Duda, P., Dudek, M. W., Dufour, L., Duk, V., Durante, P., Duras, M. M., Durham, J. M., Durmus, O. D., Dziurda, A., Dzyuba, A., Easo, S., Eckstein, E., Egede, U., Egorychev, A., Egorychev, V., Eisenhardt, S., Ejopu, E., Eklund, L., Elashri, M., Ellbracht, J., Ely, S., Ene, A., Eschle, J., Esen, S., Evans, T., Fabiano, F., Falcao, L. N., Fan, Y., Fang, B., Fantini, L., Faria, M., Farmer, K., Fazzini, D., Felkowski, L., Feng, M., Feo, M., Casani, A. Fernandez, Gomez, M. Fernandez, Fernez, A. D., Ferrari, F., Rodrigues, F. Ferreira, Ferrillo, M., Ferro-Luzzi, M., Filippov, S., Fini, R. A., Fiorini, M., Firlej, M., Fischer, K. L., Fitzgerald, D. S., Fitzpatrick, C., Fiutowski, T., Fleuret, F., Fontana, M., Foreman, L. F., Forty, R., Foulds-Holt, D., Lima, V. Franco, Sevilla, M. Franco, Frank, M., Franzoso, E., Frau, G., Frei, C., Friday, D. A., Fu, J., Führing, Q., Fujii, Y., Fulghesu, T., Gabriel, E., Galati, G., Galati, M. D., Torreira, A. Gallas, Galli, D., Gambetta, S., Gandelman, M., Gandini, P., Ganie, B., Gao, H., Gao, R., Gao, T. Q., Gao, Y., Martin, L. M. Garcia, Moreno, P. Garcia, Pardiñas, J. García, Gardner, P., Garg, K. G., Garrido, L., Gaspar, C., Gavrikov, A., Gerken, L. L., Gersabeck, E., Gersabeck, M., Gershon, T., Ghizzo, S., Ghorbanimoghaddam, Z., Giambastiani, L., Giasemis, F. I., Gibson, V., Giemza, H. K., Gilman, A. L., Giovannetti, M., Gioventù, A., Girardey, L., Giugliano, C., Giza, M. A., Glaser, F. C., Gligorov, V. V., Göbel, C., Golinka-Bezshyyko, L., Golobardes, E., Golubkov, D., Golutvin, A., Fernandez, S. Gomez, Gomulka, W., Abrantes, F. Goncalves, Goncerz, M., Gong, G., Gooding, J. A., Gorelov, I. V., Gotti, C., Govorkova, E., Grabowski, J. P., Cardoso, L. A. Granado, Graugés, E., Graverini, E., Grazette, L., Graziani, G., Grecu, A. T., Greeven, L. M., Grieser, N. A., Grillo, L., Gromov, S., Gu, C., Guarise, M., Guerry, L., Guliaeva, V., Günther, P. A., Guseinov, A. -K., Gushchin, E., Guz, Y., Gys, T., Habermann, K., Hadavizadeh, T., Hadjivasiliou, C., Haefeli, G., Haen, C., Hallett, G., Halvorsen, M. M., Hamilton, P. M., Hammerich, J., Han, Q., Han, X., Hansmann-Menzemer, S., Hao, L., Harnew, N., Harris, T. H., Hartmann, M., Hashmi, S., He, J., Hemmer, F., Henderson, C., Henderson, R. D. L., Hennequin, A. M., Hennessy, K., Henry, L., Herd, J., Gascon, P. Herrero, Heuel, J., Hicheur, A., Mendizabal, G. Hijano, Horswill, J., Hou, R., Hou, Y., Howarth, N., Hu, J., Hu, W., Hu, X., Huang, W., Hulsbergen, W., Hunter, R. J., Hushchyn, M., Hutchcroft, D., Idzik, M., Ilin, D., Ilten, P., Inglessi, A., Iniukhin, A., Ishteev, A., Ivshin, K., Jacobsson, R., Jage, H., Elles, S. J. Jaimes, Jakobsen, S., Jans, E., Jashal, B. K., Jawahery, A., Jevtic, V., Jiang, E., Jiang, X., Jiang, Y., Jiang, Y. J., John, M., Rajan, A. John Rubesh, Johnson, D., Jones, C. R., Jones, T. P., Joshi, S., Jost, B., Castella, J. Juan, Jurik, N., Juszczak, I., Kaminaris, D., Kandybei, S., Kane, M., Kang, Y., Kar, C., Karacson, M., Karpenkov, D., Kauniskangas, A., Kautz, J. W., Kazanecki, M. K., Keizer, F., Kenzie, M., Ketel, T., Khanji, B., Kharisova, A., Kholodenko, S., Khreich, G., Kirn, T., Kirsebom, V. S., Kitouni, O., Klaver, S., Kleijne, N., Klimaszewski, K., Kmiec, M. R., Koliiev, S., Kolk, L., Konoplyannikov, A., Kopciewicz, P., Koppenburg, P., Korolev, M., Kostiuk, I., Kot, O., Kotriakhova, S., Kozachuk, A., Kravchenko, P., Kravchuk, L., Kreps, M., Krokovny, P., Krupa, W., Krzemien, W., Kshyvanskyi, O., Kubis, S., Kucharczyk, M., Kudryavtsev, V., Kulikova, E., Kupsc, A., Kutsenko, B. K., Lacarrere, D., Gonzalez, P. Laguarta, Lai, A., Lampis, A., Lancierini, D., Gomez, C. Landesa, Lane, J. J., Lane, R., Lanfranchi, G., Langenbruch, C., Langer, J., Lantwin, O., Latham, T., Lazzari, F., Lazzeroni, C., Gac, R. Le, Lee, H., Lefèvre, R., Leflat, A., Legotin, S., Lehuraux, M., Cid, E. Lemos, Leroy, O., Lesiak, T., Lesser, E. D., Leverington, B., Li, A., Li, C., Li, H., Li, K., Li, L., Li, M., Li, P., Li, P. -R., Li, Q., Li, S., Li, T., Li, Y., Lian, Z., Liang, X., Libralon, S., Lin, C., Lin, T., Lindner, R., Linton, H., Lisovskyi, V., Litvinov, R., Liu, F. L., Liu, G., Liu, K., Liu, S., Liu, W., Liu, Y., Liu, Y. L., Ordonez, G. Loachamin, Salvia, A. Lobo, Loi, A., Long, T., Lopes, J. H., Huertas, A. Lopez, Soliño, S. López, Lu, Q., Lucarelli, C., Lucchesi, D., Martinez, M. Lucio, Lukashenko, V., Luo, Y., Lupato, A., Luppi, E., Lynch, K., Lyu, X. -R., Ma, G. M., Maccolini, S., Machefert, F., Maciuc, F., Mack, B., Mackay, I., Mackey, L. M., Mohan, L. R. Madhan, Madurai, M. J., Maevskiy, A., Magdalinski, D., Maisuzenko, D., Malczewski, J. J., Malde, S., Malentacca, L., Malinin, A., Maltsev, T., Manca, G., Mancinelli, G., Mancuso, C., Escalero, R. Manera, Manganella, F. M., Manuzzi, D., Marangotto, D., Marchand, J. F., Marchevski, R., Marconi, U., Mariani, E., Mariani, S., Benito, C. Marin, Marks, J., Marshall, A. M., Martel, L., Martelli, G., Martellotti, G., Martinazzoli, L., Martinelli, M., Gomez, D. Martinez, Santos, D. Martinez, Vidal, F. Martinez, Granollers, A. Martorell i, Massafferri, A., Matev, R., Mathad, A., Matiunin, V., Matteuzzi, C., Mattioli, K. R., Mauri, A., Maurice, E., Mauricio, J., Mayencourt, P., de Cos, J. Mazorra, Mazurek, M., McCann, M., McGrath, T. H., McHugh, N. T., McNab, A., McNulty, R., Meadows, B., Meier, G., Melnychuk, D., Meng, F. M., Merk, M., Merli, A., Garcia, L. Meyer, Miao, D., Miao, H., Mikhasenko, M., Milanes, D. A., Minotti, A., Minucci, E., Miralles, T., Mitreska, B., Mitzel, D. S., Modak, A., Moeser, L., Mohammed, R. A., Moise, R. D., Mokhnenko, S., Cardenas, E. F. Molina, Mombächer, T., Monk, M., Monteil, S., Gomez, A. Morcillo, Morello, G., Morello, M. J., Morgenthaler, M. P., Moron, J., Morren, W., Morris, A. B., Morris, A. G., Mountain, R., Mu, H., Mu, Z. M., Muhammad, E., Muheim, F., Mulder, M., Müller, K., Muñoz-Rojas, F., Murta, R., Naik, P., Nakada, T., Nandakumar, R., Nanut, T., Nasteva, I., Needham, M., Neri, N., Neubert, S., Neufeld, N., Neustroev, P., Nicolini, J., Nicotra, D., Niel, E. M., Nikitin, N., Niu, Q., Nogarolli, P., Nogga, P., Normand, C., Fernandez, J. Novoa, Nowak, G., Nunez, C., Nur, H. N., Oblakowska-Mucha, A., Obraztsov, V., Oeser, T., Okamura, S., Okhotnikov, A., Okhrimenko, O., Oldeman, R., Oliva, F., Olocco, M., Onderwater, C. J. G., O'Neil, R. H., Osthues, D., Goicochea, J. M. Otalora, Owen, P., Oyanguren, A., Ozcelik, O., Paciolla, F., Padee, A., Padeken, K. O., Pagare, B., Pais, P. R., Pajero, T., Palano, A., Palutan, M., Pan, X., Panshin, G., Paolucci, L., Papanestis, A., Pappagallo, M., Pappalardo, L. L., Pappenheimer, C., Parkes, C., Parmar, D., Passalacqua, B., Passaleva, G., Passaro, D., Pastore, A., Patel, M., Patoc, J., Patrignani, C., Paul, A., Pawley, C. J., Pellegrino, A., Peng, J., Altarelli, M. Pepe, Perazzini, S., Pereima, D., Da Costa, H. Pereira, Castro, A. Pereiro, Perret, P., Perrevoort, A., Perro, A., Peters, M. J., Petridis, K., Petrolini, A., Pfaller, J. P., Pham, H., Pica, L., Piccini, M., Piccolo, L., Pietrzyk, B., Pietrzyk, G., Pilato, R. N., Pinci, D., Pisani, F., Pizzichemi, M., Placinta, V., Casasus, M. Plo, Poeschl, T., Polci, F., Lener, M. Poli, Poluektov, A., Polukhina, N., Polyakov, I., Polycarpo, E., Ponce, S., Popov, D., Poslavskii, S., Prasanth, K., Prouve, C., Provenzano, D., Pugatch, V., Punzi, G., Qasim, S., Qian, Q. Q., Qian, W., Qin, N., Qu, S., Quagliani, R., Trejo, R. I. Rabadan, Rademacker, J. H., Rama, M., García, M. Ramírez, De Oliveira, V. Ramos, Pernas, M. Ramos, Rangel, M. S., Ratnikov, F., Raven, G., De Miguel, M. Rebollo, Redi, F., Reich, J., Reiss, F., Ren, Z., Resmi, P. K., Galvez, M. Ribalda, Ribatti, R., Ricart, G., Riccardi, D., Ricciardi, S., Richardson, K., Richardson-Slipper, M., Rinnert, K., Robbe, P., Robertson, G., Rodrigues, E., Alvarez, A. Rodriguez, Fernandez, E. Rodriguez, Lopez, J. A. Rodriguez, Rodriguez, E. Rodriguez, Roensch, J., Rogachev, A., Rogovskiy, A., Rolf, D. L., Roloff, P., Romanovskiy, V., Vidal, A. Romero, Romolini, G., Ronchetti, F., Rong, T., Rotondo, M., Roy, S. R., Rudolph, M. S., Diaz, M. Ruiz, Fernandez, R. A. Ruiz, Vidal, J. Ruiz, Ryzka, J., Saavedra-Arias, J. J., Silva, J. J. Saborido, Sadek, R., Sagidova, N., Sahoo, D., Sahoo, N., Saitta, B., Salomoni, M., Sanderswood, I., Santacesaria, R., Rios, C. Santamarina, Santimaria, M., Santoro, L., Santovetti, E., Saputi, A., Saranin, D., Sarnatskiy, A., Sarpis, G., Sarpis, M., Satriano, C., Satta, A., Saur, M., Savrina, D., Sazak, H., Sborzacchi, F., Smead, L. G. Scantlebury, Scarabotto, A., Schael, S., Scherl, S., Schiller, M., Schindler, H., Schmelling, M., Schmidt, B., Schmitt, S., Schmitz, H., Schneider, O., Schopper, A., Schulte, N., Schulte, S., Schune, M. H., Schwemmer, R., Schwering, G., Sciascia, B., Sciuccati, A., Segal, I., Sellam, S., Semennikov, A., Senger, T., Soares, M. Senghi, Sergi, A., Serra, N., Sestini, L., Seuthe, A., Shang, Y., Shangase, D. M., Shapkin, M., Sharma, R. S., Shchemerov, I., Shchutska, L., Shears, T., Shekhtman, L., Shen, Z., Sheng, S., Shevchenko, V., Shi, B., Shi, Q., Shimizu, Y., Shmanin, E., Shorkin, R., Shupperd, J. D., Coutinho, R. Silva, Simi, G., Simone, S., Skidmore, N., Skwarnicki, T., Slater, M. W., Smallwood, J. C., Smith, E., Smith, K., Smith, M., Snoch, A., Lavra, L. Soares, Sokoloff, M. D., Soler, F. J. P., Solomin, A., Solovev, A., Solovyev, I., Sommerfeld, N. S., Song, R., Song, Y., Song, Y. S., De Almeida, F. L. Souza, De Paula, B. Souza, Norella, E. Spadaro, Spedicato, E., Speer, J. G., Spiridenkov, E., Spradlin, P., Sriskaran, V., Stagni, F., Stahl, M., Stahl, S., Stanislaus, S., Stefaniak, M., Stein, E. N., Steinkamp, O., Stenyakin, O., Stevens, H., Strekalina, D., Su, Y., Suljik, F., Sun, J., Sun, L., Sundfeld, D., Sutcliffe, W., Swientek, K., Swystun, F., Szabelski, A., Szumlak, T., Tan, Y., Tang, Y., Tat, M. D., Terentev, A., Terzuoli, F., Teubert, F., Thomas, E., Thompson, D. J. D., Tilquin, H., Tisserand, V., T'Jampens, S., Tobin, M., Tomassetti, L., Tonani, G., Tong, X., Tork, T., Machado, D. Torres, Toscano, L., Tou, D. Y., Trippl, C., Tuci, G., Tuning, N., Uecker, L. H., Ukleja, A., Unverzagt, D. J., Urbach, B., Usachov, A., Ustyuzhanin, A., Uwer, U., Vagnoni, V., Cadenas, V. Valcarce, Valenti, G., Canudas, N. Valls, van Eldik, J., Van Hecke, H., van Herwijnen, E., Van Hulse, C. B., Van Laak, R., van Veghel, M., Vasquez, G., Gomez, R. Vazquez, Regueiro, P. Vazquez, Sierra, C. Vázquez, Vecchi, S., Velthuis, J. J., Veltri, M., Venkateswaran, A., Verdoglia, M., Vesterinen, M., Benet, D. Vico, Villalba, P. Vidrier, Diaz, M. Vieites, Vilasis-Cardona, X., Figueras, E. Vilella, Villa, A., Vincent, P., Volle, F. C., Bruch, D. vom, Voropaev, N., Vos, K., Vrahas, C., Wagner, J., Walsh, J., Walton, E. J., Wan, G., Wang, C., Wang, G., Wang, H., Wang, J., Wang, M., Wang, N. W., Wang, R., Wang, X., Wang, X. W., Wang, Y., Wang, Y. W., Wang, Z., Ward, J. A., Waterlaat, M., Watson, N. K., Websdale, D., Wei, Y., Wendel, J., Westhenry, B. D. C., White, C., Whitehead, M., Whiter, E., Wiederhold, A. R., Wiedner, D., Wilkinson, G., Wilkinson, M. K., Williams, M., Williams, M. J., Williams, M. R. J., Williams, R., Williams, Z., Wilson, F. F., Winn, M., Wislicki, W., Witek, M., Witola, L., Wormser, G., Wotton, S. A., Wu, H., Wu, J., Wu, X., Wu, Y., Wu, Z., Wyllie, K., Xian, S., Xiang, Z., Xie, Y., Xing, T. X., Xu, A., Xu, L., Xu, M., Xu, Z., Yang, K., Yang, S., Yang, X., Yang, Y., Yang, Z., Yeroshenko, V., Yeung, H., Yin, H., Yin, X., Yu, C. Y., Yu, J., Yuan, X., Yuan, Y, Zaffaroni, E., Zavertyaev, M., Zdybal, M., Zenesini, F., Zeng, C., Zeng, M., Zhang, C., Zhang, D., Zhang, J., Zhang, L., Zhang, S., Zhang, Y., Zhang, Y. Z., Zhang, Z., Zhao, Y., Zhelezov, A., Zheng, S. Z., Zheng, X. Z., Zheng, Y., Zhou, T., Zhou, X., Zhou, Y., Zhovkovska, V., Zhu, L. Z., Zhu, X., Zhukov, V., Zhuo, J., Zou, Q., Zuliani, D., and Zunica, G.
- Subjects
High Energy Physics - Experiment - Abstract
The first measurement of the $CP$ asymmetry of the decay rate ($A_{CP}$) and the $CP$ average ($\Sigma A_{\text{FB}}$) and $CP$ asymmetry ($\Delta A_{\text{FB}}$) of the forward-backward asymmetry in the muon system of $\mathit{\Lambda}^+_c\to p\mu^+\mu^-$ decays is reported. The measurement is performed using a data sample of proton-proton collisions, recorded by the LHCb experiment from 2016 to 2018 at a center-of-mass energy of 13$\text{ TeV}$, which corresponds to an integrated luminosity of 5.4$\text{ fb}^{-1}$. The asymmetries are measured in two regions of dimuon mass near the $\phi$-meson mass peak. The dimuon-mass integrated results are \begin{align*} A_{CP} &= (-1.1 \pm 4.0 \pm 0.5)\%,\\ \Sigma A_{\text{FB}} &= (\phantom{-}3.9 \pm 4.0 \pm 0.6)\%,\\ \Delta A_{\text{FB}} &= (\phantom{-}3.1 \pm 4.0 \pm 0.4)\%, \end{align*} where the first uncertainty is statistical and the second systematic. The results are consistent with the conservation of $CP$ symmetry and the Standard Model expectations., Comment: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3473/ (LHCb public pages)
- Published
- 2025
163. PGB: One-Shot Pruning for BERT via Weight Grouping and Permutation
- Author
-
Lim, Hyemin, Lee, Jaeyeon, and Choi, Dong-Wan
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large pretrained language models such as BERT suffer from slow inference and high memory usage, due to their huge size. Recent approaches to compressing BERT rely on iterative pruning and knowledge distillation, which, however, are often too complicated and computationally intensive. This paper proposes a novel semi-structured one-shot pruning method for BERT, called $\textit{Permutation and Grouping for BERT}$ (PGB), which achieves high compression efficiency and sparsity while preserving accuracy. To this end, PGB identifies important groups of individual weights by permutation and prunes all other weights as a structure in both multi-head attention and feed-forward layers. Furthermore, if no important group is formed in a particular layer, PGB drops the entire layer to produce an even more compact model. Our experimental results on BERT$_{\text{BASE}}$ demonstrate that PGB outperforms the state-of-the-art structured pruning methods in terms of computational cost and accuracy preservation.
- Published
- 2025
164. Interacting dark energy constraints from the full-shape analyses of BOSS DR12 and DES Year 3 measurements
- Author
-
Tsedrik, M., Lee, S., Markovic, K., Carrilho, P., Pourtsidou, A., Moretti, C., Bose, B., Huff, E., Robertson, A., Taylor, P. L., and Zuntz, J.
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
Dark Scattering (DS) is an interacting dark energy model characterised by pure momentum exchange between dark energy and dark matter. It is phenomenologically interesting because it is unconstrained by CMB data and can alleviate the $S_8$ tension. We derive constraints on cosmological and DS parameters using three two-point correlation functions (3$\times$2pt) from the Dark Energy Survey third year data release (DES Y3). We then add information from the multipoles of the galaxy power spectrum combined with Baryonic Acoustic Oscillation (BAO) measurements using the twelfth data release of the Baryon Oscillation Spectroscopic Survey (BOSS DR12) and external BAO measurements. We compare results from the direct combination of the probes with the joint posterior distribution calculated with a normalising flow approach. Additionally, we run a CMB analysis with the Planck Public Release 4 (PR4) for comparison of the cosmological constraints. Overall, we find that the combination of probes allows minimising the projection effects and improves constraints without the need to include CMB information. It brings the marginalised posterior maxima closer to the corresponding best-fit values and weakens the sensitivity to the priors of the spectroscopic modelling nuisance parameters. These findings are highly relevant in light of forthcoming data of surveys like DESI, Euclid, and Rubin.
- Published
- 2025
165. Reversible Switching of the Environment-Protected Quantum Spin Hall Insulator Bismuthene at the Graphene/SiC Interface
- Author
-
Tilgner, Niclas, Wolff, Susanne, Soubatch, Serguei, Lee, Tien-Lin, Unigarro, Andres David Peña, Gemming, Sibylle, Tautz, F. Stefan, Kumpf, Christian, Seyller, Thomas, Göhler, Fabian, and Schädlich, Philip
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
Quantum Spin Hall Insulators (QSHI) have been extensively studied both theoretically and experimentally because they exhibit robust helical edge states driven by spin-orbit coupling and offer the potential for applications in spintronics through dissipationless spin transport. However, to realize devices, it is indispensable to gain control over the interaction of the active layer with the substrate, and to protect it from environmental influences. Here we show that a single layer of elemental Bi, formed by intercalation of an epitaxial graphene buffer layer on SiC(0001), is a promising candidate for a QSHI. This layer can be reversibly switched between an electronically inactive precursor state and a ``bismuthene state'', the latter exhibiting the predicted band structure of a true two-dimensional bismuthene layer. Switching is accomplished by hydrogenation (dehydrogenation) of the sample, i.e., a partial passivation (activation) of dangling bonds of the SiC substrate, causing a lateral shift of Bi atoms involving a change of the adsorption site. In the bismuthene state, the Bi honeycomb layer is a prospective QSHI, inherently protected by the graphene sheet above and the H-passivated substrate below. Thus, our results represent an important step towards protected QSHI systems beyond graphene., Comment: 12 pages, 3 figures, supplementary information
- Published
- 2025
166. FedP$^2$EFT: Federated Learning to Personalize Parameter Efficient Fine-Tuning for Multilingual LLMs
- Author
-
Lee, Royson, Kim, Minyoung, Rezk, Fady, Li, Rui, Venieris, Stylianos I., and Hospedales, Timothy
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Federated learning (FL) has enabled the training of multilingual large language models (LLMs) on diverse and decentralized multilingual data, especially on low-resource languages. To improve client-specific performance, personalization via the use of parameter-efficient fine-tuning (PEFT) modules such as LoRA is common. This involves a personalization strategy (PS), such as the design of the PEFT adapter structures (e.g., in which layers to add LoRAs and what ranks) and choice of hyperparameters (e.g., learning rates) for fine-tuning. Instead of manual PS configuration, we propose FedP$^2$EFT, a federated learning-to-personalize method for multilingual LLMs in cross-device FL settings. Unlike most existing PEFT structure selection methods, which are prone to overfitting low-data regimes, FedP$^2$EFT collaboratively learns the optimal personalized PEFT structure for each client via Bayesian sparse rank selection. Evaluations on both simulated and real-world multilingual FL benchmarks demonstrate that FedP$^2$EFT largely outperforms existing personalized fine-tuning methods, while complementing a range of existing FL methods., Comment: Preprint
- Published
- 2025
167. Contrastive Token-level Explanations for Graph-based Rumour Detection
- Author
-
Chin, Daniel Wai Kit and Lee, Roy Ka-Wei
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
The widespread use of social media has accelerated the dissemination of information, but it has also facilitated the spread of harmful rumours, which can disrupt economies, influence political outcomes, and exacerbate public health crises, such as the COVID-19 pandemic. While Graph Neural Network (GNN)-based approaches have shown significant promise in automated rumour detection, they often lack transparency, making their predictions difficult to interpret. Existing graph explainability techniques fall short in addressing the unique challenges posed by the dependencies among feature dimensions in high-dimensional text embeddings used in GNN-based models. In this paper, we introduce Contrastive Token Layerwise Relevance Propagation (CT-LRP), a novel framework designed to enhance the explainability of GNN-based rumour detection. CT-LRP extends current graph explainability methods by providing token-level explanations that offer greater granularity and interpretability. We evaluate the effectiveness of CT-LRP across multiple GNN models trained on three publicly available rumour detection datasets, demonstrating that it consistently produces high-fidelity, meaningful explanations, paving the way for more robust and trustworthy rumour detection systems., Comment: This work has been submitted to the IEEE for possible publication
- Published
- 2025
168. A JWST Project on 47 Tucanae: Kinematics, energy equipartition and anisotropy of multiple populations
- Author
-
Ziliotto, T., Milone, A. P., Cordoni, G., Aros, F. I., Vesperini, E., Lee, J. W., Bellini, A., Libralato, M., Dondoglio, E., Tailo, M., Livernois, A., Legnardi, M. V., Mastrobuono-Battisti, A., Lagioia, E., Bortolan, E., Muratore, F., Marino, A. F., Alves-Brito, A., and Renzini, A.
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
Recent work with JWST has demonstrated its capability to identify and chemically characterize multiple populations in globular clusters down to the H-burning limit. In this study, we explore the kinematics of multiple populations in the globular cluster 47 Tucanae by combining data from JWST, HST, and Gaia. We analyzed velocity dispersion and anisotropy profiles from the cluster center out to $\sim$10$R_h$. Our findings indicate that while 1G stars are isotropic, 2G stars are significantly radially anisotropic. These results align with the predictions of simulations of the dynamical evolution of clusters where 2G stars are initially more centrally concentrated than 1G stars. Furthermore, we subdivided the 2G population into two subpopulations: $2G_A$ and $2G_B$, with the latter being more chemically extreme. We compared their dynamical profiles and found no significant differences. For the first time, we measured the degree of energy equipartition among the multiple populations of 47 Tucanae. Overall, within the analyzed radial range ($\sim$2-4$R_h$), both populations exhibit a low degree of energy equipartition. The most significant differences between 1G and 2G stars are observed in the tangential velocity component, where 2G stars are characterized by a stronger degree of energy equipartition than 1G stars. In the radial component, the behavior of 1G and 2G stars is more variable, with differences largely dependent on radius. Finally, our analysis reveals that the ratio of rotational velocity to velocity dispersion is larger for the 2G population, while 1G stars exhibit higher skewness in their tangential proper motions, providing further evidence of differences in the kinematic properties of the 1G and 2G populations.
- Published
- 2025
169. Higgs boson precision analysis of two Higgs doublet models: Full LHC Run 1 and Run 2 data
- Author
-
Heo, Yongtae, Lee, Jae Sik, and Park, Chan Beom
- Subjects
High Energy Physics - Phenomenology - Abstract
We present the results obtained by performing global fits of two-Higgs-doublet models (2HDMs) using the full Run 1 and Run 2 Higgs datasets collected at the LHC. Avoiding unwanted tree-level flavor-changing neutral currents and including the wrong-sign cases, we consider 12 scenarios across six types of 2HDMs: Inert, type I, type II, type III, type IV, and Aligned 2HDMs. Our main results are presented in Table 3 and Fig. 1. We find that the type-I 2HDM provides the best fit, while the wrong-sign scenarios of the type-II and type-IV 2HDMs, where the normalized Yukawa coupling to down-type quarks is opposite in sign to the Standard Model (SM), are disfavored. We also observe that the Aligned 2HDM gives the second-best fit when the Yukawa couplings to down-type quarks take the same sign as in the SM, regardless of the sign of the Yukawa couplings to the charged leptons., Comment: 18 pages, 16 figures, 4 tables
- Published
- 2025
170. Maximizing the Position Embedding for Vision Transformers with Global Average Pooling
- Author
-
Lee, Wonjun, Ham, Bumsub, and Kim, Suhyun
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
In vision transformers, position embedding (PE) plays a crucial role in capturing the order of tokens. However, in vision transformer structures, there is a limitation in the expressiveness of PE due to the structure where position embedding is simply added to the token embedding. A layer-wise method that delivers PE to each layer and applies independent Layer Normalizations for token embedding and PE has been adopted to overcome this limitation. In this paper, we identify the conflicting result that occurs in a layer-wise structure when using the global average pooling (GAP) method instead of the class token. To overcome this problem, we propose MPVG, which maximizes the effectiveness of PE in a layer-wise structure with GAP. Specifically, we identify that PE counterbalances token embedding values at each layer in a layer-wise structure. Furthermore, we recognize that the counterbalancing role of PE is insufficient in the layer-wise structure, and we address this by maximizing the effectiveness of PE through MPVG. Through experiments, we demonstrate that PE performs a counterbalancing role and that maintaining this counterbalancing directionality significantly impacts vision transformers. As a result, the experimental results show that MPVG outperforms existing methods across vision transformers on various tasks., Comment: Accepted at AAAI 2025
- Published
- 2025
171. Improving the Direct Determination of $|V_{ts}|$ using Deep Learning
- Author
-
Heo, Jeewon, Jang, Woojin, Lee, Jason Sang Hun, Roh, Youn Jung, Watson, Ian James, and Yang, Seungjin
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
An $s$-jet tagging approach to determine the Cabibbo-Kobayashi-Maskawa matrix component $|V_{ts}|$ directly in the dileptonic final state events of the top pair production in proton-proton collisions has been previously studied by measuring the branching fraction of the decay of one of the top quarks by $t \to sW$. The main challenge is improving the discrimination performance between strange jets from top decays and other jets. This study proposes novel jet discriminators, called DISAJA, using a Transformer-based deep learning method. The first model, DISAJA-H, utilizes multi-domain inputs (jets, leptons, and missing transverse momentum). An additional model, DISAJA-L, further improves the setup by using lower-level jet constituent information, rather than the high-level clustered information. DISAJA-L is a novel model that combines low-level jet constituent analysis with event classification using multi-domain inputs. The model performance is evaluated via a CMS-like LHC Run 2 fast simulation by comparing various statistical test results to those from a model based on boosted decision trees. This study shows the deep learning model has a significant performance gain over the traditional machine learning method, and we show the potential of the measurement during Run 3 of the LHC and HL-LHC., Comment: 23 pages, 10 figures
- Published
- 2025
172. The Benefits of Prosociality towards AI Agents: Examining the Effects of Helping AI Agents on Human Well-Being
- Author
-
Zhu, Zicheng, Tan, Yugin, Yamashita, Naomi, Lee, Yi-Chieh, and Zhang, Renwen
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Prosocial behaviors, such as helping others, are well-known to enhance human well-being. While there is a growing trend of humans helping AI agents, it remains unclear whether the well-being benefits of helping others extend to interactions with non-human entities. To address this, we conducted an experiment (N = 295) to explore how helping AI agents impacts human well-being, especially when the agents fulfill human basic psychological needs--relatedness, competence, and autonomy--during the interaction. Our findings showed that helping AI agents reduced participants' feelings of loneliness. When AI met participants' needs for competence and autonomy during the helping process, there was a further decrease in loneliness and an increase in positive affect. However, when AI did not meet participants' need for relatedness, participants experienced an increase in positive affect. We discuss the implications of these findings for understanding how AI can support human well-being.
- Published
- 2025
173. ScholaWrite: A Dataset of End-to-End Scholarly Writing Process
- Author
-
Wang, Linghe, Lee, Minhwa, Volkov, Ross, Chau, Luan Tuyen, and Kang, Dongyeop
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Computation and Language ,Quantitative Biology - Neurons and Cognition - Abstract
Writing is a cognitively demanding task involving continuous decision-making, heavy use of working memory, and frequent switching between multiple activities. Scholarly writing is particularly complex as it requires authors to coordinate many pieces of multiform knowledge. To fully understand writers' cognitive thought process, one should fully decode the end-to-end writing data (from individual ideas to final manuscript) and understand their complex cognitive mechanisms in scholarly writing. We introduce ScholaWrite dataset, the first-of-its-kind keystroke logs of an end-to-end scholarly writing process for complete manuscripts, with thorough annotations of cognitive writing intentions behind each keystroke. Our dataset includes LaTeX-based keystroke data from five preprints with nearly 62K total text changes and annotations across 4 months of paper writing. ScholaWrite shows promising usability and applications (e.g., iterative self-writing) for the future development of AI writing assistants for academic research, which necessitate complex methods beyond LLM prompting. Our experiments clearly demonstrated the importance of collection of end-to-end writing data, rather than the final manuscript, for the development of future writing assistants to support the cognitive thinking process of scientists. Our de-identified dataset, demo, and code repository are available on our project page., Comment: Equal contribution: Linghe Wang, Minhwa Lee | project page: https://minnesotanlp.github.io/scholawrite/
- Published
- 2025
174. Sensitivity analysis for multivariable missing data using multiple imputation: a tutorial
- Author
-
Nguyen, Cattram D, Lee, Katherine J, White, Ian R, van Buuren, Stef, and Moreno-Betancur, Margarita
- Subjects
Statistics - Methodology - Abstract
Multiple imputation is a popular method for handling missing data, with fully conditional specification (FCS) being one of the predominant imputation approaches for multivariable missingness. Unbiased estimation with standard implementations of multiple imputation depends on assumptions concerning the missingness mechanism (e.g. that data are "missing at random"). The plausibility of these assumptions can only be assessed using subject-matter knowledge, and not data alone. It is therefore important to perform sensitivity analyses to explore the robustness of results to violations of these assumptions (e.g. if the data are in fact "missing not at random"). In this tutorial, we provide a roadmap for conducting sensitivity analysis using the Not at Random Fully Conditional Specification (NARFCS) procedure for multivariate imputation. Using a case study from the Longitudinal Study of Australian Children, we work through the steps involved, from assessing the need to perform the sensitivity analysis, and specifying the NARFCS models and sensitivity parameters, through to implementing NARFCS using FCS procedures in R and Stata., Comment: 24 pages, 3 figures, 2 tables
- Published
- 2025
175. PRISM: A Robust Framework for Skill-based Meta-Reinforcement Learning with Noisy Demonstrations
- Author
-
Lee, Sanghyeon, Bae, Sangjun, Park, Yisak, and Han, Seungyul
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Meta-reinforcement learning (Meta-RL) facilitates rapid adaptation to unseen tasks but faces challenges in long-horizon environments. Skill-based approaches tackle this by decomposing state-action sequences into reusable skills and employing hierarchical decision-making. However, these methods are highly susceptible to noisy offline demonstrations, resulting in unstable skill learning and degraded performance. To overcome this, we propose Prioritized Refinement for Skill-Based Meta-RL (PRISM), a robust framework that integrates exploration near noisy data to generate online trajectories and combines them with offline data. Through prioritization, PRISM extracts high-quality data to learn task-relevant skills effectively. By addressing the impact of noise, our method ensures stable skill learning and achieves superior performance in long-horizon tasks, even with noisy and sub-optimal data., Comment: 8 pages main, 19 pages appendix with reference. Submitted to ICML 2025
- Published
- 2025
176. Decoding Human Attentive States from Spatial-temporal EEG Patches Using Transformers
- Author
-
Ding, Yi, Lee, Joon Hei, Zhang, Shuailei, Luo, Tianze, and Guan, Cuntai
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Learning the spatial topology of electroencephalogram (EEG) channels and their temporal dynamics is crucial for decoding attention states. This paper introduces EEG-PatchFormer, a transformer-based deep learning framework designed specifically for EEG attention classification in Brain-Computer Interface (BCI) applications. By integrating a Temporal CNN for frequency-based EEG feature extraction, a pointwise CNN for feature enhancement, and Spatial and Temporal Patching modules for organizing features into spatial-temporal patches, EEG-PatchFormer jointly learns spatial-temporal information from EEG data. Leveraging the global learning capabilities of the self-attention mechanism, it captures essential features across brain regions over time, thereby enhancing EEG data decoding performance. Demonstrating superior performance, EEG-PatchFormer surpasses existing benchmarks in accuracy, area under the ROC curve (AUC), and macro-F1 score on a public cognitive attention dataset. The code can be found via: https://github.com/yi-ding-cs/EEG-PatchFormer ., Comment: Implementation details are updated
- Published
- 2025
177. Energy & Force Regression on DFT Trajectories is Not Enough for Universal Machine Learning Interatomic Potentials
- Author
-
Miret, Santiago, Lee, Kin Long Kelvin, Gonzales, Carmelo, Mannan, Sajid, and Krishnan, N. M. Anoop
- Subjects
Condensed Matter - Materials Science ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Universal Machine Learning Interactomic Potentials (MLIPs) enable accelerated simulations for materials discovery. However, current research efforts fail to impactfully utilize MLIPs due to: 1. Overreliance on Density Functional Theory (DFT) for MLIP training data creation; 2. MLIPs' inability to reliably and accurately perform large-scale molecular dynamics (MD) simulations for diverse materials; 3. Limited understanding of MLIPs' underlying capabilities. To address these shortcomings, we aargue that MLIP research efforts should prioritize: 1. Employing more accurate simulation methods for large-scale MLIP training data creation (e.g. Coupled Cluster Theory) that cover a wide range of materials design spaces; 2. Creating MLIP metrology tools that leverage large-scale benchmarking, visualization, and interpretability analyses to provide a deeper understanding of MLIPs' inner workings; 3. Developing computationally efficient MLIPs to execute MD simulations that accurately model a broad set of materials properties. Together, these interdisciplinary research directions can help further the real-world application of MLIPs to accurately model complex materials at device scale.
- Published
- 2025
178. SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models
- Author
-
Levy, Daniel, Panigrahi, Siba Smarak, Kaba, Sékou-Oumar, Zhu, Qiang, Lee, Kin Long Kelvin, Galkin, Mikhail, Miret, Santiago, and Ravanbakhsh, Siamak
- Subjects
Condensed Matter - Materials Science ,Computer Science - Machine Learning - Abstract
Generating novel crystalline materials has potential to lead to advancements in fields such as electronics, energy storage, and catalysis. The defining characteristic of crystals is their symmetry, which plays a central role in determining their physical properties. However, existing crystal generation methods either fail to generate materials that display the symmetries of real-world crystals, or simply replicate the symmetry information from examples in a database. To address this limitation, we propose SymmCD, a novel diffusion-based generative model that explicitly incorporates crystallographic symmetry into the generative process. We decompose crystals into two components and learn their joint distribution through diffusion: 1) the asymmetric unit, the smallest subset of the crystal which can generate the whole crystal through symmetry transformations, and; 2) the symmetry transformations needed to be applied to each atom in the asymmetric unit. We also use a novel and interpretable representation for these transformations, enabling generalization across different crystallographic symmetry groups. We showcase the competitive performance of SymmCD on a subset of the Materials Project, obtaining diverse and valid crystals with realistic symmetries and predicted properties.
- Published
- 2025
179. An Efficient Quasi-Newton Method with Tensor Product Implementation for Solving Quasi-Linear Elliptic Equations and Systems
- Author
-
Hao, Wenrui, Lee, Sun, and Zhang, Xiangxiong
- Subjects
Mathematics - Numerical Analysis - Abstract
In this paper, we introduce a quasi-Newton method optimized for efficiently solving quasi-linear elliptic equations and systems, with a specific focus on GPU-based computation. By approximating the Jacobian matrix with a combination of linear Laplacian and simplified nonlinear terms, our method reduces the computational overhead typical of traditional Newton methods while handling the large, sparse matrices generated from discretized PDEs. We also provide a convergence analysis demonstrating local convergence to the exact solution under optimal choices for the regularization parameter, ensuring stability and efficiency in each iteration. Numerical experiments in two- and three-dimensional domains validate the proposed method's robustness and computational gains with tensor-product implementation. This approach offers a promising pathway for accelerating quasi-linear elliptic equation and system solvers, expanding the feasibility of complex simulations in physics, engineering, and other fields leveraging advanced hardware capabilities.
- Published
- 2025
180. DC-VSR: Spatially and Temporally Consistent Video Super-Resolution with Video Diffusion Prior
- Author
-
Han, Janghyeok, Sim, Gyujin, Kim, Geonung, Lee, Hyunseung, Choi, Kyuha, Han, Youngseok, and Cho, Sunghyun
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Artificial Intelligence ,Computer Science - Graphics - Abstract
Video super-resolution (VSR) aims to reconstruct a high-resolution (HR) video from a low-resolution (LR) counterpart. Achieving successful VSR requires producing realistic HR details and ensuring both spatial and temporal consistency. To restore realistic details, diffusion-based VSR approaches have recently been proposed. However, the inherent randomness of diffusion, combined with their tile-based approach, often leads to spatio-temporal inconsistencies. In this paper, we propose DC-VSR, a novel VSR approach to produce spatially and temporally consistent VSR results with realistic textures. To achieve spatial and temporal consistency, DC-VSR adopts a novel Spatial Attention Propagation (SAP) scheme and a Temporal Attention Propagation (TAP) scheme that propagate information across spatio-temporal tiles based on the self-attention mechanism. To enhance high-frequency details, we also introduce Detail-Suppression Self-Attention Guidance (DSSAG), a novel diffusion guidance scheme. Comprehensive experiments demonstrate that DC-VSR achieves spatially and temporally consistent, high-quality VSR results, outperforming previous approaches., Comment: Equal contributions from first two authors
- Published
- 2025
181. Signal shape studies and rate dependence of HFO-based gas mixtures in RPC detectors
- Author
-
Quaglia, L., Abbrescia, M., Aielli, G., Aly, R., Arena, M. C., Barroso, M., Benussi, L., Bianco, S., Bordon, F., Boscherini, D., Bruni, A., Buontempo, S., Busato, M., Camarri, P., Cardarelli, R., Congedo, L., Damiao, D. De Jesus, Debernardis, F., De Serio, M., Di Ciaccio, A., Di Stante, L., Dupieux, P., Eysermans, J., Ferretti, A., Gagliardi, M., Galati, G., Garetti, S., Guida, R., Iaselli, G., Joly, B., Juks, S. A., Lee, K. S., Liberti, B., Ramirez, D. Lucero, Mandelli, B., Manen, S. P., Massa, L., Pastore, A., Pastori, E., Piccolo, D., Pizzimento, L., Polini, A., Proto, G., Pugliese, G., Ramos, D., Rigoletti, G., Rocchi, A., Romano, M., Salvini, P., Samalan, A., Santonico, R., Saviano, G., Sessa, M., Simone, S., Terlizzi, L., Tytgat, M., Vercellin, E., Verzeroli, M., and Zaganidis, N.
- Subjects
Physics - Instrumentation and Detectors ,Nuclear Experiment - Abstract
The RPCs employed at the LHC experiments are currently operated in avalanche mode, with a mixture containing a large fraction of C$_{2}$H$_{2}$F$_{4}$ ($\approx$90\% or more) with the addition of i-C$_{4}$H$_{10}$ and SF$_{6}$ in different concentrations. However, C$_{2}$H$_{2}$F$_{4}$ and SF$_{6}$ are fluorinated greenhouse gases (F-gases) with Global Warming Potential (GWP) of $\approx$1400 and $\approx$22800, respectively. EU regulations imposed a progressive phase-down of C$_{2}$H$_{2}$F$_{4}$ production and consumption, aiming at strongly reducing its emission. This is already resulting in an increase of its price and reduction in availability. The most desirable long-term solution to this problem is to find an alternative, F-gases-free gas mixture, able to maintain similar detector performance. To address this challenge, the RPC ECOGasas@GIF++ collaboration (including RPC experts of ALICE, ATLAS, CMS, SHiP/LHCb, and the CERN EP-DT group) was created in 2019. The collaboration is currently studying a gas from the olefine family, the C$_{3}$H$_{2}$F$_{4}$ (or simply HFO, with GWP $\approx$6), to be used, in combination with CO$_{2}$, as a substitute for C$_{2}$H$_{2}$F$_{4}$. This contribution will focus on the signal shape studies that have been carried out by the collaboration during dedicated beam test periods. The methodology used in the data analysis will be presented, together with the results obtained with several HFO-based gas mixtures, and with the currently employed one. Furthermore, results on the counting-rate dependence of the RPC performance, obtained by combining the muon beam with the GIF++ $^{137}$Cs source with different attenuation factors, will also be presented.
- Published
- 2025
182. Prompt-based Depth Pruning of Large Language Models
- Author
-
Wee, Juyun, Park, Minjae, and Lee, Jaeho
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Depth pruning aims to reduce the inference cost of a large language model without any hardware-specific complications, by simply removing several less important transformer blocks. However, our empirical findings suggest that the importance of a transformer block may be highly task-dependent -- a block that is crucial for a task can be removed without degrading the accuracy on another task. Based on this observation, we develop a dynamic depth pruning algorithm, coined PuDDing (Prompt-routed Dynamic Depth Pruning), which determines which blocks to omit from the model based on the input prompt. PuDDing operates by training a lightweight router to predict the best omission set among a set of options, where this option set has also been constructed in a data-driven manner. Empirical results on commonsense reasoning benchmarks demonstrate that PuDDing effectively accelerates the inference language models, and achieves better on-task performance than static depth pruning baselines., Comment: 13 pages, 5 figures
- Published
- 2025
183. RAPID: Robust and Agile Planner Using Inverse Reinforcement Learning for Vision-Based Drone Navigation
- Author
-
Kim, Minwoo, Bae, Geunsik, Lee, Jinwoo, Shin, Woojae, Kim, Changseung, Choi, Myong-Yol, Shin, Heejung, and Oh, Hyondong
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
This paper introduces a learning-based visual planner for agile drone flight in cluttered environments. The proposed planner generates collision-free waypoints in milliseconds, enabling drones to perform agile maneuvers in complex environments without building separate perception, mapping, and planning modules. Learning-based methods, such as behavior cloning (BC) and reinforcement learning (RL), demonstrate promising performance in visual navigation but still face inherent limitations. BC is susceptible to compounding errors due to limited expert imitation, while RL struggles with reward function design and sample inefficiency. To address these limitations, this paper proposes an inverse reinforcement learning (IRL)-based framework for high-speed visual navigation. By leveraging IRL, it is possible to reduce the number of interactions with simulation environments and improve capability to deal with high-dimensional spaces while preserving the robustness of RL policies. A motion primitive-based path planning algorithm collects an expert dataset with privileged map data from diverse environments, ensuring comprehensive scenario coverage. By leveraging both the acquired expert and learner dataset gathered from the agent's interactions with the simulation environments, a robust reward function and policy are learned across diverse states. While the proposed method is trained in a simulation environment only, it can be directly applied to real-world scenarios without additional training or tuning. The performance of the proposed method is validated in both simulation and real-world environments, including forests and various structures. The trained policy achieves an average speed of 7 m/s and a maximum speed of 8.8 m/s in real flight experiments. To the best of our knowledge, this is the first work to successfully apply an IRL framework for high-speed visual navigation of drones., Comment: 18 pages, 11 figures, 58 references, and appendix is included
- Published
- 2025
184. Impact of Higher-order Tidal Corrections on the Measurement Accuracy of Neutron Star Tidal Deformability
- Author
-
Park, Gyeongbin, Lee, Chang-Hwan, and Cho, Hee-Suk
- Subjects
General Relativity and Quantum Cosmology - Abstract
Gravitational waves emitted by binary neutron stars (BNS) provide information about the internal structure of neutron stars (NSs), helping to verify dense matter equations of state. We investigate how the measurement accuracy of NS's tidal deformability can be improved by incorporating the higher-order post-Newtonian (pN) tidal corrections up to 7.5 pN. We assume an aligned-spin BNS system and adopt TaylorF2, which is the most commonly used pN waveform model. To calculate the measurement error, we use a semi-analytic method, Fisher Matrix, which is much faster than performing parameter estimation simulations. We employ Universal Relation to remove additional parameters that appear in higher-order corrections beyond 6 pN. We find that the effect of tidal corrections shows no behavior of convergence with increasing pN orders. Assuming a fiducial binary NS system whose physical parameters are compatible with GW170817, we find that the measurement error of tidal deformability ($\tilde{\lambda}$) decreases linearly as the effective spin ($\chi_{\rm eff}$) increases and the tidal deformability can be better measured for stiffer equation of states., Comment: 7 pages, 3 figures
- Published
- 2025
185. Domain-Invariant Per-Frame Feature Extraction for Cross-Domain Imitation Learning with Visual Observations
- Author
-
Kim, Minung, Lee, Kawon, Kim, Jungmo, Choi, Sungho, and Han, Seungyul
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Imitation learning (IL) enables agents to mimic expert behavior without reward signals but faces challenges in cross-domain scenarios with high-dimensional, noisy, and incomplete visual observations. To address this, we propose Domain-Invariant Per-Frame Feature Extraction for Imitation Learning (DIFF-IL), a novel IL method that extracts domain-invariant features from individual frames and adapts them into sequences to isolate and replicate expert behaviors. We also introduce a frame-wise time labeling technique to segment expert behaviors by timesteps and assign rewards aligned with temporal contexts, enhancing task performance. Experiments across diverse visual environments demonstrate the effectiveness of DIFF-IL in addressing complex visual tasks., Comment: 8 pages main, 19 pages appendix with reference. Submitted to ICML 2025
- Published
- 2025
186. Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning
- Author
-
Lee, Sunwoo, Hwang, Jaebak, Jo, Yonghyeon, and Han, Seungyul
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security ,Computer Science - Multiagent Systems - Abstract
Traditional robust methods in multi-agent reinforcement learning (MARL) often struggle against coordinated adversarial attacks in cooperative scenarios. To address this limitation, we propose the Wolfpack Adversarial Attack framework, inspired by wolf hunting strategies, which targets an initial agent and its assisting agents to disrupt cooperation. Additionally, we introduce the Wolfpack-Adversarial Learning for MARL (WALL) framework, which trains robust MARL policies to defend against the proposed Wolfpack attack by fostering system-wide collaboration. Experimental results underscore the devastating impact of the Wolfpack attack and the significant robustness improvements achieved by WALL., Comment: 8 pages main, 21 pages appendix with reference. Submitted to ICML 2025
- Published
- 2025
187. Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization
- Author
-
Lee, Chanhui, Song, Yuheon, Jeong, YongJun, Ko, Hanbum, Hormazabal, Rodrigo, Han, Sehui, Bae, Kyunghoon, Lim, Sungbin, and Kim, Sungwoong
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Physics - Chemical Physics ,Quantitative Biology - Biomolecules - Abstract
Recent advances in Large Language Models (LLMs) have motivated the development of general LLMs for molecular tasks. While several studies have demonstrated that fine-tuned LLMs can achieve impressive benchmark performances, they are far from genuine generalist molecular LLMs due to a lack of fundamental understanding of molecular structure. Specifically, when given molecular task instructions, LLMs trained with naive next-token prediction training assign similar likelihood scores to both original and negatively corrupted molecules, revealing their lack of molecular structure understanding that is crucial for reliable and general molecular LLMs. To overcome this limitation and obtain a true generalist molecular LLM, we introduce a novel multi-modal training method based on a thorough multi-modal instruction tuning as well as a molecular structure preference optimization between chosen and rejected graphs. On various molecular benchmarks, the proposed generalist molecular LLM, called Mol-LLM, achieves state-of-the-art performances among generalist LLMs on most tasks, at the same time, surpassing or comparable to state-of-the-art specialist LLMs. Moreover, Mol-LLM also shows superior generalization performances in reaction prediction tasks, demonstrating the effect of the molecular structure understanding for generalization perspective.
- Published
- 2025
188. METAMON: Finding Inconsistencies between Program Documentation and Behavior using Metamorphic LLM Queries
- Author
-
Lee, Hyeonseok, An, Gabin, and Yoo, Shin
- Subjects
Computer Science - Software Engineering - Abstract
Code documentation can, if written precisely, help developers better understand the code they accompany. However, unlike code, code documentation cannot be automatically verified via execution, potentially leading to inconsistencies between documentation and the actual behavior. While such inconsistencies can be harmful for the developer's understanding of the code, checking and finding them remains a costly task due to the involvement of human engineers. This paper proposes METAMON, which uses an existing search-based test generation technique to capture the current program behavior in the form of test cases, and subsequently uses LLM-based code reasoning to identify the generated regression test oracles that are not consistent with the program specifications in the documentation. METAMON is supported in this task by metamorphic testing and self-consistency. An empirical evaluation against 9,482 pairs of code documentation and code snippets, generated using five open-source projects from Defects4J v2.0.1, shows that METAMON can classify the code-and-documentation inconsistencies with a precision of 0.72 and a recall of 0.48., Comment: 8 pages and 7 figures, accepted to LLM4Code 2025
- Published
- 2025
189. Peri-LN: Revisiting Layer Normalization in the Transformer Architecture
- Author
-
Kim, Jeonghoon, Lee, Byeongchan, Park, Cheonbok, Oh, Yeontaek, Kim, Beomjun, Yoo, Taehwan, Shin, Seongjin, Han, Dongyoon, Shin, Jinwoo, and Yoo, Kang Min
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Designing Transformer architectures with the optimal layer normalization (LN) strategy that ensures large-scale training stability and expedite convergence has remained elusive, even in this era of large language models (LLMs). To this end, we present a comprehensive analytical foundation for understanding how different LN strategies influence training dynamics in large-scale Transformer training. Until recently, Pre-LN and Post-LN have long dominated standard practices despite their limitations in large-scale training. However, several open-source large-scale models have recently begun silently adopting a third strategy without much explanation. This strategy places layer normalization (LN) peripherally around sublayers, a design we term Peri-LN. While Peri-LN has demonstrated promising empirical performance, its precise mechanisms and benefits remain almost unexplored. Our in-depth analysis shows that Peri-LN strikes an ideal balance in variance growth -- unlike Pre-LN and Post-LN, which are prone to vanishing gradients and ``massive activations.'' To validate our theoretical insight, we conduct large-scale experiments on Transformers up to 3.2B parameters, showing that Peri-LN consistently achieves more balanced variance growth, steadier gradient flow, and convergence stability. Our results suggest that Peri-LN warrants broader consideration for large-scale Transformer architectures, providing renewed insights into the optimal placement and application of LN., Comment: Preprint
- Published
- 2025
190. Critical Current, Lengthwise Fluctuations, and Flux Jumps in REBCO CC: A Torque Magnetometry Study up to 45 T
- Author
-
Jaroszynski, J., Constantinescu, A-M, Kolb-Bond, D., Francis, A., Xu, A., Ries, R., Bradford, G., Bang, J., Lee, J., and Larbalestier, D.
- Subjects
Condensed Matter - Superconductivity - Abstract
REBCO (Rare Earth Barium Copper Oxide) coated conductors (CCs) have emerged for future high field magnets in fields and temperatures inaccesible for Nb based superconductors. However, their exceptionally high current densities pose challenges for characterization at low temperatures. This paper presents the design and implementation of a simple torque magnetometer especially suitable for characterizing REBCO CC. It details the construction and underlying physics, with particular emphasis on its capability to assess angular critical currents Ic in high magnetic fields and low temperatures. The study includes characterizations of multiple REBCO samples from different manufacturers, performed under magnetic fields up to 45 T, demonstrating the exceptional capabilities of REBCO CCs in extreme fields. The results reveal significant lengthwise Ic variations, especially in tapes cut from the edges of 12 mm-wide production tapes compared to those cut from the center. These variations are most pronounced when the field is in the vicinity of the ab-plane. Importantly, flux jumps are observed in samples with thick REBCO layers and thin stabilizers, underscoring potential thermal instabilities. These findings provide valuable insights into REBCO tape performance under extreme magnetic fields, highlighting their relevance for high-field magnet and nuclear fusion applications., Comment: 22 pages, 10 figures
- Published
- 2025
191. Calibrated Multi-Preference Optimization for Aligning Diffusion Models
- Author
-
Lee, Kyungmin, Li, Xiaohang, Wang, Qifei, He, Junfeng, Ke, Junjie, Yang, Ming-Hsuan, Essa, Irfan, Shin, Jinwoo, Yang, Feng, and Li, Yinxiao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Aligning text-to-image (T2I) diffusion models with preference optimization is valuable for human-annotated datasets, but the heavy cost of manual data collection limits scalability. Using reward models offers an alternative, however, current preference optimization methods fall short in exploiting the rich information, as they only consider pairwise preference distribution. Furthermore, they lack generalization to multi-preference scenarios and struggle to handle inconsistencies between rewards. To address this, we present Calibrated Preference Optimization (CaPO), a novel method to align T2I diffusion models by incorporating the general preference from multiple reward models without human annotated data. The core of our approach involves a reward calibration method to approximate the general preference by computing the expected win-rate against the samples generated by the pretrained models. Additionally, we propose a frontier-based pair selection method that effectively manages the multi-preference distribution by selecting pairs from Pareto frontiers. Finally, we use regression loss to fine-tune diffusion models to match the difference between calibrated rewards of a selected pair. Experimental results show that CaPO consistently outperforms prior methods, such as Direct Preference Optimization (DPO), in both single and multi-reward settings validated by evaluation on T2I benchmarks, including GenEval and T2I-Compbench.
- Published
- 2025
192. New perspective on the multiple population phenomenon in Galactic globular clusters from a wide-field photometric survey
- Author
-
Jang, S., Milone, A. P., Marino, A. F., Tailo, M., Dondoglio, E., Legnardi, M. V., Cordoni, G., Ziliotto, T., Lagioia, E. P., Carlos, M., Mohandasan, A., Bortolan, E., and Lee, Y. -W.
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Wide-field photometry of Galactic globular clusters (GCs) has been investigated to overcome limitations from the small field of view of the Hubble Space Telescope in the study of multiple populations. In particular, 'chromosome maps' (ChMs) built with ground-based photometry were constructed to identify the first and second generation stars (1G and 2G) over the wide-field of view. The ChMs allow us to derive the fraction of distinct populations in an analyzed field of view. We present here the radial distribution of the 2G fraction in 29 GCs. The distributions show that all the GCs either have a flat distribution or more centrally concentrated 2G stars. Notably, we find that the fraction of 1G stars outside the half-light radius is clearly bifurcated across all mass range. It implies that a group of GCs with lower 1G fractions (hereafter Group II) have efficiently lost their 1G stars in the outermost cluster regions. In fact, in connection with the trends of the radial distribution, most GCs of Group II have spatially mixed populations, while only less massive GCs in Group I (a group with higher 1G fraction) show that feature. Lastly, we investigate links between these two groups and host cluster parameters. We find that most GCs of Group II are distributed along a broader range of galactocentric distances with smaller perigalactic distances < 3.5 kpc. Besides, by using the Gaia data, it is observed that Group II GCs have higher energy on the integrals of motion diagrams than Group I GCs., Comment: 17 pages, 11 figures
- Published
- 2025
193. Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation
- Author
-
Lee, Junha, Park, Chunghyun, Choe, Jaesung, Wang, Yu-Chiang Frank, Kautz, Jan, Cho, Minsu, and Choy, Chris
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We tackle open-vocabulary 3D scene understanding by introducing a novel data generation pipeline and training framework. Our method addresses three critical requirements for effective training: precise 3D region segmentation, comprehensive textual descriptions, and sufficient dataset scale. By leveraging state-of-the-art open-vocabulary image segmentation models and region-aware Vision-Language Models, we develop an automatic pipeline that generates high-quality 3D mask-text pairs. Applying this pipeline to multiple 3D scene datasets, we create Mosaic3D-5.6M, a dataset of over 30K annotated scenes with 5.6M mask-text pairs, significantly larger than existing datasets. Building upon this data, we propose Mosaic3D, a foundation model combining a 3D encoder trained with contrastive learning and a lightweight mask decoder for open-vocabulary 3D semantic and instance segmentation. Our approach achieves state-of-the-art results on open-vocabulary 3D semantic and instance segmentation tasks including ScanNet200, Matterport3D, and ScanNet++, with ablation studies validating the effectiveness of our large-scale training data., Comment: project page: https://nvlabs.github.io/Mosaic3D/
- Published
- 2025
194. Hybrid Fingerprint-based Positioning in Cell-Free Massive MIMO Systems
- Author
-
Kumar, Manish, Chou, Tzu-Hsuan, Lee, Byunghyun, Michelusi, Nicolo, Love, David J., and Krogmeier, James V.
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Recently, there has been an increasing interest in 6G technology for integrated sensing and communications, where positioning stands out as a key application. In the realm of 6G, cell-free massive multiple-input multiple-output (MIMO) systems, featuring distributed base stations equipped with a large number of antennas, present an abundant source of angle-of-arrival (AOA) information that could be exploited for positioning applications. In this paper we leverage this AOA information at the base stations using the multiple signal classification (MUSIC) algorithm, in conjunction with received signal strength (RSS) for positioning through Gaussian process regression (GPR). An AOA fingerprint database is constructed by capturing the angle data from multiple locations across the network area and is combined with RSS data from the same locations to form a hybrid fingerprint which is then used to train a GPR model employing a squared exponential kernel. The trained regression model is subsequently utilized to estimate the location of a user equipment. Simulations demonstrate that the GPR model with hybrid input achieves better positioning accuracy than traditional GPR models utilizing RSS-only and AOA-only inputs.
- Published
- 2025
195. Artificial Intelligence and Legal Analysis: Implications for Legal Education and the Profession
- Author
-
Peoples, Lee
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence - Abstract
This article reports the results of a study examining the ability of legal and non-legal Large Language Models to perform legal analysis using the Issue-Rule-Application-Conclusion framework. LLMs were tested on legal reasoning tasks involving rule analysis and analogical reasoning. The results show that LLMs can conduct basic IRAC analysis, but are limited by brief responses lacking detail, an inability to commit to answers, false confidence, and hallucinations. The study compares legal and nonlegal LLMs, identifies shortcomings, and explores traits that may hinder their ability to think like a lawyer. It also discusses the implications for legal education and practice, highlighting the need for critical thinking skills in future lawyers and the potential pitfalls of overreliance on artificial intelligence AI resulting in a loss of logic, reasoning, and critical thinking skills.
- Published
- 2025
196. Data-Efficient Model for Psychological Resilience Prediction based on Neurological Data
- Author
-
Zhang, Zhi, Liu, Yan, Gao, Mengxia, Yang, Yu, Cao, Jiannong, Hou, Wai Kai, Li, Shirley, Yau, Sonata, Wing, Yun Kwok, and Lee, Tatia M. C.
- Subjects
Computer Science - Computational Engineering, Finance, and Science ,Computer Science - Artificial Intelligence - Abstract
Psychological resilience, defined as the ability to rebound from adversity, is crucial for mental health. Compared with traditional resilience assessments through self-reported questionnaires, resilience assessments based on neurological data offer more objective results with biological markers, hence significantly enhancing credibility. This paper proposes a novel data-efficient model to address the scarcity of neurological data. We employ Neuro Kolmogorov-Arnold Networks as the structure of the prediction model. In the training stage, a new trait-informed multimodal representation algorithm with a smart chunk technique is proposed to learn the shared latent space with limited data. In the test stage, a new noise-informed inference algorithm is proposed to address the low signal-to-noise ratio of the neurological data. The proposed model not only shows impressive performance on both public datasets and self-constructed datasets but also provides some valuable psychological hypotheses for future research.
- Published
- 2025
197. OCTOPINF: Workload-Aware Inference Serving for Edge Video Analytics
- Author
-
Nguyen, Thanh-Tung, Liebe, Lucas, Tau, Nhat-Quang, Wu, Yuheng, Cheng, Jinghan, and Lee, Dongman
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Edge Video Analytics (EVA) has gained significant attention as a major application of pervasive computing, enabling real-time visual processing. EVA pipelines, composed of deep neural networks (DNNs), typically demand efficient inference serving under stringent latency requirements, which is challenging due to the dynamic Edge environments (e.g., workload variability and network instability). Moreover, EVA pipelines also face significant resource contention caused by resource (e.g., GPU) constraints at the Edge. In this paper, we introduce OCTOPINF, a novel resource-efficient and workload-aware inference serving system designed for real-time EVA. OCTOPINF tackles the unique challenges of dynamic edge environments through fine-grained resource allocation, adaptive batching, and workload balancing between edge devices and servers. Furthermore, we propose a spatiotemporal scheduling algorithm that optimizes the co-location of inference tasks on GPUs, improving performance and ensuring service-level objectives (SLOs) compliance. Extensive evaluations on a real-world testbed demonstrate the effectiveness of our approach. It achieves an effective throughput increase of up to 10x compared to the baselines and shows better robustness in challenging scenarios. OCTOPINF can be used for any DNN-based EVA inference task with minimal adaptation and is available at https://github.com/tungngreen/PipelineScheduler.
- Published
- 2025
198. ConditionNET: Learning Preconditions and Effects for Execution Monitoring
- Author
-
Sliwowski, Daniel and Lee, Dongheui
- Subjects
Computer Science - Robotics ,Computer Science - Machine Learning - Abstract
The introduction of robots into everyday scenarios necessitates algorithms capable of monitoring the execution of tasks. In this paper, we propose ConditionNET, an approach for learning the preconditions and effects of actions in a fully data-driven manner. We develop an efficient vision-language model and introduce additional optimization objectives during training to optimize for consistent feature representations. ConditionNET explicitly models the dependencies between actions, preconditions, and effects, leading to improved performance. We evaluate our model on two robotic datasets, one of which we collected for this paper, containing 406 successful and 138 failed teleoperated demonstrations of a Franka Emika Panda robot performing tasks like pouring and cleaning the counter. We show in our experiments that ConditionNET outperforms all baselines on both anomaly detection and phase prediction tasks. Furthermore, we implement an action monitoring system on a real robot to demonstrate the practical applicability of the learned preconditions and effects. Our results highlight the potential of ConditionNET for enhancing the reliability and adaptability of robots in real-world environments. The data is available on the project website: https://dsliwowski1.github.io/ConditionNET_page., Comment: 9 pages, 5 figures, 3 tables
- Published
- 2025
- Full Text
- View/download PDF
199. Bayesian Spatiotemporal Nonstationary Model Quantifies Robust Increases in Daily Extreme Rainfall Across the Western Gulf Coast
- Author
-
Lu, Yuchen, Lee, Ben Seiyon, and Doss-Gollin, James
- Subjects
Statistics - Applications - Abstract
Precipitation exceedance probabilities are widely used in engineering design, risk assessment, and floodplain management. While common approaches like NOAA Atlas 14 assume that extreme precipitation characteristics are stationary over time, this assumption may underestimate current and future hazards due to anthropogenic climate change. However, the incorporation of nonstationarity in the statistical modeling of extreme precipitation has faced practical challenges that have restricted its applications. In particular, random sampling variability challenges the reliable estimation of trends and parameters, especially when observational records are limited. To address this methodological gap, we propose the Spatially Varying Covariates Model, a hierarchical Bayesian spatial framework that integrates nonstationarity and regionalization for robust frequency analysis of extreme precipitation. This model draws from extreme value theory, spatial statistics, and Bayesian statistics, and is validated through cross-validation and multiple performance metrics. Applying this framework to a case study of daily rainfall in the Western Gulf Coast, we identify robustly increasing trends in extreme precipitation intensity and variability throughout the study area, with notable spatial heterogeneity. This flexible model accommodates stations with varying observation records, yields smooth return level estimates, and can be straightforwardly adapted to the analysis of precipitation frequencies at different durations and for other regions.
- Published
- 2025
200. Licensing Open Government Data
- Author
-
Lee, Jyh-An
- Subjects
Computer Science - Computers and Society - Abstract
This article focuses on the legal issues associated with open government data licenses. This study compares current open data licenses and argues that licensing terms reflect policy considerations, which are quite different from those contemplated in business transactions or shared in typical commons communities. This article investigates the ambiguous legal status of data together with the new wave of open government data, which concerns some fundamental intellectual property (IP) questions not covered by, or analyzed in depth in, the current literature. Moreover, this study suggests that government should choose or adapt open data licenses according to their own IP regimes. In the end, this article argues that the design or choice of open government data license forms an important element of information policy; government, therefore, should make this decision in accordance with their policy goals and in compliance with their own jurisdictions' IP laws.
- Published
- 2025
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.