2,415 results on '"Piot, P"'
Search Results
2. Preference Optimization as Probabilistic Inference
- Author
-
Abdolmaleki, Abbas, Piot, Bilal, Shahriari, Bobak, Springenberg, Jost Tobias, Hertweck, Tim, Joshi, Rishabh, Oh, Junhyuk, Bloesch, Michael, Lampe, Thomas, Heess, Nicolas, Buchli, Jonas, and Riedmiller, Martin
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Existing preference optimization methods are mainly designed for directly learning from human feedback with the assumption that paired examples (preferred vs. dis-preferred) are available. In contrast, we propose a method that can leverage unpaired preferred or dis-preferred examples, and works even when only one type of feedback (positive or negative) is available. This flexibility allows us to apply it in scenarios with varying forms of feedback and models, including training generative language models based on human feedback as well as training policies for sequential decision-making problems, where learned (value) functions are available. Our approach builds upon the probabilistic framework introduced in (Dayan and Hinton, 1997), which proposes to use expectation-maximization (EM) to directly optimize the probability of preferred outcomes (as opposed to classic expected reward maximization). To obtain a practical algorithm, we identify and address a key limitation in current EM-based methods: when applied to preference optimization, they solely maximize the likelihood of preferred examples, while neglecting dis-preferred samples. We show how one can extend EM algorithms to explicitly incorporate dis-preferred outcomes, leading to a novel, theoretically grounded, preference optimization algorithm that offers an intuitive and versatile way to learn from both positive and negative feedback.
- Published
- 2024
3. Decoding Hate: Exploring Language Models' Reactions to Hate Speech
- Author
-
Piot, Paloma and Parapar, Javier
- Subjects
Computer Science - Computation and Language - Abstract
Hate speech is a harmful form of online expression, often manifesting as derogatory posts. It is a significant risk in digital environments. With the rise of Large Language Models (LLMs), there is concern about their potential to replicate hate speech patterns, given their training on vast amounts of unmoderated internet data. Understanding how LLMs respond to hate speech is crucial for their responsible deployment. However, the behaviour of LLMs towards hate speech has been limited compared. This paper investigates the reactions of seven state-of-the-art LLMs (LLaMA 2, Vicuna, LLaMA 3, Mistral, GPT-3.5, GPT-4, and Gemini Pro) to hate speech. Through qualitative analysis, we aim to reveal the spectrum of responses these models produce, highlighting their capacity to handle hate speech inputs. We also discuss strategies to mitigate hate speech generation by LLMs, particularly through fine-tuning and guideline guardrailing. Finally, we explore the models' responses to hate speech framed in politically correct language.
- Published
- 2024
4. RRM: Robust Reward Model Training Mitigates Reward Hacking
- Author
-
Liu, Tianqi, Xiong, Wei, Ren, Jie, Chen, Lichang, Wu, Junru, Joshi, Rishabh, Gao, Yang, Shen, Jiaming, Qin, Zhen, Yu, Tianhe, Sohn, Daniel, Makarova, Anastasiia, Liu, Jeremiah, Liu, Yuan, Piot, Bilal, Ittycheriah, Abe, Kumar, Aviral, and Saleh, Mohammad
- Subjects
Computer Science - Computation and Language - Abstract
Reward models (RMs) play a pivotal role in aligning large language models (LLMs) with human preferences. However, traditional RM training, which relies on response pairs tied to specific prompts, struggles to disentangle prompt-driven preferences from prompt-independent artifacts, such as response length and format. In this work, we expose a fundamental limitation of current RM training methods, where RMs fail to effectively distinguish between contextual signals and irrelevant artifacts when determining preferences. To address this, we introduce a causal framework that learns preferences independent of these artifacts and propose a novel data augmentation technique designed to eliminate them. Extensive experiments show that our approach successfully filters out undesirable artifacts, yielding a more robust reward model (RRM). Our RRM improves the performance of a pairwise reward model trained on Gemma-2-9b-it, on RewardBench, increasing accuracy from 80.61% to 84.15%. Additionally, we train two DPO policies using both the RM and RRM, demonstrating that the RRM significantly enhances DPO-aligned policies, improving MT-Bench scores from 7.27 to 8.31 and length-controlled win-rates in AlpacaEval-2 from 33.46% to 52.49%.
- Published
- 2024
5. Building Math Agents with Multi-Turn Iterative Preference Learning
- Author
-
Xiong, Wei, Shi, Chengshuai, Shen, Jiaming, Rosenberg, Aviv, Qin, Zhen, Calandriello, Daniele, Khalman, Misha, Joshi, Rishabh, Piot, Bilal, Saleh, Mohammad, Jin, Chi, Zhang, Tong, and Liu, Tianqi
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Recent studies have shown that large language models' (LLMs) mathematical problem-solving capabilities can be enhanced by integrating external tools, such as code interpreters, and employing multi-turn Chain-of-Thought (CoT) reasoning. While current methods focus on synthetic data generation and Supervised Fine-Tuning (SFT), this paper studies the complementary direct preference learning approach to further improve model performance. However, existing direct preference learning algorithms are originally designed for the single-turn chat task, and do not fully address the complexities of multi-turn reasoning and external tool integration required for tool-integrated mathematical reasoning tasks. To fill in this gap, we introduce a multi-turn direct preference learning framework, tailored for this context, that leverages feedback from code interpreters and optimizes trajectory-level preferences. This framework includes multi-turn DPO and multi-turn KTO as specific implementations. The effectiveness of our framework is validated through training of various language models using an augmented prompt set from the GSM8K and MATH datasets. Our results demonstrate substantial improvements: a supervised fine-tuned Gemma-1.1-it-7B model's performance increased from 77.5% to 83.9% on GSM8K and from 46.1% to 51.2% on MATH. Similarly, a Gemma-2-it-9B model improved from 84.1% to 86.3% on GSM8K and from 51.0% to 54.5% on MATH., Comment: A multi-turn direct preference learning framework for tool-integrated reasoning tasks
- Published
- 2024
6. Report on the Advanced Linear Collider Study Group (ALEGRO) Workshop 2024
- Author
-
Vieira, J., Cros, B., Muggli, P., Andriyash, I. A., Apsimon, O., Backhouse, M., Benedetti, C., Bulanov, S. S., Caldwell, A., Chen, Min, Cilento, V., Corde, S., D'Arcy, R., Diederichs, S., Ericson, E., Esarey, E., Farmer, J., Fedeli, L., Formenti, A., Foster, B., Garten, M., Geddes, C. G. R., Grismayer, T., Hogan, M. J., Hooker, S., Huebl, A., Jalas, S., Kirchen, M., Lehe, R., Leemans, W., Li, Boyuan, Lindström, C. A., Losito, R., Mitchell, C. E., Mori, W. B., Piot, P., Terzani, D., Thévenet, M., Turner, M., Vay, J. -L., Völker, D., Zhang, Jie, and Zhang, W.
- Subjects
Physics - Accelerator Physics - Abstract
The workshop focused on the application of ANAs to particle physics keeping in mind the ultimate goal of a collider at the energy frontier (10\,TeV, e$^+$/e$^-$, e$^-$/e$^-$, or $\gamma\gamma$). The development of ANAs is conducted at universities and national laboratories worldwide. The community is thematically broad and diverse, in particular since lasers suitable for ANA research (multi-hundred-terawatt peak power, a few tens of femtosecond-long pulses) and acceleration of electrons to hundreds of mega electron volts to multi giga electron volts became commercially available. The community spans several continents (Europe, America, Asia), including more than 62 laboratories in more than 20 countries. It is among the missions of the ICFA-ANA panel to feature the amazing progress made with ANAs, to provide international coordination and to foster international collaborations towards a future HEP collider. The scope of this edition of the workshop was to discuss the recent progress and necessary steps towards realizing a linear collider for particle physics based on novel-accelerator technologies (laser or beam driven in plasma or structures). Updates on the relevant aspects of the European Strategy for Particle Physics (ESPP) Roadmap Process as well as of the P5 (in the US) were presented, and ample time was dedicated to discussions. The major outcome of the workshop is the decision for ALEGRO to coordinate efforts in Europe, in the US, and in Asia towards a pre-CDR for an ANA-based, 10\,TeV CM collider. This goal of this coordination is to lead to a funding proposal to be submitted to both EU and EU/US funding agencies. This document presents a summary of the workshop, as seen by the co-chairs, as well as short 'one-pagers' written by the presenters at the workshop., Comment: 72 pages
- Published
- 2024
7. Gemma 2: Improving Open Language Models at a Practical Size
- Author
-
Gemma Team, Riviere, Morgane, Pathak, Shreya, Sessa, Pier Giuseppe, Hardin, Cassidy, Bhupatiraju, Surya, Hussenot, Léonard, Mesnard, Thomas, Shahriari, Bobak, Ramé, Alexandre, Ferret, Johan, Liu, Peter, Tafti, Pouya, Friesen, Abe, Casbon, Michelle, Ramos, Sabela, Kumar, Ravin, Lan, Charline Le, Jerome, Sammy, Tsitsulin, Anton, Vieillard, Nino, Stanczyk, Piotr, Girgin, Sertan, Momchev, Nikola, Hoffman, Matt, Thakoor, Shantanu, Grill, Jean-Bastien, Neyshabur, Behnam, Bachem, Olivier, Walton, Alanna, Severyn, Aliaksei, Parrish, Alicia, Ahmad, Aliya, Hutchison, Allen, Abdagic, Alvin, Carl, Amanda, Shen, Amy, Brock, Andy, Coenen, Andy, Laforge, Anthony, Paterson, Antonia, Bastian, Ben, Piot, Bilal, Wu, Bo, Royal, Brandon, Chen, Charlie, Kumar, Chintu, Perry, Chris, Welty, Chris, Choquette-Choo, Christopher A., Sinopalnikov, Danila, Weinberger, David, Vijaykumar, Dimple, Rogozińska, Dominika, Herbison, Dustin, Bandy, Elisa, Wang, Emma, Noland, Eric, Moreira, Erica, Senter, Evan, Eltyshev, Evgenii, Visin, Francesco, Rasskin, Gabriel, Wei, Gary, Cameron, Glenn, Martins, Gus, Hashemi, Hadi, Klimczak-Plucińska, Hanna, Batra, Harleen, Dhand, Harsh, Nardini, Ivan, Mein, Jacinda, Zhou, Jack, Svensson, James, Stanway, Jeff, Chan, Jetha, Zhou, Jin Peng, Carrasqueira, Joana, Iljazi, Joana, Becker, Jocelyn, Fernandez, Joe, van Amersfoort, Joost, Gordon, Josh, Lipschultz, Josh, Newlan, Josh, Ji, Ju-yeong, Mohamed, Kareem, Badola, Kartikeya, Black, Kat, Millican, Katie, McDonell, Keelin, Nguyen, Kelvin, Sodhia, Kiranbir, Greene, Kish, Sjoesund, Lars Lowe, Usui, Lauren, Sifre, Laurent, Heuermann, Lena, Lago, Leticia, McNealus, Lilly, Soares, Livio Baldini, Kilpatrick, Logan, Dixon, Lucas, Martins, Luciano, Reid, Machel, Singh, Manvinder, Iverson, Mark, Görner, Martin, Velloso, Mat, Wirth, Mateo, Davidow, Matt, Miller, Matt, Rahtz, Matthew, Watson, Matthew, Risdal, Meg, Kazemi, Mehran, Moynihan, Michael, Zhang, Ming, Kahng, Minsuk, Park, Minwoo, Rahman, Mofi, Khatwani, Mohit, Dao, Natalie, Bardoliwalla, Nenshad, Devanathan, Nesh, Dumai, Neta, Chauhan, Nilay, Wahltinez, Oscar, Botarda, Pankil, Barnes, Parker, Barham, Paul, Michel, Paul, Jin, Pengchong, Georgiev, Petko, Culliton, Phil, Kuppala, Pradeep, Comanescu, Ramona, Merhej, Ramona, Jana, Reena, Rokni, Reza Ardeshir, Agarwal, Rishabh, Mullins, Ryan, Saadat, Samaneh, Carthy, Sara Mc, Cogan, Sarah, Perrin, Sarah, Arnold, Sébastien M. R., Krause, Sebastian, Dai, Shengyang, Garg, Shruti, Sheth, Shruti, Ronstrom, Sue, Chan, Susan, Jordan, Timothy, Yu, Ting, Eccles, Tom, Hennigan, Tom, Kocisky, Tomas, Doshi, Tulsee, Jain, Vihan, Yadav, Vikas, Meshram, Vilobh, Dharmadhikari, Vishal, Barkley, Warren, Wei, Wei, Ye, Wenming, Han, Woohyun, Kwon, Woosuk, Xu, Xiang, Shen, Zhe, Gong, Zhitao, Wei, Zichuan, Cotruta, Victor, Kirk, Phoebe, Rao, Anand, Giang, Minh, Peran, Ludovic, Warkentin, Tris, Collins, Eli, Barral, Joelle, Ghahramani, Zoubin, Hadsell, Raia, Sculley, D., Banks, Jeanine, Dragan, Anca, Petrov, Slav, Vinyals, Oriol, Dean, Jeff, Hassabis, Demis, Kavukcuoglu, Koray, Farabet, Clement, Buchatskaya, Elena, Borgeaud, Sebastian, Fiedel, Noah, Joulin, Armand, Kenealy, Kathleen, Dadashi, Robert, and Andreev, Alek
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al., 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2-3 times bigger. We release all our models to the community.
- Published
- 2024
8. Offline Regularised Reinforcement Learning for Large Language Models Alignment
- Author
-
Richemond, Pierre Harvey, Tang, Yunhao, Guo, Daniel, Calandriello, Daniele, Azar, Mohammad Gheshlaghi, Rafailov, Rafael, Pires, Bernardo Avila, Tarassov, Eugene, Spangher, Lucas, Ellsworth, Will, Severyn, Aliaksei, Mallinson, Jonathan, Shani, Lior, Shamir, Gil, Joshi, Rishabh, Liu, Tianqi, Munos, Remi, and Piot, Bilal
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
The dominant framework for alignment of large language models (LLM), whether through reinforcement learning from human feedback or direct preference optimisation, is to learn from preference data. This involves building datasets where each element is a quadruplet composed of a prompt, two independent responses (completions of the prompt) and a human preference between the two independent responses, yielding a preferred and a dis-preferred response. Such data is typically scarce and expensive to collect. On the other hand, \emph{single-trajectory} datasets where each element is a triplet composed of a prompt, a response and a human feedback is naturally more abundant. The canonical element of such datasets is for instance an LLM's response to a user's prompt followed by a user's feedback such as a thumbs-up/down. Consequently, in this work, we propose DRO, or \emph{Direct Reward Optimisation}, as a framework and associated algorithms that do not require pairwise preferences. DRO uses a simple mean-squared objective that can be implemented in various ways. We validate our findings empirically, using T5 encoder-decoder language models, and show DRO's performance over selected baselines such as Kahneman-Tversky Optimization (KTO). Thus, we confirm that DRO is a simple and empirically compelling method for single-trajectory policy optimisation.
- Published
- 2024
9. Probing Berry curvature in magnetic topological insulators through resonant infrared magnetic circular dichroism
- Author
-
Bac, Seul-Ki, Mardelé, Florian le, Wang, Jiashu, Ozerov, Mykhaylo, Yoshimura, Kota, Mohelský, Ivan, Sun, Xingdan, Piot, Benjamin, Wimmer, Stefan, Ney, Andreas, Orlova, Tatyana, Zhukovskyi, Maksym, Bauer, Günther, Springholz, Gunther, Liu, Xinyu, Orlita, Milan, Park, Kyungwha, Hsu, Yi-Ting, and Assaf, Badih A.
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
Probing the quantum geometry and topology in condensed matter systems has relied heavily on static electronic transport experiments in magnetic fields. Yet, contact-free optical measurements have rarely been explored. Magnetic dichroism (MCD), the nonreciprocal absorption of circular polarized light, was theoretically linked to the quantized anomalous Hall effect in magnetic insulators and can identify the bands and momenta responsible for the underlying Berry Curvature (BC). Detecting BC through MCD faces two challenges: First, the relevant inter-band transitions usually generate MCD in the infrared (IR) range, requiring large samples with high quality. Second, while most magnetic materials are metallic, the relation between MCD and BC in metals remains unclear. Here, we report the observation of MCD in the IR range along with the anomalous Hall effect in thin film MnBi2Te4. Both phenomena emerge with a field-driven phase transition from an antiferromagnet to a canted ferromagnet. By theoretically relating the MCD to the anomalous Hall effect via BC in a metal, we show that this transition accompanies an abrupt onset of BC, signaling a topological phase transition from a topological insulator to a doped Chern insulator. Our density functional theory calculation suggests the MCD signal mainly originates from an optical transition at the Brillouin zone edge, hinting at a potential new source of BC away from the commonly considered {\Gamma} point. Our findings demonstrate a novel experimental approach for detecting BC and identifying the responsible bands and momenta, generally applicable to magnetic materials.
- Published
- 2024
10. Multi-turn Reinforcement Learning from Preference Human Feedback
- Author
-
Shani, Lior, Rosenberg, Aviv, Cassel, Asaf, Lang, Oran, Calandriello, Daniele, Zipori, Avital, Noga, Hila, Keller, Orgad, Piot, Bilal, Szpektor, Idan, Hassidim, Avinatan, Matias, Yossi, and Munos, Rémi
- Subjects
Computer Science - Machine Learning - Abstract
Reinforcement Learning from Human Feedback (RLHF) has become the standard approach for aligning Large Language Models (LLMs) with human preferences, allowing LLMs to demonstrate remarkable abilities in various tasks. Existing methods work by emulating the preferences at the single decision (turn) level, limiting their capabilities in settings that require planning or multi-turn interactions to achieve a long-term goal. In this paper, we address this issue by developing novel methods for Reinforcement Learning (RL) from preference feedback between two full multi-turn conversations. In the tabular setting, we present a novel mirror-descent-based policy optimization algorithm for the general multi-turn preference-based RL problem, and prove its convergence to Nash equilibrium. To evaluate performance, we create a new environment, Education Dialogue, where a teacher agent guides a student in learning a random topic, and show that a deep RL variant of our algorithm outperforms RLHF baselines. Finally, we show that in an environment with explicit rewards, our algorithm recovers the same performance as a reward-based RL baseline, despite relying solely on a weaker preference signal.
- Published
- 2024
11. Human Alignment of Large Language Models through Online Preference Optimisation
- Author
-
Calandriello, Daniele, Guo, Daniel, Munos, Remi, Rowland, Mark, Tang, Yunhao, Pires, Bernardo Avila, Richemond, Pierre Harvey, Lan, Charline Le, Valko, Michal, Liu, Tianqi, Joshi, Rishabh, Zheng, Zeyu, and Piot, Bilal
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
Ensuring alignment of language models' outputs with human preferences is critical to guarantee a useful, safe, and pleasant user experience. Thus, human alignment has been extensively studied recently and several methods such as Reinforcement Learning from Human Feedback (RLHF), Direct Policy Optimisation (DPO) and Sequence Likelihood Calibration (SLiC) have emerged. In this paper, our contribution is two-fold. First, we show the equivalence between two recent alignment methods, namely Identity Policy Optimisation (IPO) and Nash Mirror Descent (Nash-MD). Second, we introduce a generalisation of IPO, named IPO-MD, that leverages the regularised sampling approach proposed by Nash-MD. This equivalence may seem surprising at first sight, since IPO is an offline method whereas Nash-MD is an online method using a preference model. However, this equivalence can be proven when we consider the online version of IPO, that is when both generations are sampled by the online policy and annotated by a trained preference model. Optimising the IPO loss with such a stream of data becomes then equivalent to finding the Nash equilibrium of the preference model through self-play. Building on this equivalence, we introduce the IPO-MD algorithm that generates data with a mixture policy (between the online and reference policy) similarly as the general Nash-MD algorithm. We compare online-IPO and IPO-MD to different online versions of existing losses on preference data such as DPO and SLiC on a summarisation task.
- Published
- 2024
12. Extreme fieldwork: flame-sealed capillaries versus frozen serums to estimate body composition of southern elephant seals (Mirounga leonina) using isotopic dilution
- Author
-
Charlanne, Laura M., Zahariev, Alexandre, Guinet, Christophe, Piot, Erwan, Badaut, Jérôme, Gilbert, Caroline, Ancel, André, and Bergouignan, Audrey
- Published
- 2024
- Full Text
- View/download PDF
13. Four-Dimensional Phase-Space Reconstruction of Flat and Magnetized Beams Using Neural Networks and Differentiable Simulations
- Author
-
Kim, Seongyeol, Gonzalez-Aguilera, Juan Pablo, Piot, Philippe, Chen, Gongxiaohui, Doran, Scott, Kim, Young-Kee, Liu, Wanming, Whiteford, Charles, Wisniewski, Eric, Edelen, Auralee, Roussel, Ryan, and Power, John
- Subjects
Physics - Accelerator Physics - Abstract
Beams with cross-plane coupling or extreme asymmetries between the two transverse phase spaces are often encountered in particle accelerators. Flat beams with large transverse-emittance ratios are critical for future linear colliders. Similarly, magnetized beams with significant cross-plane coupling are expected to enhance the performance of electron cooling in hadron beams. Preparing these beams requires precise control and characterization of the four-dimensional transverse phase space. In this study, we employ generative phase space reconstruction (GPSR) techniques to rapidly characterize magnetized and flat-beam phase-space distributions using a conventional quadrupole-scan method. The reconstruction technique is experimentally demonstrated on an electron beam produced at the Argonne Wakefield Accelerator and successfully benchmarked against conventional diagnostics techniques. Specifically, we show that predicted beam parameters from the reconstructed phase-space distributions (e.g. as magnetization and flat beam emittances) are in excellent agreement with those measured from the conventional diagnostic methods.
- Published
- 2024
- Full Text
- View/download PDF
14. Generalized Preference Optimization: A Unified Approach to Offline Alignment
- Author
-
Tang, Yunhao, Guo, Zhaohan Daniel, Zheng, Zeyu, Calandriello, Daniele, Munos, Rémi, Rowland, Mark, Richemond, Pierre Harvey, Valko, Michal, Pires, Bernardo Ávila, and Piot, Bilal
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Offline preference optimization allows fine-tuning large models directly from offline data, and has proved effective in recent alignment practices. We propose generalized preference optimization (GPO), a family of offline losses parameterized by a general class of convex functions. GPO enables a unified view over preference optimization, encompassing existing algorithms such as DPO, IPO and SLiC as special cases, while naturally introducing new variants. The GPO framework also sheds light on how offline algorithms enforce regularization, through the design of the convex function that defines the loss. Our analysis and experiments reveal the connections and subtle differences between the offline regularization and the KL divergence regularization intended by the canonical RLHF formulation. In a controlled setting akin to Gao et al 2023, we also show that different GPO variants achieve similar trade-offs between regularization and performance, though the optimal values of hyper-parameter might differ as predicted by theory. In all, our results present new algorithmic toolkits and empirical insights to alignment practitioners., Comment: Accepted at ICML 2023 main conference
- Published
- 2024
15. Direct Language Model Alignment from Online AI Feedback
- Author
-
Guo, Shangmin, Zhang, Biao, Liu, Tianlin, Liu, Tianqi, Khalman, Misha, Llinares, Felipe, Rame, Alexandre, Mesnard, Thomas, Zhao, Yao, Piot, Bilal, Ferret, Johan, and Blondel, Mathieu
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Human-Computer Interaction - Abstract
Direct alignment from preferences (DAP) methods, such as DPO, have recently emerged as efficient alternatives to reinforcement learning from human feedback (RLHF), that do not require a separate reward model. However, the preference datasets used in DAP methods are usually collected ahead of training and never updated, thus the feedback is purely offline. Moreover, responses in these datasets are often sampled from a language model distinct from the one being aligned, and since the model evolves over training, the alignment phase is inevitably off-policy. In this study, we posit that online feedback is key and improves DAP methods. Our method, online AI feedback (OAIF), uses an LLM as annotator: on each training iteration, we sample two responses from the current model and prompt the LLM annotator to choose which one is preferred, thus providing online feedback. Despite its simplicity, we demonstrate via human evaluation in several tasks that OAIF outperforms both offline DAP and RLHF methods. We further show that the feedback leveraged in OAIF is easily controllable, via instruction prompts to the LLM annotator., Comment: 18 pages, 9 figures, 4 tables
- Published
- 2024
16. MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection
- Author
-
Piot, Paloma, Martín-Rodilla, Patricia, and Parapar, Javier
- Subjects
Computer Science - Computation and Language ,Computer Science - Social and Information Networks - Abstract
Hate speech represents a pervasive and detrimental form of online discourse, often manifested through an array of slurs, from hateful tweets to defamatory posts. As such speech proliferates, it connects people globally and poses significant social, psychological, and occasionally physical threats to targeted individuals and communities. Current computational linguistic approaches for tackling this phenomenon rely on labelled social media datasets for training. For unifying efforts, our study advances in the critical need for a comprehensive meta-collection, advocating for an extensive dataset to help counteract this problem effectively. We scrutinized over 60 datasets, selectively integrating those pertinent into MetaHate. This paper offers a detailed examination of existing collections, highlighting their strengths and limitations. Our findings contribute to a deeper understanding of the existing datasets, paving the way for training more robust and adaptable models. These enhanced models are essential for effectively combating the dynamic and complex nature of hate speech in the digital realm.
- Published
- 2024
17. Factors Enabling Delocalized Charge-Carriers in Pnictogen-Based Solar Absorbers: In-depth Investigation into CuSbSe2
- Author
-
Fu, Yuchen, Lohan, Hugh, Righetto, Marcello, Huang, Yi-Teng, Kavanagh, Seán R., Cho, Chang-Woo, Zelewski, Szymon J., Woo, Young Won, Demetriou, Harry, McLachlan, Martyn A., Heutz, Sandrine, Piot, Benjamin A., Scanlon, David O., Rao, Akshay, Herz, Laura M., Walsh, Aron, and Hoye, Robert L. Z.
- Subjects
Condensed Matter - Materials Science - Abstract
Inorganic semiconductors based on heavy pnictogen cations (Sb3+ and Bi3+) have gained significant attention as potential nontoxic and stable alternatives to lead-halide perovskites for solar cell applications. A limitation of these novel materials, which is being increasingly commonly found, is carrier localization, which substantially reduces mobilities and diffusion lengths. Herein, the layered p\v{r}\'ibramite CuSbSe2 is investigated and discovered to have delocalized free carriers, as shown through optical pump terahertz probe spectroscopy and temperature-dependent mobility measurements. Using a combination of theory and experiment, it is found that the underlying factors are: 1) weak coupling to acoustic phonons due to low deformation potentials, as lattice distortions are primarily accommodated through rigid inter-layer movement rather than straining inter-atomic bonds, and 2) weak coupling to optical phonons due to the ionic contributions to the dielectric constant being low compared to electronic contributions. This work provides important insights into how pnictogen-based semiconductors avoiding carrier localization could be identified., Comment: 47 pages, 4 figures
- Published
- 2024
18. Electronic band structure of Sb2Te3
- Author
-
Mohelsky, I., Wyzula, J., Mardele, F. Le, Abadizaman, F., Caha, O., Dubroka, A., Sun, X. D., Cho, C. W., Piot, B. A., Tanzim, M. F., Aguilera, I., Bauer, G., Springholz, G., and Orlita, M.
- Subjects
Condensed Matter - Materials Science - Abstract
Here we report on Landau level spectroscopy of an epitaxially grown thin film of the topological insulator Sb2Te3, complemented by ellipsometry and magneto-transport measurements. The observed response suggests that Sb2Te3 is a direct-gap semiconductor with the fundamental band gap located at the \Gamma point, or along the trigonal axis, and its width reaches Eg = 190 meV at low temperatures. Our data also indicate the presence of other low-energy extrema with a higher multiplicity in both the conduction and valence bands. The conclusions based on our experimental data are confronted with and to a great extent corroborated by the electronic band structure calculated using the GW method., Comment: 11 pages, 8 figures, to be published in Phys. Rev. B
- Published
- 2023
- Full Text
- View/download PDF
19. Nash Learning from Human Feedback
- Author
-
Munos, Rémi, Valko, Michal, Calandriello, Daniele, Azar, Mohammad Gheshlaghi, Rowland, Mark, Guo, Zhaohan Daniel, Tang, Yunhao, Geist, Matthieu, Mesnard, Thomas, Michi, Andrea, Selvi, Marco, Girgin, Sertan, Momchev, Nikola, Bachem, Olivier, Mankowitz, Daniel J., Precup, Doina, and Piot, Bilal
- Subjects
Statistics - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computer Science and Game Theory ,Computer Science - Machine Learning ,Computer Science - Multiagent Systems - Abstract
Reinforcement learning from human feedback (RLHF) has emerged as the main paradigm for aligning large language models (LLMs) with human preferences. Typically, RLHF involves the initial step of learning a reward model from human feedback, often expressed as preferences between pairs of text generations produced by a pre-trained LLM. Subsequently, the LLM's policy is fine-tuned by optimizing it to maximize the reward model through a reinforcement learning algorithm. However, an inherent limitation of current reward models is their inability to fully represent the richness of human preferences and their dependency on the sampling distribution. In this study, we introduce an alternative pipeline for the fine-tuning of LLMs using pairwise human feedback. Our approach entails the initial learning of a preference model, which is conditioned on two inputs given a prompt, followed by the pursuit of a policy that consistently generates responses preferred over those generated by any competing policy, thus defining the Nash equilibrium of this preference model. We term this approach Nash learning from human feedback (NLHF). In the context of a tabular policy representation, we present a novel algorithmic solution, Nash-MD, founded on the principles of mirror descent. This algorithm produces a sequence of policies, with the last iteration converging to the regularized Nash equilibrium. Additionally, we explore parametric representations of policies and introduce gradient descent algorithms for deep-learning architectures. To demonstrate the effectiveness of our approach, we present experimental results involving the fine-tuning of a LLM for a text summarization task. We believe NLHF offers a compelling avenue for preference learning and policy optimization with the potential of advancing the field of aligning LLMs with human preferences.
- Published
- 2023
20. A General Theoretical Paradigm to Understand Learning from Human Preferences
- Author
-
Azar, Mohammad Gheshlaghi, Rowland, Mark, Piot, Bilal, Guo, Daniel, Calandriello, Daniele, Valko, Michal, and Munos, Rémi
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
The prevalent deployment of learning from human preferences through reinforcement learning (RLHF) relies on two important approximations: the first assumes that pairwise preferences can be substituted with pointwise rewards. The second assumes that a reward model trained on these pointwise rewards can generalize from collected data to out-of-distribution data sampled by the policy. Recently, Direct Preference Optimisation (DPO) has been proposed as an approach that bypasses the second approximation and learn directly a policy from collected data without the reward modelling stage. However, this method still heavily relies on the first approximation. In this paper we try to gain a deeper theoretical understanding of these practical algorithms. In particular we derive a new general objective called $\Psi$PO for learning from human preferences that is expressed in terms of pairwise preferences and therefore bypasses both approximations. This new general objective allows us to perform an in-depth analysis of the behavior of RLHF and DPO (as special cases of $\Psi$PO) and to identify their potential pitfalls. We then consider another special case for $\Psi$PO by setting $\Psi$ simply to Identity, for which we can derive an efficient optimisation procedure, prove performance guarantees and demonstrate its empirical superiority to DPO on some illustrative examples.
- Published
- 2023
21. Magnon gap excitations in van der Waals antiferromagnet MnPSe$_3$
- Author
-
Jana, Dipankar, Vaclavkova, D., Mohelsky, I., Kapuscinski, P., Cho, C. W., Breslavetz, I., Białek, M., Ansermet, J. -Ph., Piot, B. A., Orlita, M., Faugeras, C., and Potemski, M.
- Subjects
Condensed Matter - Materials Science - Abstract
Magneto-spectroscopy methods have been employed to study the zero-wavevector magnon excitations in MnPSe$_3$. Experiments carried out as a function of temperature and the applied magnetic field show that two low-energy magnon branches of MnPSe$_3$ in its antiferromagnetic phase are gapped. The observation of two low-energy magnon gaps (at 14 and 0.7 cm$^{-1}$) implies that MnPSe$_3$ is a biaxial antiferromagnet. A relatively strong out-of-plane anisotropy imposes the spin alignment to be in-plane whereas the spin directionality within the plane is governed by a factor of 2.5 $\times$ 10$^{-3}$ weaker in-plane anisotropy., Comment: 9 pages, 3 figures
- Published
- 2023
22. Workshop on a future muon program at FNAL
- Author
-
Corrodi, S., Oksuzian, Y., Edmonds, A., Miller, J., Tran, H. N., Bonventre, R., Brown, D. N., Meot, F., Singh, V., Kolomensky, Y., Tripathy, S., Borrel, L., Bub, M., Echenard, B., Hitlin, D. G., Jafree, H., Middleton, S., Plestid, R., Porter, F. C., Zhu, R. Y., Bottura, L., Pinsard, E., Teixeira, A. M., Carelli, C., Ambrose, D., Badgley, K., Bautista, G. D., Bernstein, R. H., Boi, S., Crnkovic, J., Eldred, J., Gaponenko, A., Johnstone, C., Kiburg, B., Kutschke, R., Lynch, K., Mukherjee, A., Neuffer, D., Pellemoine, F., Pronskikh, V., Rakness, G., Tang, J., Tschirhart, R., Yucel, M., Zettlemoyer, J., Simons, B., Redigolo, D., Diociaiuti, E., Giovannella, S., Miscetti, S., Sarra, I., Muller, S. E., Ootani, W., Yucel, E. B., Kaplan, D. M., Phillips, T. J., Pasternak, J., Palo, D., Davydov, Y., Brown, D., Banerjee, S., Kawall, D., Hartwig, Z., Davidson, S., Abrams, R., Kampa, C., Mackenzie, M., Schmitt, M., Piot, P., Lee, Y. J., Morozov, V., Sato, A., Di Falco, S., Gioiosa, A., Morescalchi, L., Papa, A., Hedges, M. T., Renga, F., Lagrange, J. -B., Rogers, C., Wilcox, D., Petrov, A., Zhao, S., Dukes, E. C., Erlich, R., Group, C., Heeck, J., Pezzullo, G., Nguyen, T., and Popp, J. L.
- Subjects
High Energy Physics - Experiment ,Physics - Accelerator Physics - Abstract
The Snowmass report on rare processes and precision measurements recommended Mu2e-II and a next generation muon facility at Fermilab (Advanced Muon Facility) as priorities for the frontier. The Workshop on a future muon program at FNAL was held in March 2023 to discuss design studies for Mu2e-II, organizing efforts for the next generation muon facility, and identify synergies with other efforts (e.g., muon collider). Topics included high-power targetry, status of R&D for Mu2e-II, development of compressor rings, FFA and concepts for muon experiments (conversion, decays, muonium and other opportunities) at AMF. This document summarizes the workshop discussions with a focus on future R&D tasks needed to realize these concepts., Comment: 68 pages, 36 figures
- Published
- 2023
23. Characterization of cancer-associated adipocytes by Raman spectroscopy and trajectory inference
- Author
-
Nicolas Goffin, Emilie Buache, Nathalie Lalun, Marion Fernandes, Ines Miguel, Catherine Muller, Charlotte Vaysse, Landry Blanc, Cyril Gobinet, and Olivier Piot
- Subjects
Cancer-associated adipocytes ,Raman spectroscopy ,Trajectory inference ,Breast cancer ,Applied optics. Photonics ,TA1501-1820 - Abstract
Abstract Cancer-associated adipocytes (CAAs) have emerged as pivotal players in various cancers, particularly in such as breast cancer, significantly influencing their progression and therapy resistance. Understanding the adipocytes/cancer cells crosstalk is crucial for effective treatment strategies. Raman spectroscopy, a label-free optical technique, offers potential for characterizing biological samples by providing chemical-specific information. In this study, we used Raman spectroscopy and Trajectory Inference methods, specifically the Partition-based graph abstraction algorithm, to investigate the interactions between 3T3-L1 differentiated adipocytes and MDA-MB-231 breast cancer cells in a 2D co-culture model. We demonstrate the existence of subpopulations of adipocytes and the molecular changes associated with CAAs phenotype. This work contributes to understanding the role of CAAs in breast cancer progression and may guide the development of targeted therapies disrupting this interaction.
- Published
- 2024
- Full Text
- View/download PDF
24. Search for a neutron dark decay in $^6$He
- Author
-
Joubioux, M. Le, Savajols, H., Mittig, W., Fléchard, X., Hayen, L., Penionzhkevich, Yu. E., Ackermann, D., Borcea, C., Caceres, L., Delahaye, P., Didierjean, F., Franchoo, S., Grillet, A., Jacquot, B., Lebois, M., Ledoux, X., Lecesne, N., Liénard, E., Lukyanov, S., Naviliat-Cuncic, O., Piot, J., Singh, A., Smirnov, V., Stodel, C., Testov, D., Thomas, J. C., and Verney, D.
- Subjects
Nuclear Experiment - Abstract
Neutron dark decays have been suggested as a solution to the discrepancy between bottle and beam experiments, providing a dark matter candidate that can be searched for in halo nuclei. The free neutron in the final state following the decay of $^6$He into $^4$He $+$ $n$ + $\chi$ provides an exceptionally clean detection signature when combined with a high efficiency neutron detector. Using a high-intensity $^6$He$^+$ beam at GANIL, a search for a coincident neutron signal resulted in an upper limit on a dark decay branching ratio of Br$_\chi \leq 4.0\times10^{-10}$ (95\% C.L.). Using the dark neutron decay model proposed originally by Fornal and Grinstein, we translate this into an upper bound on a dark neutron branching ratio of $\mathcal{O}(10^{-5})$, improving over global constraints by one to several orders of magnitude depending on $m_\chi$.
- Published
- 2023
- Full Text
- View/download PDF
25. Observation of Skewed Electromagnetic Wakefields in an Asymmetric Structure Driven by Flat Electron Bunches
- Author
-
Lynn, Walter, Xu, Tianzhe, Andonian, Gerard, Doran, Scott, Ha, Gwanghui, Majernik, Nathan, Piot, Philippe, Power, John, Rosenzweig, James, Whiteford, Charles, and Wisniewski, Eric
- Subjects
Physics - Accelerator Physics - Abstract
Relativistic charged-particle beams which generate intense longitudinal fields in accelerating structures also inherently couple to transverse modes. The effects of this coupling may lead to beam break-up instability, and thus must be countered to preserve beam quality in applications such as linear colliders. Beams with highly asymmetric transverse sizes (flat-beams) have been shown to suppress the initial instability in slab-symmetric structures. However, as the coupling to transverse modes remains, this solution serves only to delay instability. In order to understand the hazards of transverse coupling in such a case, we describe here an experiment characterizing the transverse effects on a flat-beam, traversing near a planar dielectric lined structure. The measurements reveal the emergence of a previously unobserved skew-quadrupole-like interaction when the beam is canted transversely, which is not present when the flat-beam travels parallel to the dielectric surface. We deploy a multipole field fitting algorithm to reconstruct the projected transverse wakefields from the data. We generate the effective kick vector map using a simple two-particle theoretical model, with particle-in-cell simulations used to provide further insight for realistic particle distributions., Comment: Six pages, seven figures. Submitted to Physical Review
- Published
- 2023
26. Receptivity of swept-aerofoil flows to small amplitude wall roughness using resolvent analysis based on wall displacement
- Author
-
Kitzinger, Euryale, Sipp, Denis, Marquet, Olivier, and Piot, Estelle
- Subjects
Physics - Fluid Dynamics - Abstract
The receptivity of a laminar boundary layer flow to small amplitude wall roughness is investigated on an ONERA-D swept aerofoil by introducing a dedicated resolvent operator based on linearised small amplitude wall displacements. The singular value decomposition of this operator for a given spanwise wavenumber provides optimal wall roughness and flow responses that maximise an input-output gain. At the most receptive spanwise wavenumber, the optimal response is a cross-flow mode associated with an optimal roughness located close to the attachment-line and presenting a wavy shape with a wavevector nearly orthogonal to the external streamlines. The method therefore allows direct identification of the location and structure (chordwise and spanwise wavenumbers) of the most receptive roughness. For various given wall roughness shapes and locations (periodic or compact in the chordwise and/or spanwise directions), an approximation of the response based on the dominant optimal response is shown to accurately match the total response downstream of the roughness. The method therefore allows a straightforward computation of the response of the flow to any given small amplitude roughness.
- Published
- 2023
27. Single-shot Transverse Wakefield Mapping with a Hollow Electron Beam
- Author
-
Halavanau, A., Piot, P., and Baturin, S. S.
- Subjects
Physics - Accelerator Physics - Abstract
Beam-driven wakefield accelerators are foreseen to enable compact accelerator-based light sources and play a critical role in future linear-collider concepts. This class of wakefield acceleration has been extensively studied over the last four decades with a focus on demonstrating its ability to support high-accelerating gradient and, most recently, enhanced transformer ratios. Yet, the associated detrimental transverse wakefields have not been examined in as many details due to the limited diagnostics available. In this paper, we introduce a beam-based single-shot transverse-wakefield measurement technique. The approach employs a witness ''hollow" electron beam to probe the wakefields generated by a drive bunch. We show how the transverse distortions of the hollow probe provide a direct measurement of the wakefield distribution within the area circumscribed by the probe. The ability to directly measure a full structure of the transverse wakefield could help to develop mitigation schemes and ultimately suppress the adverse beam-break-up instabilities. We discuss a practical implementation of the method and demonstrate its performance with the help of start-to-end simulations., Comment: 13 pages 15 figures
- Published
- 2023
- Full Text
- View/download PDF
28. Numerical Modeling of a Proof-of-Principle Experiment on Optical Stochastic Cooling at the IOTA Electron Storage Ring
- Author
-
Dick, Austin, Borland, Michael, Jarvis, Jonathan, Lebedev, Valeri, Piot, Philippe, Romanov, Aleksandr, and Wallbank, Michael
- Subjects
Physics - Accelerator Physics - Abstract
Cooling of beams circulating in storage rings is critical for many applications including particle colliders and synchrotron light sources. A method enabling unprecedented beam-cooling rates, optical stochastic cooling (OSC), was recently demonstrated in the IOTA electron storage ring at Fermilab. This paper describes the numerical implementation of the OSC process in the particle-tracking program ELEGANT and discusses the validation of the developed model with available experimental data. The model is also employed to highlight some features associated with different modes of operation of OSC. The developed simulation tool should be valuable in guiding future configurations of optical stochastic cooling and, more broadly, modeling self-field-based beam manipulations., Comment: 15 pages, 10 figures
- Published
- 2023
- Full Text
- View/download PDF
29. Single-shot, transverse self-wakefield reconstruction from screen images
- Author
-
Majernik, N., Lynn, W., Andonian, G., Xu, T., Piot, P., and Rosenzweig, J. B.
- Subjects
Physics - Accelerator Physics - Abstract
A single-shot method to reconstruct the transverse self-wakefields acting on a beam, based only on screen images, is introduced. By employing numerical optimization with certain approximations, a relatively high-dimensional parameter space is efficiently explored to determine the multipole components of the transverse-wakefield topology up to desired order. The reconstruction technique complements simulations, which are able to directly describe the wakefield composition based on experimental conditions. The technique is applied to representative simulation results as a benchmark, and also to experimental data on wakefield observations driven in dielectric-lined structures., Comment: 10 pages, 8 figures
- Published
- 2023
- Full Text
- View/download PDF
30. Magnon gap excitations in van der Waals antiferromagnet MnPSe3
- Author
-
Jana, Dipankar, Vaclavkova, D., Mohelsky, I., Kapuscinski, P., Cho, C. W., Breslavetz, I., Białek, M., Ansermet, J.-Ph., Piot, B. A., Orlita, M., Faugeras, C., and Potemski, M.
- Published
- 2024
- Full Text
- View/download PDF
31. Pathways and identity: toward qualitative research careers in child and adolescent psychiatry
- Author
-
Martin, Andrés, DiGiovanni, Madeline, Acquaye, Amber, Ponticiello, Matthew, Chou, Débora Tseng, Neto, Emilio Abelama, Michel, Alexandre, Sibeoni, Jordan, Piot, Marie-Aude, Spodenkiewicz, Michel, and Benoit, Laelia
- Published
- 2024
- Full Text
- View/download PDF
32. Breaking the fast: first report of dives and ingestion events in molting southern elephant seals
- Author
-
Charlanne, Laura M., Chaise, Laureline, Sornette, Damien, Piot, Erwan, McCafferty, Dominic J., Ancel, André, and Gilbert, Caroline
- Published
- 2024
- Full Text
- View/download PDF
33. Distinct ontogenetic lineages dictate cDC2 heterogeneity
- Author
-
Minutti, Carlos M., Piot, Cécile, Pereira da Costa, Mariana, Chakravarty, Probir, Rogers, Neil, Huerga Encabo, Hector, Cardoso, Ana, Loong, Jane, Bessou, Gilles, Mionnet, Cyrille, Langhorne, Jean, Bonnet, Dominique, Dalod, Marc, Tomasello, Elena, and Reis e Sousa, Caetano
- Published
- 2024
- Full Text
- View/download PDF
34. Time-domain simulation of the acoustic nonlinear response of acoustic liners at high sound pressure level
- Author
-
Moufid, Ilyes, Roncen, Rémi, Matignon, Denis, and Piot, Estelle
- Published
- 2024
- Full Text
- View/download PDF
35. Unlocking the Power of Representations in Long-term Novelty-based Exploration
- Author
-
Saade, Alaa, Kapturowski, Steven, Calandriello, Daniele, Blundell, Charles, Sprechmann, Pablo, Sarra, Leopoldo, Groth, Oliver, Valko, Michal, and Piot, Bilal
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
We introduce Robust Exploration via Clustering-based Online Density Estimation (RECODE), a non-parametric method for novelty-based exploration that estimates visitation counts for clusters of states based on their similarity in a chosen embedding space. By adapting classical clustering to the nonstationary setting of Deep RL, RECODE can efficiently track state visitation counts over thousands of episodes. We further propose a novel generalization of the inverse dynamics loss, which leverages masked transformer architectures for multi-step prediction; which in conjunction with RECODE achieves a new state-of-the-art in a suite of challenging 3D-exploration tasks in DM-Hard-8. RECODE also sets new state-of-the-art in hard exploration Atari games, and is the first agent to reach the end screen in "Pitfall!".
- Published
- 2023
36. Magneto-optical sensing of the pressure driven magnetic ground states in bulk CrSBr
- Author
-
Pawbake, A., Pelini, T., Mohelsky, I., Jana, D., Breslavetz, I., Cho, C. -W., Orlita, M., Potemski, M., Measson, M. -A., Wilson, N., Mosina, K., Soll, A., Sofer, Z., Piot, B. A., Zhitomirsky, M. E., and Faugeras, C.
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
Competition between exchange interactions and magnetocrystalline anisotropy may bring new magnetic states that are of great current interest. An applied hydrostatic pressure can further be used to tune their balance. In this work we investigate the magnetization process of a biaxial antiferromagnet in an external magnetic field applied along the easy axis. We find that the single metamagnetic transition of the Ising type observed in this material under ambient pressure transforms under hydrostatic pressure into two transitions, a first-order spin flop transition followed by a second order transition towards a polarized ferromagnetic state near saturation. This reversible tuning into a new magnetic phase is obtained in layered bulk CrSBr at low temperature by varying the interlayer distance using high hydrostatic pressure, which efficiently acts on the interlayer magnetic exchange, and is probed by magneto-optical spectroscopy., Comment: 3 Figures, 6 pages. To appear in ACS NanoLetters
- Published
- 2023
- Full Text
- View/download PDF
37. Magnon gap excitations in van der Waals antiferromagnet MnPSe3
- Author
-
Dipankar Jana, D. Vaclavkova, I. Mohelsky, P. Kapuscinski, C. W. Cho, I. Breslavetz, M. Białek, J.-Ph. Ansermet, B. A. Piot, M. Orlita, C. Faugeras, and M. Potemski
- Subjects
Medicine ,Science - Abstract
Abstract Magneto-spectroscopy methods have been employed to study the zero-wavevector magnon excitations in MnPSe3. Experiments carried out as a function of temperature and the applied magnetic field show that two low-energy magnon branches of MnPSe3 in its antiferromagnetic phase are gapped. The observation of two low-energy magnon gaps (at 1.70 ± 0.05 meV and 0.09 ± 0.01 meV) implies that MnPSe3 is a biaxial antiferromagnet. A relatively strong out-of-plane anisotropy imposes the spin alignment to be in-plane whereas the spin directionality within the plane is governed by a factor of 2.5 × 10−3 weaker in-plane anisotropy.
- Published
- 2024
- Full Text
- View/download PDF
38. Cross-shell states in $^{15}$C: a test for p-sd interactions
- Author
-
Lois-Fuentes, J., Fernández-Domínguez, B., Pereira-López, X., Delaunay, F., Catford, W. N., Matta, A., Orr, N. A., Duguet, T., Otsuka, T., Somà, V., Sorlin, O., Suzuki, T., Achouri, N. L., Assié, M., Bailey, S., Bastin, B., Blumenfeld, Y., Borcea, R., Caamaño, M., Caceres, L., Clément, E., Corsi, A., Curtis, N., Deshayes, Q., Farget, F., Fisichella, M., de France, G., Franchoo, S., Freer, M., Gibelin, J., Gillibert, A., Grinyer, G. F., Hammache, F., Kamalou, O., Knapton, A., Kokalova, Tz., Lapoux, V., Crom, B. Le, Leblond, S., Marqués, F. M., Morfouace, P., Pancin, J., Perrot, L., Piot, J., Pollacco, E., Ramos, D., Regueira-Castro, D., Rodríguez-Tajes, C., Roger, T., Rotaru, F., Sénoville, M., de Séréville, N., Smith, R., Stanoiu, M., Stefan, I., Stodel, C., Suzuki, D., Thomas, J. C., Timofeyuk, N., Vandebrouck, M., Walshe, J., and Wheldon, C.
- Subjects
Nuclear Experiment ,Nuclear Theory - Abstract
The low-lying structure of $^{15}$C has been investigated via the neutron-removal $^{16}$C$(d,t)$ reaction. Along with bound neutron sd-shell hole states, unbound p-shell hole states have been firmly confirmed. The excitation energies and the deduced spectroscopic factors of the cross-shell states are an important measure of the $[(p)^{-1}(sd)^{2}]$ neutron configurations in $^{15}$C. Our results show a very good agreement with shell-model calculations using the SFO-tls interaction for $^{15}$C. However, a modification of the $p$-$sd$ and $sd$-$sd$ monopole terms was applied in order to reproduce the $N=9$ isotone $^{17}$O. In addition, the excitation energies and spectroscopic factors have been compared to the first calculations of $^{15}$C with the $ab~ initio$ self-consistent Green's function method employing the NNLO$_{sat}$ interaction. The results show the sensitivity to the size of the $N=8$ shell gap and highlight the need of going beyond the current truncation scheme in the theory.
- Published
- 2023
39. The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
- Author
-
Richemond, Pierre H., Tam, Allison, Tang, Yunhao, Strub, Florian, Piot, Bilal, and Hill, Felix
- Subjects
Computer Science - Machine Learning - Abstract
Self-predictive unsupervised learning methods such as BYOL or SimSiam have shown impressive results, and counter-intuitively, do not collapse to trivial representations. In this work, we aim at exploring the simplest possible mathematical arguments towards explaining the underlying mechanisms behind self-predictive unsupervised learning. We start with the observation that those methods crucially rely on the presence of a predictor network (and stop-gradient). With simple linear algebra, we show that when using a linear predictor, the optimal predictor is close to an orthogonal projection, and propose a general framework based on orthonormalization that enables to interpret and give intuition on why BYOL works. In addition, this framework demonstrates the crucial role of the exponential moving average and stop-gradient operator in BYOL as an efficient orthonormalization mechanism. We use these insights to propose four new \emph{closed-form predictor} variants of BYOL to support our analysis. Our closed-form predictors outperform standard linear trainable predictor BYOL at $100$ and $300$ epochs (top-$1$ linear accuracy on ImageNet).
- Published
- 2023
40. High-resolution laser system for the S3-Low Energy Branch
- Author
-
Romans, Jekabs, Ajayakumar, Anjali, Authier, Martial, Boumard, Frederic, Caceres, Lucia, Cam, Jean-Francois, Claessens, Arno, Damoy, Samuel, Delahaye, Pierre, Desrues, Philippe, Dong, Wenling, Drouart, Antoine, Duchesne, Patricia, Ferrer, Rafael, Flechard, Xavier, Franchoo, Serge, Gangnant, Patrice, Geldhof, Sarina, de Groote, Ruben P., Lecesne, Nathalie, Leroy, Renan, Lory, Julien, Lutton, Franck, Manea, Vladimir, Merrer, Yvan, Moore, Iain, Ortiz-Cortes, Alejandro, Osmond, Benoit, Piot, Julien, Pochon, Olivier, Raeder, Sebastian, de Roubin, Antoine, Savajols, Herve, Sels, Simon, Studer, Dominik, Traykov, Emil, Uusitalo, Juha, Vandamme, Christophe, Vandebrouck, Marine, Bergh, Paul Van den, Van Duppen, Piet, and Wendt, Klaus
- Subjects
Nuclear Experiment ,Physics - Atomic Physics - Abstract
In this paper we present the first high-resolution laser spectroscopy results obtained at the GISELE laser laboratory of the GANIL-SPIRAL2 facility, in preparation for the first experiments with the S$^3$-Low Energy Branch. Studies of neutron-deficient radioactive isotopes of erbium and tin represent the first physics cases to be studied at S$^3$. The measured isotope-shift and hyperfine structure data are presented for stable isotopes of these elements. The erbium isotopes were studied using the $4f^{12}6s^2$ $^3H_6 \rightarrow 4f^{12}(^3 H)6s6p$ $J = 5$ atomic transition (415 nm) and the tin isotopes were studied by the $5s^25p^2 (^3P_0) \rightarrow 5s^25p6s (^3P_1)$ atomic transition (286.4 nm), and are used as a benchmark of the laser setup. Additionally, the tin isotopes were studied by the $5s^25p6s (^3P_1) \rightarrow 5s^25p6p (^3P_2)$ atomic transition (811.6 nm), for which new isotope-shift data was obtained and the corresponding field-shift $F_{812}$ and mass-shift $M_{812}$ factors are presented.
- Published
- 2022
- Full Text
- View/download PDF
41. Understanding Self-Predictive Learning for Reinforcement Learning
- Author
-
Tang, Yunhao, Guo, Zhaohan Daniel, Richemond, Pierre Harvey, Pires, Bernardo Ávila, Chandak, Yash, Munos, Rémi, Rowland, Mark, Azar, Mohammad Gheshlaghi, Lan, Charline Le, Lyle, Clare, György, András, Thakoor, Shantanu, Dabney, Will, Piot, Bilal, Calandriello, Daniele, and Valko, Michal
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments.
- Published
- 2022
42. Mixing of surface and bulk electronic states at a graphite-hexagonal boron nitride interface
- Author
-
Mullan, Ciaran, Slizovskiy, Sergey, Yin, Jun, Wang, Ziwei, Yang, Qian, Xu, Shuigang, Yang, Yaping, Piot, Benjamin A., Hu, Sheng, Taniguchi, Takashi, Watanabe, Kenji, Novoselov, Kostya S., Geim, A. K., Fal'ko, Vladimir I., and Mishchenko, Artem
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Van der Waals assembly enables exquisite design of electronic states in two-dimensional (2D) materials, often by superimposing a long-wavelength periodic potential on a crystal lattice using moir\'e superlattices. Here we show that electronic states in three-dimensional (3D) crystals such as graphite can also be tuned by the superlattice potential arising at the interface with another crystal, namely, crystallographically aligned hexagonal boron nitride. Such alignment is found to result in a multitude of Lifshitz transitions and Brown-Zak oscillations for near-surface 2D states whereas, in high magnetic fields, fractal states of Hofstadter's butterfly extend deep into graphite's bulk. Our work shows a venue to control 3D spectra by using the approach of 2D twistronics.
- Published
- 2022
- Full Text
- View/download PDF
43. Temperature dependence of the energy band gap in ZrTe$_5$: implications for the topological phase
- Author
-
Mohelsky, I., Wyzula, J., Piot, B. A., Gu, G. D., Li, Q., Akrap, A., and Orlita, M.
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
Using Landau level spectroscopy, we determine the temperature dependence of the energy band gap in zirconium pentatelluride (ZrTe$_5$). We find that the band gap reaches $E_g=(5 \pm 1)$ meV at low temperatures and increases monotonously when the temperature is raised. This implies that ZrTe$_5$ is a weak topological insulator, with non-inverted ordering of electronic bands in the center of the Brillouin zone. Our magneto-transport experiments performed in parallel show that the resistivity anomaly in ZrTe$_5$ is not connected with the temperature dependence of the band gap., Comment: 6 pages, 2 figures
- Published
- 2022
- Full Text
- View/download PDF
44. Microscopic parameters of the van der Waals CrSBr antiferromagnet from microwave absorption experiments
- Author
-
Cho, C. W., Pawbake, A., Aubergier, N., Barra, A. L., Mosina, K., Sofer, Z., Zhitomirsky, M. E., Faugeras, C., and Piot, B. A.
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
Microwave absorption experiments employing a phase-sensitive external resistive detection are performed for a topical van der Waals antiferromagnet CrSBr. The field dependence of two resonance modes is measured in an applied field parallel to the three principal crystallographic directions, revealing anisotropies and magnetic transitions in this material. To account for the observed results, we formulate a microscopic spin model with a bi-axial single-ion anisotropy and inter-plane exchange. Theoretical calculations give an excellent description of full magnon spectra enabling us to precisely determine microscopic interaction parameters for CrSBr., Comment: includes a supplementary information document
- Published
- 2022
- Full Text
- View/download PDF
45. Beam shaping using an ultra-high vacuum multileaf collimator and emittance exchange beamline
- Author
-
Majernik, N., Andonian, G., Lynn, W., Kim, S., Lorch, C., Roussel, R., Doran, S., Wisniewski, E., Whiteford, C., Piot, P., Power, J., and Rosenzweig, J. B.
- Subjects
Physics - Accelerator Physics - Abstract
We report the development of a multileaf collimator (MLC) for charged particle beams, based on independently actuated tungsten strips which can selectively scatter unwanted particles. The MLC is used in conjunction with an emittance exchange beamline to rapidly generate highly variable longitudinal bunch profiles. The developed MLC consists of 40 independent leaves that are 2 mm wide and can move up to 10 mm, and operates in an ultra high vacuum environment, enabled by novel features such as magnetically coupled actuation. An experiment at the Argonne Wakefield Accelerator, which previously used inflexible, laser-cut masks for beam shaping before an emittance exchange beamline, was conducted to test functionality. The experiment demonstrated myriad transverse mask silhouettes, as measured on a scintillator downstream of the MLC and the corresponding longitudinal profiles after emittance exchange, as measured using a transverse deflecting cavity. Rapidly changing between mask shapes enables expeditious execution of various experiments without the downtime associated with traditional methods. The many degrees of freedom of the MLC can enable optimization of experimental figures of merit using feed-forward control and advanced machine learning methods., Comment: 7 pages, 7 figures
- Published
- 2022
- Full Text
- View/download PDF
46. Psychiatric Clinical Training Across Borders: Developing Virtual Communities of Practice Through International Co-constructive Patient Simulation
- Author
-
Danieli, Polina Perlman, Hanson, Mark D., VanRiper, Lindy, van Hoof, Marie-José, Thomas, Isaiah, Sibeoni, Jordan, Raats, Pascal, Prins, Cecil, Porter, Sara, Piot, Marie-Aude, Nair, Bina, Mian, Irfan, Leung, Kitty, Hibbard, Kate, Billon, Gregoire, Benoit, Laelia, Baker, Jonathan D., Alleyne, Shirley, de Carvalho-Filho, Marco A., Amsalem, Doron, and Martin, Andrés
- Published
- 2024
- Full Text
- View/download PDF
47. Pathways and identity: toward qualitative research careers in child and adolescent psychiatry
- Author
-
Andrés Martin, Madeline DiGiovanni, Amber Acquaye, Matthew Ponticiello, Débora Tseng Chou, Emilio Abelama Neto, Alexandre Michel, Jordan Sibeoni, Marie-Aude Piot, Michel Spodenkiewicz, and Laelia Benoit
- Subjects
Pediatrics ,RJ1-570 ,Psychiatry ,RC435-571 - Abstract
Abstract Objective Qualitative research methods are based on the analysis of words rather than numbers; they encourage self-reflection on the investigator’s part; they are attuned to social interaction and nuance; and they incorporate their subjects’ thoughts and feelings as primary sources. Despite appearing well suited for research in child and adolescent psychiatry (CAP), qualitative methods have had relatively minor uptake in the discipline. We conducted a qualitative study of CAPs involved in qualitative research to learn about these investigators’ lived experiences, and to identify modifiable factors to promote qualitative methods within the field of youth mental health. Methods We conducted individual, semi-structured 1-h long interviews through Zoom. Using purposive sample, we selected 23 participants drawn from the US (n = 12) and from France (n = 11), and equally divided in each country across seniority level. All participants were current or aspiring CAPs and had published at least one peer-reviewed qualitative article. Ten participants were women (44%). We recorded all interviews digitally and transcribed them for analysis. We coded the transcripts according to the principles of thematic analysis and approached data analysis, interpretation, and conceptualization informed by an interpersonal phenomenological analysis (IPA) framework. Results Through iterative thematic analysis we developed a conceptual model consisting of three domains: (1) Becoming a qualitativist: embracing a different way of knowing (in turn divided into the three themes of priming factors/personal fit; discovering qualitative research; and transitioning in); (2) Being a qualitativist: immersing oneself in a different kind of research (in turn divided into quality: doing qualitative research well; and community: mentors, mentees, and teams); and (3) Nurturing: toward a higher quality future in CAP (in turn divided into current state of qualitative methods in CAP; and advocating for qualitative methods in CAP). For each domain, we go on to propose specific strategies to enhance entry into qualitative careers and research in CAP: (1) Becoming: personalizing the investigator’s research focus; balancing inward and outward views; and leveraging practical advantages; (2) Being: seeking epistemological flexibility; moving beyond bibliometrics; and the potential and risks of mixing methods; and (3) Nurturing: invigorating a quality pipeline; and building communities. Conclusions We have identified factors that can support or impede entry into qualitative research among CAPs. Based on these modifiable findings, we propose possible solutions to enhance entry into qualitative methods in CAP (pathways), and to foster longer-term commitment to this type of research (identity).
- Published
- 2024
- Full Text
- View/download PDF
48. Experiments On A Conduction Cooled Superconducting Radio Frequency Cavity With Field Emission Cathode
- Author
-
Ji, Y., Dhuley, R. C., Edward, C., Thangaraj, J. C. T., Mihalcea, D., Mohsen, P. Piot O., Salehinia, I., and Korampally, V.
- Subjects
Physics - Accelerator Physics - Abstract
To achieve Ampere-class electron beam accelerators the pulse delivery rate need to be much higher than the typical photo injector repetition rate of the order of a few kilohertz. We propose here an injector which can, in principle, generate electron bunches at the same rate as the operating RF frequency. A conduction-cooled superconducting radio frequency (SRF) cavity operating in the CW mode and housing a field emission element at its region of high axial electric field can be a viable method of generating high repetition-rate electron bunches. In this paper, we report the development and experiments on a conduction-cooled Nb3Sn cavity with a niobium rod intended as a field emitter support. The initial experiments demonstrate 0.4 MV/m average accelerating gradient, which is equivalent of peak gradient of 3.2 MV/m. The measured RF cavity quality factor is 1.4 x 108 slightly above our goal. The achieved field gradient is limited by the relatively low input RF power and by the poor coupling between the external power supply and the RF cavity. With ideal coupling the field gradient can be as high as 0.6 MV/m still below our goal of about 1 MV/m.
- Published
- 2022
49. Electron Sources for Accelerators
- Author
-
Filippetto, Daniele, Grames, Joe, Hernandez-Garcia, Carlos, Karkare, Siddharth, Piot, Philippe, Power, John, Sun, Yine, and Wang, Erdong
- Subjects
Physics - Accelerator Physics - Abstract
Electron sources are essential to an array of electron accelerator supporting research in high-energy physics and beyond. This report summarizes the "Snowmass 2021 Electron Source Workshop" which reviewed the current state-of-the art research and identified some possible research directions.
- Published
- 2022
50. Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
- Author
-
Perolat, Julien, de Vylder, Bart, Hennes, Daniel, Tarassov, Eugene, Strub, Florian, de Boer, Vincent, Muller, Paul, Connor, Jerome T., Burch, Neil, Anthony, Thomas, McAleer, Stephen, Elie, Romuald, Cen, Sarah H., Wang, Zhe, Gruslys, Audrunas, Malysheva, Aleksandra, Khan, Mina, Ozair, Sherjil, Timbers, Finbarr, Pohlen, Toby, Eccles, Tom, Rowland, Mark, Lanctot, Marc, Lespiau, Jean-Baptiste, Piot, Bilal, Omidshafiei, Shayegan, Lockhart, Edward, Sifre, Laurent, Beauguerlange, Nathalie, Munos, Remi, Silver, David, Singh, Satinder, Hassabis, Demis, and Tuyls, Karl
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computer Science and Game Theory ,Computer Science - Multiagent Systems - Abstract
We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level. Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular game has an enormous game tree on the order of $10^{535}$ nodes, i.e., $10^{175}$ times larger than that of Go. It has the additional complexity of requiring decision-making under imperfect information, similar to Texas hold'em poker, which has a significantly smaller game tree (on the order of $10^{164}$ nodes). Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome. Episodes are long, with often hundreds of moves before a player wins, and situations in Stratego can not easily be broken down into manageably-sized sub-problems as in poker. For these reasons, Stratego has been a grand challenge for the field of AI for decades, and existing AI methods barely reach an amateur level of play. DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego via self-play. The Regularised Nash Dynamics (R-NaD) algorithm, a key component of DeepNash, converges to an approximate Nash equilibrium, instead of 'cycling' around it, by directly modifying the underlying multi-agent learning dynamics. DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform, competing with human expert players.
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.