Author: "Mallinson, Jonathan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Mallinson, Jonathan"' showing total 131 results

Start Over Author "Mallinson, Jonathan"

131 results on '"Mallinson, Jonathan"'

1. Françoise de Graffigny: choix de lettres ed. by E. Showalter (review)

Author: Mallinson, Jonathan
Published: 2022

2. Offline Regularised Reinforcement Learning for Large Language Models Alignment

Author: Richemond, Pierre Harvey, Tang, Yunhao, Guo, Daniel, Calandriello, Daniele, Azar, Mohammad Gheshlaghi, Rafailov, Rafael, Pires, Bernardo Avila, Tarassov, Eugene, Spangher, Lucas, Ellsworth, Will, Severyn, Aliaksei, Mallinson, Jonathan, Shani, Lior, Shamir, Gil, Joshi, Rishabh, Liu, Tianqi, Munos, Remi, and Piot, Bilal
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The dominant framework for alignment of large language models (LLM), whether through reinforcement learning from human feedback or direct preference optimisation, is to learn from preference data. This involves building datasets where each element is a quadruplet composed of a prompt, two independent responses (completions of the prompt) and a human preference between the two independent responses, yielding a preferred and a dis-preferred response. Such data is typically scarce and expensive to collect. On the other hand, \emph{single-trajectory} datasets where each element is a triplet composed of a prompt, a response and a human feedback is naturally more abundant. The canonical element of such datasets is for instance an LLM's response to a user's prompt followed by a user's feedback such as a thumbs-up/down. Consequently, in this work, we propose DRO, or \emph{Direct Reward Optimisation}, as a framework and associated algorithms that do not require pairwise preferences. DRO uses a simple mean-squared objective that can be implemented in various ways. We validate our findings empirically, using T5 encoder-decoder language models, and show DRO's performance over selected baselines such as Kahneman-Tversky Optimization (KTO). Thus, we confirm that DRO is a simple and empirically compelling method for single-trajectory policy optimisation.
Published: 2024

3. West-of-N: Synthetic Preference Generation for Improved Reward Modeling

Author: Pace, Alizée, Mallinson, Jonathan, Malmi, Eric, Krause, Sebastian, and Severyn, Aliaksei
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The success of reinforcement learning from human feedback (RLHF) in language model alignment is strongly dependent on the quality of the underlying reward model. In this paper, we present a novel approach to improve reward model quality by generating synthetic preference data, thereby augmenting the training dataset with on-policy, high-quality preference pairs. Motivated by the promising results of Best-of-N sampling strategies in language model training, we extend their application to reward model training. This results in a self-training strategy to generate preference pairs by selecting the best and worst candidates in a pool of responses to a given query. Empirically, we find that this approach improves the performance of any reward model, with an effect comparable to the addition of a similar quantity of human preference data. This work opens up new avenues of research for improving RLHF for language model alignment, by offering synthetic preference generation as a solution to reward modeling challenges.
Published: 2024

4. Gemini: A Family of Highly Capable Multimodal Models

Author: Gemini Team, Anil, Rohan, Borgeaud, Sebastian, Alayrac, Jean-Baptiste, Yu, Jiahui, Soricut, Radu, Schalkwyk, Johan, Dai, Andrew M., Hauth, Anja, Millican, Katie, Silver, David, Johnson, Melvin, Antonoglou, Ioannis, Schrittwieser, Julian, Glaese, Amelia, Chen, Jilin, Pitler, Emily, Lillicrap, Timothy, Lazaridou, Angeliki, Firat, Orhan, Molloy, James, Isard, Michael, Barham, Paul R., Hennigan, Tom, Lee, Benjamin, Viola, Fabio, Reynolds, Malcolm, Xu, Yuanzhong, Doherty, Ryan, Collins, Eli, Meyer, Clemens, Rutherford, Eliza, Moreira, Erica, Ayoub, Kareem, Goel, Megha, Krawczyk, Jack, Du, Cosmo, Chi, Ed, Cheng, Heng-Tze, Ni, Eric, Shah, Purvi, Kane, Patrick, Chan, Betty, Faruqui, Manaal, Severyn, Aliaksei, Lin, Hanzhao, Li, YaGuang, Cheng, Yong, Ittycheriah, Abe, Mahdieh, Mahdis, Chen, Mia, Sun, Pei, Tran, Dustin, Bagri, Sumit, Lakshminarayanan, Balaji, Liu, Jeremiah, Orban, Andras, Güra, Fabian, Zhou, Hao, Song, Xinying, Boffy, Aurelien, Ganapathy, Harish, Zheng, Steven, Choe, HyunJeong, Weisz, Ágoston, Zhu, Tao, Lu, Yifeng, Gopal, Siddharth, Kahn, Jarrod, Kula, Maciej, Pitman, Jeff, Shah, Rushin, Taropa, Emanuel, Merey, Majd Al, Baeuml, Martin, Chen, Zhifeng, Shafey, Laurent El, Zhang, Yujing, Sercinoglu, Olcan, Tucker, George, Piqueras, Enrique, Krikun, Maxim, Barr, Iain, Savinov, Nikolay, Danihelka, Ivo, Roelofs, Becca, White, Anaïs, Andreassen, Anders, von Glehn, Tamara, Yagati, Lakshman, Kazemi, Mehran, Gonzalez, Lucas, Khalman, Misha, Sygnowski, Jakub, Frechette, Alexandre, Smith, Charlotte, Culp, Laura, Proleev, Lev, Luan, Yi, Chen, Xi, Lottes, James, Schucher, Nathan, Lebron, Federico, Rrustemi, Alban, Clay, Natalie, Crone, Phil, Kocisky, Tomas, Zhao, Jeffrey, Perz, Bartek, Yu, Dian, Howard, Heidi, Bloniarz, Adam, Rae, Jack W., Lu, Han, Sifre, Laurent, Maggioni, Marcello, Alcober, Fred, Garrette, Dan, Barnes, Megan, Thakoor, Shantanu, Austin, Jacob, Barth-Maron, Gabriel, Wong, William, Joshi, Rishabh, Chaabouni, Rahma, Fatiha, Deeni, Ahuja, Arun, Tomar, Gaurav Singh, Senter, Evan, Chadwick, Martin, Kornakov, Ilya, Attaluri, Nithya, Iturrate, Iñaki, Liu, Ruibo, Li, Yunxuan, Cogan, Sarah, Chen, Jeremy, Jia, Chao, Gu, Chenjie, Zhang, Qiao, Grimstad, Jordan, Hartman, Ale Jakse, Garcia, Xavier, Pillai, Thanumalayan Sankaranarayana, Devlin, Jacob, Laskin, Michael, Casas, Diego de Las, Valter, Dasha, Tao, Connie, Blanco, Lorenzo, Badia, Adrià Puigdomènech, Reitter, David, Chen, Mianna, Brennan, Jenny, Rivera, Clara, Brin, Sergey, Iqbal, Shariq, Surita, Gabriela, Labanowski, Jane, Rao, Abhi, Winkler, Stephanie, Parisotto, Emilio, Gu, Yiming, Olszewska, Kate, Addanki, Ravi, Miech, Antoine, Louis, Annie, Teplyashin, Denis, Brown, Geoff, Catt, Elliot, Balaguer, Jan, Xiang, Jackie, Wang, Pidong, Ashwood, Zoe, Briukhov, Anton, Webson, Albert, Ganapathy, Sanjay, Sanghavi, Smit, Kannan, Ajay, Chang, Ming-Wei, Stjerngren, Axel, Djolonga, Josip, Sun, Yuting, Bapna, Ankur, Aitchison, Matthew, Pejman, Pedram, Michalewski, Henryk, Yu, Tianhe, Wang, Cindy, Love, Juliette, Ahn, Junwhan, Bloxwich, Dawn, Han, Kehang, Humphreys, Peter, Sellam, Thibault, Bradbury, James, Godbole, Varun, Samangooei, Sina, Damoc, Bogdan, Kaskasoli, Alex, Arnold, Sébastien M. R., Vasudevan, Vijay, Agrawal, Shubham, Riesa, Jason, Lepikhin, Dmitry, Tanburn, Richard, Srinivasan, Srivatsan, Lim, Hyeontaek, Hodkinson, Sarah, Shyam, Pranav, Ferret, Johan, Hand, Steven, Garg, Ankush, Paine, Tom Le, Li, Jian, Li, Yujia, Giang, Minh, Neitz, Alexander, Abbas, Zaheer, York, Sarah, Reid, Machel, Cole, Elizabeth, Chowdhery, Aakanksha, Das, Dipanjan, Rogozińska, Dominika, Nikolaev, Vitaliy, Sprechmann, Pablo, Nado, Zachary, Zilka, Lukas, Prost, Flavien, He, Luheng, Monteiro, Marianne, Mishra, Gaurav, Welty, Chris, Newlan, Josh, Jia, Dawei, Allamanis, Miltiadis, Hu, Clara Huiyi, de Liedekerke, Raoul, Gilmer, Justin, Saroufim, Carl, Rijhwani, Shruti, Hou, Shaobo, Shrivastava, Disha, Baddepudi, Anirudh, Goldin, Alex, Ozturel, Adnan, Cassirer, Albin, Xu, Yunhan, Sohn, Daniel, Sachan, Devendra, Amplayo, Reinald Kim, Swanson, Craig, Petrova, Dessie, Narayan, Shashi, Guez, Arthur, Brahma, Siddhartha, Landon, Jessica, Patel, Miteyan, Zhao, Ruizhe, Villela, Kevin, Wang, Luyu, Jia, Wenhao, Rahtz, Matthew, Giménez, Mai, Yeung, Legg, Keeling, James, Georgiev, Petko, Mincu, Diana, Wu, Boxi, Haykal, Salem, Saputro, Rachel, Vodrahalli, Kiran, Qin, James, Cankara, Zeynep, Sharma, Abhanshu, Fernando, Nick, Hawkins, Will, Neyshabur, Behnam, Kim, Solomon, Hutter, Adrian, Agrawal, Priyanka, Castro-Ros, Alex, Driessche, George van den, Wang, Tao, Yang, Fan, Chang, Shuo-yiin, Komarek, Paul, McIlroy, Ross, Lučić, Mario, Zhang, Guodong, Farhan, Wael, Sharman, Michael, Natsev, Paul, Michel, Paul, Bansal, Yamini, Qiao, Siyuan, Cao, Kris, Shakeri, Siamak, Butterfield, Christina, Chung, Justin, Rubenstein, Paul Kishan, Agrawal, Shivani, Mensch, Arthur, Soparkar, Kedar, Lenc, Karel, Chung, Timothy, Pope, Aedan, Maggiore, Loren, Kay, Jackie, Jhakra, Priya, Wang, Shibo, Maynez, Joshua, Phuong, Mary, Tobin, Taylor, Tacchetti, Andrea, Trebacz, Maja, Robinson, Kevin, Katariya, Yash, Riedel, Sebastian, Bailey, Paige, Xiao, Kefan, Ghelani, Nimesh, Aroyo, Lora, Slone, Ambrose, Houlsby, Neil, Xiong, Xuehan, Yang, Zhen, Gribovskaya, Elena, Adler, Jonas, Wirth, Mateo, Lee, Lisa, Li, Music, Kagohara, Thais, Pavagadhi, Jay, Bridgers, Sophie, Bortsova, Anna, Ghemawat, Sanjay, Ahmed, Zafarali, Liu, Tianqi, Powell, Richard, Bolina, Vijay, Iinuma, Mariko, Zablotskaia, Polina, Besley, James, Chung, Da-Woon, Dozat, Timothy, Comanescu, Ramona, Si, Xiance, Greer, Jeremy, Su, Guolong, Polacek, Martin, Kaufman, Raphaël Lopez, Tokumine, Simon, Hu, Hexiang, Buchatskaya, Elena, Miao, Yingjie, Elhawaty, Mohamed, Siddhant, Aditya, Tomasev, Nenad, Xing, Jinwei, Greer, Christina, Miller, Helen, Ashraf, Shereen, Roy, Aurko, Zhang, Zizhao, Ma, Ada, Filos, Angelos, Besta, Milos, Blevins, Rory, Klimenko, Ted, Yeh, Chih-Kuan, Changpinyo, Soravit, Mu, Jiaqi, Chang, Oscar, Pajarskas, Mantas, Muir, Carrie, Cohen, Vered, Lan, Charline Le, Haridasan, Krishna, Marathe, Amit, Hansen, Steven, Douglas, Sholto, Samuel, Rajkumar, Wang, Mingqiu, Austin, Sophia, Lan, Chang, Jiang, Jiepu, Chiu, Justin, Lorenzo, Jaime Alonso, Sjösund, Lars Lowe, Cevey, Sébastien, Gleicher, Zach, Avrahami, Thi, Boral, Anudhyan, Srinivasan, Hansa, Selo, Vittorio, May, Rhys, Aisopos, Konstantinos, Hussenot, Léonard, Soares, Livio Baldini, Baumli, Kate, Chang, Michael B., Recasens, Adrià, Caine, Ben, Pritzel, Alexander, Pavetic, Filip, Pardo, Fabio, Gergely, Anita, Frye, Justin, Ramasesh, Vinay, Horgan, Dan, Badola, Kartikeya, Kassner, Nora, Roy, Subhrajit, Dyer, Ethan, Campos, Víctor Campos, Tomala, Alex, Tang, Yunhao, Badawy, Dalia El, White, Elspeth, Mustafa, Basil, Lang, Oran, Jindal, Abhishek, Vikram, Sharad, Gong, Zhitao, Caelles, Sergi, Hemsley, Ross, Thornton, Gregory, Feng, Fangxiaoyu, Stokowiec, Wojciech, Zheng, Ce, Thacker, Phoebe, Ünlü, Çağlar, Zhang, Zhishuai, Saleh, Mohammad, Svensson, James, Bileschi, Max, Patil, Piyush, Anand, Ankesh, Ring, Roman, Tsihlas, Katerina, Vezer, Arpi, Selvi, Marco, Shevlane, Toby, Rodriguez, Mikel, Kwiatkowski, Tom, Daruki, Samira, Rong, Keran, Dafoe, Allan, FitzGerald, Nicholas, Gu-Lemberg, Keren, Khan, Mina, Hendricks, Lisa Anne, Pellat, Marie, Feinberg, Vladimir, Cobon-Kerr, James, Sainath, Tara, Rauh, Maribeth, Hashemi, Sayed Hadi, Ives, Richard, Hasson, Yana, Noland, Eric, Cao, Yuan, Byrd, Nathan, Hou, Le, Wang, Qingze, Sottiaux, Thibault, Paganini, Michela, Lespiau, Jean-Baptiste, Moufarek, Alexandre, Hassan, Samer, Shivakumar, Kaushik, van Amersfoort, Joost, Mandhane, Amol, Joshi, Pratik, Goyal, Anirudh, Tung, Matthew, Brock, Andrew, Sheahan, Hannah, Misra, Vedant, Li, Cheng, Rakićević, Nemanja, Dehghani, Mostafa, Liu, Fangyu, Mittal, Sid, Oh, Junhyuk, Noury, Seb, Sezener, Eren, Huot, Fantine, Lamm, Matthew, De Cao, Nicola, Chen, Charlie, Mudgal, Sidharth, Stella, Romina, Brooks, Kevin, Vasudevan, Gautam, Liu, Chenxi, Chain, Mainak, Melinkeri, Nivedita, Cohen, Aaron, Wang, Venus, Seymore, Kristie, Zubkov, Sergey, Goel, Rahul, Yue, Summer, Krishnakumaran, Sai, Albert, Brian, Hurley, Nate, Sano, Motoki, Mohananey, Anhad, Joughin, Jonah, Filonov, Egor, Kępa, Tomasz, Eldawy, Yomna, Lim, Jiawern, Rishi, Rahul, Badiezadegan, Shirin, Bos, Taylor, Chang, Jerry, Jain, Sanil, Padmanabhan, Sri Gayatri Sundara, Puttagunta, Subha, Krishna, Kalpesh, Baker, Leslie, Kalb, Norbert, Bedapudi, Vamsi, Kurzrok, Adam, Lei, Shuntong, Yu, Anthony, Litvin, Oren, Zhou, Xiang, Wu, Zhichun, Sobell, Sam, Siciliano, Andrea, Papir, Alan, Neale, Robby, Bragagnolo, Jonas, Toor, Tej, Chen, Tina, Anklin, Valentin, Wang, Feiran, Feng, Richie, Gholami, Milad, Ling, Kevin, Liu, Lijuan, Walter, Jules, Moghaddam, Hamid, Kishore, Arun, Adamek, Jakub, Mercado, Tyler, Mallinson, Jonathan, Wandekar, Siddhinita, Cagle, Stephen, Ofek, Eran, Garrido, Guillermo, Lombriser, Clemens, Mukha, Maksim, Sun, Botu, Mohammad, Hafeezul Rahman, Matak, Josip, Qian, Yadi, Peswani, Vikas, Janus, Pawel, Yuan, Quan, Schelin, Leif, David, Oana, Garg, Ankur, He, Yifan, Duzhyi, Oleksii, Älgmyr, Anton, Lottaz, Timothée, Li, Qi, Yadav, Vikas, Xu, Luyao, Chinien, Alex, Shivanna, Rakesh, Chuklin, Aleksandr, Li, Josie, Spadine, Carrie, Wolfe, Travis, Mohamed, Kareem, Das, Subhabrata, Dai, Zihang, He, Kyle, von Dincklage, Daniel, Upadhyay, Shyam, Maurya, Akanksha, Chi, Luyan, Krause, Sebastian, Salama, Khalid, Rabinovitch, Pam G, M, Pavan Kumar Reddy, Selvan, Aarush, Dektiarev, Mikhail, Ghiasi, Golnaz, Guven, Erdem, Gupta, Himanshu, Liu, Boyi, Sharma, Deepak, Shtacher, Idan Heimlich, Paul, Shachi, Akerlund, Oscar, Aubet, François-Xavier, Huang, Terry, Zhu, Chen, Zhu, Eric, Teixeira, Elico, Fritze, Matthew, Bertolini, Francesco, Marinescu, Liana-Eleonora, Bölle, Martin, Paulus, Dominik, Gupta, Khyatti, Latkar, Tejasi, Chang, Max, Sanders, Jason, Wilson, Roopa, Wu, Xuewei, Tan, Yi-Xuan, Thiet, Lam Nguyen, Doshi, Tulsee, Lall, Sid, Mishra, Swaroop, Chen, Wanming, Luong, Thang, Benjamin, Seth, Lee, Jasmine, Andrejczuk, Ewa, Rabiej, Dominik, Ranjan, Vipul, Styrc, Krzysztof, Yin, Pengcheng, Simon, Jon, Harriott, Malcolm Rose, Bansal, Mudit, Robsky, Alexei, Bacon, Geoff, Greene, David, Mirylenka, Daniil, Zhou, Chen, Sarvana, Obaid, Goyal, Abhimanyu, Andermatt, Samuel, Siegler, Patrick, Horn, Ben, Israel, Assaf, Pongetti, Francesco, Chen, Chih-Wei "Louis", Selvatici, Marco, Silva, Pedro, Wang, Kathie, Tolins, Jackson, Guu, Kelvin, Yogev, Roey, Cai, Xiaochen, Agostini, Alessandro, Shah, Maulik, Nguyen, Hung, Donnaile, Noah Ó, Pereira, Sébastien, Friso, Linda, Stambler, Adam, Kuang, Chenkai, Romanikhin, Yan, Geller, Mark, Yan, ZJ, Jang, Kane, Lee, Cheng-Chun, Fica, Wojciech, Malmi, Eric, Tan, Qijun, Banica, Dan, Balle, Daniel, Pham, Ryan, Huang, Yanping, Avram, Diana, Shi, Hongzhi, Singh, Jasjot, Hidey, Chris, Ahuja, Niharika, Saxena, Pranab, Dooley, Dan, Potharaju, Srividya Pranavi, O'Neill, Eileen, Gokulchandran, Anand, Foley, Ryan, Zhao, Kai, Dusenberry, Mike, Liu, Yuan, Mehta, Pulkit, Kotikalapudi, Ragha, Safranek-Shrader, Chalence, Goodman, Andrew, Kessinger, Joshua, Globen, Eran, Kolhar, Prateek, Gorgolewski, Chris, Ibrahim, Ali, Song, Yang, Eichenbaum, Ali, Brovelli, Thomas, Potluri, Sahitya, Lahoti, Preethi, Baetu, Cip, Ghorbani, Ali, Chen, Charles, Crawford, Andy, Pal, Shalini, Sridhar, Mukund, Gurita, Petru, Mujika, Asier, Petrovski, Igor, Cedoz, Pierre-Louis, Li, Chenmei, Chen, Shiyuan, Santo, Niccolò Dal, Goyal, Siddharth, Punjabi, Jitesh, Kappaganthu, Karthik, Kwak, Chester, LV, Pallavi, Velury, Sarmishta, Choudhury, Himadri, Hall, Jamie, Shah, Premal, Figueira, Ricardo, Thomas, Matt, Lu, Minjie, Zhou, Ting, Kumar, Chintu, Jurdi, Thomas, Chikkerur, Sharat, Ma, Yenai, Yu, Adams, Kwak, Soo, Ähdel, Victor, Rajayogam, Sujeevan, Choma, Travis, Liu, Fei, Barua, Aditya, Ji, Colin, Park, Ji Ho, Hellendoorn, Vincent, Bailey, Alex, Bilal, Taylan, Zhou, Huanjie, Khatir, Mehrdad, Sutton, Charles, Rzadkowski, Wojciech, Macintosh, Fiona, Shagin, Konstantin, Medina, Paul, Liang, Chen, Zhou, Jinjing, Shah, Pararth, Bi, Yingying, Dankovics, Attila, Banga, Shipra, Lehmann, Sabine, Bredesen, Marissa, Lin, Zifan, Hoffmann, John Eric, Lai, Jonathan, Chung, Raynald, Yang, Kai, Balani, Nihal, Bražinskas, Arthur, Sozanschi, Andrei, Hayes, Matthew, Alcalde, Héctor Fernández, Makarov, Peter, Chen, Will, Stella, Antonio, Snijders, Liselotte, Mandl, Michael, Kärrman, Ante, Nowak, Paweł, Wu, Xinyi, Dyck, Alex, Vaidyanathan, Krishnan, R, Raghavender, Mallet, Jessica, Rudominer, Mitch, Johnston, Eric, Mittal, Sushil, Udathu, Akhil, Christensen, Janara, Verma, Vishal, Irving, Zach, Santucci, Andreas, Elsayed, Gamaleldin, Davoodi, Elnaz, Georgiev, Marin, Tenney, Ian, Hua, Nan, Cideron, Geoffrey, Leurent, Edouard, Alnahlawi, Mahmoud, Georgescu, Ionut, Wei, Nan, Zheng, Ivy, Scandinaro, Dylan, Jiang, Heinrich, Snoek, Jasper, Sundararajan, Mukund, Wang, Xuezhi, Ontiveros, Zack, Karo, Itay, Cole, Jeremy, Rajashekhar, Vinu, Tumeh, Lara, Ben-David, Eyal, Jain, Rishub, Uesato, Jonathan, Datta, Romina, Bunyan, Oskar, Wu, Shimu, Zhang, John, Stanczyk, Piotr, Zhang, Ye, Steiner, David, Naskar, Subhajit, Azzam, Michael, Johnson, Matthew, Paszke, Adam, Chiu, Chung-Cheng, Elias, Jaume Sanchez, Mohiuddin, Afroz, Muhammad, Faizan, Miao, Jin, Lee, Andrew, Vieillard, Nino, Park, Jane, Zhang, Jiageng, Stanway, Jeff, Garmon, Drew, Karmarkar, Abhijit, Dong, Zhe, Lee, Jong, Kumar, Aviral, Zhou, Luowei, Evens, Jonathan, Isaac, William, Irving, Geoffrey, Loper, Edward, Fink, Michael, Arkatkar, Isha, Chen, Nanxin, Shafran, Izhak, Petrychenko, Ivan, Chen, Zhe, Jia, Johnson, Levskaya, Anselm, Zhu, Zhenkai, Grabowski, Peter, Mao, Yu, Magni, Alberto, Yao, Kaisheng, Snaider, Javier, Casagrande, Norman, Palmer, Evan, Suganthan, Paul, Castaño, Alfonso, Giannoumis, Irene, Kim, Wooyeol, Rybiński, Mikołaj, Sreevatsa, Ashwin, Prendki, Jennifer, Soergel, David, Goedeckemeyer, Adrian, Gierke, Willi, Jafari, Mohsen, Gaba, Meenu, Wiesner, Jeremy, Wright, Diana Gage, Wei, Yawen, Vashisht, Harsha, Kulizhskaya, Yana, Hoover, Jay, Le, Maigo, Li, Lu, Iwuanyanwu, Chimezie, Liu, Lu, Ramirez, Kevin, Khorlin, Andrey, Cui, Albert, LIN, Tian, Wu, Marcus, Aguilar, Ricardo, Pallo, Keith, Chakladar, Abhishek, Perng, Ginger, Abellan, Elena Allica, Zhang, Mingyang, Dasgupta, Ishita, Kushman, Nate, Penchev, Ivo, Repina, Alena, Wu, Xihui, van der Weide, Tom, Ponnapalli, Priya, Kaplan, Caroline, Simsa, Jiri, Li, Shuangfeng, Dousse, Olivier, Piper, Jeff, Ie, Nathan, Pasumarthi, Rama, Lintz, Nathan, Vijayakumar, Anitha, Andor, Daniel, Valenzuela, Pedro, Lui, Minnie, Paduraru, Cosmin, Peng, Daiyi, Lee, Katherine, Zhang, Shuyuan, Greene, Somer, Nguyen, Duc Dung, Kurylowicz, Paula, Hardin, Cassidy, Dixon, Lucas, Janzer, Lili, Choo, Kiam, Feng, Ziqiang, Zhang, Biao, Singhal, Achintya, Du, Dayou, McKinnon, Dan, Antropova, Natasha, Bolukbasi, Tolga, Keller, Orgad, Reid, David, Finchelstein, Daniel, Raad, Maria Abi, Crocker, Remi, Hawkins, Peter, Dadashi, Robert, Gaffney, Colin, Franko, Ken, Bulanova, Anna, Leblond, Rémi, Chung, Shirley, Askham, Harry, Cobo, Luis C., Xu, Kelvin, Fischer, Felix, Xu, Jun, Sorokin, Christina, Alberti, Chris, Lin, Chu-Cheng, Evans, Colin, Dimitriev, Alek, Forbes, Hannah, Banarse, Dylan, Tung, Zora, Omernick, Mark, Bishop, Colton, Sterneck, Rachel, Jain, Rohan, Xia, Jiawei, Amid, Ehsan, Piccinno, Francesco, Wang, Xingyu, Banzal, Praseem, Mankowitz, Daniel J., Polozov, Alex, Krakovna, Victoria, Brown, Sasha, Bateni, MohammadHossein, Duan, Dennis, Firoiu, Vlad, Thotakuri, Meghana, Natan, Tom, Geist, Matthieu, Girgin, Ser tan, Li, Hui, Ye, Jiayu, Roval, Ofir, Tojo, Reiko, Kwong, Michael, Lee-Thorp, James, Yew, Christopher, Sinopalnikov, Danila, Ramos, Sabela, Mellor, John, Sharma, Abhishek, Wu, Kathy, Miller, David, Sonnerat, Nicolas, Vnukov, Denis, Greig, Rory, Beattie, Jennifer, Caveness, Emily, Bai, Libin, Eisenschlos, Julian, Korchemniy, Alex, Tsai, Tomy, Jasarevic, Mimi, Kong, Weize, Dao, Phuong, Zheng, Zeyu, Liu, Frederick, Zhu, Rui, Teh, Tian Huey, Sanmiya, Jason, Gladchenko, Evgeny, Trdin, Nejc, Toyama, Daniel, Rosen, Evan, Tavakkol, Sasan, Xue, Linting, Elkind, Chen, Woodman, Oliver, Carpenter, John, Papamakarios, George, Kemp, Rupert, Kafle, Sushant, Grunina, Tanya, Sinha, Rishika, Talbert, Alice, Wu, Diane, Owusu-Afriyie, Denese, Thornton, Chloe, Pont-Tuset, Jordi, Narayana, Pradyumna, Li, Jing, Fatehi, Saaber, Wieting, John, Ajmeri, Omar, Uria, Benigno, Ko, Yeongil, Knight, Laura, Héliou, Amélie, Niu, Ning, Gu, Shane, Pang, Chenxi, Li, Yeqing, Levine, Nir, Stolovich, Ariel, Santamaria-Fernandez, Rebeca, Goenka, Sonam, Yustalim, Wenny, Strudel, Robin, Elqursh, Ali, Deck, Charlie, Lee, Hyo, Li, Zonglin, Levin, Kyle, Hoffmann, Raphael, Holtmann-Rice, Dan, Bachem, Olivier, Arora, Sho, Koh, Christy, Yeganeh, Soheil Hassas, Põder, Siim, Tariq, Mukarram, Sun, Yanhua, Ionita, Lucian, Seyedhosseini, Mojtaba, Tafti, Pouya, Liu, Zhiyu, Gulati, Anmol, Liu, Jasmine, Ye, Xinyu, Chrzaszcz, Bart, Wang, Lily, Sethi, Nikhil, Li, Tianrun, Brown, Ben, Singh, Shreya, Fan, Wei, Parisi, Aaron, Stanton, Joe, Koverkathu, Vinod, Choquette-Choo, Christopher A., Li, Yunjie, Lu, TJ, Shroff, Prakash, Varadarajan, Mani, Bahargam, Sanaz, Willoughby, Rob, Gaddy, David, Desjardins, Guillaume, Cornero, Marco, Robenek, Brona, Mittal, Bhavishya, Albrecht, Ben, Shenoy, Ashish, Moiseev, Fedor, Jacobsson, Henrik, Ghaffarkhah, Alireza, Rivière, Morgane, Walton, Alanna, Crepy, Clément, Parrish, Alicia, Zhou, Zongwei, Farabet, Clement, Radebaugh, Carey, Srinivasan, Praveen, van der Salm, Claudia, Fidjeland, Andreas, Scellato, Salvatore, Latorre-Chimoto, Eri, Klimczak-Plucińska, Hanna, Bridson, David, de Cesare, Dario, Hudson, Tom, Mendolicchio, Piermaria, Walker, Lexi, Morris, Alex, Mauger, Matthew, Guseynov, Alexey, Reid, Alison, Odoom, Seth, Loher, Lucia, Cotruta, Victor, Yenugula, Madhavi, Grewe, Dominik, Petrushkina, Anastasia, Duerig, Tom, Sanchez, Antonio, Yadlowsky, Steve, Shen, Amy, Globerson, Amir, Webb, Lynette, Dua, Sahil, Li, Dong, Bhupatiraju, Surya, Hurt, Dan, Qureshi, Haroon, Agarwal, Ananth, Shani, Tomer, Eyal, Matan, Khare, Anuj, Belle, Shreyas Rammohan, Wang, Lei, Tekur, Chetan, Kale, Mihir Sanjay, Wei, Jinliang, Sang, Ruoxin, Saeta, Brennan, Liechty, Tyler, Sun, Yi, Zhao, Yao, Lee, Stephan, Nayak, Pandu, Fritz, Doug, Vuyyuru, Manish Reddy, Aslanides, John, Vyas, Nidhi, Wicke, Martin, Ma, Xiao, Eltyshev, Evgenii, Martin, Nina, Cate, Hardie, Manyika, James, Amiri, Keyvan, Kim, Yelin, Xiong, Xi, Kang, Kai, Luisier, Florian, Tripuraneni, Nilesh, Madras, David, Guo, Mandy, Waters, Austin, Wang, Oliver, Ainslie, Joshua, Baldridge, Jason, Zhang, Han, Pruthi, Garima, Bauer, Jakob, Yang, Feng, Mansour, Riham, Gelman, Jason, Xu, Yang, Polovets, George, Liu, Ji, Cai, Honglong, Chen, Warren, Sheng, XiangHai, Xue, Emily, Ozair, Sherjil, Angermueller, Christof, Li, Xiaowei, Sinha, Anoop, Wang, Weiren, Wiesinger, Julia, Koukoumidis, Emmanouil, Tian, Yuan, Iyer, Anand, Gurumurthy, Madhu, Goldenson, Mark, Shah, Parashar, Blake, MK, Yu, Hongkun, Urbanowicz, Anthony, Palomaki, Jennimaria, Fernando, Chrisantha, Durden, Ken, Mehta, Harsh, Momchev, Nikola, Rahimtoroghi, Elahe, Georgaki, Maria, Raul, Amit, Ruder, Sebastian, Redshaw, Morgan, Lee, Jinhyuk, Zhou, Denny, Jalan, Komal, Li, Dinghua, Hechtman, Blake, Schuh, Parker, Nasr, Milad, Milan, Kieran, Mikulik, Vladimir, Franco, Juliana, Green, Tim, Nguyen, Nam, Kelley, Joe, Mahendru, Aroma, Hu, Andrea, Howland, Joshua, Vargas, Ben, Hui, Jeffrey, Bansal, Kshitij, Rao, Vikram, Ghiya, Rakesh, Wang, Emma, Ye, Ke, Sarr, Jean Michel, Preston, Melanie Moranski, Elish, Madeleine, Li, Steve, Kaku, Aakash, Gupta, Jigar, Pasupat, Ice, Juan, Da-Cheng, Someswar, Milan, M., Tejvi, Chen, Xinyun, Amini, Aida, Fabrikant, Alex, Chu, Eric, Dong, Xuanyi, Muthal, Amruta, Buthpitiya, Senaka, Jauhari, Sarthak, Khandelwal, Urvashi, Hitron, Ayal, Ren, Jie, Rinaldi, Larissa, Drath, Shahar, Dabush, Avigail, Jiang, Nan-Jiang, Godhia, Harshal, Sachs, Uli, Chen, Anthony, Fan, Yicheng, Taitelbaum, Hagai, Noga, Hila, Dai, Zhuyun, Wang, James, Hamer, Jenny, Ferng, Chun-Sung, Elkind, Chenel, Atias, Aviel, Lee, Paulina, Listík, Vít, Carlen, Mathias, van de Kerkhof, Jan, Pikus, Marcin, Zaher, Krunoslav, Müller, Paul, Zykova, Sasha, Stefanec, Richard, Gatsko, Vitaly, Hirnschall, Christoph, Sethi, Ashwin, Xu, Xingyu Federico, Ahuja, Chetan, Tsai, Beth, Stefanoiu, Anca, Feng, Bo, Dhandhania, Keshav, Katyal, Manish, Gupta, Akshay, Parulekar, Atharva, Pitta, Divya, Zhao, Jing, Bhatia, Vivaan, Bhavnani, Yashodha, Alhadlaq, Omar, Li, Xiaolin, Danenberg, Peter, Tu, Dennis, Pine, Alex, Filippova, Vera, Ghosh, Abhipso, Limonchik, Ben, Urala, Bhargava, Lanka, Chaitanya Krishna, Clive, Derik, Li, Edward, Wu, Hao, Hongtongsak, Kevin, Li, Ianna, Thakkar, Kalind, Omarov, Kuanysh, Majmundar, Kushal, Alverson, Michael, Kucharski, Michael, Patel, Mohak, Jain, Mudit, Zabelin, Maksim, Pelagatti, Paolo, Kohli, Rohan, Kumar, Saurabh, Kim, Joseph, Sankar, Swetha, Shah, Vineet, Ramachandruni, Lakshmi, Zeng, Xiangkai, Bariach, Ben, Weidinger, Laura, Vu, Tu, Andreev, Alek, He, Antoine, Hui, Kevin, Kashem, Sheleem, Subramanya, Amar, Hsiao, Sissie, Hassabis, Demis, Kavukcuoglu, Koray, Sadovsky, Adam, Le, Quoc, Strohman, Trevor, Wu, Yonghui, Petrov, Slav, Dean, Jeffrey, and Vinyals, Oriol
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
Published: 2023

5. William Moorcroft, Potter

Author: Mallinson, Jonathan
Subjects: art, Arts and Crafts, biography, cultural history, decorative arts, industry, Moorcroft, pottery, production, bic Book Industry Communication::A The arts::AF Art forms::AFP Ceramic arts, pottery, glass, bic Book Industry Communication::W Lifestyle, sport & leisure::WF Handicrafts, decorative arts & crafts::WFN Pottery, ceramics & glass crafts, bic Book Industry Communication::A The arts::AF Art forms::AFT Decorative arts, bic Book Industry Communication::A The arts::AK Industrial / commercial art & design, bic Book Industry Communication::H Humanities::HB History::HBT History: specific events & topics::HBTB Social & cultural history, bic Book Industry Communication::H Humanities::HB History::HBL History: earliest times to present day::HBLW 20th century history: c 1900 to c 2000
Abstract: William Moorcroft (1872-1945) was one of the most celebrated potters of the early twentieth century. His career extended from the Arts and Crafts movement of the late Victorian age to the Austerity aesthetics of the Second World War. Rejecting mass production and patronised by Royalty, Moorcroft’s work was a synthesis of studio and factory, art and industry. He considered it his vocation to create an everyday art, both functional and decorative, affordable by more than a privileged few: ‘If only the people in the world would concentrate upon making all things beautiful, and if all people concentrated on developing the arts of Peace, what a world it might be,’ he wrote in a letter to his daughter in 1930. 'William Moorcroft, Potter: Individuality by Design' is a pioneering study by Jonathan Mallinson, Emeritus Fellow of Trinity College, Oxford. It follows the career of William Moorcroft through a wealth of private papers, letters and diaries, business correspondence and published reviews in newspapers, trade magazines and art journals. Richly illustrated with examples of his pottery, it explores what lay behind the unique impact of work sought by museums and treasured in homes the world over. The book examines an artist’s very individual response to the turbulent half century in which he worked. It will appeal to both specialists and general readers with an interest in pottery, the decorative arts, and the cultural history of the times.
Published: 2023
Full Text: View/download PDF

6. Small Language Models Improve Giants by Rewriting Their Outputs

Author: Vernikos, Giorgos, Bražinskas, Arthur, Adamek, Jakub, Mallinson, Jonathan, Severyn, Aliaksei, and Malmi, Eric
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Despite the impressive performance of large language models (LLMs), they often lag behind specialized models in various tasks. LLMs only use a fraction of the existing training data for in-context learning, while task-specific models harness the full dataset for fine-tuning. In this work, we tackle the problem of leveraging training data to improve the performance of LLMs without fine-tuning. Our approach directly targets LLM predictions without requiring access to their weights. We create a pool of candidates from the LLM through few-shot prompting and we employ a compact model, the LM-corrector (LMCor), specifically trained to merge these candidates to produce an enhanced output. Our experiments on four natural language generation tasks demonstrate that even a small LMCor model (250M) substantially improves the few-shot performance of LLMs (62B), matching and even outperforming standard fine-tuning. Furthermore, we illustrate the robustness of LMCor against different prompts, thereby minimizing the need for extensive prompt engineering. Finally, we show that LMCor can be seamlessly integrated with different LLMs at inference, serving as a plug-and-play module to improve their performance., Comment: Accepted at EACL 2024
Published: 2023

7. Re-présentant les Lettres d'une Péruvienne en 1752: illustration et illusion

Author: Mallinson, Jonathan
Published: 2011
Full Text: View/download PDF

8. Teaching Small Language Models to Reason

Author: Magister, Lucie Charlotte, Mallinson, Jonathan, Adamek, Jakub, Malmi, Eric, and Severyn, Aliaksei
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Chain of thought prompting successfully improves the reasoning capabilities of large language models, achieving state of the art results on a range of datasets. However, these reasoning capabilities only appear to emerge in models with a size of over 100 billion parameters. In this paper, we explore the transfer of such reasoning capabilities to models with less than 100 billion parameters via knowledge distillation. Specifically, we finetune a student model on the chain of thought outputs generated by a larger teacher model. Our experiments show that the proposed method improves task performance across arithmetic, commonsense and symbolic reasoning datasets. For example, the accuracy of T5 XXL on GSM8K improves from 8.11% to 21.99% when finetuned on PaLM-540B generated chains of thought.
Published: 2022

9. Text Generation with Text-Editing Models

Author: Malmi, Eric, Dong, Yue, Mallinson, Jonathan, Chuklin, Aleksandr, Adamek, Jakub, Mirylenka, Daniil, Stahlberg, Felix, Krause, Sebastian, Kumar, Shankar, and Severyn, Aliaksei
Subjects: Computer Science - Computation and Language
Abstract: Text-editing models have recently become a prominent alternative to seq2seq models for monolingual text-generation tasks such as grammatical error correction, simplification, and style transfer. These tasks share a common trait - they exhibit a large amount of textual overlap between the source and target texts. Text-editing models take advantage of this observation and learn to generate the output by predicting edit operations applied to the source sequence. In contrast, seq2seq models generate outputs word-by-word from scratch thus making them slow at inference time. Text-editing models provide several benefits over seq2seq models including faster inference speed, higher sample efficiency, and better control and interpretability of the outputs. This tutorial provides a comprehensive overview of text-editing models and current state-of-the-art approaches, and analyzes their pros and cons. We discuss challenges related to productionization and how these models can be used to mitigate hallucination and bias, both pressing challenges in the field of text generation., Comment: Accepted as a tutorial at NAACL 2022
Published: 2022

10. EdiT5: Semi-Autoregressive Text-Editing with T5 Warm-Start

Author: Mallinson, Jonathan, Adamek, Jakub, Malmi, Eric, and Severyn, Aliaksei
Subjects: Computer Science - Computation and Language
Abstract: We present EdiT5 - a novel semi-autoregressive text-editing model designed to combine the strengths of non-autoregressive text-editing and autoregressive decoding. EdiT5 is faster during inference than conventional sequence-to-sequence (seq2seq) models, while being capable of modelling flexible input-output transformations. This is achieved by decomposing the generation process into three sub-tasks: (1) tagging to decide on the subset of input tokens to be preserved in the output, (2) re-ordering to define their order in the output text, and (3) insertion to infill the missing tokens that are not present in the input. The tagging and re-ordering steps, which are responsible for generating the largest portion of the output, are non-autoregressive, while the insertion step uses an autoregressive decoder. Depending on the task, EdiT5 on average requires significantly fewer autoregressive steps, demonstrating speedups of up to 25x when compared to seq2seq models. Quality-wise, EdiT5 is initialized with a pre-trained T5 checkpoint yielding comparable performance to T5 in high-resource settings when evaluated on three NLG tasks: Sentence Fusion, Grammatical Error Correction, and Decontextualization while clearly outperforming T5 in low-resource settings., Comment: To be published in Findings of EMNLP 2022
Published: 2022

11. RED-ACE: Robust Error Detection for ASR using Confidence Embeddings

Author: Gekhman, Zorik, Zverinski, Dina, Mallinson, Jonathan, and Beryozkin, Genady
Subjects: Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: ASR Error Detection (AED) models aim to post-process the output of Automatic Speech Recognition (ASR) systems, in order to detect transcription errors. Modern approaches usually use text-based input, comprised solely of the ASR transcription hypothesis, disregarding additional signals from the ASR model. Instead, we propose to utilize the ASR system's word-level confidence scores for improving AED performance. Specifically, we add an ASR Confidence Embedding (ACE) layer to the AED model's encoder, allowing us to jointly encode the confidence scores and the transcribed text into a contextualized representation. Our experiments show the benefits of ASR confidence scores for AED, their complementary effect over the textual signal, as well as the effectiveness and robustness of ACE for combining these signals. To foster further research, we publish a novel AED dataset consisting of ASR outputs on the LibriSpeech corpus with annotated transcription errors., Comment: Accepted as a short paper in EMNLP 2022
Published: 2022

12. 2. 1901–04: The End of the Beginning

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

13. 12. 1932–35: Individuality and Industrial Art

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

14. 9. 1924–25: Recognition of the Artist Potter

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

15. Introduction: William Moorcroft, Potter

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

16. 4. 1910–12: Approaching a Crossroads

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

17. Conclusion: Individuality by Design

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

18. 11. 1929–31: No Ordinary Potter

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

19. 1. 1897–1900: The Making of a Potter

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

20. 10. 1926–28: Re-negotiating the Future

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

21. 6. 1913–14: A New Beginning

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

22. 13. 1936–39: Pottery for a Troubled World

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

23. 3. 1905–09: Experiment and Adversity

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

24. 14. 1939–45: Adversity and Resolution

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

25. 8. 1919–23: A Lone Furrow

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

26. 7. 1914–18: The Art of Survival

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

27. 5. 1912–13: Breaking with Macintyre’s

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

28. A Simple Recipe for Multilingual Grammatical Error Correction

Author: Rothe, Sascha, Mallinson, Jonathan, Malmi, Eric, Krause, Sebastian, and Severyn, Aliaksei
Subjects: Computer Science - Computation and Language
Abstract: This paper presents a simple recipe to train state-of-the-art multilingual Grammatical Error Correction (GEC) models. We achieve this by first proposing a language-agnostic method to generate a large number of synthetic examples. The second ingredient is to use large-scale multilingual language models (up to 11B parameters). Once fine-tuned on language-specific supervised sets we surpass the previous state-of-the-art results on GEC benchmarks in four languages: English, Czech, German and Russian. Having established a new set of baselines for GEC, we make our results easily reproducible and accessible by releasing a cLang-8 dataset. It is produced by using our best model, which we call gT5, to clean the targets of a widely used yet noisy lang-8 dataset. cLang-8 greatly simplifies typical GEC training pipelines composed of multiple fine-tuning stages -- we demonstrate that performing a single fine-tuning step on cLang-8 with the off-the-shelf language models yields further accuracy improvements over an already top-performing gT5 model for English.
Published: 2021

29. Felix: Flexible Text Editing Through Tagging and Insertion

Author: Mallinson, Jonathan, Severyn, Aliaksei, Malmi, Eric, and Garrido, Guillermo
Subjects: Computer Science - Computation and Language
Abstract: We present Felix --- a flexible text-editing approach for generation, designed to derive the maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pre-training. In contrast to conventional sequence-to-sequence (seq2seq) models, Felix is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input. The tagging model employs a novel Pointer mechanism, while the insertion model is based on a Masked Language Model. Both of these models are chosen to be non-autoregressive to guarantee faster inference. Felix performs favourably when compared to recent text-editing methods and strong seq2seq baselines when evaluated on four NLG tasks: Sentence Fusion, Machine Translation Automatic Post-Editing, Summarization, and Text Simplification.
Published: 2020

30. Universal rewriting via machine translation

Author: Mallinson, Jonathan, Lapata, Maria, and Sennrich, Rico
Abstract: Natural language allows for the same meaning (semantics) to be expressed in multiple different ways, i.e. paraphrasing. This thesis examines automatic approaches for paraphrasing, focusing on three paraphrasing subtasks: unconstrained paraphrasing where there are no constraints on the output, simplification, where the output must be simpler than the input, and text compression where the output must be shorter than the input. Whilst we can learn paraphrasing from supervised data, this data is sparse and expensive to create. This thesis is concerned with the use of transfer learning to improve paraphrasing when there is no supervised data. In particular, we address the following question: can transfer learning be used to overcome a lack of paraphrasing data? To answer this question we split it into three subquestions (1) No supervised data exists for a specific paraphrasing task; can bilingual data be used as a source of training data for paraphrasing? (2) Supervised paraphrasing data exists in one language but not in another; can bilingual data be used to transfer paraphrasing training data from one language to another? (3) Can the output of encoder-decoder paraphrasing models be controlled?
Published: 2021
Full Text: View/download PDF

31. Controllable Sentence Simplification: Employing Syntactic and Lexical Constraints

Author: Mallinson, Jonathan and Lapata, Mirella
Subjects: Computer Science - Computation and Language
Abstract: Sentence simplification aims to make sentences easier to read and understand. Recent approaches have shown promising results with sequence-to-sequence models which have been developed assuming homogeneous target audiences. In this paper we argue that different users have different simplification needs (e.g. dyslexics vs. non-native speakers), and propose CROSS, ContROllable Sentence Simplification model, which allows to control both the level of simplicity and the type of the simplification. We achieve this by enriching a Transformer-based architecture with syntactic and lexical constraints (which can be set or learned from data). Empirical results on two benchmark datasets show that constraints are key to successful simplification, offering flexible generation output.
Published: 2019

32. Learning to Paraphrase for Question Answering

Author: Dong, Li, Mallinson, Jonathan, Reddy, Siva, and Lapata, Mirella
Subjects: Computer Science - Computation and Language
Abstract: Question answering (QA) systems are sensitive to the many different ways natural language expresses the same information need. In this paper we turn to paraphrases as a means of capturing this knowledge and present a general framework which learns felicitous paraphrases for various QA tasks. Our method is trained end-to-end using question-answer pairs as a supervision signal. A question and its paraphrases serve as input to a neural scoring model which assigns higher weights to linguistic expressions most likely to yield correct answers. We evaluate our approach on QA over Freebase and answer sentence selection. Experimental results on three datasets show that our framework consistently improves performance, achieving competitive results despite the use of simple QA models., Comment: EMNLP 2017
Published: 2017

33. Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext

Author: Wieting, John, Mallinson, Jonathan, and Gimpel, Kevin
Subjects: Computer Science - Computation and Language
Abstract: We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b). We use neural machine translation to generate sentential paraphrases via back-translation of bilingual sentence pairs. We evaluate the paraphrase pairs by their ability to serve as training data for learning paraphrastic sentence embeddings. We find that the data quality is stronger than prior work based on bitext and on par with manually-written English paraphrase pairs, with the advantage that our approach can scale up to generate large training sets for many languages and domains. We experiment with several language pairs and data sources, and develop a variety of data filtering techniques. In the process, we explore how neural machine translation output differs from human-written sentences, finding clear differences in length, the amount of repetition, and the use of rare words.
Published: 2017

34. William Moorcroft, Potter

Author: Mallinson, Jonathan, primary
Published: 2023
Full Text: View/download PDF

35. Fast Text Generation with Text-Editing Models

Author: Malmi, Eric, primary, Dong, Yue, additional, Mallinson, Jonathan, additional, Chuklin, Aleksandr, additional, Adamek, Jakub, additional, Mirylenka, Daniil, additional, Stahlberg, Felix, additional, Krause, Sebastian, additional, Kumar, Shankar, additional, and Severyn, Aliaksei, additional
Published: 2023
Full Text: View/download PDF

36. Teaching Small Language Models to Reason

Author: Magister, Lucie Charlotte, primary, Mallinson, Jonathan, additional, Adamek, Jakub, additional, Malmi, Eric, additional, and Severyn, Aliaksei, additional
Published: 2023
Full Text: View/download PDF

37. RED-ACE: Robust Error Detection for ASR using Confidence Embeddings

Author: Gekhman, Zorik, primary, Zverinski, Dina, additional, Mallinson, Jonathan, additional, and Beryozkin, Genady, additional
Published: 2022
Full Text: View/download PDF

38. Text Generation with Text-Editing Models

Author: Malmi, Eric, primary, Dong, Yue, additional, Mallinson, Jonathan, additional, Chuklin, Aleksandr, additional, Adamek, Jakub, additional, Mirylenka, Daniil, additional, Stahlberg, Felix, additional, Krause, Sebastian, additional, Kumar, Shankar, additional, and Severyn, Aliaksei, additional
Published: 2022
Full Text: View/download PDF

39. EdiT5: Semi-Autoregressive Text Editing with T5 Warm-Start

Author: Mallinson, Jonathan, primary, Adamek, Jakub, additional, Malmi, Eric, additional, and Severyn, Aliaksei, additional
Published: 2022
Full Text: View/download PDF

40. Attila, ou les tyrannies du tyran. [Attila, or the tyrant's tyrannies. In French]

Author: Mallinson, Jonathan
Published: 1996

41. Zero-Shot Crosslingual Sentence Simplification

Author: Mallinson, Jonathan, Sennrich, Rico; https://orcid.org/0000-0002-1438-4741, Lapata, Mirella, Mallinson, Jonathan, Sennrich, Rico; https://orcid.org/0000-0002-1438-4741, and Lapata, Mirella
Abstract: Sentence simplification aims to make sentences easier to read and understand. Recent approaches have shown promising results with encoder-decoder models trained on large amounts of parallel data which often only exists in English. We propose a zero-shot modeling framework which transfers simplification knowledge from English to another language (for which no parallel simplification corpus exists) while generalizing across languages and tasks. A shared transformer encoder constructs language-agnostic representations, with a combination of task-specific encoder layers added on top (e.g., for translation and simplification). Empirical results using both human and automatic metrics show that our approach produces better simplifications than unsupervised and pivot-based methods.
Published: 2020

42. A Simple Recipe for Multilingual Grammatical Error Correction

Author: Rothe, Sascha, primary, Mallinson, Jonathan, additional, Malmi, Eric, additional, Krause, Sebastian, additional, and Severyn, Aliaksei, additional
Published: 2021
Full Text: View/download PDF

43. PREFACE

Author: Mallinson, Jonathan, primary
Published: 2010
Full Text: View/download PDF

44. FELIX: Flexible Text Editing Through Tagging and Insertion

Author: Mallinson, Jonathan, primary, Severyn, Aliaksei, additional, Malmi, Eric, additional, and Garrido, Guillermo, additional
Published: 2020
Full Text: View/download PDF

45. Zero-Shot Crosslingual Sentence Simplification

Author: Mallinson, Jonathan, primary, Sennrich, Rico, additional, and Lapata, Mirella, additional
Published: 2020
Full Text: View/download PDF

46. University of Edinburgh’s submission to the Document-level Generation and Translation Shared Task

Author: Puduppully, Ratish, primary, Mallinson, Jonathan, additional, and Lapata, Mirella, additional
Published: 2019
Full Text: View/download PDF

47. Clelie: Histoire romaine

Author: Mallinson, Jonathan
Subjects: Clelie: Histoire romaine (Book) -- Book reviews, Books -- Book reviews, Literature/writing, Regional focus/area studies
Published: 2005

48. Sentence Compression for Arbitrary Languages via Multilingual Pivoting

Author: Mallinson, Jonathan, primary, Sennrich, Rico, additional, and Lapata, Mirella, additional
Published: 2018
Full Text: View/download PDF

49. Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext

Author: Wieting, John, primary, Mallinson, Jonathan, additional, and Gimpel, Kevin, additional
Published: 2017
Full Text: View/download PDF

50. Learning to Paraphrase for Question Answering

Author: Dong, Li, primary, Mallinson, Jonathan, additional, Reddy, Siva, additional, and Lapata, Mirella, additional
Published: 2017
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

131 results on '"Mallinson, Jonathan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources