Author: "Xu, Kelvin" / Publication Year Range: Last 10 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xu, Kelvin"' showing total 41 results

Start Over Author "Xu, Kelvin" Publication Year Range Last 10 years

41 results on '"Xu, Kelvin"'

1. Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries

Author: Vodrahalli, Kiran, Ontanon, Santiago, Tripuraneni, Nilesh, Xu, Kelvin, Jain, Sanil, Shivanna, Rakesh, Hui, Jeffrey, Dikkala, Nishanth, Kazemi, Mehran, Fatemi, Bahare, Anil, Rohan, Dyer, Ethan, Shakeri, Siamak, Vij, Roopali, Mehta, Harsh, Ramasesh, Vinay, Le, Quoc, Chi, Ed, Lu, Yifeng, Firat, Orhan, Lazaridou, Angeliki, Lespiau, Jean-Baptiste, Attaluri, Nithya, and Olszewska, Kate
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: We introduce Michelangelo: a minimal, synthetic, and unleaked long-context reasoning evaluation for large language models which is also easy to automatically score. This evaluation is derived via a novel, unifying framework for evaluations over arbitrarily long contexts which measure the model's ability to do more than retrieve a single piece of information from its context. The central idea of the Latent Structure Queries framework (LSQ) is to construct tasks which require a model to ``chisel away'' the irrelevant information in the context, revealing a latent structure in the context. To verify a model's understanding of this latent structure, we query the model for details of the structure. Using LSQ, we produce three diagnostic long-context evaluations across code and natural-language domains intended to provide a stronger signal of long-context language model capabilities. We perform evaluations on several state-of-the-art models and demonstrate both that a) the proposed evaluations are high-signal and b) that there is significant room for improvement in synthesizing long-context information.
Published: 2024

2. Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

Author: Hron, Jiri, Culp, Laura, Elsayed, Gamaleldin, Liu, Rosanne, Adlam, Ben, Bileschi, Maxwell, Bohnet, Bernd, Co-Reyes, JD, Fiedel, Noah, Freeman, C. Daniel, Gur, Izzeddin, Kenealy, Kathleen, Lee, Jaehoon, Liu, Peter J., Mishra, Gaurav, Mordatch, Igor, Nova, Azade, Novak, Roman, Parisi, Aaron, Pennington, Jeffrey, Rizkowsky, Alex, Simpson, Isabelle, Sedghi, Hanie, Sohl-dickstein, Jascha, Swersky, Kevin, Vikram, Sharad, Warkentin, Tris, Xiao, Lechao, Xu, Kelvin, Snoek, Jasper, and Kornblith, Simon
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: While many capabilities of language models (LMs) improve with increased training budget, the influence of scale on hallucinations is not yet fully understood. Hallucinations come in many forms, and there is no universally accepted definition. We thus focus on studying only those hallucinations where a correct answer appears verbatim in the training set. To fully control the training data content, we construct a knowledge graph (KG)-based dataset, and use it to train a set of increasingly large LMs. We find that for a fixed dataset, larger and longer-trained LMs hallucinate less. However, hallucinating on $\leq5$% of the training data requires an order of magnitude larger model, and thus an order of magnitude more compute, than Hoffmann et al. (2022) reported was optimal. Given this costliness, we study how hallucination detectors depend on scale. While we see detector size improves performance on fixed LM's outputs, we find an inverse relationship between the scale of the LM and the detectability of its hallucinations., Comment: Published at COLM 2024. 16 pages, 11 figures
Published: 2024

3. Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Author: Snell, Charlie, Lee, Jaehoon, Xu, Kelvin, and Kumar, Aviral
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Enabling LLMs to improve their outputs by using more test-time computation is a critical step towards building generally self-improving agents that can operate on open-ended natural language. In this paper, we study the scaling of inference-time computation in LLMs, with a focus on answering the question: if an LLM is allowed to use a fixed but non-trivial amount of inference-time compute, how much can it improve its performance on a challenging prompt? Answering this question has implications not only on the achievable performance of LLMs, but also on the future of LLM pretraining and how one should tradeoff inference-time and pre-training compute. Despite its importance, little research attempted to understand the scaling behaviors of various test-time inference methods. Moreover, current work largely provides negative results for a number of these strategies. In this work, we analyze two primary mechanisms to scale test-time computation: (1) searching against dense, process-based verifier reward models; and (2) updating the model's distribution over a response adaptively, given the prompt at test time. We find that in both cases, the effectiveness of different approaches to scaling test-time compute critically varies depending on the difficulty of the prompt. This observation motivates applying a "compute-optimal" scaling strategy, which acts to most effectively allocate test-time compute adaptively per prompt. Using this compute-optimal strategy, we can improve the efficiency of test-time compute scaling by more than 4x compared to a best-of-N baseline. Additionally, in a FLOPs-matched evaluation, we find that on problems where a smaller base model attains somewhat non-trivial success rates, test-time compute can be used to outperform a 14x larger model.
Published: 2024

4. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Author: Gemini Team, Georgiev, Petko, Lei, Ving Ian, Burnell, Ryan, Bai, Libin, Gulati, Anmol, Tanzer, Garrett, Vincent, Damien, Pan, Zhufeng, Wang, Shibo, Mariooryad, Soroosh, Ding, Yifan, Geng, Xinyang, Alcober, Fred, Frostig, Roy, Omernick, Mark, Walker, Lexi, Paduraru, Cosmin, Sorokin, Christina, Tacchetti, Andrea, Gaffney, Colin, Daruki, Samira, Sercinoglu, Olcan, Gleicher, Zach, Love, Juliette, Voigtlaender, Paul, Jain, Rohan, Surita, Gabriela, Mohamed, Kareem, Blevins, Rory, Ahn, Junwhan, Zhu, Tao, Kawintiranon, Kornraphop, Firat, Orhan, Gu, Yiming, Zhang, Yujing, Rahtz, Matthew, Faruqui, Manaal, Clay, Natalie, Gilmer, Justin, Co-Reyes, JD, Penchev, Ivo, Zhu, Rui, Morioka, Nobuyuki, Hui, Kevin, Haridasan, Krishna, Campos, Victor, Mahdieh, Mahdis, Guo, Mandy, Hassan, Samer, Kilgour, Kevin, Vezer, Arpi, Cheng, Heng-Tze, de Liedekerke, Raoul, Goyal, Siddharth, Barham, Paul, Strouse, DJ, Noury, Seb, Adler, Jonas, Sundararajan, Mukund, Vikram, Sharad, Lepikhin, Dmitry, Paganini, Michela, Garcia, Xavier, Yang, Fan, Valter, Dasha, Trebacz, Maja, Vodrahalli, Kiran, Asawaroengchai, Chulayuth, Ring, Roman, Kalb, Norbert, Soares, Livio Baldini, Brahma, Siddhartha, Steiner, David, Yu, Tianhe, Mentzer, Fabian, He, Antoine, Gonzalez, Lucas, Xu, Bibo, Kaufman, Raphael Lopez, Shafey, Laurent El, Oh, Junhyuk, Hennigan, Tom, Driessche, George van den, Odoom, Seth, Lucic, Mario, Roelofs, Becca, Lall, Sid, Marathe, Amit, Chan, Betty, Ontanon, Santiago, He, Luheng, Teplyashin, Denis, Lai, Jonathan, Crone, Phil, Damoc, Bogdan, Ho, Lewis, Riedel, Sebastian, Lenc, Karel, Yeh, Chih-Kuan, Chowdhery, Aakanksha, Xu, Yang, Kazemi, Mehran, Amid, Ehsan, Petrushkina, Anastasia, Swersky, Kevin, Khodaei, Ali, Chen, Gowoon, Larkin, Chris, Pinto, Mario, Yan, Geng, Badia, Adria Puigdomenech, Patil, Piyush, Hansen, Steven, Orr, Dave, Arnold, Sebastien M. R., Grimstad, Jordan, Dai, Andrew, Douglas, Sholto, Sinha, Rishika, Yadav, Vikas, Chen, Xi, Gribovskaya, Elena, Austin, Jacob, Zhao, Jeffrey, Patel, Kaushal, Komarek, Paul, Austin, Sophia, Borgeaud, Sebastian, Friso, Linda, Goyal, Abhimanyu, Caine, Ben, Cao, Kris, Chung, Da-Woon, Lamm, Matthew, Barth-Maron, Gabe, Kagohara, Thais, Olszewska, Kate, Chen, Mia, Shivakumar, Kaushik, Agarwal, Rishabh, Godhia, Harshal, Rajwar, Ravi, Snaider, Javier, Dotiwalla, Xerxes, Liu, Yuan, Barua, Aditya, Ungureanu, Victor, Zhang, Yuan, Batsaikhan, Bat-Orgil, Wirth, Mateo, Qin, James, Danihelka, Ivo, Doshi, Tulsee, Chadwick, Martin, Chen, Jilin, Jain, Sanil, Le, Quoc, Kar, Arjun, Gurumurthy, Madhu, Li, Cheng, Sang, Ruoxin, Liu, Fangyu, Lamprou, Lampros, Munoz, Rich, Lintz, Nathan, Mehta, Harsh, Howard, Heidi, Reynolds, Malcolm, Aroyo, Lora, Wang, Quan, Blanco, Lorenzo, Cassirer, Albin, Griffith, Jordan, Das, Dipanjan, Lee, Stephan, Sygnowski, Jakub, Fisher, Zach, Besley, James, Powell, Richard, Ahmed, Zafarali, Paulus, Dominik, Reitter, David, Borsos, Zalan, Joshi, Rishabh, Pope, Aedan, Hand, Steven, Selo, Vittorio, Jain, Vihan, Sethi, Nikhil, Goel, Megha, Makino, Takaki, May, Rhys, Yang, Zhen, Schalkwyk, Johan, Butterfield, Christina, Hauth, Anja, Goldin, Alex, Hawkins, Will, Senter, Evan, Brin, Sergey, Woodman, Oliver, Ritter, Marvin, Noland, Eric, Giang, Minh, Bolina, Vijay, Lee, Lisa, Blyth, Tim, Mackinnon, Ian, Reid, Machel, Sarvana, Obaid, Silver, David, Chen, Alexander, Wang, Lily, Maggiore, Loren, Chang, Oscar, Attaluri, Nithya, Thornton, Gregory, Chiu, Chung-Cheng, Bunyan, Oskar, Levine, Nir, Chung, Timothy, Eltyshev, Evgenii, Si, Xiance, Lillicrap, Timothy, Brady, Demetra, Aggarwal, Vaibhav, Wu, Boxi, Xu, Yuanzhong, McIlroy, Ross, Badola, Kartikeya, Sandhu, Paramjit, Moreira, Erica, Stokowiec, Wojciech, Hemsley, Ross, Li, Dong, Tudor, Alex, Shyam, Pranav, Rahimtoroghi, Elahe, Haykal, Salem, Sprechmann, Pablo, Zhou, Xiang, Mincu, Diana, Li, Yujia, Addanki, Ravi, Krishna, Kalpesh, Wu, Xiao, Frechette, Alexandre, Eyal, Matan, Dafoe, Allan, Lacey, Dave, Whang, Jay, Avrahami, Thi, Zhang, Ye, Taropa, Emanuel, Lin, Hanzhao, Toyama, Daniel, Rutherford, Eliza, Sano, Motoki, Choe, HyunJeong, Tomala, Alex, Safranek-Shrader, Chalence, Kassner, Nora, Pajarskas, Mantas, Harvey, Matt, Sechrist, Sean, Fortunato, Meire, Lyu, Christina, Elsayed, Gamaleldin, Kuang, Chenkai, Lottes, James, Chu, Eric, Jia, Chao, Chen, Chih-Wei, Humphreys, Peter, Baumli, Kate, Tao, Connie, Samuel, Rajkumar, Santos, Cicero Nogueira dos, Andreassen, Anders, Rakićević, Nemanja, Grewe, Dominik, Kumar, Aviral, Winkler, Stephanie, Caton, Jonathan, Brock, Andrew, Dalmia, Sid, Sheahan, Hannah, Barr, Iain, Miao, Yingjie, Natsev, Paul, Devlin, Jacob, Behbahani, Feryal, Prost, Flavien, Sun, Yanhua, Myaskovsky, Artiom, Pillai, Thanumalayan Sankaranarayana, Hurt, Dan, Lazaridou, Angeliki, Xiong, Xi, Zheng, Ce, Pardo, Fabio, Li, Xiaowei, Horgan, Dan, Stanton, Joe, Ambar, Moran, Xia, Fei, Lince, Alejandro, Wang, Mingqiu, Mustafa, Basil, Webson, Albert, Lee, Hyo, Anil, Rohan, Wicke, Martin, Dozat, Timothy, Sinha, Abhishek, Piqueras, Enrique, Dabir, Elahe, Upadhyay, Shyam, Boral, Anudhyan, Hendricks, Lisa Anne, Fry, Corey, Djolonga, Josip, Su, Yi, Walker, Jake, Labanowski, Jane, Huang, Ronny, Misra, Vedant, Chen, Jeremy, Skerry-Ryan, RJ, Singh, Avi, Rijhwani, Shruti, Yu, Dian, Castro-Ros, Alex, Changpinyo, Beer, Datta, Romina, Bagri, Sumit, Hrafnkelsson, Arnar Mar, Maggioni, Marcello, Zheng, Daniel, Sulsky, Yury, Hou, Shaobo, Paine, Tom Le, Yang, Antoine, Riesa, Jason, Rogozinska, Dominika, Marcus, Dror, Badawy, Dalia El, Zhang, Qiao, Wang, Luyu, Miller, Helen, Greer, Jeremy, Sjos, Lars Lowe, Nova, Azade, Zen, Heiga, Chaabouni, Rahma, Rosca, Mihaela, Jiang, Jiepu, Chen, Charlie, Liu, Ruibo, Sainath, Tara, Krikun, Maxim, Polozov, Alex, Lespiau, Jean-Baptiste, Newlan, Josh, Cankara, Zeyncep, Kwak, Soo, Xu, Yunhan, Chen, Phil, Coenen, Andy, Meyer, Clemens, Tsihlas, Katerina, Ma, Ada, Gottweis, Juraj, Xing, Jinwei, Gu, Chenjie, Miao, Jin, Frank, Christian, Cankara, Zeynep, Ganapathy, Sanjay, Dasgupta, Ishita, Hughes-Fitt, Steph, Chen, Heng, Reid, David, Rong, Keran, Fan, Hongmin, van Amersfoort, Joost, Zhuang, Vincent, Cohen, Aaron, Gu, Shixiang Shane, Mohananey, Anhad, Ilic, Anastasija, Tobin, Taylor, Wieting, John, Bortsova, Anna, Thacker, Phoebe, Wang, Emma, Caveness, Emily, Chiu, Justin, Sezener, Eren, Kaskasoli, Alex, Baker, Steven, Millican, Katie, Elhawaty, Mohamed, Aisopos, Kostas, Lebsack, Carl, Byrd, Nathan, Dai, Hanjun, Jia, Wenhao, Wiethoff, Matthew, Davoodi, Elnaz, Weston, Albert, Yagati, Lakshman, Ahuja, Arun, Gao, Isabel, Pundak, Golan, Zhang, Susan, Azzam, Michael, Sim, Khe Chai, Caelles, Sergi, Keeling, James, Sharma, Abhanshu, Swing, Andy, Li, YaGuang, Liu, Chenxi, Bostock, Carrie Grimes, Bansal, Yamini, Nado, Zachary, Anand, Ankesh, Lipschultz, Josh, Karmarkar, Abhijit, Proleev, Lev, Ittycheriah, Abe, Yeganeh, Soheil Hassas, Polovets, George, Faust, Aleksandra, Sun, Jiao, Rrustemi, Alban, Li, Pen, Shivanna, Rakesh, Liu, Jeremiah, Welty, Chris, Lebron, Federico, Baddepudi, Anirudh, Krause, Sebastian, Parisotto, Emilio, Soricut, Radu, Xu, Zheng, Bloxwich, Dawn, Johnson, Melvin, Neyshabur, Behnam, Mao-Jones, Justin, Wang, Renshen, Ramasesh, Vinay, Abbas, Zaheer, Guez, Arthur, Segal, Constant, Nguyen, Duc Dung, Svensson, James, Hou, Le, York, Sarah, Milan, Kieran, Bridgers, Sophie, Gworek, Wiktor, Tagliasacchi, Marco, Lee-Thorp, James, Chang, Michael, Guseynov, Alexey, Hartman, Ale Jakse, Kwong, Michael, Zhao, Ruizhe, Kashem, Sheleem, Cole, Elizabeth, Miech, Antoine, Tanburn, Richard, Phuong, Mary, Pavetic, Filip, Cevey, Sebastien, Comanescu, Ramona, Ives, Richard, Yang, Sherry, Du, Cosmo, Li, Bo, Zhang, Zizhao, Iinuma, Mariko, Hu, Clara Huiyi, Roy, Aurko, Bijwadia, Shaan, Zhu, Zhenkai, Martins, Danilo, Saputro, Rachel, Gergely, Anita, Zheng, Steven, Jia, Dawei, Antonoglou, Ioannis, Sadovsky, Adam, Gu, Shane, Bi, Yingying, Andreev, Alek, Samangooei, Sina, Khan, Mina, Kocisky, Tomas, Filos, Angelos, Kumar, Chintu, Bishop, Colton, Yu, Adams, Hodkinson, Sarah, Mittal, Sid, Shah, Premal, Moufarek, Alexandre, Cheng, Yong, Bloniarz, Adam, Lee, Jaehoon, Pejman, Pedram, Michel, Paul, Spencer, Stephen, Feinberg, Vladimir, Xiong, Xuehan, Savinov, Nikolay, Smith, Charlotte, Shakeri, Siamak, Tran, Dustin, Chesus, Mary, Bohnet, Bernd, Tucker, George, von Glehn, Tamara, Muir, Carrie, Mao, Yiran, Kazawa, Hideto, Slone, Ambrose, Soparkar, Kedar, Shrivastava, Disha, Cobon-Kerr, James, Sharman, Michael, Pavagadhi, Jay, Araya, Carlos, Misiunas, Karolis, Ghelani, Nimesh, Laskin, Michael, Barker, David, Li, Qiujia, Briukhov, Anton, Houlsby, Neil, Glaese, Mia, Lakshminarayanan, Balaji, Schucher, Nathan, Tang, Yunhao, Collins, Eli, Lim, Hyeontaek, Feng, Fangxiaoyu, Recasens, Adria, Lai, Guangda, Magni, Alberto, De Cao, Nicola, Siddhant, Aditya, Ashwood, Zoe, Orbay, Jordi, Dehghani, Mostafa, Brennan, Jenny, He, Yifan, Xu, Kelvin, Gao, Yang, Saroufim, Carl, Molloy, James, Wu, Xinyi, Arnold, Seb, Chang, Solomon, Schrittwieser, Julian, Buchatskaya, Elena, Radpour, Soroush, Polacek, Martin, Giordano, Skye, Bapna, Ankur, Tokumine, Simon, Hellendoorn, Vincent, Sottiaux, Thibault, Cogan, Sarah, Severyn, Aliaksei, Saleh, Mohammad, Thakoor, Shantanu, Shefey, Laurent, Qiao, Siyuan, Gaba, Meenu, Chang, Shuo-yiin, Swanson, Craig, Zhang, Biao, Lee, Benjamin, Rubenstein, Paul Kishan, Song, Gan, Kwiatkowski, Tom, Koop, Anna, Kannan, Ajay, Kao, David, Schuh, Parker, Stjerngren, Axel, Ghiasi, Golnaz, Gibson, Gena, Vilnis, Luke, Yuan, Ye, Ferreira, Felipe Tiengo, Kamath, Aishwarya, Klimenko, Ted, Franko, Ken, Xiao, Kefan, Bhattacharya, Indro, Patel, Miteyan, Wang, Rui, Morris, Alex, Strudel, Robin, Sharma, Vivek, Choy, Peter, Hashemi, Sayed Hadi, Landon, Jessica, Finkelstein, Mara, Jhakra, Priya, Frye, Justin, Barnes, Megan, Mauger, Matthew, Daun, Dennis, Baatarsukh, Khuslen, Tung, Matthew, Farhan, Wael, Michalewski, Henryk, Viola, Fabio, Quitry, Felix de Chaumont, Lan, Charline Le, Hudson, Tom, Wang, Qingze, Fischer, Felix, Zheng, Ivy, White, Elspeth, Dragan, Anca, Alayrac, Jean-baptiste, Ni, Eric, Pritzel, Alexander, Iwanicki, Adam, Isard, Michael, Bulanova, Anna, Zilka, Lukas, Dyer, Ethan, Sachan, Devendra, Srinivasan, Srivatsan, Muckenhirn, Hannah, Cai, Honglong, Mandhane, Amol, Tariq, Mukarram, Rae, Jack W., Wang, Gary, Ayoub, Kareem, FitzGerald, Nicholas, Zhao, Yao, Han, Woohyun, Alberti, Chris, Garrette, Dan, Krishnakumar, Kashyap, Gimenez, Mai, Levskaya, Anselm, Sohn, Daniel, Matak, Josip, Iturrate, Inaki, Chang, Michael B., Xiang, Jackie, Cao, Yuan, Ranka, Nishant, Brown, Geoff, Hutter, Adrian, Mirrokni, Vahab, Chen, Nanxin, Yao, Kaisheng, Egyed, Zoltan, Galilee, Francois, Liechty, Tyler, Kallakuri, Praveen, Palmer, Evan, Ghemawat, Sanjay, Liu, Jasmine, Tao, David, Thornton, Chloe, Green, Tim, Jasarevic, Mimi, Lin, Sharon, Cotruta, Victor, Tan, Yi-Xuan, Fiedel, Noah, Yu, Hongkun, Chi, Ed, Neitz, Alexander, Heitkaemper, Jens, Sinha, Anu, Zhou, Denny, Sun, Yi, Kaed, Charbel, Hulse, Brice, Mishra, Swaroop, Georgaki, Maria, Kudugunta, Sneha, Farabet, Clement, Shafran, Izhak, Vlasic, Daniel, Tsitsulin, Anton, Ananthanarayanan, Rajagopal, Carin, Alen, Su, Guolong, Sun, Pei, V, Shashank, Carvajal, Gabriel, Broder, Josef, Comsa, Iulia, Repina, Alena, Wong, William, Chen, Warren Weilun, Hawkins, Peter, Filonov, Egor, Loher, Lucia, Hirnschall, Christoph, Wang, Weiyi, Ye, Jingchen, Burns, Andrea, Cate, Hardie, Wright, Diana Gage, Piccinini, Federico, Zhang, Lei, Lin, Chu-Cheng, Gog, Ionel, Kulizhskaya, Yana, Sreevatsa, Ashwin, Song, Shuang, Cobo, Luis C., Iyer, Anand, Tekur, Chetan, Garrido, Guillermo, Xiao, Zhuyun, Kemp, Rupert, Zheng, Huaixiu Steven, Li, Hui, Agarwal, Ananth, Ngani, Christel, Goshvadi, Kati, Santamaria-Fernandez, Rebeca, Fica, Wojciech, Chen, Xinyun, Gorgolewski, Chris, Sun, Sean, Garg, Roopal, Ye, Xinyu, Eslami, S. M. Ali, Hua, Nan, Simon, Jon, Joshi, Pratik, Kim, Yelin, Tenney, Ian, Potluri, Sahitya, Thiet, Lam Nguyen, Yuan, Quan, Luisier, Florian, Chronopoulou, Alexandra, Scellato, Salvatore, Srinivasan, Praveen, Chen, Minmin, Koverkathu, Vinod, Dalibard, Valentin, Xu, Yaming, Saeta, Brennan, Anderson, Keith, Sellam, Thibault, Fernando, Nick, Huot, Fantine, Jung, Junehyuk, Varadarajan, Mani, Quinn, Michael, Raul, Amit, Le, Maigo, Habalov, Ruslan, Clark, Jon, Jalan, Komal, Bullard, Kalesha, Singhal, Achintya, Luong, Thang, Wang, Boyu, Rajayogam, Sujeevan, Eisenschlos, Julian, Jia, Johnson, Finchelstein, Daniel, Yakubovich, Alex, Balle, Daniel, Fink, Michael, Agarwal, Sameer, Li, Jing, Dvijotham, Dj, Pal, Shalini, Kang, Kai, Konzelmann, Jaclyn, Beattie, Jennifer, Dousse, Olivier, Wu, Diane, Crocker, Remi, Elkind, Chen, Jonnalagadda, Siddhartha Reddy, Lee, Jong, Holtmann-Rice, Dan, Kallarackal, Krystal, Liu, Rosanne, Vnukov, Denis, Vats, Neera, Invernizzi, Luca, Jafari, Mohsen, Zhou, Huanjie, Taylor, Lilly, Prendki, Jennifer, Wu, Marcus, Eccles, Tom, Liu, Tianqi, Kopparapu, Kavya, Beaufays, Francoise, Angermueller, Christof, Marzoca, Andreea, Sarcar, Shourya, Dib, Hilal, Stanway, Jeff, Perbet, Frank, Trdin, Nejc, Sterneck, Rachel, Khorlin, Andrey, Li, Dinghua, Wu, Xihui, Goenka, Sonam, Madras, David, Goldshtein, Sasha, Gierke, Willi, Zhou, Tong, Liu, Yaxin, Liang, Yannie, White, Anais, Li, Yunjie, Singh, Shreya, Bahargam, Sanaz, Epstein, Mark, Basu, Sujoy, Lao, Li, Ozturel, Adnan, Crous, Carl, Zhai, Alex, Lu, Han, Tung, Zora, Gaur, Neeraj, Walton, Alanna, Dixon, Lucas, Zhang, Ming, Globerson, Amir, Uy, Grant, Bolt, Andrew, Wiles, Olivia, Nasr, Milad, Shumailov, Ilia, Selvi, Marco, Piccinno, Francesco, Aguilar, Ricardo, McCarthy, Sara, Khalman, Misha, Shukla, Mrinal, Galic, Vlado, Carpenter, John, Villela, Kevin, Zhang, Haibin, Richardson, Harry, Martens, James, Bosnjak, Matko, Belle, Shreyas Rammohan, Seibert, Jeff, Alnahlawi, Mahmoud, McWilliams, Brian, Singh, Sankalp, Louis, Annie, Ding, Wen, Popovici, Dan, Simicich, Lenin, Knight, Laura, Mehta, Pulkit, Gupta, Nishesh, Shi, Chongyang, Fatehi, Saaber, Mitrovic, Jovana, Grills, Alex, Pagadora, Joseph, Petrova, Dessie, Eisenbud, Danielle, Zhang, Zhishuai, Yates, Damion, Mittal, Bhavishya, Tripuraneni, Nilesh, Assael, Yannis, Brovelli, Thomas, Jain, Prateek, Velimirovic, Mihajlo, Akbulut, Canfer, Mu, Jiaqi, Macherey, Wolfgang, Kumar, Ravin, Xu, Jun, Qureshi, Haroon, Comanici, Gheorghe, Wiesner, Jeremy, Gong, Zhitao, Ruddock, Anton, Bauer, Matthias, Felt, Nick, GP, Anirudh, Arnab, Anurag, Zelle, Dustin, Rothfuss, Jonas, Rosgen, Bill, Shenoy, Ashish, Seybold, Bryan, Li, Xinjian, Mudigonda, Jayaram, Erdogan, Goker, Xia, Jiawei, Simsa, Jiri, Michi, Andrea, Yao, Yi, Yew, Christopher, Kan, Steven, Caswell, Isaac, Radebaugh, Carey, Elisseeff, Andre, Valenzuela, Pedro, McKinney, Kay, Paterson, Kim, Cui, Albert, Latorre-Chimoto, Eri, Kim, Solomon, Zeng, William, Durden, Ken, Ponnapalli, Priya, Sosea, Tiberiu, Choquette-Choo, Christopher A., Manyika, James, Robenek, Brona, Vashisht, Harsha, Pereira, Sebastien, Lam, Hoi, Velic, Marko, Owusu-Afriyie, Denese, Lee, Katherine, Bolukbasi, Tolga, Parrish, Alicia, Lu, Shawn, Park, Jane, Venkatraman, Balaji, Talbert, Alice, Rosique, Lambert, Cheng, Yuchung, Sozanschi, Andrei, Paszke, Adam, Kumar, Praveen, Austin, Jessica, Li, Lu, Salama, Khalid, Kim, Wooyeol, Dukkipati, Nandita, Baryshnikov, Anthony, Kaplanis, Christos, Sheng, XiangHai, Chervonyi, Yuri, Unlu, Caglar, Casas, Diego de Las, Askham, Harry, Tunyasuvunakool, Kathryn, Gimeno, Felix, Poder, Siim, Kwak, Chester, Miecnikowski, Matt, Dimitriev, Alek, Parisi, Aaron, Liu, Dangyi, Tsai, Tomy, Shevlane, Toby, Kouridi, Christina, Garmon, Drew, Goedeckemeyer, Adrian, Brown, Adam R., Vijayakumar, Anitha, Elqursh, Ali, Jazayeri, Sadegh, Huang, Jin, Carthy, Sara Mc, Hoover, Jay, Kim, Lucy, Kumar, Sandeep, Chen, Wei, Biles, Courtney, Bingham, Garrett, Rosen, Evan, Wang, Lisa, Tan, Qijun, Engel, David, Pongetti, Francesco, de Cesare, Dario, Hwang, Dongseong, Yu, Lily, Pullman, Jennifer, Narayanan, Srini, Levin, Kyle, Gopal, Siddharth, Li, Megan, Aharoni, Asaf, Trinh, Trieu, Lo, Jessica, Casagrande, Norman, Vij, Roopali, Matthey, Loic, Ramadhana, Bramandia, Matthews, Austin, Carey, CJ, Johnson, Matthew, Goranova, Kremena, Shah, Rohin, Ashraf, Shereen, Dasgupta, Kingshuk, Larsen, Rasmus, Wang, Yicheng, Vuyyuru, Manish Reddy, Jiang, Chong, Ijazi, Joana, Osawa, Kazuki, Smith, Celine, Boppana, Ramya Sree, Bilal, Taylan, Koizumi, Yuma, Xu, Ying, Altun, Yasemin, Shabat, Nir, Bariach, Ben, Korchemniy, Alex, Choo, Kiam, Ronneberger, Olaf, Iwuanyanwu, Chimezie, Zhao, Shubin, Soergel, David, Hsieh, Cho-Jui, Cai, Irene, Iqbal, Shariq, Sundermeyer, Martin, Chen, Zhe, Bursztein, Elie, Malaviya, Chaitanya, Biadsy, Fadi, Shroff, Prakash, Dhillon, Inderjit, Latkar, Tejasi, Dyer, Chris, Forbes, Hannah, Nicosia, Massimo, Nikolaev, Vitaly, Greene, Somer, Georgiev, Marin, Wang, Pidong, Martin, Nina, Sedghi, Hanie, Zhang, John, Banzal, Praseem, Fritz, Doug, Rao, Vikram, Wang, Xuezhi, Zhang, Jiageng, Patraucean, Viorica, Du, Dayou, Mordatch, Igor, Jurin, Ivan, Liu, Lewis, Dubey, Ayush, Mohan, Abhi, Nowakowski, Janek, Ion, Vlad-Doru, Wei, Nan, Tojo, Reiko, Raad, Maria Abi, Hudson, Drew A., Keshava, Vaishakh, Agrawal, Shubham, Ramirez, Kevin, Wu, Zhichun, Nguyen, Hoang, Liu, Ji, Sewak, Madhavi, Petrini, Bryce, Choi, DongHyun, Philips, Ivan, Wang, Ziyue, Bica, Ioana, Garg, Ankush, Wilkiewicz, Jarek, Agrawal, Priyanka, Guo, Danhao, Xue, Emily, Shaik, Naseer, Leach, Andrew, Khan, Sadh MNM, Wiesinger, Julia, Jerome, Sammy, Chakladar, Abhishek, Wang, Alek Wenjiao, Ornduff, Tina, Abu, Folake, Ghaffarkhah, Alireza, Wainwright, Marcus, Cortes, Mario, Liu, Frederick, Maynez, Joshua, Terzis, Andreas, Samangouei, Pouya, Mansour, Riham, Kępa, Tomasz, Aubet, François-Xavier, Algymr, Anton, Banica, Dan, Weisz, Agoston, Orban, Andras, Senges, Alexandre, Andrejczuk, Ewa, Geller, Mark, Santo, Niccolo Dal, Anklin, Valentin, Merey, Majd Al, Baeuml, Martin, Strohman, Trevor, Bai, Junwen, Petrov, Slav, Wu, Yonghui, Hassabis, Demis, Kavukcuoglu, Koray, Dean, Jeffrey, and Vinyals, Oriol
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
Published: 2024

5. Gemini: A Family of Highly Capable Multimodal Models

Author: Gemini Team, Anil, Rohan, Borgeaud, Sebastian, Alayrac, Jean-Baptiste, Yu, Jiahui, Soricut, Radu, Schalkwyk, Johan, Dai, Andrew M., Hauth, Anja, Millican, Katie, Silver, David, Johnson, Melvin, Antonoglou, Ioannis, Schrittwieser, Julian, Glaese, Amelia, Chen, Jilin, Pitler, Emily, Lillicrap, Timothy, Lazaridou, Angeliki, Firat, Orhan, Molloy, James, Isard, Michael, Barham, Paul R., Hennigan, Tom, Lee, Benjamin, Viola, Fabio, Reynolds, Malcolm, Xu, Yuanzhong, Doherty, Ryan, Collins, Eli, Meyer, Clemens, Rutherford, Eliza, Moreira, Erica, Ayoub, Kareem, Goel, Megha, Krawczyk, Jack, Du, Cosmo, Chi, Ed, Cheng, Heng-Tze, Ni, Eric, Shah, Purvi, Kane, Patrick, Chan, Betty, Faruqui, Manaal, Severyn, Aliaksei, Lin, Hanzhao, Li, YaGuang, Cheng, Yong, Ittycheriah, Abe, Mahdieh, Mahdis, Chen, Mia, Sun, Pei, Tran, Dustin, Bagri, Sumit, Lakshminarayanan, Balaji, Liu, Jeremiah, Orban, Andras, Güra, Fabian, Zhou, Hao, Song, Xinying, Boffy, Aurelien, Ganapathy, Harish, Zheng, Steven, Choe, HyunJeong, Weisz, Ágoston, Zhu, Tao, Lu, Yifeng, Gopal, Siddharth, Kahn, Jarrod, Kula, Maciej, Pitman, Jeff, Shah, Rushin, Taropa, Emanuel, Merey, Majd Al, Baeuml, Martin, Chen, Zhifeng, Shafey, Laurent El, Zhang, Yujing, Sercinoglu, Olcan, Tucker, George, Piqueras, Enrique, Krikun, Maxim, Barr, Iain, Savinov, Nikolay, Danihelka, Ivo, Roelofs, Becca, White, Anaïs, Andreassen, Anders, von Glehn, Tamara, Yagati, Lakshman, Kazemi, Mehran, Gonzalez, Lucas, Khalman, Misha, Sygnowski, Jakub, Frechette, Alexandre, Smith, Charlotte, Culp, Laura, Proleev, Lev, Luan, Yi, Chen, Xi, Lottes, James, Schucher, Nathan, Lebron, Federico, Rrustemi, Alban, Clay, Natalie, Crone, Phil, Kocisky, Tomas, Zhao, Jeffrey, Perz, Bartek, Yu, Dian, Howard, Heidi, Bloniarz, Adam, Rae, Jack W., Lu, Han, Sifre, Laurent, Maggioni, Marcello, Alcober, Fred, Garrette, Dan, Barnes, Megan, Thakoor, Shantanu, Austin, Jacob, Barth-Maron, Gabriel, Wong, William, Joshi, Rishabh, Chaabouni, Rahma, Fatiha, Deeni, Ahuja, Arun, Tomar, Gaurav Singh, Senter, Evan, Chadwick, Martin, Kornakov, Ilya, Attaluri, Nithya, Iturrate, Iñaki, Liu, Ruibo, Li, Yunxuan, Cogan, Sarah, Chen, Jeremy, Jia, Chao, Gu, Chenjie, Zhang, Qiao, Grimstad, Jordan, Hartman, Ale Jakse, Garcia, Xavier, Pillai, Thanumalayan Sankaranarayana, Devlin, Jacob, Laskin, Michael, Casas, Diego de Las, Valter, Dasha, Tao, Connie, Blanco, Lorenzo, Badia, Adrià Puigdomènech, Reitter, David, Chen, Mianna, Brennan, Jenny, Rivera, Clara, Brin, Sergey, Iqbal, Shariq, Surita, Gabriela, Labanowski, Jane, Rao, Abhi, Winkler, Stephanie, Parisotto, Emilio, Gu, Yiming, Olszewska, Kate, Addanki, Ravi, Miech, Antoine, Louis, Annie, Teplyashin, Denis, Brown, Geoff, Catt, Elliot, Balaguer, Jan, Xiang, Jackie, Wang, Pidong, Ashwood, Zoe, Briukhov, Anton, Webson, Albert, Ganapathy, Sanjay, Sanghavi, Smit, Kannan, Ajay, Chang, Ming-Wei, Stjerngren, Axel, Djolonga, Josip, Sun, Yuting, Bapna, Ankur, Aitchison, Matthew, Pejman, Pedram, Michalewski, Henryk, Yu, Tianhe, Wang, Cindy, Love, Juliette, Ahn, Junwhan, Bloxwich, Dawn, Han, Kehang, Humphreys, Peter, Sellam, Thibault, Bradbury, James, Godbole, Varun, Samangooei, Sina, Damoc, Bogdan, Kaskasoli, Alex, Arnold, Sébastien M. R., Vasudevan, Vijay, Agrawal, Shubham, Riesa, Jason, Lepikhin, Dmitry, Tanburn, Richard, Srinivasan, Srivatsan, Lim, Hyeontaek, Hodkinson, Sarah, Shyam, Pranav, Ferret, Johan, Hand, Steven, Garg, Ankush, Paine, Tom Le, Li, Jian, Li, Yujia, Giang, Minh, Neitz, Alexander, Abbas, Zaheer, York, Sarah, Reid, Machel, Cole, Elizabeth, Chowdhery, Aakanksha, Das, Dipanjan, Rogozińska, Dominika, Nikolaev, Vitaliy, Sprechmann, Pablo, Nado, Zachary, Zilka, Lukas, Prost, Flavien, He, Luheng, Monteiro, Marianne, Mishra, Gaurav, Welty, Chris, Newlan, Josh, Jia, Dawei, Allamanis, Miltiadis, Hu, Clara Huiyi, de Liedekerke, Raoul, Gilmer, Justin, Saroufim, Carl, Rijhwani, Shruti, Hou, Shaobo, Shrivastava, Disha, Baddepudi, Anirudh, Goldin, Alex, Ozturel, Adnan, Cassirer, Albin, Xu, Yunhan, Sohn, Daniel, Sachan, Devendra, Amplayo, Reinald Kim, Swanson, Craig, Petrova, Dessie, Narayan, Shashi, Guez, Arthur, Brahma, Siddhartha, Landon, Jessica, Patel, Miteyan, Zhao, Ruizhe, Villela, Kevin, Wang, Luyu, Jia, Wenhao, Rahtz, Matthew, Giménez, Mai, Yeung, Legg, Keeling, James, Georgiev, Petko, Mincu, Diana, Wu, Boxi, Haykal, Salem, Saputro, Rachel, Vodrahalli, Kiran, Qin, James, Cankara, Zeynep, Sharma, Abhanshu, Fernando, Nick, Hawkins, Will, Neyshabur, Behnam, Kim, Solomon, Hutter, Adrian, Agrawal, Priyanka, Castro-Ros, Alex, Driessche, George van den, Wang, Tao, Yang, Fan, Chang, Shuo-yiin, Komarek, Paul, McIlroy, Ross, Lučić, Mario, Zhang, Guodong, Farhan, Wael, Sharman, Michael, Natsev, Paul, Michel, Paul, Bansal, Yamini, Qiao, Siyuan, Cao, Kris, Shakeri, Siamak, Butterfield, Christina, Chung, Justin, Rubenstein, Paul Kishan, Agrawal, Shivani, Mensch, Arthur, Soparkar, Kedar, Lenc, Karel, Chung, Timothy, Pope, Aedan, Maggiore, Loren, Kay, Jackie, Jhakra, Priya, Wang, Shibo, Maynez, Joshua, Phuong, Mary, Tobin, Taylor, Tacchetti, Andrea, Trebacz, Maja, Robinson, Kevin, Katariya, Yash, Riedel, Sebastian, Bailey, Paige, Xiao, Kefan, Ghelani, Nimesh, Aroyo, Lora, Slone, Ambrose, Houlsby, Neil, Xiong, Xuehan, Yang, Zhen, Gribovskaya, Elena, Adler, Jonas, Wirth, Mateo, Lee, Lisa, Li, Music, Kagohara, Thais, Pavagadhi, Jay, Bridgers, Sophie, Bortsova, Anna, Ghemawat, Sanjay, Ahmed, Zafarali, Liu, Tianqi, Powell, Richard, Bolina, Vijay, Iinuma, Mariko, Zablotskaia, Polina, Besley, James, Chung, Da-Woon, Dozat, Timothy, Comanescu, Ramona, Si, Xiance, Greer, Jeremy, Su, Guolong, Polacek, Martin, Kaufman, Raphaël Lopez, Tokumine, Simon, Hu, Hexiang, Buchatskaya, Elena, Miao, Yingjie, Elhawaty, Mohamed, Siddhant, Aditya, Tomasev, Nenad, Xing, Jinwei, Greer, Christina, Miller, Helen, Ashraf, Shereen, Roy, Aurko, Zhang, Zizhao, Ma, Ada, Filos, Angelos, Besta, Milos, Blevins, Rory, Klimenko, Ted, Yeh, Chih-Kuan, Changpinyo, Soravit, Mu, Jiaqi, Chang, Oscar, Pajarskas, Mantas, Muir, Carrie, Cohen, Vered, Lan, Charline Le, Haridasan, Krishna, Marathe, Amit, Hansen, Steven, Douglas, Sholto, Samuel, Rajkumar, Wang, Mingqiu, Austin, Sophia, Lan, Chang, Jiang, Jiepu, Chiu, Justin, Lorenzo, Jaime Alonso, Sjösund, Lars Lowe, Cevey, Sébastien, Gleicher, Zach, Avrahami, Thi, Boral, Anudhyan, Srinivasan, Hansa, Selo, Vittorio, May, Rhys, Aisopos, Konstantinos, Hussenot, Léonard, Soares, Livio Baldini, Baumli, Kate, Chang, Michael B., Recasens, Adrià, Caine, Ben, Pritzel, Alexander, Pavetic, Filip, Pardo, Fabio, Gergely, Anita, Frye, Justin, Ramasesh, Vinay, Horgan, Dan, Badola, Kartikeya, Kassner, Nora, Roy, Subhrajit, Dyer, Ethan, Campos, Víctor Campos, Tomala, Alex, Tang, Yunhao, Badawy, Dalia El, White, Elspeth, Mustafa, Basil, Lang, Oran, Jindal, Abhishek, Vikram, Sharad, Gong, Zhitao, Caelles, Sergi, Hemsley, Ross, Thornton, Gregory, Feng, Fangxiaoyu, Stokowiec, Wojciech, Zheng, Ce, Thacker, Phoebe, Ünlü, Çağlar, Zhang, Zhishuai, Saleh, Mohammad, Svensson, James, Bileschi, Max, Patil, Piyush, Anand, Ankesh, Ring, Roman, Tsihlas, Katerina, Vezer, Arpi, Selvi, Marco, Shevlane, Toby, Rodriguez, Mikel, Kwiatkowski, Tom, Daruki, Samira, Rong, Keran, Dafoe, Allan, FitzGerald, Nicholas, Gu-Lemberg, Keren, Khan, Mina, Hendricks, Lisa Anne, Pellat, Marie, Feinberg, Vladimir, Cobon-Kerr, James, Sainath, Tara, Rauh, Maribeth, Hashemi, Sayed Hadi, Ives, Richard, Hasson, Yana, Noland, Eric, Cao, Yuan, Byrd, Nathan, Hou, Le, Wang, Qingze, Sottiaux, Thibault, Paganini, Michela, Lespiau, Jean-Baptiste, Moufarek, Alexandre, Hassan, Samer, Shivakumar, Kaushik, van Amersfoort, Joost, Mandhane, Amol, Joshi, Pratik, Goyal, Anirudh, Tung, Matthew, Brock, Andrew, Sheahan, Hannah, Misra, Vedant, Li, Cheng, Rakićević, Nemanja, Dehghani, Mostafa, Liu, Fangyu, Mittal, Sid, Oh, Junhyuk, Noury, Seb, Sezener, Eren, Huot, Fantine, Lamm, Matthew, De Cao, Nicola, Chen, Charlie, Mudgal, Sidharth, Stella, Romina, Brooks, Kevin, Vasudevan, Gautam, Liu, Chenxi, Chain, Mainak, Melinkeri, Nivedita, Cohen, Aaron, Wang, Venus, Seymore, Kristie, Zubkov, Sergey, Goel, Rahul, Yue, Summer, Krishnakumaran, Sai, Albert, Brian, Hurley, Nate, Sano, Motoki, Mohananey, Anhad, Joughin, Jonah, Filonov, Egor, Kępa, Tomasz, Eldawy, Yomna, Lim, Jiawern, Rishi, Rahul, Badiezadegan, Shirin, Bos, Taylor, Chang, Jerry, Jain, Sanil, Padmanabhan, Sri Gayatri Sundara, Puttagunta, Subha, Krishna, Kalpesh, Baker, Leslie, Kalb, Norbert, Bedapudi, Vamsi, Kurzrok, Adam, Lei, Shuntong, Yu, Anthony, Litvin, Oren, Zhou, Xiang, Wu, Zhichun, Sobell, Sam, Siciliano, Andrea, Papir, Alan, Neale, Robby, Bragagnolo, Jonas, Toor, Tej, Chen, Tina, Anklin, Valentin, Wang, Feiran, Feng, Richie, Gholami, Milad, Ling, Kevin, Liu, Lijuan, Walter, Jules, Moghaddam, Hamid, Kishore, Arun, Adamek, Jakub, Mercado, Tyler, Mallinson, Jonathan, Wandekar, Siddhinita, Cagle, Stephen, Ofek, Eran, Garrido, Guillermo, Lombriser, Clemens, Mukha, Maksim, Sun, Botu, Mohammad, Hafeezul Rahman, Matak, Josip, Qian, Yadi, Peswani, Vikas, Janus, Pawel, Yuan, Quan, Schelin, Leif, David, Oana, Garg, Ankur, He, Yifan, Duzhyi, Oleksii, Älgmyr, Anton, Lottaz, Timothée, Li, Qi, Yadav, Vikas, Xu, Luyao, Chinien, Alex, Shivanna, Rakesh, Chuklin, Aleksandr, Li, Josie, Spadine, Carrie, Wolfe, Travis, Mohamed, Kareem, Das, Subhabrata, Dai, Zihang, He, Kyle, von Dincklage, Daniel, Upadhyay, Shyam, Maurya, Akanksha, Chi, Luyan, Krause, Sebastian, Salama, Khalid, Rabinovitch, Pam G, M, Pavan Kumar Reddy, Selvan, Aarush, Dektiarev, Mikhail, Ghiasi, Golnaz, Guven, Erdem, Gupta, Himanshu, Liu, Boyi, Sharma, Deepak, Shtacher, Idan Heimlich, Paul, Shachi, Akerlund, Oscar, Aubet, François-Xavier, Huang, Terry, Zhu, Chen, Zhu, Eric, Teixeira, Elico, Fritze, Matthew, Bertolini, Francesco, Marinescu, Liana-Eleonora, Bölle, Martin, Paulus, Dominik, Gupta, Khyatti, Latkar, Tejasi, Chang, Max, Sanders, Jason, Wilson, Roopa, Wu, Xuewei, Tan, Yi-Xuan, Thiet, Lam Nguyen, Doshi, Tulsee, Lall, Sid, Mishra, Swaroop, Chen, Wanming, Luong, Thang, Benjamin, Seth, Lee, Jasmine, Andrejczuk, Ewa, Rabiej, Dominik, Ranjan, Vipul, Styrc, Krzysztof, Yin, Pengcheng, Simon, Jon, Harriott, Malcolm Rose, Bansal, Mudit, Robsky, Alexei, Bacon, Geoff, Greene, David, Mirylenka, Daniil, Zhou, Chen, Sarvana, Obaid, Goyal, Abhimanyu, Andermatt, Samuel, Siegler, Patrick, Horn, Ben, Israel, Assaf, Pongetti, Francesco, Chen, Chih-Wei "Louis", Selvatici, Marco, Silva, Pedro, Wang, Kathie, Tolins, Jackson, Guu, Kelvin, Yogev, Roey, Cai, Xiaochen, Agostini, Alessandro, Shah, Maulik, Nguyen, Hung, Donnaile, Noah Ó, Pereira, Sébastien, Friso, Linda, Stambler, Adam, Kuang, Chenkai, Romanikhin, Yan, Geller, Mark, Yan, ZJ, Jang, Kane, Lee, Cheng-Chun, Fica, Wojciech, Malmi, Eric, Tan, Qijun, Banica, Dan, Balle, Daniel, Pham, Ryan, Huang, Yanping, Avram, Diana, Shi, Hongzhi, Singh, Jasjot, Hidey, Chris, Ahuja, Niharika, Saxena, Pranab, Dooley, Dan, Potharaju, Srividya Pranavi, O'Neill, Eileen, Gokulchandran, Anand, Foley, Ryan, Zhao, Kai, Dusenberry, Mike, Liu, Yuan, Mehta, Pulkit, Kotikalapudi, Ragha, Safranek-Shrader, Chalence, Goodman, Andrew, Kessinger, Joshua, Globen, Eran, Kolhar, Prateek, Gorgolewski, Chris, Ibrahim, Ali, Song, Yang, Eichenbaum, Ali, Brovelli, Thomas, Potluri, Sahitya, Lahoti, Preethi, Baetu, Cip, Ghorbani, Ali, Chen, Charles, Crawford, Andy, Pal, Shalini, Sridhar, Mukund, Gurita, Petru, Mujika, Asier, Petrovski, Igor, Cedoz, Pierre-Louis, Li, Chenmei, Chen, Shiyuan, Santo, Niccolò Dal, Goyal, Siddharth, Punjabi, Jitesh, Kappaganthu, Karthik, Kwak, Chester, LV, Pallavi, Velury, Sarmishta, Choudhury, Himadri, Hall, Jamie, Shah, Premal, Figueira, Ricardo, Thomas, Matt, Lu, Minjie, Zhou, Ting, Kumar, Chintu, Jurdi, Thomas, Chikkerur, Sharat, Ma, Yenai, Yu, Adams, Kwak, Soo, Ähdel, Victor, Rajayogam, Sujeevan, Choma, Travis, Liu, Fei, Barua, Aditya, Ji, Colin, Park, Ji Ho, Hellendoorn, Vincent, Bailey, Alex, Bilal, Taylan, Zhou, Huanjie, Khatir, Mehrdad, Sutton, Charles, Rzadkowski, Wojciech, Macintosh, Fiona, Shagin, Konstantin, Medina, Paul, Liang, Chen, Zhou, Jinjing, Shah, Pararth, Bi, Yingying, Dankovics, Attila, Banga, Shipra, Lehmann, Sabine, Bredesen, Marissa, Lin, Zifan, Hoffmann, John Eric, Lai, Jonathan, Chung, Raynald, Yang, Kai, Balani, Nihal, Bražinskas, Arthur, Sozanschi, Andrei, Hayes, Matthew, Alcalde, Héctor Fernández, Makarov, Peter, Chen, Will, Stella, Antonio, Snijders, Liselotte, Mandl, Michael, Kärrman, Ante, Nowak, Paweł, Wu, Xinyi, Dyck, Alex, Vaidyanathan, Krishnan, R, Raghavender, Mallet, Jessica, Rudominer, Mitch, Johnston, Eric, Mittal, Sushil, Udathu, Akhil, Christensen, Janara, Verma, Vishal, Irving, Zach, Santucci, Andreas, Elsayed, Gamaleldin, Davoodi, Elnaz, Georgiev, Marin, Tenney, Ian, Hua, Nan, Cideron, Geoffrey, Leurent, Edouard, Alnahlawi, Mahmoud, Georgescu, Ionut, Wei, Nan, Zheng, Ivy, Scandinaro, Dylan, Jiang, Heinrich, Snoek, Jasper, Sundararajan, Mukund, Wang, Xuezhi, Ontiveros, Zack, Karo, Itay, Cole, Jeremy, Rajashekhar, Vinu, Tumeh, Lara, Ben-David, Eyal, Jain, Rishub, Uesato, Jonathan, Datta, Romina, Bunyan, Oskar, Wu, Shimu, Zhang, John, Stanczyk, Piotr, Zhang, Ye, Steiner, David, Naskar, Subhajit, Azzam, Michael, Johnson, Matthew, Paszke, Adam, Chiu, Chung-Cheng, Elias, Jaume Sanchez, Mohiuddin, Afroz, Muhammad, Faizan, Miao, Jin, Lee, Andrew, Vieillard, Nino, Park, Jane, Zhang, Jiageng, Stanway, Jeff, Garmon, Drew, Karmarkar, Abhijit, Dong, Zhe, Lee, Jong, Kumar, Aviral, Zhou, Luowei, Evens, Jonathan, Isaac, William, Irving, Geoffrey, Loper, Edward, Fink, Michael, Arkatkar, Isha, Chen, Nanxin, Shafran, Izhak, Petrychenko, Ivan, Chen, Zhe, Jia, Johnson, Levskaya, Anselm, Zhu, Zhenkai, Grabowski, Peter, Mao, Yu, Magni, Alberto, Yao, Kaisheng, Snaider, Javier, Casagrande, Norman, Palmer, Evan, Suganthan, Paul, Castaño, Alfonso, Giannoumis, Irene, Kim, Wooyeol, Rybiński, Mikołaj, Sreevatsa, Ashwin, Prendki, Jennifer, Soergel, David, Goedeckemeyer, Adrian, Gierke, Willi, Jafari, Mohsen, Gaba, Meenu, Wiesner, Jeremy, Wright, Diana Gage, Wei, Yawen, Vashisht, Harsha, Kulizhskaya, Yana, Hoover, Jay, Le, Maigo, Li, Lu, Iwuanyanwu, Chimezie, Liu, Lu, Ramirez, Kevin, Khorlin, Andrey, Cui, Albert, LIN, Tian, Wu, Marcus, Aguilar, Ricardo, Pallo, Keith, Chakladar, Abhishek, Perng, Ginger, Abellan, Elena Allica, Zhang, Mingyang, Dasgupta, Ishita, Kushman, Nate, Penchev, Ivo, Repina, Alena, Wu, Xihui, van der Weide, Tom, Ponnapalli, Priya, Kaplan, Caroline, Simsa, Jiri, Li, Shuangfeng, Dousse, Olivier, Piper, Jeff, Ie, Nathan, Pasumarthi, Rama, Lintz, Nathan, Vijayakumar, Anitha, Andor, Daniel, Valenzuela, Pedro, Lui, Minnie, Paduraru, Cosmin, Peng, Daiyi, Lee, Katherine, Zhang, Shuyuan, Greene, Somer, Nguyen, Duc Dung, Kurylowicz, Paula, Hardin, Cassidy, Dixon, Lucas, Janzer, Lili, Choo, Kiam, Feng, Ziqiang, Zhang, Biao, Singhal, Achintya, Du, Dayou, McKinnon, Dan, Antropova, Natasha, Bolukbasi, Tolga, Keller, Orgad, Reid, David, Finchelstein, Daniel, Raad, Maria Abi, Crocker, Remi, Hawkins, Peter, Dadashi, Robert, Gaffney, Colin, Franko, Ken, Bulanova, Anna, Leblond, Rémi, Chung, Shirley, Askham, Harry, Cobo, Luis C., Xu, Kelvin, Fischer, Felix, Xu, Jun, Sorokin, Christina, Alberti, Chris, Lin, Chu-Cheng, Evans, Colin, Dimitriev, Alek, Forbes, Hannah, Banarse, Dylan, Tung, Zora, Omernick, Mark, Bishop, Colton, Sterneck, Rachel, Jain, Rohan, Xia, Jiawei, Amid, Ehsan, Piccinno, Francesco, Wang, Xingyu, Banzal, Praseem, Mankowitz, Daniel J., Polozov, Alex, Krakovna, Victoria, Brown, Sasha, Bateni, MohammadHossein, Duan, Dennis, Firoiu, Vlad, Thotakuri, Meghana, Natan, Tom, Geist, Matthieu, Girgin, Ser tan, Li, Hui, Ye, Jiayu, Roval, Ofir, Tojo, Reiko, Kwong, Michael, Lee-Thorp, James, Yew, Christopher, Sinopalnikov, Danila, Ramos, Sabela, Mellor, John, Sharma, Abhishek, Wu, Kathy, Miller, David, Sonnerat, Nicolas, Vnukov, Denis, Greig, Rory, Beattie, Jennifer, Caveness, Emily, Bai, Libin, Eisenschlos, Julian, Korchemniy, Alex, Tsai, Tomy, Jasarevic, Mimi, Kong, Weize, Dao, Phuong, Zheng, Zeyu, Liu, Frederick, Zhu, Rui, Teh, Tian Huey, Sanmiya, Jason, Gladchenko, Evgeny, Trdin, Nejc, Toyama, Daniel, Rosen, Evan, Tavakkol, Sasan, Xue, Linting, Elkind, Chen, Woodman, Oliver, Carpenter, John, Papamakarios, George, Kemp, Rupert, Kafle, Sushant, Grunina, Tanya, Sinha, Rishika, Talbert, Alice, Wu, Diane, Owusu-Afriyie, Denese, Thornton, Chloe, Pont-Tuset, Jordi, Narayana, Pradyumna, Li, Jing, Fatehi, Saaber, Wieting, John, Ajmeri, Omar, Uria, Benigno, Ko, Yeongil, Knight, Laura, Héliou, Amélie, Niu, Ning, Gu, Shane, Pang, Chenxi, Li, Yeqing, Levine, Nir, Stolovich, Ariel, Santamaria-Fernandez, Rebeca, Goenka, Sonam, Yustalim, Wenny, Strudel, Robin, Elqursh, Ali, Deck, Charlie, Lee, Hyo, Li, Zonglin, Levin, Kyle, Hoffmann, Raphael, Holtmann-Rice, Dan, Bachem, Olivier, Arora, Sho, Koh, Christy, Yeganeh, Soheil Hassas, Põder, Siim, Tariq, Mukarram, Sun, Yanhua, Ionita, Lucian, Seyedhosseini, Mojtaba, Tafti, Pouya, Liu, Zhiyu, Gulati, Anmol, Liu, Jasmine, Ye, Xinyu, Chrzaszcz, Bart, Wang, Lily, Sethi, Nikhil, Li, Tianrun, Brown, Ben, Singh, Shreya, Fan, Wei, Parisi, Aaron, Stanton, Joe, Koverkathu, Vinod, Choquette-Choo, Christopher A., Li, Yunjie, Lu, TJ, Shroff, Prakash, Varadarajan, Mani, Bahargam, Sanaz, Willoughby, Rob, Gaddy, David, Desjardins, Guillaume, Cornero, Marco, Robenek, Brona, Mittal, Bhavishya, Albrecht, Ben, Shenoy, Ashish, Moiseev, Fedor, Jacobsson, Henrik, Ghaffarkhah, Alireza, Rivière, Morgane, Walton, Alanna, Crepy, Clément, Parrish, Alicia, Zhou, Zongwei, Farabet, Clement, Radebaugh, Carey, Srinivasan, Praveen, van der Salm, Claudia, Fidjeland, Andreas, Scellato, Salvatore, Latorre-Chimoto, Eri, Klimczak-Plucińska, Hanna, Bridson, David, de Cesare, Dario, Hudson, Tom, Mendolicchio, Piermaria, Walker, Lexi, Morris, Alex, Mauger, Matthew, Guseynov, Alexey, Reid, Alison, Odoom, Seth, Loher, Lucia, Cotruta, Victor, Yenugula, Madhavi, Grewe, Dominik, Petrushkina, Anastasia, Duerig, Tom, Sanchez, Antonio, Yadlowsky, Steve, Shen, Amy, Globerson, Amir, Webb, Lynette, Dua, Sahil, Li, Dong, Bhupatiraju, Surya, Hurt, Dan, Qureshi, Haroon, Agarwal, Ananth, Shani, Tomer, Eyal, Matan, Khare, Anuj, Belle, Shreyas Rammohan, Wang, Lei, Tekur, Chetan, Kale, Mihir Sanjay, Wei, Jinliang, Sang, Ruoxin, Saeta, Brennan, Liechty, Tyler, Sun, Yi, Zhao, Yao, Lee, Stephan, Nayak, Pandu, Fritz, Doug, Vuyyuru, Manish Reddy, Aslanides, John, Vyas, Nidhi, Wicke, Martin, Ma, Xiao, Eltyshev, Evgenii, Martin, Nina, Cate, Hardie, Manyika, James, Amiri, Keyvan, Kim, Yelin, Xiong, Xi, Kang, Kai, Luisier, Florian, Tripuraneni, Nilesh, Madras, David, Guo, Mandy, Waters, Austin, Wang, Oliver, Ainslie, Joshua, Baldridge, Jason, Zhang, Han, Pruthi, Garima, Bauer, Jakob, Yang, Feng, Mansour, Riham, Gelman, Jason, Xu, Yang, Polovets, George, Liu, Ji, Cai, Honglong, Chen, Warren, Sheng, XiangHai, Xue, Emily, Ozair, Sherjil, Angermueller, Christof, Li, Xiaowei, Sinha, Anoop, Wang, Weiren, Wiesinger, Julia, Koukoumidis, Emmanouil, Tian, Yuan, Iyer, Anand, Gurumurthy, Madhu, Goldenson, Mark, Shah, Parashar, Blake, MK, Yu, Hongkun, Urbanowicz, Anthony, Palomaki, Jennimaria, Fernando, Chrisantha, Durden, Ken, Mehta, Harsh, Momchev, Nikola, Rahimtoroghi, Elahe, Georgaki, Maria, Raul, Amit, Ruder, Sebastian, Redshaw, Morgan, Lee, Jinhyuk, Zhou, Denny, Jalan, Komal, Li, Dinghua, Hechtman, Blake, Schuh, Parker, Nasr, Milad, Milan, Kieran, Mikulik, Vladimir, Franco, Juliana, Green, Tim, Nguyen, Nam, Kelley, Joe, Mahendru, Aroma, Hu, Andrea, Howland, Joshua, Vargas, Ben, Hui, Jeffrey, Bansal, Kshitij, Rao, Vikram, Ghiya, Rakesh, Wang, Emma, Ye, Ke, Sarr, Jean Michel, Preston, Melanie Moranski, Elish, Madeleine, Li, Steve, Kaku, Aakash, Gupta, Jigar, Pasupat, Ice, Juan, Da-Cheng, Someswar, Milan, M., Tejvi, Chen, Xinyun, Amini, Aida, Fabrikant, Alex, Chu, Eric, Dong, Xuanyi, Muthal, Amruta, Buthpitiya, Senaka, Jauhari, Sarthak, Khandelwal, Urvashi, Hitron, Ayal, Ren, Jie, Rinaldi, Larissa, Drath, Shahar, Dabush, Avigail, Jiang, Nan-Jiang, Godhia, Harshal, Sachs, Uli, Chen, Anthony, Fan, Yicheng, Taitelbaum, Hagai, Noga, Hila, Dai, Zhuyun, Wang, James, Hamer, Jenny, Ferng, Chun-Sung, Elkind, Chenel, Atias, Aviel, Lee, Paulina, Listík, Vít, Carlen, Mathias, van de Kerkhof, Jan, Pikus, Marcin, Zaher, Krunoslav, Müller, Paul, Zykova, Sasha, Stefanec, Richard, Gatsko, Vitaly, Hirnschall, Christoph, Sethi, Ashwin, Xu, Xingyu Federico, Ahuja, Chetan, Tsai, Beth, Stefanoiu, Anca, Feng, Bo, Dhandhania, Keshav, Katyal, Manish, Gupta, Akshay, Parulekar, Atharva, Pitta, Divya, Zhao, Jing, Bhatia, Vivaan, Bhavnani, Yashodha, Alhadlaq, Omar, Li, Xiaolin, Danenberg, Peter, Tu, Dennis, Pine, Alex, Filippova, Vera, Ghosh, Abhipso, Limonchik, Ben, Urala, Bhargava, Lanka, Chaitanya Krishna, Clive, Derik, Li, Edward, Wu, Hao, Hongtongsak, Kevin, Li, Ianna, Thakkar, Kalind, Omarov, Kuanysh, Majmundar, Kushal, Alverson, Michael, Kucharski, Michael, Patel, Mohak, Jain, Mudit, Zabelin, Maksim, Pelagatti, Paolo, Kohli, Rohan, Kumar, Saurabh, Kim, Joseph, Sankar, Swetha, Shah, Vineet, Ramachandruni, Lakshmi, Zeng, Xiangkai, Bariach, Ben, Weidinger, Laura, Vu, Tu, Andreev, Alek, He, Antoine, Hui, Kevin, Kashem, Sheleem, Subramanya, Amar, Hsiao, Sissie, Hassabis, Demis, Kavukcuoglu, Koray, Sadovsky, Adam, Le, Quoc, Strohman, Trevor, Wu, Yonghui, Petrov, Slav, Dean, Jeffrey, and Vinyals, Oriol
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
Published: 2023

6. Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Author: Singh, Avi, Co-Reyes, John D., Agarwal, Rishabh, Anand, Ankesh, Patil, Piyush, Garcia, Xavier, Liu, Peter J., Harrison, James, Lee, Jaehoon, Xu, Kelvin, Parisi, Aaron, Kumar, Abhishek, Alemi, Alex, Rizkowsky, Alex, Nova, Azade, Adlam, Ben, Bohnet, Bernd, Elsayed, Gamaleldin, Sedghi, Hanie, Mordatch, Igor, Simpson, Isabelle, Gur, Izzeddin, Snoek, Jasper, Pennington, Jeffrey, Hron, Jiri, Kenealy, Kathleen, Swersky, Kevin, Mahajan, Kshiteej, Culp, Laura, Xiao, Lechao, Bileschi, Maxwell L., Constant, Noah, Novak, Roman, Liu, Rosanne, Warkentin, Tris, Qian, Yundi, Bansal, Yamini, Dyer, Ethan, Neyshabur, Behnam, Sohl-Dickstein, Jascha, and Fiedel, Noah
Subjects: Computer Science - Machine Learning
Abstract: Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST$^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times. Testing on advanced MATH reasoning and APPS coding benchmarks using PaLM-2 models, we find that ReST$^{EM}$ scales favorably with model size and significantly surpasses fine-tuning only on human data. Overall, our findings suggest self-training with feedback can substantially reduce dependence on human-generated data., Comment: Accepted to TMLR. Camera-ready version. First three authors contributed equally
Published: 2023

7. Frontier Language Models are not Robust to Adversarial Arithmetic, or 'What do I need to say so you agree 2+2=5?

Author: Freeman, C. Daniel, Culp, Laura, Parisi, Aaron, Bileschi, Maxwell L, Elsayed, Gamaleldin F, Rizkowsky, Alex, Simpson, Isabelle, Alemi, Alex, Nova, Azade, Adlam, Ben, Bohnet, Bernd, Mishra, Gaurav, Sedghi, Hanie, Mordatch, Igor, Gur, Izzeddin, Lee, Jaehoon, Co-Reyes, JD, Pennington, Jeffrey, Xu, Kelvin, Swersky, Kevin, Mahajan, Kshiteej, Xiao, Lechao, Liu, Rosanne, Kornblith, Simon, Constant, Noah, Liu, Peter J., Novak, Roman, Qian, Yundi, Fiedel, Noah, and Sohl-Dickstein, Jascha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Computer Science - Machine Learning
Abstract: We introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment. This problem is comprised of arithmetic questions posed in natural language, with an arbitrary adversarial string inserted before the question is complete. Even in the simple setting of 1-digit addition problems, it is easy to find adversarial prompts that make all tested models (including PaLM2, GPT4, Claude2) misbehave, and even to steer models to a particular wrong answer. We additionally provide a simple algorithm for finding successful attacks by querying those same models, which we name "prompt inversion rejection sampling" (PIRS). We finally show that models can be partially hardened against these attacks via reinforcement learning and via agentic constitutional loops. However, we were not able to make a language model fully robust against adversarial arithmetic attacks.
Published: 2023

8. Small-scale proxies for large-scale Transformer training instabilities

Author: Wortsman, Mitchell, Liu, Peter J., Xiao, Lechao, Everett, Katie, Alemi, Alex, Adlam, Ben, Co-Reyes, John D., Gur, Izzeddin, Kumar, Abhishek, Novak, Roman, Pennington, Jeffrey, Sohl-dickstein, Jascha, Xu, Kelvin, Lee, Jaehoon, Gilmer, Justin, and Kornblith, Simon
Subjects: Computer Science - Machine Learning
Abstract: Teams that have trained large Transformer-based models have reported training instabilities at large scale that did not appear when training with the same hyperparameters at smaller scales. Although the causes of such instabilities are of scientific interest, the amount of resources required to reproduce them has made investigation difficult. In this work, we seek ways to reproduce and study training stability and instability at smaller scales. First, we focus on two sources of training instability described in previous work: the growth of logits in attention layers (Dehghani et al., 2023) and divergence of the output logits from the log probabilities (Chowdhery et al., 2022). By measuring the relationship between learning rate and loss across scales, we show that these instabilities also appear in small models when training at high learning rates, and that mitigations previously employed at large scales are equally effective in this regime. This prompts us to investigate the extent to which other known optimizer and model interventions influence the sensitivity of the final loss to changes in the learning rate. To this end, we study methods such as warm-up, weight decay, and the $\mu$Param (Yang et al., 2022), and combine techniques to train small models that achieve similar losses across orders of magnitude of learning rate variation. Finally, to conclude our exploration we study two cases where instabilities can be predicted before they emerge by examining the scaling behavior of model activation and gradient norms.
Published: 2023

9. PaLM 2 Technical Report

Author: Anil, Rohan, Dai, Andrew M., Firat, Orhan, Johnson, Melvin, Lepikhin, Dmitry, Passos, Alexandre, Shakeri, Siamak, Taropa, Emanuel, Bailey, Paige, Chen, Zhifeng, Chu, Eric, Clark, Jonathan H., Shafey, Laurent El, Huang, Yanping, Meier-Hellstern, Kathy, Mishra, Gaurav, Moreira, Erica, Omernick, Mark, Robinson, Kevin, Ruder, Sebastian, Tay, Yi, Xiao, Kefan, Xu, Yuanzhong, Zhang, Yujing, Abrego, Gustavo Hernandez, Ahn, Junwhan, Austin, Jacob, Barham, Paul, Botha, Jan, Bradbury, James, Brahma, Siddhartha, Brooks, Kevin, Catasta, Michele, Cheng, Yong, Cherry, Colin, Choquette-Choo, Christopher A., Chowdhery, Aakanksha, Crepy, Clément, Dave, Shachi, Dehghani, Mostafa, Dev, Sunipa, Devlin, Jacob, Díaz, Mark, Du, Nan, Dyer, Ethan, Feinberg, Vlad, Feng, Fangxiaoyu, Fienber, Vlad, Freitag, Markus, Garcia, Xavier, Gehrmann, Sebastian, Gonzalez, Lucas, Gur-Ari, Guy, Hand, Steven, Hashemi, Hadi, Hou, Le, Howland, Joshua, Hu, Andrea, Hui, Jeffrey, Hurwitz, Jeremy, Isard, Michael, Ittycheriah, Abe, Jagielski, Matthew, Jia, Wenhao, Kenealy, Kathleen, Krikun, Maxim, Kudugunta, Sneha, Lan, Chang, Lee, Katherine, Lee, Benjamin, Li, Eric, Li, Music, Li, Wei, Li, YaGuang, Li, Jian, Lim, Hyeontaek, Lin, Hanzhao, Liu, Zhongtao, Liu, Frederick, Maggioni, Marcello, Mahendru, Aroma, Maynez, Joshua, Misra, Vedant, Moussalem, Maysam, Nado, Zachary, Nham, John, Ni, Eric, Nystrom, Andrew, Parrish, Alicia, Pellat, Marie, Polacek, Martin, Polozov, Alex, Pope, Reiner, Qiao, Siyuan, Reif, Emily, Richter, Bryan, Riley, Parker, Ros, Alex Castro, Roy, Aurko, Saeta, Brennan, Samuel, Rajkumar, Shelby, Renee, Slone, Ambrose, Smilkov, Daniel, So, David R., Sohn, Daniel, Tokumine, Simon, Valter, Dasha, Vasudevan, Vijay, Vodrahalli, Kiran, Wang, Xuezhi, Wang, Pidong, Wang, Zirui, Wang, Tao, Wieting, John, Wu, Yuhuai, Xu, Kelvin, Xu, Yunhan, Xue, Linting, Yin, Pengcheng, Yu, Jiahui, Zhang, Qiao, Zheng, Steven, Zheng, Ce, Zhou, Weikang, Zhou, Denny, Petrov, Slav, and Wu, Yonghui
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM. This improved efficiency enables broader deployment while also allowing the model to respond faster, for a more natural pace of interaction. PaLM 2 demonstrates robust reasoning capabilities exemplified by large improvements over PaLM on BIG-Bench and other reasoning tasks. PaLM 2 exhibits stable performance on a suite of responsible AI evaluations, and enables inference-time control over toxicity without additional overhead or impact on other capabilities. Overall, PaLM 2 achieves state-of-the-art performance across a diverse set of tasks and capabilities. When discussing the PaLM 2 family, it is important to distinguish between pre-trained models (of various sizes), fine-tuned variants of these models, and the user-facing products that use these models. In particular, user-facing products typically include additional pre- and post-processing steps. Additionally, the underlying models may evolve over time. Therefore, one should not expect the performance of user-facing products to exactly match the results reported in this report.
Published: 2023

10. ContMulti-objective Optimization Model for Momentum Change Based on Genetic Algorithm

Author: Zhang, Shuo, Kong, Ziqi, Xu, Kelvin, Shi, Guangxiao, Kong, Zixiao, Li, Xia, Zan, Jinjin, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Zhang, Xiankun, editor, and Chen, Wei, editor
Published: 2024
Full Text: View/download PDF

11. Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

Author: Xu, Kelvin, Hu, Zheyuan, Doshi, Ria, Rovinsky, Aaron, Kumar, Vikash, Gupta, Abhishek, and Levine, Sergey
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Complex and contact-rich robotic manipulation tasks, particularly those that involve multi-fingered hands and underactuated object manipulation, present a significant challenge to any control method. Methods based on reinforcement learning offer an appealing choice for such settings, as they can enable robots to learn to delicately balance contact forces and dexterously reposition objects without strong modeling assumptions. However, running reinforcement learning on real-world dexterous manipulation systems often requires significant manual engineering. This negates the benefits of autonomous data collection and ease of use that reinforcement learning should in principle provide. In this paper, we describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction. The core principle underlying our system is that, in a vision-based setting, users should be able to provide high-level intermediate supervision that circumvents challenges in teleoperation or kinesthetic teaching which allow a robot to not only learn a task efficiently but also to autonomously practice. Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples, a reinforcement learning procedure that learns the task autonomously without interventions, and experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world, without simulation, manual modeling, or reward engineering., Comment: First two authors contributed equally
Published: 2022

12. Autonomous Reinforcement Learning: Formalism and Benchmarking

Author: Sharma, Archit, Xu, Kelvin, Sardana, Nikhil, Gupta, Abhishek, Hausman, Karol, Levine, Sergey, and Finn, Chelsea
Subjects: Computer Science - Machine Learning, Computer Science - Robotics
Abstract: Reinforcement learning (RL) provides a naturalistic framing for learning through trial and error, which is appealing both because of its simplicity and effectiveness and because of its resemblance to how humans and animals acquire skills through experience. However, real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world, whereas common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts. This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms, such as robots. In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials. We introduce a simulated benchmark EARL around this framework, containing a set of diverse and challenging simulated tasks reflective of the hurdles introduced to learning when only a minimal reliance on extrinsic intervention can be assumed. We show that standard approaches to episodic RL and existing approaches struggle as interventions are minimized, underscoring the need for developing new algorithms for reinforcement learning with a greater focus on autonomy.
Published: 2021

13. Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention

Author: Gupta, Abhishek, Yu, Justin, Zhao, Tony Z., Kumar, Vikash, Rovinsky, Aaron, Xu, Kelvin, Devlin, Thomas, and Levine, Sergey
Subjects: Computer Science - Machine Learning, Computer Science - Robotics
Abstract: Reinforcement Learning (RL) algorithms can in principle acquire complex robotic skills by learning from large amounts of data in the real world, collected via trial and error. However, most RL algorithms use a carefully engineered setup in order to collect data, requiring human supervision and intervention to provide episodic resets. This is particularly evident in challenging robotics problems, such as dexterous manipulation. To make data collection scalable, such applications require reset-free algorithms that are able to learn autonomously, without explicit instrumentation or human intervention. Most prior work in this area handles single-task learning. However, we might also want robots that can perform large repertoires of skills. At first, this would appear to only make the problem harder. However, the key observation we make in this work is that an appropriately chosen multi-task RL setting actually alleviates the reset-free learning challenge, with minimal additional machinery required. In effect, solving a multi-task problem can directly solve the reset-free problem since different combinations of tasks can serve to perform resets for other tasks. By learning multiple tasks together and appropriately sequencing them, we can effectively learn all of the tasks together reset-free. This type of multi-task learning can effectively scale reset-free learning schemes to much more complex problems, as we demonstrate in our experiments. We propose a simple scheme for multi-task learning that tackles the reset-free learning problem, and show its effectiveness at learning to solve complex dexterous manipulation tasks in both hardware and simulation without any explicit resets. This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention., Comment: Published at ICRA 2021. First four authors contributed equally
Published: 2021

14. Continual Learning of Control Primitives: Skill Discovery via Reset-Games

Author: Xu, Kelvin, Verma, Siddharth, Finn, Chelsea, and Levine, Sergey
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed. First, in real world settings, when an agent attempts a task and fails, the environment must somehow "reset" so that the agent can attempt the task again. While easy in simulation, this could require considerable human effort in the real world, especially if the number of trials is very large. Second, real world learning often involves complex, temporally extended behavior that is often difficult to acquire with random exploration. While these two problems may at first appear unrelated, in this work, we show how a single method can allow an agent to acquire skills with minimal supervision while removing the need for resets. We do this by exploiting the insight that the need to "reset" an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set of "reset-skills". We propose a general-sum game formulation that balances the objectives of resetting and learning skills, and demonstrate that this approach improves performance on reset-free tasks, and additionally show that the skills we obtain can be used to significantly accelerate downstream learning., Comment: To appear at NeurIPS 2020
Published: 2020

15. Resolving hidden pixels beyond the resolution limit of projection imaging by square aperture

Author: Xu, Kelvin J. and Xu, Gu
Published: 2023
Full Text: View/download PDF

16. Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples

Author: Triantafillou, Eleni, Zhu, Tyler, Dumoulin, Vincent, Lamblin, Pascal, Evci, Utku, Xu, Kelvin, Goroshin, Ross, Gelada, Carles, Swersky, Kevin, Manzagol, Pierre-Antoine, and Larochelle, Hugo
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Few-shot classification refers to learning a classifier for new classes given only a few examples. While a plethora of models have emerged to tackle it, we find the procedure and datasets that are used to assess their progress lacking. To address this limitation, we propose Meta-Dataset: a new benchmark for training and evaluating models that is large-scale, consists of diverse datasets, and presents more realistic tasks. We experiment with popular baselines and meta-learners on Meta-Dataset, along with a competitive method that we propose. We analyze performance as a function of various characteristics of test tasks and examine the models' ability to leverage diverse training sources for improving their generalization. We also propose a new set of baselines for quantifying the benefit of meta-learning in Meta-Dataset. Our extensive experimentation has uncovered important research challenges and we hope to inspire work in these directions., Comment: Code available at https://github.com/google-research/meta-dataset
Published: 2019

17. Probabilistic Model-Agnostic Meta-Learning

Author: Finn, Chelsea, Xu, Kelvin, and Levine, Sergey
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Meta-learning for few-shot learning entails acquiring a prior over previous tasks and experiences, such that new tasks be learned from small amounts of data. However, a critical challenge in few-shot learning is task ambiguity: even when a powerful prior can be meta-learned from a large number of prior tasks, a small dataset for a new task can simply be too ambiguous to acquire a single model (e.g., a classifier) for that task that is accurate. In this paper, we propose a probabilistic meta-learning algorithm that can sample models for a new task from a model distribution. Our approach extends model-agnostic meta-learning, which adapts to new tasks via gradient descent, to incorporate a parameter distribution that is trained via a variational lower bound. At meta-test time, our algorithm adapts via a simple procedure that injects noise into gradient descent, and at meta-training time, the model is trained such that this stochastic adaptation procedure produces samples from the approximate model posterior. Our experimental results show that our method can sample plausible classifiers and regressors in ambiguous few-shot learning problems. We also show how reasoning about ambiguity can also be used for downstream active learning problems., Comment: NeurIPS 2018. First two authors contributed equally. Supplementary results available at https://sites.google.com/view/probabilistic-maml/
Published: 2018

18. Learning a Prior over Intent via Meta-Inverse Reinforcement Learning

Author: Xu, Kelvin, Ratner, Ellis, Dragan, Anca, Levine, Sergey, and Finn, Chelsea
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: A significant challenge for the practical application of reinforcement learning in the real world is the need to specify an oracle reward function that correctly defines a task. Inverse reinforcement learning (IRL) seeks to avoid this challenge by instead inferring a reward function from expert behavior. While appealing, it can be impractically expensive to collect datasets of demonstrations that cover the variation common in the real world (e.g. opening any type of door). Thus in practice, IRL must commonly be performed with only a limited set of demonstrations where it can be exceedingly difficult to unambiguously recover a reward function. In this work, we exploit the insight that demonstrations from other tasks can be used to constrain the set of possible reward functions by learning a "prior" that is specifically optimized for the ability to infer expressive reward functions from limited numbers of demonstrations. We demonstrate that our method can efficiently recover rewards from images for novel tasks and provide intuition as to how our approach is analogous to learning a prior.
Published: 2018

19. Trust-PCL: An Off-Policy Trust Region Method for Continuous Control

Author: Nachum, Ofir, Norouzi, Mohammad, Xu, Kelvin, and Schuurmans, Dale
Subjects: Computer Science - Artificial Intelligence
Abstract: Trust region methods, such as TRPO, are often used to stabilize policy optimization algorithms in reinforcement learning (RL). While current trust region strategies are effective for continuous control, they typically require a prohibitively large amount of on-policy interaction with the environment. To address this problem, we propose an off-policy trust region method, Trust-PCL. The algorithm is the result of observing that the optimal policy and state values of a maximum reward objective with a relative-entropy regularizer satisfy a set of multi-step pathwise consistencies along any path. Thus, Trust-PCL is able to maintain optimization stability while exploiting off-policy data to improve sample efficiency. When evaluated on a number of continuous control tasks, Trust-PCL improves the solution quality and sample efficiency of TRPO., Comment: ICLR 2018
Published: 2017

20. Bridging the Gap Between Value and Policy Based Reinforcement Learning

Author: Nachum, Ofir, Norouzi, Mohammad, Xu, Kelvin, and Schuurmans, Dale
Subjects: Computer Science - Artificial Intelligence, Computer Science - Learning, Statistics - Machine Learning
Abstract: We establish a new connection between value and policy based reinforcement learning (RL) based on a relationship between softmax temporal value consistency and policy optimality under entropy regularization. Specifically, we show that softmax consistent action values correspond to optimal entropy regularized policy probabilities along any action sequence, regardless of provenance. From this observation, we develop a new RL algorithm, Path Consistency Learning (PCL), that minimizes a notion of soft consistency error along multi-step action sequences extracted from both on- and off-policy traces. We examine the behavior of PCL in different scenarios and show that PCL can be interpreted as generalizing both actor-critic and Q-learning algorithms. We subsequently deepen the relationship by showing how a single model can be used to represent both a policy and the corresponding softmax state values, eliminating the need for a separate critic. The experimental evaluation demonstrates that PCL significantly outperforms strong actor-critic and Q-learning baselines across several benchmarks., Comment: NIPS 2017
Published: 2017

21. Unsupervised Perceptual Rewards for Imitation Learning

Author: Sermanet, Pierre, Xu, Kelvin, and Levine, Sergey
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Reward function design and exploration time are arguably the biggest obstacles to the deployment of reinforcement learning (RL) agents in the real world. In many real-world tasks, designing a reward function takes considerable hand engineering and often requires additional sensors to be installed just to measure whether the task has been executed successfully. Furthermore, many interesting tasks consist of multiple implicit intermediate steps that must be executed in sequence. Even when the final outcome can be measured, it does not necessarily provide feedback on these intermediate steps. To address these issues, we propose leveraging the abstraction power of intermediate visual representations learned by deep models to quickly infer perceptual reward functions from small numbers of demonstrations. We present a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps. This method makes use of the features in a pre-trained deep model, but does not require any explicit specification of sub-goals. The resulting reward functions can then be used by an RL agent to learn to perform the task in real-world settings. To evaluate the learned reward, we present qualitative results on two real-world tasks and a quantitative evaluation against a human-designed reward function. We also show that our method can be used to learn a real-world door opening skill using a real robot, even when the demonstration used for reward learning is provided by a human using their own hand. To our knowledge, these are the first results showing that complex robotic manipulation skills can be learned directly and without supervised labels from a video of a human performing the task. Supplementary material and data are available at https://sermanet.github.io/rewards
Published: 2016

22. An Actor-Critic Algorithm for Sequence Prediction

Author: Bahdanau, Dzmitry, Brakel, Philemon, Xu, Kelvin, Goyal, Anirudh, Lowe, Ryan, Pineau, Joelle, Courville, Aaron, and Bengio, Yoshua
Subjects: Computer Science - Learning
Abstract: We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL). Current log-likelihood training methods are limited by the discrepancy between their training and testing modes, as models must generate tokens conditioned on their previous guesses rather than the ground-truth tokens. We address this problem by introducing a \textit{critic} network that is trained to predict the value of an output token, given the policy of an \textit{actor} network. This results in a training procedure that is much closer to the test phase, and allows us to directly optimize for a task-specific score such as BLEU. Crucially, since we leverage these techniques in the supervised learning setting rather than the traditional RL setting, we condition the critic network on the ground-truth output. We show that our method leads to improved performance on both a synthetic task, and for German-English machine translation. Our analysis paves the way for such methods to be applied in natural language generation tasks, such as machine translation, caption generation, and dialogue modelling.
Published: 2016

23. Theano: A Python framework for fast computation of mathematical expressions

Author: The Theano Development Team, Al-Rfou, Rami, Alain, Guillaume, Almahairi, Amjad, Angermueller, Christof, Bahdanau, Dzmitry, Ballas, Nicolas, Bastien, Frédéric, Bayer, Justin, Belikov, Anatoly, Belopolsky, Alexander, Bengio, Yoshua, Bergeron, Arnaud, Bergstra, James, Bisson, Valentin, Snyder, Josh Bleecher, Bouchard, Nicolas, Boulanger-Lewandowski, Nicolas, Bouthillier, Xavier, de Brébisson, Alexandre, Breuleux, Olivier, Carrier, Pierre-Luc, Cho, Kyunghyun, Chorowski, Jan, Christiano, Paul, Cooijmans, Tim, Côté, Marc-Alexandre, Côté, Myriam, Courville, Aaron, Dauphin, Yann N., Delalleau, Olivier, Demouth, Julien, Desjardins, Guillaume, Dieleman, Sander, Dinh, Laurent, Ducoffe, Mélanie, Dumoulin, Vincent, Kahou, Samira Ebrahimi, Erhan, Dumitru, Fan, Ziye, Firat, Orhan, Germain, Mathieu, Glorot, Xavier, Goodfellow, Ian, Graham, Matt, Gulcehre, Caglar, Hamel, Philippe, Harlouchet, Iban, Heng, Jean-Philippe, Hidasi, Balázs, Honari, Sina, Jain, Arjun, Jean, Sébastien, Jia, Kai, Korobov, Mikhail, Kulkarni, Vivek, Lamb, Alex, Lamblin, Pascal, Larsen, Eric, Laurent, César, Lee, Sean, Lefrancois, Simon, Lemieux, Simon, Léonard, Nicholas, Lin, Zhouhan, Livezey, Jesse A., Lorenz, Cory, Lowin, Jeremiah, Ma, Qianli, Manzagol, Pierre-Antoine, Mastropietro, Olivier, McGibbon, Robert T., Memisevic, Roland, van Merriënboer, Bart, Michalski, Vincent, Mirza, Mehdi, Orlandi, Alberto, Pal, Christopher, Pascanu, Razvan, Pezeshki, Mohammad, Raffel, Colin, Renshaw, Daniel, Rocklin, Matthew, Romero, Adriana, Roth, Markus, Sadowski, Peter, Salvatier, John, Savard, François, Schlüter, Jan, Schulman, John, Schwartz, Gabriel, Serban, Iulian Vlad, Serdyuk, Dmitriy, Shabanian, Samira, Simon, Étienne, Spieckermann, Sigurd, Subramanyam, S. Ramana, Sygnowski, Jakub, Tanguay, Jérémie, van Tulder, Gijs, Turian, Joseph, Urban, Sebastian, Vincent, Pascal, Visin, Francesco, de Vries, Harm, Warde-Farley, David, Webb, Dustin J., Willson, Matthew, Xu, Kelvin, Xue, Lijun, Yao, Li, Zhang, Saizheng, and Zhang, Ying
Subjects: Computer Science - Symbolic Computation, Computer Science - Learning, Computer Science - Mathematical Software
Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it., Comment: 19 pages, 5 figures
Published: 2016

24. A Controller-Recognizer Framework: How necessary is recognition for control?

Author: Moczulski, Marcin, Xu, Kelvin, Courville, Aaron, and Cho, Kyunghyun
Subjects: Computer Science - Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recently there has been growing interest in building active visual object recognizers, as opposed to the usual passive recognizers which classifies a given static image into a predefined set of object categories. In this paper we propose to generalize these recently proposed end-to-end active visual recognizers into a controller-recognizer framework. A model in the controller-recognizer framework consists of a controller, which interfaces with an external manipulator, and a recognizer which classifies the visual input adjusted by the manipulator. We describe two most recently proposed controller-recognizer models: recurrent attention model and spatial transformer network as representative examples of controller-recognizer models. Based on this description we observe that most existing end-to-end controller-recognizers tightly, or completely, couple a controller and recognizer. We ask a question whether this tight coupling is necessary, and try to answer this empirically by building a controller-recognizer model with a decoupled controller and recognizer. Our experiments revealed that it is not always necessary to tightly couple them and that by decoupling a controller and recognizer, there is a possibility of building a generic controller that is pretrained and works together with any subsequent recognizer.
Published: 2015

25. On Using Monolingual Corpora in Neural Machine Translation

Author: Gulcehre, Caglar, Firat, Orhan, Xu, Kelvin, Cho, Kyunghyun, Barrault, Loic, Lin, Huei-Chi, Bougares, Fethi, Schwenk, Holger, and Bengio, Yoshua
Subjects: Computer Science - Computation and Language
Abstract: Recent work on end-to-end neural network-based architectures for machine translation has shown promising results for En-Fr and En-De translation. Arguably, one of the major factors behind this success has been the availability of high quality parallel corpora. In this work, we investigate how to leverage abundant monolingual corpora for neural machine translation. Compared to a phrase-based and hierarchical baseline, we obtain up to $1.96$ BLEU improvement on the low-resource language pair Turkish-English, and $1.59$ BLEU on the focused domain task of Chinese-English chat messages. While our method was initially targeted toward such tasks with less parallel data, we show that it also extends to high resource languages such as Cs-En and De-En where we obtain an improvement of $0.39$ and $0.47$ BLEU scores over the neural machine translation baselines, respectively., Comment: 9 pages, 2 figures
Published: 2015

26. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

Author: Xu, Kelvin, Ba, Jimmy, Kiros, Ryan, Cho, Kyunghyun, Courville, Aaron, Salakhutdinov, Ruslan, Zemel, Richard, and Bengio, Yoshua
Subjects: Computer Science - Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images. We describe how we can train this model in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound. We also show through visualization how the model is able to automatically learn to fix its gaze on salient objects while generating the corresponding words in the output sequence. We validate the use of attention with state-of-the-art performance on three benchmark datasets: Flickr8k, Flickr30k and MS COCO.
Published: 2015

27. On integrating a language model into neural machine translation

Author: Gulcehre, Caglar, Firat, Orhan, Xu, Kelvin, Cho, Kyunghyun, and Bengio, Yoshua
Published: 2017
Full Text: View/download PDF

28. Towards Adaptive, Continual Embodied Agents

Author: Xu, Kelvin, Levine, Sergey1, Xu, Kelvin, Xu, Kelvin, Levine, Sergey1, and Xu, Kelvin
Abstract: In recent years, artificial learning systems have demonstrated tremendous advances on a number of challenging domains such as computer vision, natural language processing and speech recognition. A striking characteristic of these recent advances has been the seemingly simple formula of combining flexible deep function approximators with large datasets collected for specific problems. These systems struggle however to leverage their learned capabilities when generalizing to new inputs for acquiring new capabilities, often requiring re-training from scratch on a similarly large dataset from scratch. This is in stark contrast to humans, who have a remarkable ability to build upon their prior experiences and learn new concepts from only a few examples. In the first part of this thesis, we will study the question of how to construct systems that mimic this ability to adapt rapidly to new tasks. One of the core principles that will underlie this part of the thesis will be to leverage structure in a large number of prior experiences/tasks to enable fast adaptation and uncertainty. We will start first by studying the setting of reward specification, a common challenge in reinforcement learning, and next study how a probabilistic framing of the meta-learning setting can enable reasoning under uncertainty. In the second part of this thesis, given the established potential that a prior datasets of tasks can play in accelerating learning, we will ask the natural question of how to enable agents to collect data completely autonomously. This would remove the need of a human to "curate" the dataset of tasks for the artificial agent and enable fully scalable never ending embodied learning. The central theme of the approach we take will be to consider the online real world nature of "tasks" that an agent must solve, and through it revisit the basic assumptions of episodic RL. Finally, we will conclude with a demonstration of these ideas in the domain of real world dexterous manipula
Published: 2022

29. Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

Author: Xu, Kelvin, primary, Hu, Zheyuan, additional, Doshi, Ria, additional, Rovinsky, Aaron, additional, Kumar, Vikash, additional, Gupta, Abhishek, additional, and Levine, Sergey, additional
Published: 2023
Full Text: View/download PDF

30. LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

Author: Abdulhai, Marwa, White, Isadora, Snell, Charlie, Sun, Charles, Hong, Joey, Zhai, Yuexiang, Xu, Kelvin, Levine, Sergey, Abdulhai, Marwa, White, Isadora, Snell, Charlie, Sun, Charles, Hong, Joey, Zhai, Yuexiang, Xu, Kelvin, and Levine, Sergey
Abstract: Large language models (LLMs) provide excellent text-generation capabilities, but standard prompting and generation methods generally do not lead to intentional or goal-directed agents and might necessitate considerable prompt tuning. This becomes particularly apparent in multi-turn conversations: even the best current LLMs rarely ask clarifying questions, engage in explicit information gathering, or take actions now that lead to better decisions after multiple turns. Reinforcement learning has the potential to leverage the powerful modeling capabilities of LLMs, as well as their internal representation of textual interactions, to create capable goal-directed language agents. This can enable intentional and temporally extended interactions, such as with humans, through coordinated persuasion and carefully crafted questions, or in goal-directed play through text games to bring about desired final outcomes. However, enabling this requires the community to develop stable and reliable reinforcement learning algorithms that can effectively train LLMs. Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms. Our paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for LLMs, together with an open-source research framework containing a basic toolkit for getting started on multi-turn RL with offline value-based and policy-based RL methods. Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
Published: 2023

31. Binding Strength and Hydrogen Bond Numbers between COVID-19 RBD and HVR of Antibody

Author: Wang, Ryan Taoran, primary, Xu, Alex Fan, additional, Zhou, Qi, additional, Song, Tinglu, additional, Xu, Kelvin J., additional, and Xu, Gu, additional
Published: 2021
Full Text: View/download PDF

32. Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention

Author: Gupta, Abhishek, primary, Yu, Justin, additional, Zhao, Tony Z., additional, Kumar, Vikash, additional, Rovinsky, Aaron, additional, Xu, Kelvin, additional, Devlin, Thomas, additional, and Levine, Sergey, additional
Published: 2021
Full Text: View/download PDF

33. Binding strength and hydrogen bond numbers between Covid-19 RBD and HVR of antibody

Author: Wang, Ryan Taoran, primary, Xu, Alex Fan, additional, Zhou, Qi, additional, Song, Tinglu, additional, Xu, Kelvin J., additional, and Xu, Gu, additional
Published: 2020
Full Text: View/download PDF

34. Hysteresis and Instability Predicted in Moisture Degradation of Perovskite Solar Cells

Author: Xu, Kelvin J., primary, Wang, Ryan T., additional, Xu, Alex F., additional, Chen, Jason Y., additional, and Xu, Gu, additional
Published: 2020
Full Text: View/download PDF

35. Exploring Attention Based Model for Captioning Images

Author: Xu, Kelvin, Bengio, Yoshua, and Courville, Aaron
Subjects: Inference Variationelle, Neural Networks, Generation de Description, Apprentissage par Renforcement, Representation Learning, Variational Inference, Reinforcement Learning, Sequence Modelling, Deep Learning, Apprentissage Supervise, Modelisation de Donnees Sequentielles, Apprentissage Profond, Apprentissage de Representations, Attention, Reseaux de Neurones, Caption Generation, Supervised Learning
Abstract: Comprendre ce qu’il y a dans une image est l’enjeu primaire de la vision par ordinateur. Depuis 2012, les réseaux de neurones se sont imposés comme le modèle de facto pour de nombreuses applications d’apprentissage automatique. Inspirés par les récents travaux en traduction automatique et en détection d’objet, cette thèse s’intéresse aux modèles capables de décrire le contenu d’une image et explore comment la notion d’attention peut être parametrisée par des réseaux de neurones et utilisée pour la description d’image. Cette thèse presente un reseau de neurones base sur l’attention qui peut décrire le contenu d’images, et explique comment apprendre ce modèle de facon déterministique par backpropagation ou de facon stochastique avec de l’inférence variationnelle ou de l’apprentissage par renforcement. Etonnamment, nous montrons que le modèle apprend automatiquement a concentrer son attention sur les objets correspondant aux mots dans la phrase prédite. Cette notion d’attention obtient l’état de l’art sur trois benchmarks: Flickr9k, Flickr30k and MS COCO., Understanding the content of images is arguably the primary goal of computer vision. Beyond merely saying what is in an image, one test of a system's understanding of an image is its ability to describe the contents of an image in natural language (a task we will refer to in this thesis as \image captioning"). Since 2012, neural networks have exploded as the defacto modelling tool for many important applications in machine learning. Inspired by recent work in machine translation and object detection, this thesis explores such models that can describe the content of images. In addition, it explores how the notion of \attention" can be both parameterized by neural networks and usefully employed for image captioning. More technically, this thesis presents a single attention based neural network that can describe images. It describes how to train such models in a purely deterministic manner using standard backpropagation and stochastically by considering techniques used in variational inference and reinforcement learning. Surprisingly, we show through visualization how the model is able to automatically learn an intuitive gaze of salient objects corresponding to words in the output sequence. We validate the use of an attention based approach with state-of-the-art performance three benchmark datasets: Flickr9k, Flickr30k and MS COCO.
Published: 2018

36. Privacy-Preserving Fall Detection with Deep Learning on mmWave Radar Signal

Author: Sun, Yangfan, primary, Hang, Renlong, additional, Li, Zhu, additional, Jin, Mouqing, additional, and Xu, Kelvin, additional
Published: 2019
Full Text: View/download PDF

37. The challenges in managing perioperative hypoglycemia in a child on modified Atkins diet

Author: Mak, WenJie, primary, Xu, Kelvin, additional, Lim, Sophie WeiYan, additional, and Hee, Hwan Ing, additional
Published: 2019
Full Text: View/download PDF

38. Automatically inferring task context for continual learning

Author: Collins, Jasmine, primary, Xu, Kelvin, additional, Olshausen, Bruno, additional, and Cheung, Brian, additional
Published: 2019
Full Text: View/download PDF

39. Unsupervised Perceptual Rewards for Imitation Learning

Author: Sermanet, Pierre, primary, Xu, Kelvin, additional, and Levine, Sergey, additional
Published: 2017
Full Text: View/download PDF

40. Structural basis for corepressor assembly by the orphan nuclear receptor TLX

Author: Zhi, Xiaoyong, primary, Zhou, X. Edward, additional, He, Yuanzheng, additional, Searose-Xu, Kelvin, additional, Zhang, Chun-Li, additional, Tsai, Chih-Cheng, additional, Melcher, Karsten, additional, and Xu, H. Eric, additional
Published: 2015
Full Text: View/download PDF

41. Structural basis for corepressor assembly by the orphan nuclear receptor TLX.

Author: Xiaoyong Zhi, X. Edward Zhou, Yuanzheng He, Searose-Xu, Kelvin, Chun-Li Zhang, Chih-Cheng Tsai, Melcher, Karsten, and Xu, H. Eric
Subjects: *NUCLEAR receptors (Biochemistry), *HORMONE receptors, *CELL receptors, *GENETIC repressors, *PEPTIDES
Abstract: The orphan nuclear receptor TLX regulates neural stem cell self-renewal in the adult brain and functions primarily as a transcription repressor through recruitment of Atrophin corepressors, which bind to TLX via a conserved peptide motif termed the Atro box. Here we report crystal structures of the human and insect TLX ligand-binding domain in complex with Atro box peptides. In these structures, TLX adopts an autorepressed conformation in which its helix H12 occupies the coactivator-binding groove. Unexpectedly, H12 in this autorepressed conformation forms a novel binding pocket with residues from helix H3 that accommodates a short helix formed by the conserved ALXXLXXY motif of the Atro box. Mutations that weaken the TLX-Atrophin interaction compromise the repressive activity of TLX, demonstrating that this interaction is required for Atrophin to confer repressor activity to TLX. Moreover, the autorepressed conformation is conserved in the repressor class of orphan nuclear receptors, and mutations of corresponding residues in other members of this class of receptors diminish their repressor activities. Together, our results establish the functional conservation of the autorepressed conformation and define a key sequence motif in the Atro box that is essential for TLX-mediated repression. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

41 results on '"Xu, Kelvin"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources