Author: "Schulman, A." / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Schulman, A."' showing total 333 results

Start Over Author "Schulman, A." Publication Type Reports

333 results on '"Schulman, A."'

1. Measuring short-form factuality in large language models

Author: Wei, Jason, Karina, Nguyen, Chung, Hyung Won, Jiao, Yunxin Joy, Papay, Spencer, Glaese, Amelia, Schulman, John, and Fedus, William
Subjects: Computer Science - Computation and Language
Abstract: We present SimpleQA, a benchmark that evaluates the ability of language models to answer short, fact-seeking questions. We prioritized two properties in designing this eval. First, SimpleQA is challenging, as it is adversarially collected against GPT-4 responses. Second, responses are easy to grade, because questions are created such that there exists only a single, indisputable answer. Each answer in SimpleQA is graded as either correct, incorrect, or not attempted. A model with ideal behavior would get as many questions correct as possible while not attempting the questions for which it is not confident it knows the correct answer. SimpleQA is a simple, targeted evaluation for whether models "know what they know," and our hope is that this benchmark will remain relevant for the next few generations of frontier models. SimpleQA can be found at https://github.com/openai/simple-evals., Comment: Blog post: https://openai.com/index/introducing-simpleqa/
Published: 2024

2. Rule Based Rewards for Language Model Safety

Author: Mu, Tong, Helyar, Alec, Heidecke, Johannes, Achiam, Joshua, Vallone, Andrea, Kivlichan, Ian, Lin, Molly, Beutel, Alex, Schulman, John, and Weng, Lilian
Subjects: Computer Science - Artificial Intelligence
Abstract: Reinforcement learning based fine-tuning of large language models (LLMs) on human preferences has been shown to enhance both their capabilities and safety behavior. However, in cases related to safety, without precise instructions to human annotators, the data collected may cause the model to become overly cautious, or to respond in an undesirable style, such as being judgmental. Additionally, as model capabilities and usage patterns evolve, there may be a costly need to add or relabel data to modify safety behavior. We propose a novel preference modeling approach that utilizes AI feedback and only requires a small amount of human data. Our method, Rule Based Rewards (RBR), uses a collection of rules for desired or undesired behaviors (e.g. refusals should not be judgmental) along with a LLM grader. In contrast to prior methods using AI feedback, our method uses fine-grained, composable, LLM-graded few-shot prompts as reward directly in RL training, resulting in greater control, accuracy and ease of updating. We show that RBRs are an effective training method, achieving an F1 score of 97.1, compared to a human-feedback baseline of 91.7, resulting in much higher safety-behavior accuracy through better balancing usefulness and safety., Comment: Accepted at Neurips 2024
Published: 2024

3. GPT-4o System Card

Author: OpenAI, Hurst, Aaron, Lerer, Adam, Goucher, Adam P., Perelman, Adam, Ramesh, Aditya, Clark, Aidan, Ostrow, AJ, Welihinda, Akila, Hayes, Alan, Radford, Alec, Mądry, Aleksander, Baker-Whitcomb, Alex, Beutel, Alex, Borzunov, Alex, Carney, Alex, Chow, Alex, Kirillov, Alex, Nichol, Alex, Paino, Alex, Renzin, Alex, Passos, Alex Tachard, Kirillov, Alexander, Christakis, Alexi, Conneau, Alexis, Kamali, Ali, Jabri, Allan, Moyer, Allison, Tam, Allison, Crookes, Amadou, Tootoochian, Amin, Tootoonchian, Amin, Kumar, Ananya, Vallone, Andrea, Karpathy, Andrej, Braunstein, Andrew, Cann, Andrew, Codispoti, Andrew, Galu, Andrew, Kondrich, Andrew, Tulloch, Andrew, Mishchenko, Andrey, Baek, Angela, Jiang, Angela, Pelisse, Antoine, Woodford, Antonia, Gosalia, Anuj, Dhar, Arka, Pantuliano, Ashley, Nayak, Avi, Oliver, Avital, Zoph, Barret, Ghorbani, Behrooz, Leimberger, Ben, Rossen, Ben, Sokolowsky, Ben, Wang, Ben, Zweig, Benjamin, Hoover, Beth, Samic, Blake, McGrew, Bob, Spero, Bobby, Giertler, Bogo, Cheng, Bowen, Lightcap, Brad, Walkin, Brandon, Quinn, Brendan, Guarraci, Brian, Hsu, Brian, Kellogg, Bright, Eastman, Brydon, Lugaresi, Camillo, Wainwright, Carroll, Bassin, Cary, Hudson, Cary, Chu, Casey, Nelson, Chad, Li, Chak, Shern, Chan Jun, Conger, Channing, Barette, Charlotte, Voss, Chelsea, Ding, Chen, Lu, Cheng, Zhang, Chong, Beaumont, Chris, Hallacy, Chris, Koch, Chris, Gibson, Christian, Kim, Christina, Choi, Christine, McLeavey, Christine, Hesse, Christopher, Fischer, Claudia, Winter, Clemens, Czarnecki, Coley, Jarvis, Colin, Wei, Colin, Koumouzelis, Constantin, Sherburn, Dane, Kappler, Daniel, Levin, Daniel, Levy, Daniel, Carr, David, Farhi, David, Mely, David, Robinson, David, Sasaki, David, Jin, Denny, Valladares, Dev, Tsipras, Dimitris, Li, Doug, Nguyen, Duc Phong, Findlay, Duncan, Oiwoh, Edede, Wong, Edmund, Asdar, Ehsan, Proehl, Elizabeth, Yang, Elizabeth, Antonow, Eric, Kramer, Eric, Peterson, Eric, Sigler, Eric, Wallace, Eric, Brevdo, Eugene, Mays, Evan, Khorasani, Farzad, Such, Felipe Petroski, Raso, Filippo, Zhang, Francis, von Lohmann, Fred, Sulit, Freddie, Goh, Gabriel, Oden, Gene, Salmon, Geoff, Starace, Giulio, Brockman, Greg, Salman, Hadi, Bao, Haiming, Hu, Haitang, Wong, Hannah, Wang, Haoyu, Schmidt, Heather, Whitney, Heather, Jun, Heewoo, Kirchner, Hendrik, Pinto, Henrique Ponde de Oliveira, Ren, Hongyu, Chang, Huiwen, Chung, Hyung Won, Kivlichan, Ian, O'Connell, Ian, Osband, Ian, Silber, Ian, Sohl, Ian, Okuyucu, Ibrahim, Lan, Ikai, Kostrikov, Ilya, Sutskever, Ilya, Kanitscheider, Ingmar, Gulrajani, Ishaan, Coxon, Jacob, Menick, Jacob, Pachocki, Jakub, Aung, James, Betker, James, Crooks, James, Lennon, James, Kiros, Jamie, Leike, Jan, Park, Jane, Kwon, Jason, Phang, Jason, Teplitz, Jason, Wei, Jason, Wolfe, Jason, Chen, Jay, Harris, Jeff, Varavva, Jenia, Lee, Jessica Gan, Shieh, Jessica, Lin, Ji, Yu, Jiahui, Weng, Jiayi, Tang, Jie, Yu, Jieqi, Jang, Joanne, Candela, Joaquin Quinonero, Beutler, Joe, Landers, Joe, Parish, Joel, Heidecke, Johannes, Schulman, John, Lachman, Jonathan, McKay, Jonathan, Uesato, Jonathan, Ward, Jonathan, Kim, Jong Wook, Huizinga, Joost, Sitkin, Jordan, Kraaijeveld, Jos, Gross, Josh, Kaplan, Josh, Snyder, Josh, Achiam, Joshua, Jiao, Joy, Lee, Joyce, Zhuang, Juntang, Harriman, Justyn, Fricke, Kai, Hayashi, Kai, Singhal, Karan, Shi, Katy, Karthik, Kavin, Wood, Kayla, Rimbach, Kendra, Hsu, Kenny, Nguyen, Kenny, Gu-Lemberg, Keren, Button, Kevin, Liu, Kevin, Howe, Kiel, Muthukumar, Krithika, Luther, Kyle, Ahmad, Lama, Kai, Larry, Itow, Lauren, Workman, Lauren, Pathak, Leher, Chen, Leo, Jing, Li, Guy, Lia, Fedus, Liam, Zhou, Liang, Mamitsuka, Lien, Weng, Lilian, McCallum, Lindsay, Held, Lindsey, Ouyang, Long, Feuvrier, Louis, Zhang, Lu, Kondraciuk, Lukas, Kaiser, Lukasz, Hewitt, Luke, Metz, Luke, Doshi, Lyric, Aflak, Mada, Simens, Maddie, Boyd, Madelaine, Thompson, Madeleine, Dukhan, Marat, Chen, Mark, Gray, Mark, Hudnall, Mark, Zhang, Marvin, Aljubeh, Marwan, Litwin, Mateusz, Zeng, Matthew, Johnson, Max, Shetty, Maya, Gupta, Mayank, Shah, Meghan, Yatbaz, Mehmet, Yang, Meng Jia, Zhong, Mengchao, Glaese, Mia, Chen, Mianna, Janner, Michael, Lampe, Michael, Petrov, Michael, Wu, Michael, Wang, Michele, Fradin, Michelle, Pokrass, Michelle, Castro, Miguel, de Castro, Miguel Oom Temudo, Pavlov, Mikhail, Brundage, Miles, Wang, Miles, Khan, Minal, Murati, Mira, Bavarian, Mo, Lin, Molly, Yesildal, Murat, Soto, Nacho, Gimelshein, Natalia, Cone, Natalie, Staudacher, Natalie, Summers, Natalie, LaFontaine, Natan, Chowdhury, Neil, Ryder, Nick, Stathas, Nick, Turley, Nick, Tezak, Nik, Felix, Niko, Kudige, Nithanth, Keskar, Nitish, Deutsch, Noah, Bundick, Noel, Puckett, Nora, Nachum, Ofir, Okelola, Ola, Boiko, Oleg, Murk, Oleg, Jaffe, Oliver, Watkins, Olivia, Godement, Olivier, Campbell-Moore, Owen, Chao, Patrick, McMillan, Paul, Belov, Pavel, Su, Peng, Bak, Peter, Bakkum, Peter, Deng, Peter, Dolan, Peter, Hoeschele, Peter, Welinder, Peter, Tillet, Phil, Pronin, Philip, Tillet, Philippe, Dhariwal, Prafulla, Yuan, Qiming, Dias, Rachel, Lim, Rachel, Arora, Rahul, Troll, Rajan, Lin, Randall, Lopes, Rapha Gontijo, Puri, Raul, Miyara, Reah, Leike, Reimar, Gaubert, Renaud, Zamani, Reza, Wang, Ricky, Donnelly, Rob, Honsby, Rob, Smith, Rocky, Sahai, Rohan, Ramchandani, Rohit, Huet, Romain, Carmichael, Rory, Zellers, Rowan, Chen, Roy, Chen, Ruby, Nigmatullin, Ruslan, Cheu, Ryan, Jain, Saachi, Altman, Sam, Schoenholz, Sam, Toizer, Sam, Miserendino, Samuel, Agarwal, Sandhini, Culver, Sara, Ethersmith, Scott, Gray, Scott, Grove, Sean, Metzger, Sean, Hermani, Shamez, Jain, Shantanu, Zhao, Shengjia, Wu, Sherwin, Jomoto, Shino, Wu, Shirong, Shuaiqi, Xia, Phene, Sonia, Papay, Spencer, Narayanan, Srinivas, Coffey, Steve, Lee, Steve, Hall, Stewart, Balaji, Suchir, Broda, Tal, Stramer, Tal, Xu, Tao, Gogineni, Tarun, Christianson, Taya, Sanders, Ted, Patwardhan, Tejal, Cunninghman, Thomas, Degry, Thomas, Dimson, Thomas, Raoux, Thomas, Shadwell, Thomas, Zheng, Tianhao, Underwood, Todd, Markov, Todor, Sherbakov, Toki, Rubin, Tom, Stasi, Tom, Kaftan, Tomer, Heywood, Tristan, Peterson, Troy, Walters, Tyce, Eloundou, Tyna, Qi, Valerie, Moeller, Veit, Monaco, Vinnie, Kuo, Vishal, Fomenko, Vlad, Chang, Wayne, Zheng, Weiyi, Zhou, Wenda, Manassra, Wesam, Sheu, Will, Zaremba, Wojciech, Patil, Yash, Qian, Yilei, Kim, Yongjik, Cheng, Youlong, Zhang, Yu, He, Yuchen, Zhang, Yuchen, Jin, Yujia, Dai, Yunxing, and Malkov, Yury
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computers and Society, Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50\% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models. In line with our commitment to building AI safely and consistent with our voluntary commitments to the White House, we are sharing the GPT-4o System Card, which includes our Preparedness Framework evaluations. In this System Card, we provide a detailed look at GPT-4o's capabilities, limitations, and safety evaluations across multiple categories, focusing on speech-to-speech while also evaluating text and image capabilities, and measures we've implemented to ensure the model is safe and aligned. We also include third-party assessments on dangerous capabilities, as well as discussion of potential societal impacts of GPT-4o's text and vision capabilities.
Published: 2024

4. Diversity in Evolutionary Dynamics

Author: Rabani, Yuval, Schulman, Leonard J., and Sinclair, Alistair
Subjects: Quantitative Biology - Populations and Evolution, Computer Science - Computational Engineering, Finance, and Science, Computer Science - Computer Science and Game Theory
Abstract: We consider the dynamics imposed by natural selection on the populations of two competing, sexually reproducing, haploid species. In this setting, the fitness of any genome varies over time due to the changing population mix of the competing species; crucially, this fitness variation arises naturally from the model itself, without the need for imposing it exogenously as is typically the case. Previous work on this model [14] showed that, in the special case where each of the two species exhibits just two phenotypes, genetic diversity is maintained at all times. This finding supported the tenet that sexual reproduction is advantageous because it promotes diversity, which increases the survivability of a species. In the present paper we consider the more realistic case where there are more than two phenotypes available to each species. The conclusions about diversity in general turn out to be very different from the two-phenotype case. Our first result is negative: namely, we show that sexual reproduction does not guarantee the maintenance of diversity at all times, i.e., the result of [14] does not generalize. Our counterexample consists of two competing species with just three phenotypes each. We show that, for any time~$t_0$ and any $\varepsilon>0$, there is a time $t\ge t_0$ at which the combined diversity of both species is smaller than~$\varepsilon$. Our main result is a complementary positive statement, which says that in any non-degenerate example, diversity is maintained in a weaker, "infinitely often" sense. Thus, our results refute the supposition that sexual reproduction ensures diversity at all times, but affirm a weaker assertion that extended periods of high diversity are necessarily a recurrent event.
Published: 2024

5. An Interactive Agent Foundation Model

Author: Durante, Zane, Sarkar, Bidipta, Gong, Ran, Taori, Rohan, Noda, Yusuke, Tang, Paul, Adeli, Ehsan, Lakshmikanth, Shrinidhi Kowshika, Schulman, Kevin, Milstein, Arnold, Terzopoulos, Demetri, Famoti, Ade, Kuno, Noboru, Llorens, Ashley, Vo, Hoi, Ikeuchi, Katsu, Fei-Fei, Li, Gao, Jianfeng, Wake, Naoki, and Huang, Qiuyuan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Robotics
Abstract: The development of artificial intelligence systems is transitioning from creating static, task-specific models to dynamic, agent-based systems capable of performing well in a wide range of applications. We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents across a wide range of domains, datasets, and tasks. Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction, enabling a versatile and adaptable AI framework. We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare. Our model demonstrates its ability to generate meaningful and contextually relevant outputs in each area. The strength of our approach lies in its generality, leveraging a variety of data sources such as robotics sequences, gameplay data, large-scale video datasets, and textual information for effective multimodal and multi-task learning. Our approach provides a promising avenue for developing generalist, action-taking, multimodal systems.
Published: 2024

6. Causal Discovery under Latent Class Confounding

Author: Mazaheri, Bijan, Gordon, Spencer, Rabani, Yuval, and Schulman, Leonard
Subjects: Computer Science - Machine Learning, Computer Science - Computational Complexity, Mathematics - Statistics Theory
Abstract: An acyclic causal structure can be described with directed acyclic graph (DAG), where arrows indicate the possibility of direct causation. The task of learning this structure from data is known as "causal discovery." Diverse populations or changing environments can sometimes give rise to data that is heterogeneous in the following sense: each population/environment is a "source" which idiosyncratically determines the forms of those direct causal effects. From this perspective, the source is a latent common cause for every observed variable. While some methods for causal discovery are able to work around latent confounding in special cases, especially when only few observables are confounded, a global confounder is a difficult challenge. The only known ways to deal with latent global confounding involve assumptions that limit the structural equations and/or noise functions. We demonstrate that globally confounded causal structures can still be identifiable with arbitrary structural equations and noise functions, so long as the number of latent classes remains small relative to the size and sparsity of the underlying DAG.
Published: 2023

7. Identifiability of Product of Experts Models

Author: Gordon, Spencer L., Kant, Manav, Ma, Eric, Schulman, Leonard J., and Staicu, Andrei
Subjects: Computer Science - Machine Learning, Mathematics - Algebraic Geometry, Mathematics - Statistics Theory, 62E10, 62F99, 68T05, I.2.6
Abstract: Product of experts (PoE) are layered networks in which the value at each node is an AND (or product) of the values (possibly negated) at its inputs. These were introduced as a neural network architecture that can efficiently learn to generate high-dimensional data which satisfy many low-dimensional constraints -- thereby allowing each individual expert to perform a simple task. PoEs have found a variety of applications in learning. We study the problem of identifiability of a product of experts model having a layer of binary latent variables, and a layer of binary observables that are iid conditional on the latents. The previous best upper bound on the number of observables needed to identify the model was exponential in the number of parameters. We show: (a) When the latents are uniformly distributed, the model is identifiable with a number of observables equal to the number of parameters (and hence best possible). (b) In the more general case of arbitrarily distributed latents, the model is identifiable for a number of observables that is still linear in the number of parameters (and within a factor of two of best-possible). The proofs rely on root interlacing phenomena for some special three-term recurrences., Comment: 24 pages, 2 figures
Published: 2023

8. Nuclear Pleomorphism in Canine Cutaneous Mast Cell Tumors: Comparison of Reproducibility and Prognostic Relevance between Estimates, Manual Morphometry and Algorithmic Morphometry

Author: Haghofer, Andreas, Parlak, Eda, Bartel, Alexander, Donovan, Taryn A., Assenmacher, Charles-Antoine, Bolfa, Pompei, Dark, Michael J., Fuchs-Baumgartinger, Andrea, Klang, Andrea, Jäger, Kathrin, Klopfleisch, Robert, Merz, Sophie, Richter, Barbara, Schulman, F. Yvonne, Janout, Hannah, Ganz, Jonathan, Scharinger, Josef, Aubreville, Marc, Winkler, Stephan M., Kiupel, Matti, and Bertram, Christof A.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Variation in nuclear size and shape is an important criterion of malignancy for many tumor types; however, categorical estimates by pathologists have poor reproducibility. Measurements of nuclear characteristics (morphometry) can improve reproducibility, but manual methods are time consuming. The aim of this study was to explore the limitations of estimates and develop alternative morphometric solutions for canine cutaneous mast cell tumors (ccMCT). We assessed the following nuclear evaluation methods for measurement accuracy, reproducibility, and prognostic utility: 1) anisokaryosis (karyomegaly) estimates by 11 pathologists; 2) gold standard manual morphometry of at least 100 nuclei; 3) practicable manual morphometry with stratified sampling of 12 nuclei by 9 pathologists; and 4) automated morphometry using a deep learning-based segmentation algorithm. The study dataset comprised 96 ccMCT with available outcome information. The study dataset comprised 96 ccMCT with available outcome information. Inter-rater reproducibility of karyomegaly estimates was low ($\kappa$ = 0.226), while it was good (ICC = 0.654) for practicable morphometry of the standard deviation (SD) of nuclear size. As compared to gold standard manual morphometry (AUC = 0.839, 95% CI: 0.701 - 0.977), the prognostic value (tumor-specific survival) of SDs of nuclear area for practicable manual morphometry (12 nuclei) and automated morphometry were high with an area under the ROC curve (AUC) of 0.868 (95% CI: 0.737 - 0.991) and 0.943 (95% CI: 0.889 - 0.996), respectively. This study supports the use of manual morphometry with stratified sampling of 12 nuclei and algorithmic morphometry to overcome the poor reproducibility of estimates.
Published: 2023

9. Identification of Mixtures of Discrete Product Distributions in Near-Optimal Sample and Time Complexity

Author: Gordon, Spencer L., Jahn, Erik, Mazaheri, Bijan, Rabani, Yuval, and Schulman, Leonard J.
Subjects: Computer Science - Machine Learning, Computer Science - Data Structures and Algorithms, Electrical Engineering and Systems Science - Signal Processing, Statistics - Machine Learning
Abstract: We consider the problem of identifying, from statistics, a distribution of discrete random variables $X_1,\ldots,X_n$ that is a mixture of $k$ product distributions. The best previous sample complexity for $n \in O(k)$ was $(1/\zeta)^{O(k^2 \log k)}$ (under a mild separation assumption parameterized by $\zeta$). The best known lower bound was $\exp(\Omega(k))$. It is known that $n\geq 2k-1$ is necessary and sufficient for identification. We show, for any $n\geq 2k-1$, how to achieve sample complexity and run-time complexity $(1/\zeta)^{O(k)}$. We also extend the known lower bound of $e^{\Omega(k)}$ to match our upper bound across a broad range of $\zeta$. Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.
Published: 2023

10. Let's Verify Step by Step

Author: Lightman, Hunter, Kosaraju, Vineet, Burda, Yura, Edwards, Harri, Baker, Bowen, Lee, Teddy, Leike, Jan, Schulman, John, Sutskever, Ilya, and Cobbe, Karl
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: In recent years, large language models have greatly improved in their ability to perform complex multi-step reasoning. However, even state-of-the-art models still regularly produce logical mistakes. To train more reliable models, we can turn either to outcome supervision, which provides feedback for a final result, or process supervision, which provides feedback for each intermediate reasoning step. Given the importance of training reliable models, and given the high cost of human feedback, it is important to carefully compare the both methods. Recent work has already begun this comparison, but many questions still remain. We conduct our own investigation, finding that process supervision significantly outperforms outcome supervision for training models to solve problems from the challenging MATH dataset. Our process-supervised model solves 78% of problems from a representative subset of the MATH test set. Additionally, we show that active learning significantly improves the efficacy of process supervision. To support related research, we also release PRM800K, the complete dataset of 800,000 step-level human feedback labels used to train our best reward model.
Published: 2023

11. The impact of local pinning sites in magnetic tunnel junctions with non-homogeneous free layers

Author: Jenkins, Alex. S., Martins, Leandro, Benetti, Luana, Schulman, Alejandro, Anacleto, Pedro, Claro, Marcel, Paz, Elvira, Çaha, Ihsan, Deepak, Francis Leonard, and Ferreira, Ricardo
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Pinning at local defects is a significant road block for the successful implementation of technological paradigms which rely on the dynamic properties of non-trivial magnetic textures. In this report a comprehensive study of the influence of local pinning sites for non-homogeneous magnetic layers integrated as the free layer of a magnetic tunnel junction is presented, both experimentally and with corresponding micromagnetic simulations. The pinning sites are found to be extremely detrimental to the frequency controllability of the devices, a key requirement for their use as synapses in a frequency multiplexed artificial neural networks. In addition to describing the impact of the local pinning sites in the more conventional NiFe, a vortex-based magnetic tunnel junction with an amorphous free layer is presented which shows significantly improved frequency selectivity, marking a clear direction for the design of future low power devices., Comment: 5 figures publication
Published: 2023

12. Enhancing Spin Transfer Torque in Magnetic Tunnel Junction Devices: Exploring the Influence of Capping Layer Materials and Thickness on Device Characteristics

Author: Parvini, Tahereh Sadat, Paz, Elvira, Böhnert, Tim, Schulman, Alejandro, Benetti, Luana, Oberbauer, Felix, Walowski, Jakob, Moradi, Farshad, Ferreira, Ricardo, and Münzenberg, Markus
Subjects: Physics - Optics
Abstract: We have developed and optimized two categories of spin transfer torque magnetic tunnel junctions (STT-MTJs) that exhibit a high tunnel magnetoresistance (TMR) ratio, low critical current, high outputpower in the micro watt range, and auto-oscillation behavior. These characteristics demonstrate the potential of STT-MTJs for low-power, high-speed, and reliable spintronic applications, including magnetic memory, logic, and signal processing. The only distinguishing factor between the two categories, denoted as A-MTJs and B-MTJs, is the composition of their free layers, 2 CoFeB/0.21 Ta/6 CoFeSiB for A-MTJs and 2 CoFeB/0.21 Ta/7 NiFe for B-MTJs. Our study reveals that B-MTJs exhibit lower critical currents for auto-oscillation than A-MTJs. We found that both stacks have comparable saturation magnetization and anisotropy field, suggesting that the difference in auto-oscillation behavior is due to the higher damping of A-MTJs compared to B-MTJs. To verify this hypothesis, we employed the all-optical time-resolved magneto-optical Kerr effect (TRMOKE) technique, which confirmed that STT-MTJs with lower damping exhibited auto-oscillation at lower critical current values. Additionally, our study aimed to optimize the STT-MTJ performance by investigating the impact of the capping layer on the device's response to electronic and optical stimuli.
Published: 2023
Full Text: View/download PDF

13. GPT-4 Technical Report

Author: OpenAI, Achiam, Josh, Adler, Steven, Agarwal, Sandhini, Ahmad, Lama, Akkaya, Ilge, Aleman, Florencia Leoni, Almeida, Diogo, Altenschmidt, Janko, Altman, Sam, Anadkat, Shyamal, Avila, Red, Babuschkin, Igor, Balaji, Suchir, Balcom, Valerie, Baltescu, Paul, Bao, Haiming, Bavarian, Mohammad, Belgum, Jeff, Bello, Irwan, Berdine, Jake, Bernadett-Shapiro, Gabriel, Berner, Christopher, Bogdonoff, Lenny, Boiko, Oleg, Boyd, Madelaine, Brakman, Anna-Luisa, Brockman, Greg, Brooks, Tim, Brundage, Miles, Button, Kevin, Cai, Trevor, Campbell, Rosie, Cann, Andrew, Carey, Brittany, Carlson, Chelsea, Carmichael, Rory, Chan, Brooke, Chang, Che, Chantzis, Fotis, Chen, Derek, Chen, Sully, Chen, Ruby, Chen, Jason, Chen, Mark, Chess, Ben, Cho, Chester, Chu, Casey, Chung, Hyung Won, Cummings, Dave, Currier, Jeremiah, Dai, Yunxing, Decareaux, Cory, Degry, Thomas, Deutsch, Noah, Deville, Damien, Dhar, Arka, Dohan, David, Dowling, Steve, Dunning, Sheila, Ecoffet, Adrien, Eleti, Atty, Eloundou, Tyna, Farhi, David, Fedus, Liam, Felix, Niko, Fishman, Simón Posada, Forte, Juston, Fulford, Isabella, Gao, Leo, Georges, Elie, Gibson, Christian, Goel, Vik, Gogineni, Tarun, Goh, Gabriel, Gontijo-Lopes, Rapha, Gordon, Jonathan, Grafstein, Morgan, Gray, Scott, Greene, Ryan, Gross, Joshua, Gu, Shixiang Shane, Guo, Yufei, Hallacy, Chris, Han, Jesse, Harris, Jeff, He, Yuchen, Heaton, Mike, Heidecke, Johannes, Hesse, Chris, Hickey, Alan, Hickey, Wade, Hoeschele, Peter, Houghton, Brandon, Hsu, Kenny, Hu, Shengli, Hu, Xin, Huizinga, Joost, Jain, Shantanu, Jain, Shawn, Jang, Joanne, Jiang, Angela, Jiang, Roger, Jin, Haozhun, Jin, Denny, Jomoto, Shino, Jonn, Billie, Jun, Heewoo, Kaftan, Tomer, Kaiser, Łukasz, Kamali, Ali, Kanitscheider, Ingmar, Keskar, Nitish Shirish, Khan, Tabarak, Kilpatrick, Logan, Kim, Jong Wook, Kim, Christina, Kim, Yongjik, Kirchner, Jan Hendrik, Kiros, Jamie, Knight, Matt, Kokotajlo, Daniel, Kondraciuk, Łukasz, Kondrich, Andrew, Konstantinidis, Aris, Kosic, Kyle, Krueger, Gretchen, Kuo, Vishal, Lampe, Michael, Lan, Ikai, Lee, Teddy, Leike, Jan, Leung, Jade, Levy, Daniel, Li, Chak Ming, Lim, Rachel, Lin, Molly, Lin, Stephanie, Litwin, Mateusz, Lopez, Theresa, Lowe, Ryan, Lue, Patricia, Makanju, Anna, Malfacini, Kim, Manning, Sam, Markov, Todor, Markovski, Yaniv, Martin, Bianca, Mayer, Katie, Mayne, Andrew, McGrew, Bob, McKinney, Scott Mayer, McLeavey, Christine, McMillan, Paul, McNeil, Jake, Medina, David, Mehta, Aalok, Menick, Jacob, Metz, Luke, Mishchenko, Andrey, Mishkin, Pamela, Monaco, Vinnie, Morikawa, Evan, Mossing, Daniel, Mu, Tong, Murati, Mira, Murk, Oleg, Mély, David, Nair, Ashvin, Nakano, Reiichiro, Nayak, Rajeev, Neelakantan, Arvind, Ngo, Richard, Noh, Hyeonwoo, Ouyang, Long, O'Keefe, Cullen, Pachocki, Jakub, Paino, Alex, Palermo, Joe, Pantuliano, Ashley, Parascandolo, Giambattista, Parish, Joel, Parparita, Emy, Passos, Alex, Pavlov, Mikhail, Peng, Andrew, Perelman, Adam, Peres, Filipe de Avila Belbute, Petrov, Michael, Pinto, Henrique Ponde de Oliveira, Michael, Pokorny, Pokrass, Michelle, Pong, Vitchyr H., Powell, Tolly, Power, Alethea, Power, Boris, Proehl, Elizabeth, Puri, Raul, Radford, Alec, Rae, Jack, Ramesh, Aditya, Raymond, Cameron, Real, Francis, Rimbach, Kendra, Ross, Carl, Rotsted, Bob, Roussez, Henri, Ryder, Nick, Saltarelli, Mario, Sanders, Ted, Santurkar, Shibani, Sastry, Girish, Schmidt, Heather, Schnurr, David, Schulman, John, Selsam, Daniel, Sheppard, Kyla, Sherbakov, Toki, Shieh, Jessica, Shoker, Sarah, Shyam, Pranav, Sidor, Szymon, Sigler, Eric, Simens, Maddie, Sitkin, Jordan, Slama, Katarina, Sohl, Ian, Sokolowsky, Benjamin, Song, Yang, Staudacher, Natalie, Such, Felipe Petroski, Summers, Natalie, Sutskever, Ilya, Tang, Jie, Tezak, Nikolas, Thompson, Madeleine B., Tillet, Phil, Tootoonchian, Amin, Tseng, Elizabeth, Tuggle, Preston, Turley, Nick, Tworek, Jerry, Uribe, Juan Felipe Cerón, Vallone, Andrea, Vijayvergiya, Arun, Voss, Chelsea, Wainwright, Carroll, Wang, Justin Jay, Wang, Alvin, Wang, Ben, Ward, Jonathan, Wei, Jason, Weinmann, CJ, Welihinda, Akila, Welinder, Peter, Weng, Jiayi, Weng, Lilian, Wiethoff, Matt, Willner, Dave, Winter, Clemens, Wolrich, Samuel, Wong, Hannah, Workman, Lauren, Wu, Sherwin, Wu, Jeff, Wu, Michael, Xiao, Kai, Xu, Tao, Yoo, Sarah, Yu, Kevin, Yuan, Qiming, Zaremba, Wojciech, Zellers, Rowan, Zhang, Chong, Zhang, Marvin, Zhao, Shengjia, Zheng, Tianhao, Zhuang, Juntang, Zhuk, William, and Zoph, Barret
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4., Comment: 100 pages; updated authors list; fixed author names and added citation
Published: 2023

14. Scaling laws for single-agent reinforcement learning

Author: Hilton, Jacob, Tang, Jie, and Schulman, John
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Recent work has shown that, in generative modeling, cross-entropy loss improves smoothly with model size and training compute, following a power law plus constant scaling law. One challenge in extending these results to reinforcement learning is that the main performance objective of interest, mean episode return, need not vary smoothly. To overcome this, we introduce *intrinsic performance*, a monotonic function of the return defined as the minimum compute required to achieve the given return across a family of models of different sizes. We find that, across a range of environments, intrinsic performance scales as a power law in model size and environment interactions. Consequently, as in generative modeling, the optimal model size scales as a power law in the training compute budget. Furthermore, we study how this relationship varies with the environment and with other properties of the training setup. In particular, using a toy MNIST-based environment, we show that varying the "horizon length" of the task mostly changes the coefficient but not the exponent of this relationship., Comment: 33 pages
Published: 2023

15. Need for 'special' states in a deterministic theory of quantum mechanics

Author: Schulman, L. S.
Subjects: Quantum Physics
Abstract: There are several theories or processes which may underlie quantum mechanics and make it deterministic. Some references are given in the main text. Any such theory, plus a number of reasonable assumptions, implies the existence of what I have called ``special" states. The assumptions are conservation laws, obedience (up to a point) of Schrodinger's equation, and a single world, in the sense of the many worlds interpretation (the last one a consequence of any deterministic theory). This article also, for clarity, gives an example of a ``special" state. There is an experimental test of the ``special" state theory., Comment: 5 pages, 2 figures. For v2 I have added the condition (for the validity of my argument) that there not be degrees of freedom invisible to the Schrodinger equation
Published: 2023

16. Multilayer spintronic neural networks with radio-frequency connections

Author: Ross, Andrew, Leroux, Nathan, de Riz, Arnaud, Marković, Danijela, Sanz-Hernández, Dédalo, Trastoy, Juan, Bortolotti, Paolo, Querlioz, Damien, Martins, Leandro, Benetti, Luana, Claro, Marcel S., Anacleto, Pedro, Schulman, Alejandro, Taris, Thierry, Begueret, Jean-Baptiste, Saïghi, Sylvain, Jenkins, Alex S., Ferreira, Ricardo, Vincent, Adrien F., Mizrahi, Alice, and Grollier, Julie
Subjects: Computer Science - Emerging Technologies
Abstract: Spintronic nano-synapses and nano-neurons perform complex cognitive computations with high accuracy thanks to their rich, reproducible and controllable magnetization dynamics. These dynamical nanodevices could transform artificial intelligence hardware, provided that they implement state-of-the art deep neural networks. However, there is today no scalable way to connect them in multilayers. Here we show that the flagship nano-components of spintronics, magnetic tunnel junctions, can be connected into multilayer neural networks where they implement both synapses and neurons thanks to their magnetization dynamics, and communicate by processing, transmitting and receiving radio frequency (RF) signals. We build a hardware spintronic neural network composed of nine magnetic tunnel junctions connected in two layers, and show that it natively classifies nonlinearly-separable RF inputs with an accuracy of 97.7%. Using physical simulations, we demonstrate that a large network of nanoscale junctions can achieve state-of the-art identification of drones from their RF transmissions, without digitization, and consuming only a few milliwatts, which is a gain of more than four orders of magnitude in power consumption compared to currently used techniques. This study lays the foundation for deep, dynamical, spintronic neural networks.
Published: 2022

17. Classification of multi-frequency RF signals by extreme learning, using magnetic tunnel junctions as neurons and synapses

Author: Leroux, Nathan, Marković, Danijela, Sanz-Hernández, Dédalo, Trastoy, Juan, Bortolotti, Paolo, Schulman, Alejandro, Benetti, Luana, Jenkins, Alex, Ferreira, Ricardo, Grollier, Julie, and Mizrahi, Alice
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Computer Science - Artificial Intelligence, Computer Science - Emerging Technologies
Abstract: Extracting information from radiofrequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications from radars to health. These RF inputs are composed of multiples frequencies. Here we show that magnetic tunnel junctions can process analogue RF inputs with multiple frequencies in parallel and perform synaptic operations. Using a backpropagation-free method called extreme learning, we classify noisy images encoded by RF signals, using experimental data from magnetic tunnel junctions functioning as both synapses and neurons. We achieve the same accuracy as an equivalent software neural network. These results are a key step for embedded radiofrequency artificial intelligence., Comment: 9 pages, 5 figures
Published: 2022

18. Scaling Laws for Reward Model Overoptimization

Author: Gao, Leo, Schulman, John, and Hilton, Jacob
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In reinforcement learning from human feedback, it is common to optimize against a reward model trained to predict human preferences. Because the reward model is an imperfect proxy, optimizing its value too much can hinder ground truth performance, in accordance with Goodhart's law. This effect has been frequently observed, but not carefully measured due to the expense of collecting human preference data. In this work, we use a synthetic setup in which a fixed "gold-standard" reward model plays the role of humans, providing labels used to train a proxy reward model. We study how the gold reward model score changes as we optimize against the proxy reward model using either reinforcement learning or best-of-$n$ sampling. We find that this relationship follows a different functional form depending on the method of optimization, and that in both cases its coefficients scale smoothly with the number of reward model parameters. We also study the effect on this relationship of the size of the reward model dataset, the number of reward model and policy parameters, and the coefficient of the KL penalty added to the reward in the reinforcement learning setup. We explore the implications of these empirical results for theoretical considerations in AI alignment.
Published: 2022

19. Efficient Training of Language Models to Fill in the Middle

Author: Bavarian, Mohammad, Jun, Heewoo, Tezak, Nikolas, Schulman, John, McLeavey, Christine, Tworek, Jerry, and Chen, Mark
Subjects: Computer Science - Computation and Language
Abstract: We show that autoregressive language models can learn to infill text after we apply a straightforward transformation to the dataset, which simply moves a span of text from the middle of a document to its end. While this data augmentation has garnered much interest in recent years, we provide extensive evidence that training models with a large fraction of data transformed in this way does not harm the original left-to-right generative capability, as measured by perplexity and sampling evaluations across a wide range of scales. Given the usefulness, simplicity, and efficiency of training models to fill-in-the-middle (FIM), we suggest that future autoregressive language models be trained with FIM by default. To this end, we run a series of ablations on key hyperparameters, such as the data transformation frequency, the structure of the transformation, and the method of selecting the infill span. We use these ablations to prescribe strong default settings and best practices to train FIM models. We have released our best infilling model trained with best practices in our API, and release our infilling benchmarks to aid future research.
Published: 2022

20. A simple method to reprogram the binding specificity of DNA-coated colloids that crystallize

Author: Moerman, Pepijn G., Fang, Huang, Videbæk, Thomas E., Rogers, W. Benjamin, and Schulman, Rebecca
Subjects: Condensed Matter - Soft Condensed Matter
Abstract: DNA-coated colloids can crystallize into a multitude of lattices, ranging from face-centered cubic to diamond and thereby contribute to our understanding of crystallization and open avenues to producing structures with useful photonic properties. Despite the broad potential design space of DNA-coated colloids, the design cycle for synthesizing DNA-coated particles is slow: preparing a particle with a new type of DNA sequence takes more than one day and requires custom-made and chemically modified DNA that typically takes the supplier over a month to synthesize. Here, we introduce a method to generate particles with custom sequences from a single feed stock in under an hour at ambient conditions. Our method appends new DNA domains onto the DNA grafted to colloidal particles based on a template that takes the supplier less than a week to produce. The resultant particles crystallize as readily and at the same temperature as those produced via direct chemical synthesis. Moreover, we show that particles coated with a single sequence can be converted into a variety of building blocks with differing specificities by appending different DNA sequences to them. This approach to DNA-coated particle preparation will make it practical to identify optimal and complex particle sequence designs and to expand the use of DNA-coated colloids to a much broader range of investigators and commercial entities., Comment: manuscript 9 pages 6 figures. SI, 12 pages 6 figures
Published: 2022

21. Trackers Bounce Back: Measuring Evasion of Partitioned Storage in the Wild

Author: Randall, Audrey, Snyder, Peter, Ukani, Alisha, Snoeren, Alex, Voelker, Geoff, Savage, Stefan, and Schulman, Aaron
Subjects: Computer Science - Cryptography and Security
Abstract: This work presents a systematic study of navigational tracking, the latest development in the cat-and-mouse game between browsers and online trackers. Navigational tracking allows trackers to 'aggregate users' activities and behaviors across sites by modifying their navigation requests. This technique is particularly important because it circumvents the increasing efforts by browsers to partition or block third-party storage, which was previously necessary for most cross-website tracking. While previous work has studied specific navigational tracking techniques (i.e. "bounce tracking"), our work is the first effort to systematically study and measure the entire category of navigational tracking techniques. We describe and measure the frequency of two different navigational tracking techniques on the Web, and find that navigational tracking is present on slightly more than ten percent of all navigations that we made. Our contributions include identifying 214 domains belonging to at least 104 organizations tracking users across sites through link decoration techniques using direct or indirect navigation flows. We identify a further 23 domains belonging to at least 16 organizations tracking users through bounce tracking (i.e. bouncing users through unrelated third parties to generate user profiles). We also improve on prior techniques for differenting user identifiers from non-sensitive information, which is necessary to detect one class of navigational tracking. We discuss how our findings can used to protect users from navigational tracking, and commit to releasing both our complete dataset and our measurement pipeline
Published: 2022

22. Training language models to follow instructions with human feedback

Author: Ouyang, Long, Wu, Jeff, Jiang, Xu, Almeida, Diogo, Wainwright, Carroll L., Mishkin, Pamela, Zhang, Chong, Agarwal, Sandhini, Slama, Katarina, Ray, Alex, Schulman, John, Hilton, Jacob, Kelton, Fraser, Miller, Luke, Simens, Maddie, Askell, Amanda, Welinder, Peter, Christiano, Paul, Leike, Jan, and Lowe, Ryan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.
Published: 2022

23. Rosebud: Making FPGA-Accelerated Middlebox Development More Pleasant

Author: Khazraee, Moein, Forencich, Alex, Papen, George, Snoeren, Alex C., and Schulman, Aaron
Subjects: Computer Science - Hardware Architecture
Abstract: We introduce an approach to designing FPGA-accelerated middleboxes that simplifies development, debugging, and performance tuning by decoupling the tasks of hardware-accelerator implementation and software-application programming. Rosebud is a framework that links hardware accelerators to a high-performance packet processing pipeline through a standardized hardware/software interface. This separation of concerns allows hardware developers to focus on optimizing custom accelerators while freeing software programmers to reuse, configure, and debug accelerators in a fashion akin to software libraries. We show the benefits of the Rosebud framework by building a firewall based on a large blacklist and porting the Pigasus IDS pattern-matching accelerator in less than a month. Our experiments demonstrate that Rosebud delivers high performance, serving ~200 Gbps of traffic while adding only 0.7-7 microseconds of latency., Comment: 20 pages. Final version, to appear in ASPLOS23
Published: 2022

24. Causal Inference Despite Limited Global Confounding via Mixture Models

Author: Gordon, Spencer L., Mazaheri, Bijan, Rabani, Yuval, and Schulman, Leonard J.
Subjects: Computer Science - Machine Learning, Computer Science - Data Structures and Algorithms, Electrical Engineering and Systems Science - Signal Processing, Statistics - Machine Learning, 68W40, 62F99, 62-09, F.2, G.3
Abstract: A Bayesian Network is a directed acyclic graph (DAG) on a set of $n$ random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite $k$-mixture of such models is graphically represented by a larger graph which has an additional ``hidden'' (or ``latent'') random variable $U$, ranging in $\{1,\ldots,k\}$, and a directed edge from $U$ to every other vertex. Models of this type are fundamental to causal inference, where $U$ models an unobserved confounding effect of multiple populations, obscuring the causal relationships in the observable DAG. By solving the mixture problem and recovering the joint probability distribution with $U$, traditionally unidentifiable causal relationships become identifiable. Using a reduction to the more well-studied ``product'' case on empty graphs, we give the first algorithm to learn mixtures of non-empty DAGs., Comment: Published in CleaR 2023
Published: 2021

25. WebGPT: Browser-assisted question-answering with human feedback

Author: Nakano, Reiichiro, Hilton, Jacob, Balaji, Suchir, Wu, Jeff, Ouyang, Long, Kim, Christina, Hesse, Christopher, Jain, Shantanu, Kosaraju, Vineet, Saunders, William, Jiang, Xu, Cobbe, Karl, Eloundou, Tyna, Krueger, Gretchen, Button, Kevin, Knight, Matthew, Chess, Benjamin, and Schulman, John
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing environment, which allows the model to search and navigate the web. By setting up the task so that it can be performed by humans, we are able to train models on the task using imitation learning, and then optimize answer quality with human feedback. To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers. We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. Our best model is obtained by fine-tuning GPT-3 using behavior cloning, and then performing rejection sampling against a reward model trained to predict human preferences. This model's answers are preferred by humans 56% of the time to those of our human demonstrators, and 69% of the time to the highest-voted answer from Reddit., Comment: 32 pages
Published: 2021

26. Training Verifiers to Solve Math Word Problems

Author: Cobbe, Karl, Kosaraju, Vineet, Bavarian, Mohammad, Chen, Mark, Jun, Heewoo, Kaiser, Lukasz, Plappert, Matthias, Tworek, Jerry, Hilton, Jacob, Nakano, Reiichiro, Hesse, Christopher, and Schulman, John
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: State-of-the-art language models can match human performance on many tasks, but they still struggle to robustly perform multi-step mathematical reasoning. To diagnose the failures of current models and support research, we introduce GSM8K, a dataset of 8.5K high quality linguistically diverse grade school math word problems. We find that even the largest transformer models fail to achieve high test performance, despite the conceptual simplicity of this problem distribution. To increase performance, we propose training verifiers to judge the correctness of model completions. At test time, we generate many candidate solutions and select the one ranked highest by the verifier. We demonstrate that verification significantly improves performance on GSM8K, and we provide strong empirical evidence that verification scales more effectively with increased data than a finetuning baseline.
Published: 2021

27. Batch size-invariance for policy optimization

Author: Hilton, Jacob, Cobbe, Karl, and Schulman, John
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We say an algorithm is batch size-invariant if changes to the batch size can largely be compensated for by changes to other hyperparameters. Stochastic gradient descent is well-known to have this property at small batch sizes, via the learning rate. However, some policy optimization algorithms (such as PPO) do not have this property, because of how they control the size of policy updates. In this work we show how to make these algorithms batch size-invariant. Our key insight is to decouple the proximal policy (used for controlling policy updates) from the behavior policy (used for off-policy corrections). Our experiments help explain why these algorithms work, and additionally show how they can make more efficient use of stale data., Comment: 32 pages. Code is available at https://github.com/openai/ppo-ewma
Published: 2021

28. Unsolved Problems in ML Safety

Author: Hendrycks, Dan, Carlini, Nicholas, Schulman, John, and Steinhardt, Jacob
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Machine learning (ML) systems are rapidly increasing in size, are acquiring new capabilities, and are increasingly deployed in high-stakes settings. As with other powerful technologies, safety for ML should be a leading research priority. In response to emerging safety challenges in ML, such as those introduced by recent large-scale models, we provide a new roadmap for ML Safety and refine the technical problems that the field needs to address. We present four problems ready for research, namely withstanding hazards ("Robustness"), identifying hazards ("Monitoring"), reducing inherent model hazards ("Alignment"), and reducing systemic hazards ("Systemic Safety"). Throughout, we clarify each problem's motivation and provide concrete research directions., Comment: Position Paper
Published: 2021

29. ZLeaks: Passive Inference Attacks on Zigbee based Smart Homes

Author: Shafqat, Narmeen, Dubois, Daniel J., Choffnes, David, Schulman, Aaron, Bharadia, Dinesh, and Ranganathan, Aanjhan
Subjects: Computer Science - Cryptography and Security
Abstract: Zigbee is an energy-efficient wireless IoT protocol that is increasingly being deployed in smart home settings. In this work, we analyze the privacy guarantees of Zigbee protocol. Specifically, we present ZLeaks, a tool that passively identifies in-home devices or events from the encrypted Zigbee traffic by 1) inferring a single application layer (APL) command in the event's traffic, and 2) exploiting the device's periodic reporting pattern and interval. This enables an attacker to infer user's habits or determine if the smart home is vulnerable to unauthorized entry. We evaluated ZLeaks' efficacy on 19 unique Zigbee devices across several categories and 5 popular smart hubs in three different scenarios; controlled RF shield, living smart-home IoT lab, and third-party Zigbee captures. We were able to i) identify unknown events and devices (without a-priori device signatures) using command inference approach with 83.6% accuracy, ii) automatically extract device's reporting signatures, iii) determine known devices using the reporting signatures with 99.8% accuracy, and iv) identify APL commands in a public capture with 91.2% accuracy. In short, we highlight the trade-off between designing a low-power, low-cost wireless network and achieving privacy guarantees. We have also released ZLeaks tool for the benefit of the research community., Comment: An updated version of the authors' previous submission (arXiv:2107.10830). It has been accepted at the 20th International Conference on Applied Cryptography and Network Security, ACNS 2022
Published: 2021

30. A Refined Approximation for Euclidean k-Means

Author: Grandoni, Fabrizio, Ostrovsky, Rafail, Rabani, Yuval, Schulman, Leonard J., and Venkat, Rakesh
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Computational Geometry
Abstract: In the Euclidean $k$-Means problem we are given a collection of $n$ points $D$ in an Euclidean space and a positive integer $k$. Our goal is to identify a collection of $k$ points in the same space (centers) so as to minimize the sum of the squared Euclidean distances between each point in $D$ and the closest center. This problem is known to be APX-hard and the current best approximation ratio is a primal-dual $6.357$ approximation based on a standard LP for the problem [Ahmadian et al. FOCS'17, SICOMP'20]. In this note we show how a minor modification of Ahmadian et al.'s analysis leads to a slightly improved $6.12903$ approximation. As a related result, we also show that the mentioned LP has integrality gap at least $\frac{16+\sqrt{5}}{15}>1.2157$., Comment: Corrected a confusing typo in a formula on page 5 and added one remark
Published: 2021

31. Shapes as Product Differentiation: Neural Network Embedding in the Analysis of Markets for Fonts

Author: Han, Sukjin, Schulman, Eric H., Grauman, Kristen, and Ramakrishnan, Santhosh
Subjects: Economics - Econometrics, Computer Science - Computer Vision and Pattern Recognition
Abstract: Many differentiated products have key attributes that are unstructured and thus high-dimensional (e.g., design, text). Instead of treating unstructured attributes as unobservables in economic models, quantifying them can be important to answer interesting economic questions. To propose an analytical framework for these types of products, this paper considers one of the simplest design products-fonts-and investigates merger and product differentiation using an original dataset from the world's largest online marketplace for fonts. We quantify font shapes by constructing embeddings from a deep convolutional neural network. Each embedding maps a font's shape onto a low-dimensional vector. In the resulting product space, designers are assumed to engage in Hotelling-type spatial competition. From the image embeddings, we construct two alternative measures that capture the degree of design differentiation. We then study the causal effects of a merger on the merging firm's creative decisions using the constructed measures in a synthetic control method. We find that the merger causes the merging firm to increase the visual variety of font design. Notably, such effects are not captured when using traditional measures for product offerings (e.g., specifications and the number of products) constructed from structured data.
Published: 2021

32. Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark

Author: Mohanty, Sharada, Poonganam, Jyotish, Gaidon, Adrien, Kolobov, Andrey, Wulfe, Blake, Chakraborty, Dipam, Šemetulskis, Gražvydas, Schapke, João, Kubilius, Jonas, Pašukonis, Jurgis, Klimas, Linas, Hausknecht, Matthew, MacAlpine, Patrick, Tran, Quang Nhat, Tumiel, Thomas, Tang, Xiaocheng, Chen, Xinwei, Hesse, Christopher, Hilton, Jacob, Guss, William Hebgen, Genc, Sahika, Schulman, John, and Cobbe, Karl
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The NeurIPS 2020 Procgen Competition was designed as a centralized benchmark with clearly defined tasks for measuring Sample Efficiency and Generalization in Reinforcement Learning. Generalization remains one of the most fundamental challenges in deep reinforcement learning, and yet we do not have enough benchmarks to measure the progress of the community on Generalization in Reinforcement Learning. We present the design of a centralized benchmark for Reinforcement Learning which can help measure Sample Efficiency and Generalization in Reinforcement Learning by doing end to end evaluation of the training and rollout phases of thousands of user submitted code bases in a scalable way. We designed the benchmark on top of the already existing Procgen Benchmark by defining clear tasks and standardizing the end to end evaluation setups. The design aims to maximize the flexibility available for researchers who wish to design future iterations of such benchmarks, and yet imposes necessary practical constraints to allow for a system like this to scale. This paper presents the competition setup and the details and analysis of the top solutions identified through this setup in context of 2020 iteration of the competition at NeurIPS.
Published: 2021

33. TerraWatt: Sustaining Sustainable Computing of Containers in Containers

Author: Switzer, Jennifer, McGuinness, Rob, Pannuto, Pat, Porter, George, Schulman, Aaron, and Raghavan, Barath
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Networking and Internet Architecture
Abstract: Each day the world inches closer to a climate catastrophe and a sustainability revolution. To avoid the former and achieve the latter we must transform our use of energy. Surprisingly, today's growing problem is that there is too much wind and solar power generation at the wrong times and in the wrong places. We argue for the construction of TerraWatt: a geographically-distributed, large-scale, zero-carbon compute infrastructure using renewable energy and older hardware. Delivering zero-carbon compute for general cloud workloads is challenging due to spatiotemporal power variability. We describe the systems challenges in using intermittent renewable power at scale to fuel such an older, decentralized compute infrastructure.
Published: 2021

34. Hadamard Extensions and the Identification of Mixtures of Product Distributions

Author: Gordon, Spencer L. and Schulman, Leonard J.
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing, Statistics - Machine Learning, 68W40, 62F99, F.2, G.3
Abstract: The Hadamard Extension of a matrix is the matrix consisting of all Hadamard products of subsets of its rows. This construction arises in the context of identifying a mixture of product distributions on binary random variables: full column rank of such extensions is a necessary ingredient of identification algorithms. We provide several results concerning when a Hadamard Extension has full column rank., Comment: V2: re-titled and slight edits
Published: 2021

35. The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors

Author: Guss, William H., Castro, Mario Ynocente, Devlin, Sam, Houghton, Brandon, Kuno, Noboru Sean, Loomis, Crissman, Milani, Stephanie, Mohanty, Sharada, Nakata, Keisuke, Salakhutdinov, Ruslan, Schulman, John, Shiroshita, Shinya, Topin, Nicholay, Ummadisingu, Avinash, and Vinyals, Oriol
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development. Resolution of these limitations requires new, sample-efficient methods. To facilitate research in this direction, we propose this second iteration of the MineRL Competition. The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations to drastically reduce the number of samples needed to solve complex, hierarchical, and sparse environments. To that end, participants compete under a limited environment sample-complexity budget to develop systems which solve the MineRL ObtainDiamond task in Minecraft, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods. The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment with different game textures and shaders. At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform where they are trained from scratch on a hold-out dataset-environment pair for a total of 4-days on a pre-specified hardware platform. In this follow-up iteration to the NeurIPS 2019 MineRL Competition, we implement new features to expand the scale and reach of the competition. In response to the feedback of the previous participants, we introduce a second minor track focusing on solutions without access to environment interactions of any kind except during test-time. Further we aim to prompt domain agnostic submissions by implementing several novel competition mechanics including action-space randomization and desemantization of observations and actions., Comment: 37 pages, initial submission, accepted at NeurIPS. arXiv admin note: substantial text overlap with arXiv:1904.10079
Published: 2021

36. Growth and site-specific organization of micron-scale biomolecular devices on living mammalian cells

Author: Jia, Sisi, Phua, Siew Cheng, Nihongaki, Yuta, Li, Yizeng, Pacella, Michael, Li, Yi, Mohammed, Abdul M., Sun, Sean, Inoue, Takanari, and Schulman, Rebecca
Subjects: Quantitative Biology - Biomolecules
Abstract: Mesoscale molecular assemblies on the cell surface, such as cilia and filopodia, integrate information, control transport and amplify signals. Synthetic devices mimicking these structures could sensitively monitor these cellular functions and direct new ones. The challenges in creating such devices, however are that they must be integrated with cells in a precise kinetically controlled process and a device's structure and its precisely structured cell interface must then be maintained during active cellular function. Here we report the ability to integrate synthetic micro-scale filaments, DNA nanotubes, into a cell's architecture by anchoring them by their ends to specific receptors on the surfaces of mammalian cells. These filaments can act as shear stress meters: how anchored nanotubes bend at the cell surface quantitatively indicates the magnitude of shear stresses between 0-2 dyn per cm2, a regime important for cell signaling. Nanotubes can also grow while anchored to cells, thus acting as dynamic components of cells. This approach to cell surface engineering, in which synthetic biomolecular assemblies are organized within existing cellular architecture, could make it possible to build new types of sensors, machines and scaffolds that can interface with, control and measure properties of cells., Comment: 20 pages, 5 figures
Published: 2021
Full Text: View/download PDF

37. Source Identification for Mixtures of Product Distributions

Author: Gordon, Spencer L., Mazaheri, Bijan, Rabani, Yuval, and Schulman, Leonard J.
Subjects: Computer Science - Machine Learning, Computer Science - Data Structures and Algorithms, Electrical Engineering and Systems Science - Signal Processing, Statistics - Machine Learning, 68W40, 62F99, F.2, G.3
Abstract: We give an algorithm for source identification of a mixture of $k$ product distributions on $n$ bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)} n^{O(k)}$ arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O'Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).
Published: 2020

38. Scaling Laws for Autoregressive Generative Modeling

Author: Henighan, Tom, Kaplan, Jared, Katz, Mor, Chen, Mark, Hesse, Christopher, Jackson, Jacob, Jun, Heewoo, Brown, Tom B., Dhariwal, Prafulla, Gray, Scott, Hallacy, Chris, Mann, Benjamin, Radford, Alec, Ramesh, Aditya, Ryder, Nick, Ziegler, Daniel M., Schulman, John, Amodei, Dario, and McCandlish, Sam
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: We identify empirical scaling laws for the cross-entropy loss in four domains: generative image modeling, video modeling, multimodal image$\leftrightarrow$text models, and mathematical problem solving. In all cases autoregressive Transformers smoothly improve in performance as model size and compute budgets increase, following a power-law plus constant scaling law. The optimal model size also depends on the compute budget through a power-law, with exponents that are nearly universal across all data domains. The cross-entropy loss has an information theoretic interpretation as $S($True$) + D_{\mathrm{KL}}($True$||$Model$)$, and the empirical scaling laws suggest a prediction for both the true data distribution's entropy and the KL divergence between the true and model distributions. With this interpretation, billion-parameter Transformers are nearly perfect models of the YFCC100M image distribution downsampled to an $8\times 8$ resolution, and we can forecast the model size needed to achieve any given reducible loss (ie $D_{\mathrm{KL}}$) in nats/image for other resolutions. We find a number of additional scaling laws in specific domains: (a) we identify a scaling relation for the mutual information between captions and images in multimodal models, and show how to answer the question "Is a picture worth a thousand words?"; (b) in the case of mathematical problem solving, we identify scaling laws for model performance when extrapolating beyond the training distribution; (c) we finetune generative image models for ImageNet classification and find smooth scaling of the classification loss and error rate, even as the generative loss levels off. Taken together, these results strengthen the case that scaling laws have important implications for neural network performance, including on downstream tasks., Comment: 20+17 pages, 33 figures; added appendix with additional language results
Published: 2020

39. Phasic Policy Gradient

Author: Cobbe, Karl, Hilton, Jacob, Klimov, Oleg, and Schulman, John
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We introduce Phasic Policy Gradient (PPG), a reinforcement learning framework which modifies traditional on-policy actor-critic methods by separating policy and value function training into distinct phases. In prior methods, one must choose between using a shared network or separate networks to represent the policy and value function. Using separate networks avoids interference between objectives, while using a shared network allows useful features to be shared. PPG is able to achieve the best of both worlds by splitting optimization into two phases, one that advances training and one that distills features. PPG also enables the value function to be more aggressively optimized with a higher level of sample reuse. Compared to PPO, we find that PPG significantly improves sample efficiency on the challenging Procgen Benchmark.
Published: 2020

40. Droplet Migration on Conical Fibers

Author: Fournier, Clementine, Lee, Carmen, Schulman, Rafael, Raphael, Elie, and Dalnoki-Veress, Kari
Subjects: Condensed Matter - Soft Condensed Matter, Physics - Fluid Dynamics
Abstract: The spontaneous migration of droplets on conical fibers is studied experimentally by depositing silicone oil droplets onto conical glass fibers. Their motion is recorded using optical microscopy and analysed to extract the relevant geometrical parameters of the system. The speed of the droplet can be predicted as a function of geometry and the fluid properties using a simple theoretical model, which balances viscous dissipation against the surface tension driving force. The experimental data are found to be in good agreement with the model.
Published: 2020

41. The Sparse Hausdorff Moment Problem, with Application to Topic Models

Author: Gordon, Spencer, Mazaheri, Bijan, Schulman, Leonard J., and Rabani, Yuval
Subjects: Computer Science - Machine Learning, Computer Science - Data Structures and Algorithms, Statistics - Machine Learning
Abstract: We consider the problem of identifying, from its first $m$ noisy moments, a probability distribution on $[0,1]$ of support $k<\infty$. This is equivalent to the problem of learning a distribution on $m$ observable binary random variables $X_1,X_2,\dots,X_m$ that are iid conditional on a hidden random variable $U$ taking values in $\{1,2,\dots,k\}$. Our focus is on accomplishing this with $m=2k$, which is the minimum $m$ for which verifying that the source is a $k$-mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models. We give an algorithm for identifying a $k$-mixture using samples of $m=2k$ iid binary random variables using a sample of size $\left(1/w_{\min}\right)^2 \cdot\left(1/\zeta\right)^{O(k)}$ and post-sampling runtime of only $O(k^{2+o(1)})$ arithmetic operations. Here $w_{\min}$ is the minimum probability of an outcome of $U$, and $\zeta$ is the minimum separation between the distinct success probabilities of the $X_i$s. Stated in terms of the moment problem, it suffices to know the moments to additive accuracy $w_{\min}\cdot\zeta^{O(k)}$. It is known that the sample complexity of any solution to the identification problem must be at least exponential in $k$. Previous results demonstrated either worse sample complexity and worse $O(k^c)$ runtime for some $c$ substantially larger than $2$, or similar sample complexity and much worse $k^{O(k^2)}$ runtime.
Published: 2020

42. Leveraging Procedural Generation to Benchmark Reinforcement Learning

Author: Cobbe, Karl, Hesse, Christopher, Hilton, Jacob, and Schulman, John
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like environments designed to benchmark both sample efficiency and generalization in reinforcement learning. We believe that the community will benefit from increased access to high quality training environments, and we provide detailed experimental protocols for using this benchmark. We empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation. We then use this benchmark to investigate the effects of scaling model size, finding that larger models significantly improve both sample efficiency and generalization.
Published: 2019

43. Using Social Stories[Superscript TM] to Implement Cognitive Behavioral Therapy via Zoom in Parents and Their Children with Autism and Anxiety

Author: Danielle M. Schulman
Abstract: Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by challenges in socialization and communication, and restricted, repetitive behaviors or interests across development (American Psychiatric Association [APA], 2013). Youth with ASD have an increased risk for developing anxiety disorders (Mattila et al., 2010). Anxiety in youth with ASD is associated with further socialization and communication impairment (Duvekot et al., 2010), behavioral issues (e.g., conduct problems, hyperactivity, inattention; Farrugia & Hudson, 2006), and emotional issues (e.g., self-injurious behavior, depression symptoms; Kerns et al., 2015). Effective anxiety interventions for this population are therefore needed due to the widespread impairment associated with anxiety (Chang et al., 2012; Duvekot et al., 2018; Kerns et al., 2015). Research shows that cognitive behavioral therapy (CBT) is an effective intervention for children with ASD and anxiety (Ung et al., 2015); however, barriers to implementing CBT exist, such as the distance and limited transportation to services (Stewart et al., 2017). These treatment barriers can be addressed via teletherapy. Moreover, Social Stories[superscript TM] can be easily implemented via the computer (Sansoti & Powell-Smith, 2008). They have been included in CBT programs for children with ASD and anxiety but have not been used as the main intervention (Ooi et al., 2008; Schleismann & Gillis, 2011; Sofronoff et al., 2005; Sung et al., 2011). The current study examined the efficacy of an internet-based Social Story intervention for anxiety that includes aspects of CBT found effective in previous research (Hepburn et al., 2016; Schleismann & Gillis, 2011; Ung et al., 2015; Wise et al., 2019; Wood et al., 2009). For the present study, participants were six children between 8 and 12 years old diagnosed with ASD and an anxiety disorder. They were recruited from Hofstra University's Diagnostic and Research Institute for Autism Spectrum Disorders and local support groups for parents of children with ASD. Parents of participants also participated by observing sessions and implementing the intervention's homework assignments. It was hypothesized that from baseline to treatment, the intervention would contribute to decreased child anxiety levels, decreased child behavior avoidance, decreased parental stress, increased child self-efficacy and increased parental self-efficacy, based on parent- and self-reports. Additionally, it was hypothesized that these improvements would be maintained across a 2-week maintenance phase. A small-"n," multiple baseline design was used to investigate data across the intervention's 3-week baseline phase, 6-week treatment phase, and 2-week maintenance phase. Data was analyzed using visual analyses, percentage exceeding the median (PEM; Ma, 2006), and standard mean difference (SMD; Busk & Serlin, 1992). Dependent Samples T-Tests were used to determine if treatment effects were maintained across the 2-week maintenance phase. Further, a time series analysis explored the relation among self-efficacy, anxiety, and parental stress. Results showed that each participant improved in at least one of the aforementioned variables, with most participants improving in their child-reported anxiety levels. Implications for future research, as well as the strengths and limitations of the study, are discussed. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2022

44. Teacher Mobility in the School District of Philadelphia, 2009-10 through 2015-16

Author: Philadelphia Education Research Consortium (PERC), Steinberg, Matthew P., Neild, Ruth Curran, Canuette, W. Kyle, Park, Sharin, Schulman, Emily, and Wright, Melissa
Abstract: Teachers are the most important influence in schools on student achievement, which makes attracting and retaining excellent teachers a high priority for all school districts. But public schools in large cities like Philadelphia are especially challenged to provide every student with a highly-effective teacher. Teacher mobility--that is, transferring from one school to another or leaving the profession entirely--is disproportionately concentrated in urban school districts and has negative consequences for student performance. For these reasons, it is critical for policymakers and school leaders in Philadelphia to have a clear picture of the extent and nature of teacher mobility. This report provides evidence on teacher mobility in the School District of Philadelphia (SDP) during the 2009-10 through 2015-16 school years. Using publicly available data from the Pennsylvania Department of Education, the authors examine the extent of teacher mobility as well as the characteristics of mobile teachers and the schools that they exit and enter. For this report, teacher mobility is defined as occurring when an SDP teacher does not return to the same school in the following year. Therefore, a mobile teacher is one who moved to another SDP school, moved to a Philadelphia charter school or a Pennsylvania public school outside of Philadelphia, or left public education in Pennsylvania.
Published: 2018

45. Edge Expansion and Spectral Gap of Nonnegative Matrices

Author: Mehta, Jenish C. and Schulman, Leonard J.
Subjects: Mathematics - Combinatorics, Computer Science - Discrete Mathematics
Abstract: The classic graphical Cheeger inequalities state that if $M$ is an $n\times n$ symmetric doubly stochastic matrix, then \[ \frac{1-\lambda_{2}(M)}{2}\leq\phi(M)\leq\sqrt{2\cdot(1-\lambda_{2}(M))} \] where $\phi(M)=\min_{S\subseteq[n],|S|\leq n/2}\left(\frac{1}{|S|}\sum_{i\in S,j\not\in S}M_{i,j}\right)$ is the edge expansion of $M$, and $\lambda_{2}(M)$ is the second largest eigenvalue of $M$. We study the relationship between $\phi(A)$ and the spectral gap $1-\text{Re}\lambda_{2}(A)$ for any doubly stochastic matrix $A$ (not necessarily symmetric), where $\lambda_{2}(A)$ is a nontrivial eigenvalue of $A$ with maximum real part. Fiedler showed that the upper bound on $\phi(A)$ is unaffected, i.e., $\phi(A)\leq\sqrt{2\cdot(1-\text{Re}\lambda_{2}(A))}$. With regards to the lower bound on $\phi(A)$, there are known constructions with \[ \phi(A)\in\Theta\left(\frac{1-\text{Re}\lambda_{2}(A)}{\log n}\right), \] indicating that at least a mild dependence on $n$ is necessary to lower bound $\phi(A)$. In our first result, we provide an exponentially better construction of $n\times n$ doubly stochastic matrices $A_{n}$, for which \[\phi(A_{n})\leq\frac{1-\text{Re}\lambda_{2}(A_{n})}{\sqrt{n}}.\] In fact, all nontrivial eigenvalues of our matrices are $0$, even though the matrices are highly nonexpanding. We further show that this bound is in the correct range (up to the exponent of $n$), by showing that for any doubly stochastic matrix $A$, \[\phi(A)\geq\frac{1-\text{Re}\lambda_{2}(A)}{35\cdot n}.\] Our second result extends these bounds to general nonnegative matrices $R$, obtaining a two-sided quantitative refinement of the Perron-Frobenius theorem in which the edge expansion $\phi(R)$ (appropriately defined), a quantitative measure of the irreducibility of $R$, controls the gap between the Perron-Frobenius eigenvalue and the next-largest real part of any eigenvalue.
Published: 2019

46. Policy Gradient Search: Online Planning and Expert Iteration without Search Trees

Author: Anthony, Thomas, Nishihara, Robert, Moritz, Philipp, Salimans, Tim, and Schulman, John
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Monte Carlo Tree Search (MCTS) algorithms perform simulation-based search to improve policies online. During search, the simulation policy is adapted to explore the most promising lines of play. MCTS has been used by state-of-the-art programs for many problems, however a disadvantage to MCTS is that it estimates the values of states with Monte Carlo averages, stored in a search tree; this does not scale to games with very high branching factors. We propose an alternative simulation-based search method, Policy Gradient Search (PGS), which adapts a neural network simulation policy online via policy gradient updates, avoiding the need for a search tree. In Hex, PGS achieves comparable performance to MCTS, and an agent trained using Expert Iteration with PGS was able defeat MoHex 2.0, the strongest open-source Hex agent, in 9x9 Hex.
Published: 2019

47. Semi-Supervised Learning by Label Gradient Alignment

Author: Jackson, Jacob and Schulman, John
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We present label gradient alignment, a novel algorithm for semi-supervised learning which imputes labels for the unlabeled data and trains on the imputed labels. We define a semantically meaningful distance metric on the input space by mapping a point (x, y) to the gradient of the model at (x, y). We then formulate an optimization problem whose objective is to minimize the distance between the labeled and the unlabeled data in this space, and we solve it by gradient descent on the imputed labels. We evaluate label gradient alignment using the standardized architecture introduced by Oliver et al. (2018) and demonstrate state-of-the-art accuracy in semi-supervised CIFAR-10 classification.
Published: 2019

48. Droplets capped with an elastic film can be round, elliptical, or nearly square

Author: Schulman, Rafael D. and Dalnoki-Veress, Kari
Subjects: Condensed Matter - Soft Condensed Matter, Condensed Matter - Mesoscale and Nanoscale Physics, Physics - Fluid Dynamics
Abstract: We present experiments which show that the partial wetting of droplets capped by taut elastic films is highly tunable. Adjusting the tension allows the contact angle and droplet morphology to be controlled. By exploiting these elastic boundaries, droplets can be made elliptical, with an adjustable aspect ratio, and can even be transformed into a nearly square shape. This system can be used to create tunable liquid lenses, and moreover, presents a unique approach to liquid patterning., Comment: 5 pages, 4 figures
Published: 2018
Full Text: View/download PDF

49. Quantifying Generalization in Reinforcement Learning

Author: Cobbe, Karl, Klimov, Oleg, Hesse, Chris, Kim, Taehoon, and Schulman, John
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In this paper, we investigate the problem of overfitting in deep reinforcement learning. Among the most common benchmarks in RL, it is customary to use the same environments for both training and testing. This practice offers relatively little insight into an agent's ability to generalize. We address this issue by using procedurally generated environments to construct distinct training and test sets. Most notably, we introduce a new environment called CoinRun, designed as a benchmark for generalization in RL. Using CoinRun, we find that agents overfit to surprisingly large training sets. We then show that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization.
Published: 2018

50. LA100 Equity Strategies

Author: Anderson, Kate, primary, Day, Megan, additional, Romero Lankao, Patricia, additional, Berdahl, Sonja, additional, Rauser, Casandra, additional, Bowen, Thomas, additional, Fournier, Eric, additional, Heath, Garvin, additional, Hinojosa, Raul, additional, Ong, Paul, additional, Palmintier, Bryan, additional, Pierce, Gregory, additional, Pincetl, Stephanie, additional, Prasanna, Ashreeta, additional, Ravi, Vikram, additional, Reyna, Janet, additional, Lee, Dong-Yeon, additional, Rosner, Nicole, additional, Sandoval, Noah, additional, Sekar, Ashok, additional, Sheinberg, Rachel, additional, Simeone, Christina, additional, Stenger, Katelyn, additional, Sun, Bingrong, additional, Valenzuela, Abel, additional, Wilson, Alana, additional, Zhu, Yifang, additional, Abraham, Sherin Ann, additional, Blanco, Lis, additional, Bolla, Greg, additional, Bustamante, Leticia, additional, Coffee, Daniel, additional, Craer, Jennifer, additional, Das, Paritosh, additional, Denholm, Paul, additional, Duwadi, Kapil, additional, Fontanini, Anthony, additional, Gonzalez, Silvia, additional, Gu, Yu, additional, He, Yueshuai, additional, Hernandez, Ariana, additional, Horsey, Ry, additional, Katz, Sophie, additional, Krishnamoorthy, Gayathri, additional, Li, Yun, additional, Lin, Yun, additional, Liu, Lixi, additional, Lockshin, Jane, additional, Ma, Jiaqi, additional, Maguire, Jeff, additional, Marroquin, Isaias, additional, Mooney, Meghan, additional, Panda, Kinshuk, additional, Pleitez, Marcelo, additional, Robertson, Joe, additional, Rodriguez, Ruth, additional, Ruddick-Schulman, Saul, additional, Sanchez-Hall, Magali, additional, Sedzro, Kwami Senam, additional, Velasquez, Leslie, additional, Walzberg, Julien, additional, White, Philip, additional, Yu, Qiao, additional, and Zimny-Schmitt, Daniel, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

333 results on '"Schulman, A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources