Author: "Lambert, Nathan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Lambert, Nathan"' showing total 172 results

Start Over Author "Lambert, Nathan"

172 results on '"Lambert, Nathan"'

1. Self-Directed Synthetic Dialogues and Revisions Technical Report

Author: Lambert, Nathan, Schoelkopf, Hailey, Gokaslan, Aaron, Soldaini, Luca, Pyatkin, Valentina, and Castricato, Louis
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Synthetic data has become an important tool in the fine-tuning of language models to follow instructions and solve complex problems. Nevertheless, the majority of open data to date is often lacking multi-turn data and collected on closed models, limiting progress on advancing open fine-tuning methods. We introduce Self Directed Synthetic Dialogues (SDSD), an experimental dataset consisting of guided conversations of language models talking to themselves. The dataset consists of multi-turn conversations generated with DBRX, Llama 2 70B, and Mistral Large, all instructed to follow a conversation plan generated prior to the conversation. We also explore including principles from Constitutional AI and other related works to create synthetic preference data via revisions to the final conversation turn. We hope this work encourages further exploration in multi-turn data and the use of open models for expanding the impact of synthetic data., Comment: 25 pages, 3 figures, 4 tables
Published: 2024

2. WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Author: Han, Seungju, Rao, Kavel, Ettinger, Allyson, Jiang, Liwei, Lin, Bill Yuchen, Lambert, Nathan, Choi, Yejin, and Dziri, Nouha
Subjects: Computer Science - Computation and Language
Abstract: We introduce WildGuard -- an open, light-weight moderation tool for LLM safety that achieves three goals: (1) identifying malicious intent in user prompts, (2) detecting safety risks of model responses, and (3) determining model refusal rate. Together, WildGuard serves the increasing needs for automatic safety moderation and evaluation of LLM interactions, providing a one-stop tool with enhanced accuracy and broad coverage across 13 risk categories. While existing open moderation tools such as Llama-Guard2 score reasonably well in classifying straightforward model interactions, they lag far behind a prompted GPT-4, especially in identifying adversarial jailbreaks and in evaluating models' refusals, a key measure for evaluating safety behaviors in model responses. To address these challenges, we construct WildGuardMix, a large-scale and carefully balanced multi-task safety moderation dataset with 92K labeled examples that cover vanilla (direct) prompts and adversarial jailbreaks, paired with various refusal and compliance responses. WildGuardMix is a combination of WildGuardTrain, the training data of WildGuard, and WildGuardTest, a high-quality human-annotated moderation test set with 5K labeled items covering broad risk scenarios. Through extensive evaluations on WildGuardTest and ten existing public benchmarks, we show that WildGuard establishes state-of-the-art performance in open-source safety moderation across all the three tasks compared to ten strong existing open-source moderation models (e.g., up to 26.4% improvement on refusal detection). Importantly, WildGuard matches and sometimes exceeds GPT-4 performance (e.g., up to 3.9% improvement on prompt harmfulness identification). WildGuard serves as a highly effective safety moderator in an LLM interface, reducing the success rate of jailbreak attacks from 79.8% to 2.4%., Comment: First two authors contributed equally. Third and fourth authors contributed equally
Published: 2024

3. Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Author: Ivison, Hamish, Wang, Yizhong, Liu, Jiacheng, Wu, Zeqiu, Pyatkin, Valentina, Lambert, Nathan, Smith, Noah A., Choi, Yejin, and Hajishirzi, Hannaneh
Subjects: Computer Science - Computation and Language
Abstract: Learning from preference feedback has emerged as an essential step for improving the generation quality and performance of modern language models (LMs). Despite its widespread use, the way preference-based learning is applied varies wildly, with differing data, learning algorithms, and evaluations used, making disentangling the impact of each aspect difficult. In this work, we identify four core aspects of preference-based learning: preference data, learning algorithm, reward model, and policy training prompts, systematically investigate the impact of these components on downstream model performance, and suggest a recipe for strong learning for preference feedback. Our findings indicate that all aspects are important for performance, with better preference data leading to the largest improvements, followed by the choice of learning algorithm, the use of improved reward models, and finally the use of additional unlabeled prompts for policy training. Notably, PPO outperforms DPO by up to 2.5% in math and 1.2% in general domains. High-quality preference data leads to improvements of up to 8% in instruction following and truthfulness. Despite significant gains of up to 5% in mathematical evaluation when scaling up reward models, we surprisingly observe marginal improvements in other categories. We publicly release the code used for training (https://github.com/hamishivi/EasyLM) and evaluating (https://github.com/allenai/open-instruct) our models, along with the models and datasets themselves (https://huggingface.co/collections/allenai/tulu-v25-suite-66676520fd578080e126f618)., Comment: Preprint
Published: 2024

4. Towards a Framework for Openness in Foundation Models: Proceedings from the Columbia Convening on Openness in Artificial Intelligence

Author: Basdevant, Adrien, François, Camille, Storchan, Victor, Bankston, Kevin, Bdeir, Ayah, Behlendorf, Brian, Debbah, Merouane, Kapoor, Sayash, LeCun, Yann, Surman, Mark, King-Turvey, Helen, Lambert, Nathan, Maffulli, Stefano, Marda, Nik, Shivkumar, Govind, and Tunney, Justine
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence
Abstract: Over the past year, there has been a robust debate about the benefits and risks of open sourcing foundation models. However, this discussion has often taken place at a high level of generality or with a narrow focus on specific technical attributes. In part, this is because defining open source for foundation models has proven tricky, given its significant differences from traditional software development. In order to inform more practical and nuanced decisions about opening AI systems, including foundation models, this paper presents a framework for grappling with openness across the AI stack. It summarizes previous work on this topic, analyzes the various potential reasons to pursue openness, and outlines how openness varies in different parts of the AI stack, both at the model and at the system level. In doing so, its authors hope to provide a common descriptive framework to deepen a nuanced and rigorous understanding of openness in AI and enable further work around definitions of openness and safety in AI.
Published: 2024

5. D2PO: Discriminator-Guided DPO with Response Evaluation Models

Author: Singhal, Prasann, Lambert, Nathan, Niekum, Scott, Goyal, Tanya, and Durrett, Greg
Subjects: Computer Science - Computation and Language
Abstract: Varied approaches for aligning language models have been proposed, including supervised fine-tuning, RLHF, and direct optimization methods such as DPO. Although DPO has rapidly gained popularity due to its straightforward training process and competitive results, there is an open question of whether there remain practical advantages of using a discriminator, like a reward model, to evaluate responses. We propose D2PO, discriminator-guided DPO, an approach for the online setting where preferences are being collected throughout learning. As we collect gold preferences, we use these not only to train our policy, but to train a discriminative response evaluation model to silver-label even more synthetic data for policy training. We explore this approach across a set of diverse tasks, including a realistic chat setting, we find that our approach leads to higher-quality outputs compared to DPO with the same data budget, and greater efficiency in terms of preference data requirements. Furthermore, we show conditions under which silver labeling is most helpful: it is most effective when training the policy with DPO, outperforming traditional PPO, and benefits from maintaining a separate discriminator from the policy model., Comment: 20 pages, 12 figures, Accepted to COLM 2024
Published: 2024

6. Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback

Author: Conitzer, Vincent, Freedman, Rachel, Heitzig, Jobst, Holliday, Wesley H., Jacobs, Bob M., Lambert, Nathan, Mossé, Milan, Pacuit, Eric, Russell, Stuart, Schoelkopf, Hailey, Tewolde, Emanuel, and Zwicker, William S.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computers and Society, Computer Science - Computer Science and Game Theory, 68T01, 68T50, 91B14, 91B12, I.2.0, I.2.7, K.4.2, I.2.m, J.4
Abstract: Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the input into consistent data about "collective" preferences or otherwise use it to make collective choices about model behavior? In this paper, we argue that the field of social choice is well positioned to address these questions, and we discuss ways forward for this agenda, drawing on discussions in a recent workshop on Social Choice for AI Ethics and Safety held in Berkeley, CA, USA in December 2023., Comment: 15 pages, 4 figures
Published: 2024

7. RewardBench: Evaluating Reward Models for Language Modeling

Author: Lambert, Nathan, Pyatkin, Valentina, Morrison, Jacob, Miranda, LJ, Lin, Bill Yuchen, Chandu, Khyathi, Dziri, Nouha, Kumar, Sachin, Zick, Tom, Choi, Yejin, Smith, Noah A., and Hajishirzi, Hannaneh
Subjects: Computer Science - Machine Learning
Abstract: Reward models (RMs) are at the crux of successfully using RLHF to align pretrained models to human preferences, yet there has been relatively little study that focuses on evaluation of those models. Evaluating reward models presents an opportunity to understand the opaque technologies used for alignment of language models and which values are embedded in them. Resources for reward model training and understanding are sparse in the nascent open-source community around them. To enhance scientific understanding of reward models, we present RewardBench, a benchmark dataset and code-base for evaluation. The RewardBench dataset is a collection of prompt-chosen-rejected trios spanning chat, reasoning, and safety, to benchmark how reward models perform on challenging, structured and out-of-distribution queries. We create specific comparison datasets for RMs that have subtle, but verifiable reasons (e.g. bugs, incorrect facts) why one answer should be preferred to another. On the RewardBench leaderboard, we evaluate reward models trained with a variety of methods, such as the direct MLE training of classifiers and the implicit reward modeling of Direct Preference Optimization (DPO). We present many findings on propensity for refusals, reasoning limitations, and instruction following shortcomings of various reward models towards a better understanding of the RLHF process., Comment: 44 pages, 19 figures, 12 tables
Published: 2024

8. A Survey on Data Selection for Language Models

Author: Albalak, Alon, Elazar, Yanai, Xie, Sang Michael, Longpre, Shayne, Lambert, Nathan, Wang, Xinyi, Muennighoff, Niklas, Hou, Bairu, Pan, Liangming, Jeong, Haewon, Raffel, Colin, Chang, Shiyu, Hashimoto, Tatsunori, and Wang, William Yang
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training. However, naively training a model on all available data may not be optimal (or feasible), as the quality of available text data can vary. Filtering out data can also decrease the carbon footprint and financial costs of training models by reducing the amount of training required. Data selection methods aim to determine which candidate data points to include in the training dataset and how to appropriately sample from the selected data points. The promise of improved data selection methods has caused the volume of research in the area to rapidly expand. However, because deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive, few organizations have the resources for extensive data selection research. Consequently, knowledge of effective data selection practices has become concentrated within a few organizations, many of which do not openly share their findings and methodologies. To narrow this gap in knowledge, we present a comprehensive review of existing literature on data selection methods and related research areas, providing a taxonomy of existing approaches. By describing the current landscape of research, this work aims to accelerate progress in data selection by establishing an entry point for new and established researchers. Additionally, throughout this review we draw attention to noticeable holes in the literature and conclude the paper by proposing promising avenues for future research., Comment: Paper list available at https://github.com/alon-albalak/data-selection-survey
Published: 2024

9. OLMo: Accelerating the Science of Language Models

Author: Groeneveld, Dirk, Beltagy, Iz, Walsh, Pete, Bhagia, Akshita, Kinney, Rodney, Tafjord, Oyvind, Jha, Ananya Harsh, Ivison, Hamish, Magnusson, Ian, Wang, Yizhong, Arora, Shane, Atkinson, David, Authur, Russell, Chandu, Khyathi Raghavi, Cohan, Arman, Dumas, Jennifer, Elazar, Yanai, Gu, Yuling, Hessel, Jack, Khot, Tushar, Merrill, William, Morrison, Jacob, Muennighoff, Niklas, Naik, Aakanksha, Nam, Crystal, Peters, Matthew E., Pyatkin, Valentina, Ravichander, Abhilasha, Schwenk, Dustin, Shah, Saurabh, Smith, Will, Strubell, Emma, Subramani, Nishant, Wortsman, Mitchell, Dasigi, Pradeep, Lambert, Nathan, Richardson, Kyle, Zettlemoyer, Luke, Dodge, Jesse, Lo, Kyle, Soldaini, Luca, Smith, Noah A., and Hajishirzi, Hannaneh
Subjects: Computer Science - Computation and Language
Abstract: Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs. To this end, we have built OLMo, a competitive, truly Open Language Model, to enable the scientific study of language models. Unlike most prior efforts that have only released model weights and inference code, we release OLMo alongside open training data and training and evaluation code. We hope this release will empower the open research community and inspire a new wave of innovation.
Published: 2024

10. Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Author: Soldaini, Luca, Kinney, Rodney, Bhagia, Akshita, Schwenk, Dustin, Atkinson, David, Authur, Russell, Bogin, Ben, Chandu, Khyathi, Dumas, Jennifer, Elazar, Yanai, Hofmann, Valentin, Jha, Ananya Harsh, Kumar, Sachin, Lucy, Li, Lyu, Xinxi, Lambert, Nathan, Magnusson, Ian, Morrison, Jacob, Muennighoff, Niklas, Naik, Aakanksha, Nam, Crystal, Peters, Matthew E., Ravichander, Abhilasha, Richardson, Kyle, Shen, Zejiang, Strubell, Emma, Subramani, Nishant, Tafjord, Oyvind, Walsh, Pete, Zettlemoyer, Luke, Smith, Noah A., Hajishirzi, Hannaneh, Beltagy, Iz, Groeneveld, Dirk, Dodge, Jesse, and Lo, Kyle
Subjects: Computer Science - Computation and Language
Abstract: Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to reproduce them. As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities and limitations. To facilitate scientific research on language model pretraining, we curate and release Dolma, a three-trillion-token English corpus, built from a diverse mixture of web content, scientific papers, code, public-domain books, social media, and encyclopedic materials. We extensively document Dolma, including its design principles, details about its construction, and a summary of its contents. We present analyses and experimental results on intermediate states of Dolma to share what we have learned about important data curation practices. Finally, we open-source our data curation toolkit to enable reproduction of our work as well as support further research in large-scale data curation., Comment: Accepted at ACL 2024; Dataset: https://hf.co/datasets/allenai/dolma; Code: https://github.com/allenai/dolma
Published: 2024

11. Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

Author: Ivison, Hamish, Wang, Yizhong, Pyatkin, Valentina, Lambert, Nathan, Peters, Matthew, Dasigi, Pradeep, Jang, Joel, Wadden, David, Smith, Noah A., Beltagy, Iz, and Hajishirzi, Hannaneh
Subjects: Computer Science - Computation and Language
Abstract: Since the release of T\"ULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques. We test and incorporate a number of these advances into T\"ULU, resulting in T\"ULU 2, a suite of improved T\"ULU models for advancing the understanding and best practices of adapting pretrained language models to downstream tasks and user preferences. Concretely, we release: (1) T\"ULU-V2-mix, an improved collection of high-quality instruction datasets; (2) T\"ULU 2, LLAMA-2 models finetuned on the V2 mixture; (3) T\"ULU 2+DPO, T\"ULU 2 models trained with direct preference optimization (DPO), including the largest DPO-trained model to date (T\"ULU 2+DPO 70B); (4) CODE T\"ULU 2, CODE LLAMA models finetuned on our V2 mix that outperform CODE LLAMA and its instruction-tuned variant, CODE LLAMA-Instruct. Our evaluation from multiple perspectives shows that the T\"ULU 2 suite achieves state-of-the-art performance among open models and matches or exceeds the performance of GPT-3.5-turbo-0301 on several benchmarks. We release all the checkpoints, data, training and evaluation code to facilitate future open efforts on adapting large language models., Comment: technical report; fixed zephyr numbers
Published: 2023

12. The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback

Author: Lambert, Nathan and Calandra, Roberto
Subjects: Computer Science - Machine Learning
Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) more capable in complex settings. RLHF proceeds as collecting human preference data, training a reward model on said data, and optimizing a base ML model with respect to said reward for extrinsic evaluation metrics (e.g. MMLU, GSM8k). RLHF relies on many assumptions about how the various pieces fit together, such as a reward model capturing human preferences and an RL optimizer extracting the right signal from a reward model. As the RLHF process involves many distinct design decisions, it is easy to assume that multiple processes are correlated and therefore numerically linked. This apparent correlation is often not true, where reward models are easily overoptimized or RL optimizers can reduce performance on tasks not modeled in the data. Notable manifestations of models trained with imperfect RLHF systems are those that are prone to refusing basic requests for safety reasons or appearing lazy in generations. As chat model evaluation becomes increasingly nuanced, the reliance on a perceived link between reward model training, RL scores, and downstream performance drives these issues, which we describe as an objective mismatch. In this paper, we illustrate the causes of this issue, reviewing relevant literature from model-based reinforcement learning, and argue for solutions. By solving objective mismatch in RLHF, the ML models of the future will be more precisely aligned to user instructions for both safety and helpfulness., Comment: 11 pages, 5 figures
Published: 2023

13. Zephyr: Direct Distillation of LM Alignment

Author: Tunstall, Lewis, Beeching, Edward, Lambert, Nathan, Rajani, Nazneen, Rasul, Kashif, Belkada, Younes, Huang, Shengyi, von Werra, Leandro, Fourrier, Clémentine, Habib, Nathan, Sarrazin, Nathan, Sanseviero, Omar, Rush, Alexander M., and Wolf, Thomas
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: We aim to produce a smaller language model that is aligned to user intent. Previous research has shown that applying distilled supervised fine-tuning (dSFT) on larger models significantly improves task accuracy; however, these models are unaligned, i.e. they do not respond well to natural prompts. To distill this property, we experiment with the use of preference data from AI Feedback (AIF). Starting from a dataset of outputs ranked by a teacher model, we apply distilled direct preference optimization (dDPO) to learn a chat model with significantly improved intent alignment. The approach requires only a few hours of training without any additional sampling during fine-tuning. The final result, Zephyr-7B, sets the state-of-the-art on chat benchmarks for 7B parameter models, and requires no human annotation. In particular, results on MT-Bench show that Zephyr-7B surpasses Llama2-Chat-70B, the best open-access RLHF-based model. Code, models, data, and tutorials for the system are available at https://github.com/huggingface/alignment-handbook.
Published: 2023

14. Entangled Preferences: The History and Risks of Reinforcement Learning and Human Feedback

Author: Lambert, Nathan, Gilbert, Thomas Krendl, and Zick, Tom
Subjects: Computer Science - Computers and Society
Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) easier to use and more effective. A core piece of the RLHF process is the training and utilization of a model of human preferences that acts as a reward function for optimization. This approach, which operates at the intersection of many stakeholders and academic disciplines, remains poorly understood. RLHF reward models are often cited as being central to achieving performance, yet very few descriptors of capabilities, evaluations, training methods, or open-source models exist. Given this lack of information, further study and transparency is needed for learned RLHF reward models. In this paper, we illustrate the complex history of optimizing preferences, and articulate lines of inquiry to understand the sociotechnical context of reward models. In particular, we highlight the ontological differences between costs, rewards, and preferences at stake in RLHF's foundations, related methodological tensions, and possible research directions to improve general understanding of how reward models function., Comment: 13 pages, 1 figure
Published: 2023

15. A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning

Author: Wei, Ran, Lambert, Nathan, McDonald, Anthony, Garcia, Alfredo, and Calandra, Roberto
Subjects: Computer Science - Machine Learning
Abstract: Model-based Reinforcement Learning (MBRL) aims to make agents more sample-efficient, adaptive, and explainable by learning an explicit model of the environment. While the capabilities of MBRL agents have significantly improved in recent years, how to best learn the model is still an unresolved question. The majority of MBRL algorithms aim at training the model to make accurate predictions about the environment and subsequently using the model to determine the most rewarding actions. However, recent research has shown that model predictive accuracy is often not correlated with action quality, tracing the root cause to the objective mismatch between accurate dynamics model learning and policy optimization of rewards. A number of interrelated solution categories to the objective mismatch problem have emerged as MBRL continues to mature as a research area. In this work, we provide an in-depth survey of these solution categories and propose a taxonomy to foster future research.
Published: 2023

16. Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

Author: Shoker, Sarah, Reddie, Andrew, Barrington, Sarah, Booth, Ruby, Brundage, Miles, Chahal, Husanjot, Depp, Michael, Drexel, Bill, Gupta, Ritwik, Favaro, Marina, Hecla, Jake, Hickey, Alan, Konaev, Margarita, Kumar, Kirthi, Lambert, Nathan, Lohn, Andrew, O'Keefe, Cullen, Rajani, Nazneen, Sellitto, Michael, Trager, Robert, Walker, Leah, Wehsener, Alexa, and Young, Jessica
Subjects: Computer Science - Computers and Society
Abstract: Foundation models could eventually introduce several pathways for undermining state security: accidents, inadvertent escalation, unintentional conflict, the proliferation of weapons, and the interference with human diplomacy are just a few on a long list. The Confidence-Building Measures for Artificial Intelligence workshop hosted by the Geopolitics Team at OpenAI and the Berkeley Risk and Security Lab at the University of California brought together a multistakeholder group to think through the tools and strategies to mitigate the potential risks introduced by foundation models to international security. Originating in the Cold War, confidence-building measures (CBMs) are actions that reduce hostility, prevent conflict escalation, and improve trust between parties. The flexibility of CBMs make them a key instrument for navigating the rapid changes in the foundation model landscape. Participants identified the following CBMs that directly apply to foundation models and which are further explained in this conference proceedings: 1. crisis hotlines 2. incident sharing 3. model, transparency, and system cards 4. content provenance and watermarks 5. collaborative red teaming and table-top exercises and 6. dataset and evaluation sharing. Because most foundation model developers are non-government entities, many CBMs will need to involve a wider stakeholder community. These measures can be implemented either by AI labs or by relevant government actors.
Published: 2023

17. BLISS: Interplanetary Exploration with Swarms of Low-Cost Spacecraft

Author: Alvara, Alexander N., Lee, Lydia, Sin, Emmanuel, Lambert, Nathan, Westphal, Andrew J., and Pister, Kristofer S. J.
Subjects: Electrical Engineering and Systems Science - Systems and Control, Electrical Engineering and Systems Science - Image and Video Processing, Physics - Space Physics
Abstract: Leveraging advancements in micro-scale technology, we propose a fleet of autonomous, low-cost, small solar sails for interplanetary exploration. The Berkeley Low-cost Interplanetary Solar Sail (BLISS) project aims to utilize small-scale technologies to create a fleet of tiny interplanetary femto-spacecraft for rapid, low-cost exploration of the inner solar system. This paper describes the hardware required to build a nearly 10 g spacecraft using a 1 m$^2$ solar sail steered by micro-electromechanical systems (MEMS) inchworm actuators. The trajectory control to a NEO, here 101955 Bennu, is detailed along with the low-level actuation control of the solar sail and the specifications of proposed onboard communication and computation. Two other applications are also shortly considered: sample return from dozens of Jupiter-family comets and interstellar comet rendezvous and imaging. The paper concludes by discussing the fundamental scaling limits and future directions for steerable autonomous miniature solar sails with onboard custom computers and sensors., Comment: 16 pages, 13 figures, 5 tables, 23 equations, and just over 10 years
Published: 2023

18. Measuring Data

Author: Mitchell, Margaret, Luccioni, Alexandra Sasha, Lambert, Nathan, Gerchick, Marissa, McMillan-Major, Angelina, Ozoani, Ezinwanne, Rajani, Nazneen, Thrush, Tristan, Jernite, Yacine, and Kiela, Douwe
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We identify the task of measuring data to quantitatively characterize the composition of machine learning data and datasets. Similar to an object's height, width, and volume, data measurements quantify different attributes of data along common dimensions that support comparison. Several lines of research have proposed what we refer to as measurements, with differing terminology; we bring some of this work together, particularly in fields of computer vision and language, and build from it to motivate measuring data as a critical component of responsible AI development. Measuring data aids in systematically building and analyzing machine learning (ML) data towards specific goals and gaining better control of what modern ML systems will learn. We conclude with a discussion of the many avenues of future work, the limitations of data measurements, and how to leverage these measurement approaches in research and practice.
Published: 2022

19. Reward Reports for Reinforcement Learning

Author: Gilbert, Thomas Krendl, Lambert, Nathan, Dean, Sarah, Zick, Tom, and Snoswell, Aaron
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: Building systems that are good for society in the face of complex societal effects requires a dynamic approach. Recent approaches to machine learning (ML) documentation have demonstrated the promise of discursive frameworks for deliberation about these complexities. However, these developments have been grounded in a static ML paradigm, leaving the role of feedback and post-deployment performance unexamined. Meanwhile, recent work in reinforcement learning has shown that the effects of feedback and optimization objectives on system behavior can be wide-ranging and unpredictable. In this paper we sketch a framework for documenting deployed and iteratively updated learning systems, which we call Reward Reports. Taking inspiration from various contributions to the technical literature on reinforcement learning, we outline Reward Reports as living documents that track updates to design choices and assumptions behind what a particular automated system is optimizing for. They are intended to track dynamic phenomena arising from system deployment, rather than merely static properties of models or data. After presenting the elements of a Reward Report, we discuss a concrete example: Meta's BlenderBot 3 chatbot. Several others for game-playing (DeepMind's MuZero), content recommendation (MovieLens), and traffic control (Project Flow) are included in the appendix.
Published: 2022

20. Investigating Compounding Prediction Errors in Learned Dynamics Models

Author: Lambert, Nathan, Pister, Kristofer, and Calandra, Roberto
Subjects: Computer Science - Machine Learning
Abstract: Accurately predicting the consequences of agents' actions is a key prerequisite for planning in robotic control. Model-based reinforcement learning (MBRL) is one paradigm which relies on the iterative learning and prediction of state-action transitions to solve a task. Deep MBRL has become a popular candidate, using a neural network to learn a dynamics model that predicts with each pass from high-dimensional states to actions. These "one-step" predictions are known to become inaccurate over longer horizons of composed prediction - called the compounding error problem. Given the prevalence of the compounding error problem in MBRL and related fields of data-driven control, we set out to understand the properties of and conditions causing these long-horizon errors. In this paper, we explore the effects of subcomponents of a control problem on long term prediction error: including choosing a system, collecting data, and training a model. These detailed quantitative studies on simulated and real-world data show that the underlying dynamics of a system are the strongest factor determining the shape and magnitude of prediction error. Given a clearer understanding of compounding prediction error, researchers can implement new types of models beyond "one-step" that are more useful for control., Comment: 25 pages, 19 figures
Published: 2022

21. Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning Systems

Author: Gilbert, Thomas Krendl, Dean, Sarah, Zick, Tom, and Lambert, Nathan
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: In the long term, reinforcement learning (RL) is considered by many AI theorists to be the most promising path to artificial general intelligence. This places RL practitioners in a position to design systems that have never existed before and lack prior documentation in law and policy. Public agencies could intervene on complex dynamics that were previously too opaque to deliberate about, and long-held policy ambitions would finally be made tractable. In this whitepaper we illustrate this potential and how it might be technically enacted in the domains of energy infrastructure, social media recommender systems, and transportation. Alongside these unprecedented interventions come new forms of risk that exacerbate the harms already generated by standard machine learning tools. We correspondingly present a new typology of risks arising from RL design choices, falling under four categories: scoping the horizon, defining rewards, pruning information, and training multiple agents. Rather than allowing RL systems to unilaterally reshape human domains, policymakers need new mechanisms for the rule of reason, foreseeability, and interoperability that match the risks these systems pose. We argue that criteria for these choices may be drawn from emerging subfields within antitrust, tort, and administrative law. It will then be possible for courts, federal and state agencies, and non-governmental organizations to play more active roles in RL specification and evaluation. Building on the "model cards" and "datasheets" frameworks proposed by Mitchell et al. and Gebru et al., we argue the need for Reward Reports for AI systems. Reward Reports are living documents for proposed RL deployments that demarcate design choices., Comment: 60 pages
Published: 2022

22. The Challenges of Exploration for Offline Reinforcement Learning

Author: Lambert, Nathan, Wulfmeier, Markus, Whitney, William, Byravan, Arunkumar, Bloesch, Michael, Dasagi, Vibhavari, Hertweck, Tim, and Riedmiller, Martin
Subjects: Computer Science - Machine Learning
Abstract: Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked processes of reinforcement learning: collecting informative experience and inferring optimal behaviour. The second step has been widely studied in the offline setting, but just as critical to data-efficient RL is the collection of informative data. The task-agnostic setting for data collection, where the task is not known a priori, is of particular interest due to the possibility of collecting a single dataset and using it to solve several downstream tasks as they arise. We investigate this setting via curiosity-based intrinsic motivation, a family of exploration methods which encourage the agent to explore those states or transitions it has not yet learned to model. With Explore2Offline, we propose to evaluate the quality of collected data by transferring the collected data and inferring policies with reward relabelling and standard offline RL algorithms. We evaluate a wide variety of data collection strategies, including a new exploration agent, Intrinsic Model Predictive Control (IMPC), using this scheme and demonstrate their performance on various tasks. We use this decoupled framework to strengthen intuitions about exploration and the data prerequisites for effective offline RL.
Published: 2022

23. BotNet: A Simulator for Studying the Effects of Accurate Communication Models on Multi-agent and Swarm Control

Author: Selden, Mark, Zhou, Jason, Campos, Felipe, Lambert, Nathan, Drew, Daniel, and Pister, Kristofer S. J.
Subjects: Computer Science - Robotics
Abstract: Decentralized control in multi-robot systems is dependent on accurate and reliable communication between agents. Important communication factors, such as latency and packet delivery ratio, are strong functions of the number of agents in the network. Findings from studies of mobile and high node-count radio-frequency (RF) mesh networks have only been transferred to the domain of multi-robot systems to a limited extent, and typical multi-agent robotic simulators often depend on simple propagation models that do not reflect the behavior of realistic RF networks. In this paper, we present a new open source swarm robotics simulator, BotNet, with an embedded standards-compliant time-synchronized channel hopping (6TiSCH) RF mesh network simulator. Using this simulator we show how more accurate communications models can limit even simple multi-robot control tasks such as flocking and formation control, with agent counts ranging from 10 up to 2500 agents. The experimental results are used to motivate changes to the inter-robot communication propagation models and other networking components currently used in practice in order to bridge the sim-to-real gap., Comment: 9 pages, 8 figures
Published: 2021

24. Axes for Sociotechnical Inquiry in AI Research

Author: Dean, Sarah, Gilbert, Thomas Krendl, Lambert, Nathan, and Zick, Tom
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence
Abstract: The development of artificial intelligence (AI) technologies has far exceeded the investigation of their relationship with society. Sociotechnical inquiry is needed to mitigate the harms of new technologies whose potential impacts remain poorly understood. To date, subfields of AI research develop primarily individual views on their relationship with sociotechnics, while tools for external investigation, comparison, and cross-pollination are lacking. In this paper, we propose four directions for inquiry into new and evolving areas of technological development: value--what progress and direction does a field promote, optimization--how the defined system within a problem formulation relates to broader dynamics, consensus--how agreement is achieved and who is included in building it, and failure--what methods are pursued when the problem specification is found wanting. The paper provides a lexicon for sociotechnical inquiry and illustrates it through the example of consumer drone technology., Comment: 9 pages, 1 figure
Published: 2021
Full Text: View/download PDF

25. MBRL-Lib: A Modular Library for Model-based Reinforcement Learning

Author: Pineda, Luis, Amos, Brandon, Zhang, Amy, Lambert, Nathan O., and Calandra, Roberto
Subjects: Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Systems and Control
Abstract: Model-based reinforcement learning is a compelling framework for data-efficient learning of agents that interact with the world. This family of algorithms has many subcomponents that need to be carefully selected and tuned. As a result the entry-bar for researchers to approach the field and to deploy it in real-world tasks can be daunting. In this paper, we present MBRL-Lib -- a machine learning library for model-based reinforcement learning in continuous state-action spaces based on PyTorch. MBRL-Lib is designed as a platform for both researchers, to easily develop, debug and compare new algorithms, and non-expert user, to lower the entry-bar of deploying state-of-the-art algorithms. MBRL-Lib is open-source at https://github.com/facebookresearch/mbrl-lib.
Published: 2021

26. BLISS: Interplanetary exploration with swarms of low-cost spacecraft

Author: Alvara, Alexander N., Lee, Lydia, Sin, Emmanuel, Lambert, Nathan, Westphal, Andrew J., and Pister, Kristofer S.J.
Published: 2024
Full Text: View/download PDF

27. On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

Author: Zhang, Baohe, Rajan, Raghu, Pineda, Luis, Lambert, Nathan, Biedenkapp, André, Chua, Kurtland, Hutter, Frank, and Calandra, Roberto
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing, Electrical Engineering and Systems Science - Systems and Control
Abstract: Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner. MBRL algorithms can be fairly complex due to the separate dynamics modeling and the subsequent planning algorithm, and as a result, they often possess tens of hyperparameters and architectural choices. For this reason, MBRL typically requires significant human expertise before it can be applied to new problems and domains. To alleviate this problem, we propose to use automatic hyperparameter optimization (HPO). We demonstrate that this problem can be tackled effectively with automated HPO, which we demonstrate to yield significantly improved performance compared to human experts. In addition, we show that tuning of several MBRL hyperparameters dynamically, i.e. during the training itself, further improves the performance compared to using static hyperparameters which are kept fixed for the whole training. Finally, our experiments provide valuable insights into the effects of several hyperparameters, such as plan horizon or learning rate and their influence on the stability of training and resulting rewards., Comment: 19 pages, accepted by AISTATS 2021
Published: 2021

28. AI Development for the Public Interest: From Abstraction Traps to Sociotechnical Risks

Author: Andrus, McKane, Dean, Sarah, Gilbert, Thomas Krendl, Lambert, Nathan, and Zick, Tom
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence
Abstract: Despite interest in communicating ethical problems and social contexts within the undergraduate curriculum to advance Public Interest Technology (PIT) goals, interventions at the graduate level remain largely unexplored. This may be due to the conflicting ways through which distinct Artificial Intelligence (AI) research tracks conceive of their interface with social contexts. In this paper we track the historical emergence of sociotechnical inquiry in three distinct subfields of AI research: AI Safety, Fair Machine Learning (Fair ML) and Human-in-the-Loop (HIL) Autonomy. We show that for each subfield, perceptions of PIT stem from the particular dangers faced by past integration of technical systems within a normative social order. We further interrogate how these histories dictate the response of each subfield to conceptual traps, as defined in the Science and Technology Studies literature. Finally, through a comparative analysis of these currently siloed fields, we present a roadmap for a unified approach to sociotechnical graduate pedagogy in AI., Comment: 8 Pages
Published: 2021

29. Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning

Author: Lambert, Nathan O., Wilcox, Albert, Zhang, Howard, Pister, Kristofer S. J., and Calandra, Roberto
Subjects: Computer Science - Machine Learning, Computer Science - Robotics
Abstract: Accurately predicting the dynamics of robotic systems is crucial for model-based control and reinforcement learning. The most common way to estimate dynamics is by fitting a one-step ahead prediction model and using it to recursively propagate the predicted state distribution over long horizons. Unfortunately, this approach is known to compound even small prediction errors, making long-term predictions inaccurate. In this paper, we propose a new parametrization to supervised learning on state-action data to stably predict at longer horizons -- that we call a trajectory-based model. This trajectory-based model takes an initial state, a future time index, and control parameters as inputs, and directly predicts the state at the future time index. Experimental results in simulated and real-world robotic tasks show that trajectory-based models yield significantly more accurate long term predictions, improved sample efficiency, and the ability to predict task reward. With these improved prediction properties, we conclude with a demonstration of methods for using the trajectory-based model for control., Comment: 8 pages, +4 pages appendix
Published: 2020

30. Nonholonomic Yaw Control of an Underactuated Flying Robot with Model-based Reinforcement Learning

Author: Lambert, Nathan, Schindler, Craig, Drew, Daniel, and Pister, Kristofer
Subjects: Computer Science - Robotics
Abstract: Nonholonomic control is a candidate to control nonlinear systems with path-dependant states. We investigate an underactuated flying micro-aerial-vehicle, the ionocraft, that requires nonholonomic control in the yaw-direction for complete attitude control. Deploying an analytical control law involves substantial engineering design and is sensitive to inaccuracy in the system model. With specific assumptions on assembly and system dynamics, we derive a Lie bracket for yaw control of the ionocraft. As a comparison to the significant engineering effort required for an analytic control law, we implement a data-driven model-based reinforcement learning yaw controller in a simulated flight task. We demonstrate that a simple model-based reinforcement learning framework can match the derived Lie bracket control (in yaw rate and chosen actions) in a few minutes of flight data, without a pre-defined dynamics function. This paper shows that learning-based approaches are useful as a tool for synthesis of nonlinear control laws previously only addressable through expert-based design., Comment: 7 pages, 1 page appendix
Published: 2020
Full Text: View/download PDF

31. Learning for Microrobot Exploration: Model-based Locomotion, Sparse-robust Navigation, and Low-power Deep Classification

Author: Lambert, Nathan O., Toddywala, Farhan, Liao, Brian, Zhu, Eric, Lee, Lydia, and Pister, Kristofer S. J.
Subjects: Computer Science - Robotics
Abstract: Building intelligent autonomous systems at any scale is challenging. The sensing and computation constraints of a microrobot platform make the problems harder. We present improvements to learning-based methods for on-board learning of locomotion, classification, and navigation of microrobots. We show how simulated locomotion can be achieved with model-based reinforcement learning via on-board sensor data distilled into control. Next, we introduce a sparse, linear detector and a Dynamic Thresholding method to FAST Visual Odometry for improved navigation in the noisy regime of mm scale imagery. We end with a new image classifier capable of classification with fewer than one million multiply-and-accumulate (MAC) operations by combining fast downsampling, efficient layer structures and hard activation functions. These are promising steps toward using state-of-the-art algorithms in the power-limited world of edge-intelligence and microrobots., Comment: 6 pages; 2 pages appendices
Published: 2020

32. Objective Mismatch in Model-based Reinforcement Learning

Author: Lambert, Nathan, Amos, Brandon, Yadan, Omry, and Calandra, Roberto
Subjects: Computer Science - Machine Learning, Computer Science - Robotics, Statistics - Machine Learning
Abstract: Model-based reinforcement learning (MBRL) has been shown to be a powerful framework for data-efficiently learning control of continuous tasks. Recent work in MBRL has mostly focused on using more advanced function approximators and planning schemes, with little development of the general framework. In this paper, we identify a fundamental issue of the standard MBRL framework -- what we call the objective mismatch issue. Objective mismatch arises when one objective is optimized in the hope that a second, often uncorrelated, metric will also be optimized. In the context of MBRL, we characterize the objective mismatch between training the forward dynamics model w.r.t.~the likelihood of the one-step ahead prediction, and the overall goal of improving performance on a downstream control task. For example, this issue can emerge with the realization that dynamics models effective for a specific task do not necessarily need to be globally accurate, and vice versa globally accurate models might not be sufficiently accurate locally to obtain good control performance on a specific task. In our experiments, we study this objective mismatch issue and demonstrate that the likelihood of one-step ahead predictions is not always correlated with control performance. This observation highlights a critical limitation in the MBRL framework which will require further research to be fully understood and addressed. We propose an initial method to mitigate the mismatch issue by re-weighting dynamics model training. Building on it, we conclude with a discussion about other potential directions of research for addressing this issue., Comment: 9 pages, 2 pages references, 5 pages appendices
Published: 2020

33. Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning

Author: Li, Tianyu, Lambert, Nathan, Calandra, Roberto, Meier, Franziska, and Rai, Akshara
Subjects: Computer Science - Robotics
Abstract: Learning to locomote to arbitrary goals on hardware remains a challenging problem for reinforcement learning. In this paper, we present a hierarchical learning framework that improves sample-efficiency and generalizability of locomotion skills on real-world robots. Our approach divides the problem of goal-oriented locomotion into two sub-problems: learning diverse primitives skills, and using model-based planning to sequence these skills. We parametrize our primitives as cyclic movements, improving sample-efficiency of learning on a 18 degrees of freedom robot. Then, we learn coarse dynamics models over primitive cycles and use them in a model predictive control framework. This allows us to learn to walk to arbitrary goals up to 12m away, after about two hours of training from scratch on hardware. Our results on a Daisy hexapod hardware and simulation demonstrate the efficacy of our approach at reaching distant targets, in different environments and with sensory noise., Comment: Submitted to 2020 ICRA
Published: 2019

34. Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning

Author: Lambert, Nathan O., Drew, Daniel S., Yaconelli, Joseph, Calandra, Roberto, Levine, Sergey, and Pister, Kristofer S. J.
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: Designing effective low-level robot controllers often entail platform-specific implementations that require manual heuristic parameter tuning, significant system knowledge, or long design times. With the rising number of robotic and mechatronic systems deployed across areas ranging from industrial automation to intelligent toys, the need for a general approach to generating low-level controllers is increasing. To address the challenge of rapidly generating low-level controllers, we argue for using model-based reinforcement learning (MBRL) trained on relatively small amounts of automatically generated (i.e., without system simulation) data. In this paper, we explore the capabilities of MBRL on a Crazyflie centimeter-scale quadrotor with rapid dynamics to predict and control at <50Hz. To our knowledge, this is the first use of MBRL for controlled hover of a quadrotor using only on-board sensors, direct motor input signals, and no initial dynamics knowledge. Our controller leverages rapid simulation of a neural network forward dynamics model on a GPU-enabled base station, which then transmits the best current action to the quadrotor firmware via radio. In our experiments, the quadrotor achieved hovering capability of up to 6 seconds with 3 minutes of experimental training data., Comment: Accepted to IROS and RA-L, 2019. For more information, see the website: https://sites.google.com/berkeley.edu/mbrl-quadrotor/. 9 pages, 12 figures
Published: 2019

35. Femoral Artery Closure Devices vs Manual Compression During Cardiac Catheterization and Percutaneous Coronary Intervention

Author: Kreutz, Rolf P., Phookan, Sujoy, Bahrami, Hamid, Sinha, Anjan K., Breall, Jeffrey A., Revtyak, George E., Ephrem, Georges, Zenisek, Joseph R., Frick, Kyle A., Jaradat, Ziad A., Abu Romeh, Ibrahim S., O’Leary, Brian A., Ansari, Hamza Z., Ferguson, Andrew D., Zawacki, Kevin E., Hoque, Mohammad Z., Iqtidar, Ali F., Lambert, Nathan D., and von der Lohe, Elisabeth
Published: 2022
Full Text: View/download PDF

36. Clinical Outcomes of Patients With Acute Myocardial Infarction in Health Professional Shortage Areas in Indiana

Author: Gunderman, David J., primary, Kumar, Ashish, additional, Munguia-Vazquez, Raymundo, additional, Vora, Keyur, additional, Shah, Chirag, additional, Lambert, Nathan, additional, Cavanaugh, Brendan, additional, Dharmakumar, Rohan, additional, and Kalra, Ankur, additional
Published: 2024
Full Text: View/download PDF

37. Supervision as a mechanism in teacher wellbeing: A Q-methodological study of school staff viewpoints

Author: Beech, Kirsty, primary, Gulliford, Anthea, additional, and Lambert, Nathan, additional
Published: 2023
Full Text: View/download PDF

38. BLISS: Interplanetary exploration with swarms of low-cost spacecraft

Author: Alvara, Alexander N., primary, Lee, Lydia, additional, Sin, Emmanuel, additional, Lambert, Nathan, additional, Westphal, Andrew J., additional, and Pister, Kristofer, S.J., additional
Published: 2023
Full Text: View/download PDF

39. Synergy of Prediction and Control in Model-based Reinforcement Learning

Author: Lambert, Nathan Owen
Subjects: Artificial intelligence, Robotics, Dynamics models, Reinforcement learning, Robotics
Abstract: Model-based reinforcement learning (MBRL) has often been touted for its potential to improve on the sample-efficiency, generalization, and safety of existing reinforcement learning algorithms.These model-based algorithms constrain the policy optimization during trial-and-error learning to include a structured representation of the environment dynamics. To date, the posited benefits have largely been left as directions for future work.This thesis attempts to illustrate the central mechanism in MBRL: how a learned dynamics model interacts with decision making.A better understanding of this interaction will point the field in the direction of enabling the posited benefits. This thesis encompasses the interaction of model-learning with decision making with respect to two central issues: compounding prediction errors and objective mismatch.The compounding error challenge emerges from accumulating errors on recursive passes of any one-step transition model.Most dynamics models are trained for single-step accuracy, which often results in models with substantial long-term prediction error.Additionally, the model being trained for accurate transitions need not guarantee high-performance policies on the downstream task.The lack of correlation between model and policy metrics in separate optimization is coined and studied as Objective Mismatch.These challenges are primarily studied in the context of sample-based model predictive control (MPC) algorithms, where the learned model is used to simulate trajectories and their resulting predicted rewards.To mitigate compounding error and objective mismatch, the trajectory-based dynamics model is a feedforward prediction parametrization containing a direct representation of time.This model represents one small, but important steps towards more useful dynamics models in model-based reinforcement learning.This thesis concludes with future directions on the synergy of prediction and control in MBRL, primarily focused on state-abstractions, temporal correlation, and future prediction methodologies.
Published: 2022

40. Reward Reports for Reinforcement Learning

Author: Gilbert, Thomas Krendl, primary, Lambert, Nathan, additional, Dean, Sarah, additional, Zick, Tom, additional, Snoswell, Aaron, additional, and Mehta, Soham, additional
Published: 2023
Full Text: View/download PDF

41. Keratoplasty Outcomes in Patients With Uveitis

Author: Hennein, Lauren, Lambert, Nathan G., Chamberlain, Winston, Hirabayashi, Kristin, Rose-Nussbaumer, Jennifer, and Schallhorn, Julie M.
Published: 2020
Full Text: View/download PDF

42. Case Series of Stickler Syndrome Presenting With Acute Angle Closure

Author: Walters, Alexander, Lambert, Nathan, Bricel, Seth, Hwang, Thomas, Ing, Eliesa, and Tehrani, Shandiz
Published: 2020
Full Text: View/download PDF

43. Risk factors and biomarkers of age-related macular degeneration

Author: Lambert, Nathan G., ElShelmani, Hanan, Singh, Malkit K., Mansergh, Fiona C., Wride, Michael A., Padilla, Maximilian, Keegan, David, Hogg, Ruth E., and Ambati, Balamurali K.
Published: 2016
Full Text: View/download PDF

44. Intraocular pressure study using monitored forced-infusion system phacoemulsification technology

Author: Jensen, Jason D., Boulter, Tyler, Lambert, Nathan G., Zaugg, Brian, Stagg, Brian C., Pettey, Jeff H., and Olson, Randall J.
Published: 2016
Full Text: View/download PDF

45. Comparison of a torsional and a standard tip with a monitored forced infusion phacoemulsification system

Author: Boulter, Tyler, Jensen, Jason D., Christensen, Michael D., Lambert, Nathan G., Zaugg, Brian, Stagg, Brian C., Pettey, Jeff H., and Olson, Randall J.
Published: 2016
Full Text: View/download PDF

46. Super-Infinite: The Transformations of John Donne by Katherine Rundell (review)

Author: Lambert, Nathanael
Published: 2024
Full Text: View/download PDF

47. The Temporal Stability and Predictive Validity of Pupils' Causal Attributions for Difficult Classroom Behaviour

Author: Lambert, Nathan and Miller, Andy
Abstract: Background: Recent studies have investigated the causal attributions for difficult pupil behaviour made by teachers, pupils, and parents but none have investigated the temporal stability or predictive validity of these attributions. Aims: This study examines the causal attributions made for difficult classroom behaviour by students on two occasions 30 months apart. The longitudinal stability of these attributions is considered as is the predictive validity of the first set of attributions in relation to teachers' later judgments about individual students' behaviour. Sample: Two hundred and seventeen secondary school age pupils (114 males, 103 females) provided data on the two occasions. Teachers also rated each student's behaviour at the two times. Method: A questionnaire listing 63 possible causes of classroom misbehaviour was delivered to pupils firstly when they were in Year 7 (aged 11-12) and then again, 30 months later. Responses were analysed through exploratory factor analysis (EFA). Additionally, teachers were asked to rate the standard of behaviour of each of the students on the two occasions. Results: EFA of the Years 7 and 10 data indicated that pupils' attributions yielded broadly similar five-factor models with the perceived relative importance of these factors remaining the same. Analysis also revealed a predictive relationship between pupils' attributions regarding the factor named "culture of misbehaviour" in Year 7, and teachers' judgments of their standard of behaviour in Year 10. Conclusion: The present study suggests that young adolescents' causal attributions for difficult classroom behaviour remain stable over time and are predictive of teachers' later judgments about their behaviour.
Published: 2010
Full Text: View/download PDF

48. Young People's Views about Their Involvement in Decision-Making

Author: Aston, Hermione J. and Lambert, Nathan
Abstract: This paper reports on research conducted over a two-year period in a large Educational Psychology Service (EPS) in England. Researchers were keen to ascertain the views of young people and EPS members about young people being directly involved in educational decision-making and how their "genuine" involvement in such decision-making might be best achieved. Focus groups were employed as a means of gathering data which were analysed using Content Analysis. Young people and EPS members ultimately identified "culture", "attitudes", "environment", and "systems" as being the most important factors in ensuring the genuine involvement of young people in decision-making.
Published: 2010
Full Text: View/download PDF

49. Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning

Author: Lambert, Nathan, primary, Wilcox, Albert, additional, Zhang, Howard, additional, Pister, Kristofer S. J., additional, and Calandra, Roberto, additional
Published: 2021
Full Text: View/download PDF

50. Predicting Flying Robot Dynamics with Deep Learning

Author: Li, Brian, primary and Lambert, Nathan, additional
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

172 results on '"Lambert, Nathan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources