404 results on '"Jennifer Wortman"'
Search Results
2. How do authors' perceptions of their papers compare with co-authors' perceptions and peer-review decisions?
- Author
-
Charvi Rastogi, Ivan Stelmakh, Alina Beygelzimer, Yann N Dauphin, Percy Liang, Jennifer Wortman Vaughan, Zhenyu Xue, Hal Daumé Iii, Emma Pierson, and Nihar B Shah
- Subjects
Medicine ,Science - Abstract
How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we surveyed the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based on scientific contribution, and (iii) the change in their perception about their own papers after seeing the reviews. The salient results are: (1) Authors had roughly a three-fold overestimate of the acceptance probability of their papers: The median prediction was 70% for an approximately 25% acceptance rate. (2) Female authors exhibited a marginally higher (statistically significant) miscalibration than male authors; predictions of authors invited to serve as meta-reviewers or reviewers were similarly calibrated, but better than authors who were not invited to review. (3) Authors' relative ranking of scientific contribution of two submissions they made generally agreed with their predicted acceptance probabilities (93% agreement), but there was a notable 7% responses where authors predicted a worse outcome for their better paper. (4) The author-provided rankings disagreed with the peer-review decisions about a third of the time; when co-authors ranked their jointly authored papers, co-authors disagreed at a similar rate-about a third of the time. (5) At least 30% of respondents of both accepted and rejected papers said that their perception of their own paper improved after the review process. The stakeholders in peer review should take these findings into account in setting their expectations from peer review.
- Published
- 2024
- Full Text
- View/download PDF
3. Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
- Author
-
Cooper, A. Feder, Choquette-Choo, Christopher A., Bogen, Miranda, Jagielski, Matthew, Filippova, Katja, Liu, Ken Ziyu, Chouldechova, Alexandra, Hayes, Jamie, Huang, Yangsibo, Mireshghallah, Niloofar, Shumailov, Ilia, Triantafillou, Eleni, Kairouz, Peter, Mitchell, Nicole, Liang, Percy, Ho, Daniel E., Choi, Yejin, Koyejo, Sanmi, Delgado, Fernando, Grimmelmann, James, Shmatikov, Vitaly, De Sa, Christopher, Barocas, Solon, Cyphert, Amy, Lemley, Mark, boyd, danah, Vaughan, Jennifer Wortman, Brundage, Miles, Bau, David, Neel, Seth, Jacobs, Abigail Z., Terzis, Andreas, Wallach, Hanna, Papernot, Nicolas, and Lee, Katherine
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model's parameters, e.g., a particular individual's personal data or in-copyright expression of Spiderman that was included in the model's training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual's data or reflect the concept of "Spiderman." Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We aim for conceptual clarity and to encourage more thoughtful communication among machine learning (ML), law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives., Comment: Presented at the 2nd Workshop on Generative AI and Law at ICML (July 2024)
- Published
- 2024
4. Challenges in Human-Agent Communication
- Author
-
Bansal, Gagan, Vaughan, Jennifer Wortman, Amershi, Saleema, Horvitz, Eric, Fourney, Adam, Mozannar, Hussein, Dibia, Victor, and Weld, Daniel S.
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
Remarkable advancements in modern generative foundation models have enabled the development of sophisticated and highly capable autonomous agents that can observe their environment, invoke tools, and communicate with other agents to solve problems. Although such agents can communicate with users through natural language, their complexity and wide-ranging failure modes present novel challenges for human-AI interaction. Building on prior research and informed by a communication grounding perspective, we contribute to the study of \emph{human-agent communication} by identifying and analyzing twelve key communication challenges that these systems pose. These include challenges in conveying information from the agent to the user, challenges in enabling the user to convey information to the agent, and overarching challenges that need to be considered across all human-agent communication. We illustrate each challenge through concrete examples and identify open directions of research. Our findings provide insights into critical gaps in human-agent communication research and serve as an urgent call for new design patterns, principles, and guidelines to support transparency and control in these systems.
- Published
- 2024
5. Dimensions of Generative AI Evaluation Design
- Author
-
Dow, P. Alex, Vaughan, Jennifer Wortman, Barocas, Solon, Atalla, Chad, Chouldechova, Alexandra, and Wallach, Hanna
- Subjects
Computer Science - Computers and Society - Abstract
There are few principles or guidelines to ensure evaluations of generative AI (GenAI) models and systems are effective. To help address this gap, we propose a set of general dimensions that capture critical choices involved in GenAI evaluation design. These dimensions include the evaluation setting, the task type, the input source, the interaction style, the duration, the metric type, and the scoring method. By situating GenAI evaluations within these dimensions, we aim to guide decision-making during GenAI evaluation design and provide a structure for comparing different evaluations. We illustrate the utility of the proposed set of general dimensions using two examples: a hypothetical evaluation of the fairness of a GenAI system and three real-world GenAI evaluations of biological threats., Comment: NeurIPS 2024 Workshop on Evaluating Evaluations (EvalEval)
- Published
- 2024
6. Evaluating Generative AI Systems is a Social Science Measurement Challenge
- Author
-
Wallach, Hanna, Desai, Meera, Pangakis, Nicholas, Cooper, A. Feder, Wang, Angelina, Barocas, Solon, Chouldechova, Alexandra, Atalla, Chad, Blodgett, Su Lin, Corvi, Emily, Dow, P. Alex, Garcia-Gathright, Jean, Olteanu, Alexandra, Reed, Stefanie, Sheng, Emily, Vann, Dan, Vaughan, Jennifer Wortman, Vogel, Matthew, Washington, Hannah, and Jacobs, Abigail Z.
- Subjects
Computer Science - Computers and Society - Abstract
Across academia, industry, and government, there is an increasing awareness that the measurement tasks involved in evaluating generative AI (GenAI) systems are especially difficult. We argue that these measurement tasks are highly reminiscent of measurement tasks found throughout the social sciences. With this in mind, we present a framework, grounded in measurement theory from the social sciences, for measuring concepts related to the capabilities, impacts, opportunities, and risks of GenAI systems. The framework distinguishes between four levels: the background concept, the systematized concept, the measurement instrument(s), and the instance-level measurements themselves. This four-level approach differs from the way measurement is typically done in ML, where researchers and practitioners appear to jump straight from background concepts to measurement instruments, with little to no explicit systematization in between. As well as surfacing assumptions, thereby making it easier to understand exactly what the resulting measurements do and do not mean, this framework has two important implications for evaluating evaluations: First, it can enable stakeholders from different worlds to participate in conceptual debates, broadening the expertise involved in evaluating GenAI systems. Second, it brings rigor to operational debates by offering a set of lenses for interrogating the validity of measurement instruments and their resulting measurements., Comment: NeurIPS 2024 Workshop on Evaluating Evaluations (EvalEval)
- Published
- 2024
7. Supporting Industry Computing Researchers in Assessing, Articulating, and Addressing the Potential Negative Societal Impact of Their Work
- Author
-
Deng, Wesley Hanwen, Barocas, Solon, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Recent years have witnessed increasing calls for computing researchers to grapple with the societal impacts of their work. Tools such as impact assessments have gained prominence as a method to uncover potential impacts, and a number of publication venues now encourage authors to include an impact statement in their submissions. Despite this push, little is known about the way researchers assess, articulate, and address the potential negative societal impact of their work -- especially in industry settings, where research outcomes are often quickly integrated into products. In addition, while there are nascent efforts to support researchers in this task, there remains a dearth of empirically-informed tools and processes. Through interviews with 25 industry computing researchers across different companies and research areas, we first identify four key factors that influence how they grapple with (or choose not to grapple with) the societal impact of their research. To develop an effective impact assessment template tailored to industry computing researchers' needs, we conduct an iterative co-design process with these 25 industry researchers and an additional 16 researchers and practitioners with prior experience and expertise in reviewing and developing impact assessments or broad responsible computing practices. Through the co-design process, we develop 10 design considerations to facilitate the effective design, development, and adaptation of an impact assessment template for use in industry research settings and beyond, as well as our own ``Societal Impact Assessment'' template with concrete scaffolds. We explore the effectiveness of this template through a user study with 15 industry research interns, revealing both its strengths and limitations. Finally, we discuss the implications for future researchers and organizations seeking to foster more responsible research practices.
- Published
- 2024
- Full Text
- View/download PDF
8. (De)Noise: Moderating the Inconsistency Between Human Decision-Makers
- Author
-
Grgić-Hlača, Nina, Ali, Junaid, Gummadi, Krishna P., and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Computers and Society - Abstract
Prior research in psychology has found that people's decisions are often inconsistent. An individual's decisions vary across time, and decisions vary even more across people. Inconsistencies have been identified not only in subjective matters, like matters of taste, but also in settings one might expect to be more objective, such as sentencing, job performance evaluations, or real estate appraisals. In our study, we explore whether algorithmic decision aids can be used to moderate the degree of inconsistency in human decision-making in the context of real estate appraisal. In a large-scale human-subject experiment, we study how different forms of algorithmic assistance influence the way that people review and update their estimates of real estate prices. We find that both (i) asking respondents to review their estimates in a series of algorithmically chosen pairwise comparisons and (ii) providing respondents with traditional machine advice are effective strategies for influencing human responses. Compared to simply reviewing initial estimates one by one, the aforementioned strategies lead to (i) a higher propensity to update initial estimates, (ii) a higher accuracy of post-review estimates, and (iii) a higher degree of consistency between the post-review estimates of different respondents. While these effects are more pronounced with traditional machine advice, the approach of reviewing algorithmically chosen pairs can be implemented in a wider range of settings, since it does not require access to ground truth data., Comment: To appear in CSCW 2024
- Published
- 2024
9. 'I'm Not Sure, But...': Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust
- Author
-
Kim, Sunnie S. Y., Liao, Q. Vera, Vorvoreanu, Mihaela, Ballard, Stephanie, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
Widely deployed large language models (LLMs) can produce convincing yet incorrect outputs, potentially misleading users who may rely on them as if they were correct. To reduce such overreliance, there have been calls for LLMs to communicate their uncertainty to end users. However, there has been little empirical work examining how users perceive and act upon LLMs' expressions of uncertainty. We explore this question through a large-scale, pre-registered, human-subject experiment (N=404) in which participants answer medical questions with or without access to responses from a fictional LLM-infused search engine. Using both behavioral and self-reported measures, we examine how different natural language expressions of uncertainty impact participants' reliance, trust, and overall task performance. We find that first-person expressions (e.g., "I'm not sure, but...") decrease participants' confidence in the system and tendency to agree with the system's answers, while increasing participants' accuracy. An exploratory analysis suggests that this increase can be attributed to reduced (but not fully eliminated) overreliance on incorrect answers. While we observe similar effects for uncertainty expressed from a general perspective (e.g., "It's not clear, but..."), these effects are weaker and not statistically significant. Our findings suggest that using natural language expressions of uncertainty may be an effective approach for reducing overreliance on LLMs, but that the precise language used matters. This highlights the importance of user testing before deploying LLMs at scale., Comment: Accepted to FAccT 2024. This version includes the appendix
- Published
- 2024
- Full Text
- View/download PDF
10. Canvil: Designerly Adaptation for LLM-Powered User Experiences
- Author
-
Feng, K. J. Kevin, Liao, Q. Vera, Xiao, Ziang, Vaughan, Jennifer Wortman, Zhang, Amy X., and McDonald, David W.
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Advancements in large language models (LLMs) are sparking a proliferation of LLM-powered user experiences (UX). In product teams, designers often craft UX to meet user needs, but it is unclear how they engage with LLMs as a novel design material. Through a formative study with 12 designers, we find that designers seek a translational mechanism that enables design requirements to shape and be shaped by LLM behavior, motivating a need for designerly adaptation to facilitate this translation. We then built Canvil, a Figma widget that operationalizes designerly adaptation. We used Canvil as a technology probe in a group-based design study (6 groups, N=17), finding that designers constructively iterated on both adaptation approaches and interface designs to enhance end-user interaction with LLMs. Furthermore, designers identified promising collaborative workflows for designerly adaptation. Our work opens new avenues for processes and tools that foreground designers' user-centered expertise in LLM-powered applications. Canvil is available for public use at https://www.figma.com/community/widget/1277396720888327660.
- Published
- 2024
11. Open Datasheets: Machine-readable Documentation for Open Datasets and Responsible AI Assessments
- Author
-
Roman, Anthony Cintron, Vaughan, Jennifer Wortman, See, Valerie, Ballard, Steph, Torres, Jehu, Robinson, Caleb, and Ferres, Juan M. Lavista
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
This paper introduces a no-code, machine-readable documentation framework for open datasets, with a focus on responsible AI (RAI) considerations. The framework aims to improve comprehensibility, and usability of open datasets, facilitating easier discovery and use, better understanding of content and context, and evaluation of dataset quality and accuracy. The proposed framework is designed to streamline the evaluation of datasets, helping researchers, data scientists, and other open data users quickly identify datasets that meet their needs and organizational policies or regulations. The paper also discusses the implementation of the framework and provides recommendations to maximize its potential. The framework is expected to enhance the quality and reliability of data used in research and decision-making, fostering the development of more responsible and trustworthy AI systems.
- Published
- 2023
12. Revitalization of a Forward Genetic Screen Identifies Three New Regulators of Fungal Secondary Metabolism in the Genus Aspergillus
- Author
-
Brandon T. Pfannenstiel, Xixi Zhao, Jennifer Wortman, Philipp Wiemann, Kurt Throckmorton, Joseph E. Spraker, Alexandra A. Soukup, Xingyu Luo, Daniel L. Lindner, Fang Yun Lim, Benjamin P. Knox, Brian Haas, Gregory J. Fischer, Tsokyi Choera, Robert A. E. Butchko, Jin-Woo Bok, Katharyn J. Affeldt, Nancy P. Keller, and Jonathan M. Palmer
- Subjects
Aspergillus nidulans ,forward genetics ,whole-genome sequencing ,secondary metabolism ,Microbiology ,QR1-502 - Abstract
ABSTRACT The study of aflatoxin in Aspergillus spp. has garnered the attention of many researchers due to aflatoxin’s carcinogenic properties and frequency as a food and feed contaminant. Significant progress has been made by utilizing the model organism Aspergillus nidulans to characterize the regulation of sterigmatocystin (ST), the penultimate precursor of aflatoxin. A previous forward genetic screen identified 23 A. nidulans mutants involved in regulating ST production. Six mutants were characterized from this screen using classical mapping (five mutations in mcsA) and complementation with a cosmid library (one mutation in laeA). The remaining mutants were backcrossed and sequenced using Illumina and Ion Torrent sequencing platforms. All but one mutant contained one or more sequence variants in predicted open reading frames. Deletion of these genes resulted in identification of mutant alleles responsible for the loss of ST production in 12 of the 17 remaining mutants. Eight of these mutations were in genes already known to affect ST synthesis (laeA, mcsA, fluG, and stcA), while the remaining four mutations (in laeB, sntB, and hamI) were in previously uncharacterized genes not known to be involved in ST production. Deletion of laeB, sntB, and hamI in A. flavus results in loss of aflatoxin production, confirming that these regulators are conserved in the aflatoxigenic aspergilli. This report highlights the multifaceted regulatory mechanisms governing secondary metabolism in Aspergillus. Additionally, these data contribute to the increasing number of studies showing that forward genetic screens of fungi coupled with whole-genome resequencing is a robust and cost-effective technique. IMPORTANCE In a postgenomic world, reverse genetic approaches have displaced their forward genetic counterparts. The techniques used in forward genetics to identify loci of interest were typically very cumbersome and time-consuming, relying on Mendelian traits in model organisms. The current work was pursued not only to identify alleles involved in regulation of secondary metabolism but also to demonstrate a return to forward genetics to track phenotypes and to discover genetic pathways that could not be predicted through a reverse genetics approach. While identification of mutant alleles from whole-genome sequencing has been done before, here we illustrate the possibility of coupling this strategy with a genetic screen to identify multiple alleles of interest. Sequencing of classically derived mutants revealed several uncharacterized genes, which represent novel pathways to regulate and control the biosynthesis of sterigmatocystin and of aflatoxin, a societally and medically important mycotoxin.
- Published
- 2017
- Full Text
- View/download PDF
13. Has the Machine Learning Review Process Become More Arbitrary as the Field Has Grown? The NeurIPS 2021 Consistency Experiment
- Author
-
Beygelzimer, Alina, Dauphin, Yann N., Liang, Percy, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Machine Learning ,Computer Science - Digital Libraries - Abstract
We present the NeurIPS 2021 consistency experiment, a larger-scale variant of the 2014 NeurIPS experiment in which 10% of conference submissions were reviewed by two independent committees to quantify the randomness in the review process. We observe that the two committees disagree on their accept/reject recommendations for 23% of the papers and that, consistent with the results from 2014, approximately half of the list of accepted papers would change if the review process were randomly rerun. Our analysis suggests that making the conference more selective would increase the arbitrariness of the process. Taken together with previous research, our results highlight the inherent difficulty of objectively measuring the quality of research, and suggest that authors should not be excessively discouraged by rejected work.
- Published
- 2023
14. AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
- Author
-
Liao, Q. Vera and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
The rise of powerful large language models (LLMs) brings about tremendous opportunities for innovation but also looming risks for individuals and society at large. We have reached a pivotal moment for ensuring that LLMs and LLM-infused applications are developed and deployed responsibly. However, a central pillar of responsible AI -- transparency -- is largely missing from the current discourse around LLMs. It is paramount to pursue new approaches to provide transparency for LLMs, and years of research at the intersection of AI and human-computer interaction (HCI) highlight that we must do so with a human-centered perspective: Transparency is fundamentally about supporting appropriate human understanding, and this understanding is sought by different stakeholders with different goals in different contexts. In this new era of LLMs, we must develop and design approaches to transparency by considering the needs of stakeholders in the emerging LLM ecosystem, the novel types of LLM-infused applications being built, and the new usage patterns and challenges around LLMs, all while building on lessons learned about how people process, interact with, and make use of information. We reflect on the unique challenges that arise in providing transparency for LLMs, along with lessons learned from HCI and responsible AI research that has taken a human-centered perspective on AI transparency. We then lay out four common approaches that the community has taken to achieve transparency -- model reporting, publishing evaluation results, providing explanations, and communicating uncertainty -- and call out open questions around how these approaches may or may not be applied to LLMs. We hope this provides a starting point for discussion and a useful roadmap for future research.
- Published
- 2023
15. GAM Coach: Towards Interactive and User-centered Algorithmic Recourse
- Author
-
Wang, Zijie J., Vaughan, Jennifer Wortman, Caruana, Rich, and Chau, Duen Horng
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
Machine learning (ML) recourse techniques are increasingly used in high-stakes domains, providing end users with actions to alter ML predictions, but they assume ML developers understand what input variables can be changed. However, a recourse plan's actionability is subjective and unlikely to match developers' expectations completely. We present GAM Coach, a novel open-source system that adapts integer linear programming to generate customizable counterfactual explanations for Generalized Additive Models (GAMs), and leverages interactive visualizations to enable end users to iteratively generate recourse plans meeting their needs. A quantitative user study with 41 participants shows our tool is usable and useful, and users prefer personalized recourse plans over generic plans. Through a log analysis, we explore how users discover satisfactory recourse plans, and provide empirical evidence that transparency can lead to more opportunities for everyday users to discover counterintuitive patterns in ML models. GAM Coach is available at: https://poloclub.github.io/gam-coach/., Comment: Accepted to CHI 2023. 20 pages, 12 figures. For a demo video, see https://youtu.be/ubacP34H9XE. For a live demo, visit https://poloclub.github.io/gam-coach/
- Published
- 2023
- Full Text
- View/download PDF
16. Designerly Understanding: Information Needs for Model Transparency to Support Design Ideation for AI-Powered User Experience
- Author
-
Liao, Q. Vera, Subramonyam, Hariharan, Wang, Jennifer, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
Despite the widespread use of artificial intelligence (AI), designing user experiences (UX) for AI-powered systems remains challenging. UX designers face hurdles understanding AI technologies, such as pre-trained language models, as design materials. This limits their ability to ideate and make decisions about whether, where, and how to use AI. To address this problem, we bridge the literature on AI design and AI transparency to explore whether and how frameworks for transparent model reporting can support design ideation with pre-trained models. By interviewing 23 UX practitioners, we find that practitioners frequently work with pre-trained models, but lack support for UX-led ideation. Through a scenario-based design task, we identify common goals that designers seek model understanding for and pinpoint their model transparency information needs. Our study highlights the pivotal role that UX designers can play in Responsible AI and calls for supporting their understanding of AI limitations through model transparency and interrogation., Comment: Accepted at ACM CHI Conference on Human Factors in Computing Systems (CHI 2023)
- Published
- 2023
17. Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions
- Author
-
Vasconcelos, Helena, Bansal, Gagan, Fourney, Adam, Liao, Q. Vera, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
Large-scale generative models enabled the development of AI-powered code completion tools to assist programmers in writing code. However, much like other AI-powered tools, AI-powered code completions are not always accurate, potentially introducing bugs or even security vulnerabilities into code if not properly detected and corrected by a human programmer. One technique that has been proposed and implemented to help programmers identify potential errors is to highlight uncertain tokens. However, there have been no empirical studies exploring the effectiveness of this technique -- nor investigating the different and not-yet-agreed-upon notions of uncertainty in the context of generative models. We explore the question of whether conveying information about uncertainty enables programmers to more quickly and accurately produce code when collaborating with an AI-powered code completion tool, and if so, what measure of uncertainty best fits programmers' needs. Through a mixed-methods study with 30 programmers, we compare three conditions: providing the AI system's code completion alone, highlighting tokens with the lowest likelihood of being generated by the underlying generative model, and highlighting tokens with the highest predicted likelihood of being edited by a programmer. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits, and is subjectively preferred by study participants. In contrast, highlighting tokens according to their probability of being generated does not provide any benefit over the baseline with no highlighting. We further explore the design space of how to convey uncertainty in AI-powered code completion tools, and find that programmers prefer highlights that are granular, informative, interpretable, and not overwhelming., Comment: ACM Transactions on Computer-Human Interaction (TOCHI) 2024
- Published
- 2023
- Full Text
- View/download PDF
18. Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations
- Author
-
Chen, Valerie, Liao, Q. Vera, Vaughan, Jennifer Wortman, and Bansal, Gagan
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
AI explanations are often mentioned as a way to improve human-AI decision-making, but empirical studies have not found consistent evidence of explanations' effectiveness and, on the contrary, suggest that they can increase overreliance when the AI system is wrong. While many factors may affect reliance on AI support, one important factor is how decision-makers reconcile their own intuition -- beliefs or heuristics, based on prior knowledge, experience, or pattern recognition, used to make judgments -- with the information provided by the AI system to determine when to override AI predictions. We conduct a think-aloud, mixed-methods study with two explanation types (feature- and example-based) for two prediction tasks to explore how decision-makers' intuition affects their use of AI predictions and explanations, and ultimately their choice of when to rely on AI. Our results identify three types of intuition involved in reasoning about AI predictions and explanations: intuition about the task outcome, features, and AI limitations. Building on these, we summarize three observed pathways for decision-makers to apply their own intuition and override AI predictions. We use these pathways to explain why (1) the feature-based explanations we used did not improve participants' decision outcomes and increased their overreliance on AI, and (2) the example-based explanations we used improved decision-makers' performance over feature-based explanations and helped achieve complementary human-AI performance. Overall, our work identifies directions for further development of AI decision-support systems and explanation methods that help decision-makers effectively apply their intuition to achieve appropriate reliance on AI., Comment: To appear in CSCW 2023
- Published
- 2023
19. Evolution of Extensively Drug-Resistant Tuberculosis over Four Decades: Whole Genome Sequencing and Dating Analysis of Mycobacterium tuberculosis Isolates from KwaZulu-Natal.
- Author
-
Keira A Cohen, Thomas Abeel, Abigail Manson McGuire, Christopher A Desjardins, Vanisha Munsamy, Terrance P Shea, Bruce J Walker, Nonkqubela Bantubani, Deepak V Almeida, Lucia Alvarado, Sinéad B Chapman, Nomonde R Mvelase, Eamon Y Duffy, Michael G Fitzgerald, Pamla Govender, Sharvari Gujja, Susanna Hamilton, Clinton Howarth, Jeffrey D Larimer, Kashmeel Maharaj, Matthew D Pearson, Margaret E Priest, Qiandong Zeng, Nesri Padayatchi, Jacques Grosset, Sarah K Young, Jennifer Wortman, Koleka P Mlisana, Max R O'Donnell, Bruce W Birren, William R Bishai, Alexander S Pym, and Ashlee M Earl
- Subjects
Medicine - Abstract
BackgroundThe continued advance of antibiotic resistance threatens the treatment and control of many infectious diseases. This is exemplified by the largest global outbreak of extensively drug-resistant (XDR) tuberculosis (TB) identified in Tugela Ferry, KwaZulu-Natal, South Africa, in 2005 that continues today. It is unclear whether the emergence of XDR-TB in KwaZulu-Natal was due to recent inadequacies in TB control in conjunction with HIV or other factors. Understanding the origins of drug resistance in this fatal outbreak of XDR will inform the control and prevention of drug-resistant TB in other settings. In this study, we used whole genome sequencing and dating analysis to determine if XDR-TB had emerged recently or had ancient antecedents.Methods and findingsWe performed whole genome sequencing and drug susceptibility testing on 337 clinical isolates of Mycobacterium tuberculosis collected in KwaZulu-Natal from 2008 to 2013, in addition to three historical isolates, collected from patients in the same province and including an isolate from the 2005 Tugela Ferry XDR outbreak, a multidrug-resistant (MDR) isolate from 1994, and a pansusceptible isolate from 1995. We utilized an array of whole genome comparative techniques to assess the relatedness among strains, to establish the order of acquisition of drug resistance mutations, including the timing of acquisitions leading to XDR-TB in the LAM4 spoligotype, and to calculate the number of independent evolutionary emergences of MDR and XDR. Our sequencing and analysis revealed a 50-member clone of XDR M. tuberculosis that was highly related to the Tugela Ferry XDR outbreak strain. We estimated that mutations conferring isoniazid and streptomycin resistance in this clone were acquired 50 y prior to the Tugela Ferry outbreak (katG S315T [isoniazid]; gidB 130 bp deletion [streptomycin]; 1957 [95% highest posterior density (HPD): 1937-1971]), with the subsequent emergence of MDR and XDR occurring 20 y (rpoB L452P [rifampicin]; pncA 1 bp insertion [pyrazinamide]; 1984 [95% HPD: 1974-1992]) and 10 y (rpoB D435G [rifampicin]; rrs 1400 [kanamycin]; gyrA A90V [ofloxacin]; 1995 [95% HPD: 1988-1999]) prior to the outbreak, respectively. We observed frequent de novo evolution of MDR and XDR, with 56 and nine independent evolutionary events, respectively. Isoniazid resistance evolved before rifampicin resistance 46 times, whereas rifampicin resistance evolved prior to isoniazid only twice. We identified additional putative compensatory mutations to rifampicin in this dataset. One major limitation of this study is that the conclusions with respect to ordering and timing of acquisition of mutations may not represent universal patterns of drug resistance emergence in other areas of the globe.ConclusionsIn the first whole genome-based analysis of the emergence of drug resistance among clinical isolates of M. tuberculosis, we show that the ancestral precursor of the LAM4 XDR outbreak strain in Tugela Ferry gained mutations to first-line drugs at the beginning of the antibiotic era. Subsequent accumulation of stepwise resistance mutations, occurring over decades and prior to the explosion of HIV in this region, yielded MDR and XDR, permitting the emergence of compensatory mutations. Our results suggest that drug-resistant strains circulating today reflect not only vulnerabilities of current TB control efforts but also those that date back 50 y. In drug-resistant TB, isoniazid resistance was overwhelmingly the initial resistance mutation to be acquired, which would not be detected by current rapid molecular diagnostics employed in South Africa that assess only rifampicin resistance.
- Published
- 2015
- Full Text
- View/download PDF
20. How do Authors' Perceptions of their Papers Compare with Co-authors' Perceptions and Peer-review Decisions?
- Author
-
Rastogi, Charvi, Stelmakh, Ivan, Beygelzimer, Alina, Dauphin, Yann N., Liang, Percy, Vaughan, Jennifer Wortman, Xue, Zhenyu, Daumé III, Hal, Pierson, Emma, and Shah, Nihar B.
- Subjects
Computer Science - Machine Learning ,Computer Science - Databases ,Computer Science - Digital Libraries - Abstract
How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we survey the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based on scientific contribution, and (iii) the change in their perception about their own papers after seeing the reviews. The salient results are: (1) Authors have roughly a three-fold overestimate of the acceptance probability of their papers: The median prediction is 70% for an approximately 25% acceptance rate. (2) Female authors exhibit a marginally higher (statistically significant) miscalibration than male authors; predictions of authors invited to serve as meta-reviewers or reviewers are similarly calibrated, but better than authors who were not invited to review. (3) Authors' relative ranking of scientific contribution of two submissions they made generally agree (93%) with their predicted acceptance probabilities, but there is a notable 7% responses where authors think their better paper will face a worse outcome. (4) The author-provided rankings disagreed with the peer-review decisions about a third of the time; when co-authors ranked their jointly authored papers, co-authors disagreed at a similar rate -- about a third of the time. (5) At least 30% of respondents of both accepted and rejected papers said that their perception of their own paper improved after the review process. The stakeholders in peer review should take these findings into account in setting their expectations from peer review.
- Published
- 2022
21. Interpretable Distribution Shift Detection using Optimal Transport
- Author
-
Hulkund, Neha, Fusi, Nicolo, Vaughan, Jennifer Wortman, and Alvarez-Melis, David
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
We propose a method to identify and characterize distribution shifts in classification datasets based on optimal transport. It allows the user to identify the extent to which each class is affected by the shift, and retrieves corresponding pairs of samples to provide insights on its nature. We illustrate its use on synthetic and natural shift examples. While the results we present are preliminary, we hope that this inspires future work on interpretable methods for analyzing distribution shifts., Comment: Presented at ICML 2022 DataPerf Workshop
- Published
- 2022
22. Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values
- Author
-
Wang, Zijie J., Kale, Alex, Nori, Harsha, Stella, Peter, Nunnally, Mark E., Chau, Duen Horng, Vorvoreanu, Mihaela, Vaughan, Jennifer Wortman, and Caruana, Rich
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
Machine learning (ML) interpretability techniques can reveal undesirable patterns in data that models exploit to make predictions--potentially causing harms once deployed. However, how to take action to address these patterns is not always clear. In a collaboration between ML and human-computer interaction researchers, physicians, and data scientists, we develop GAM Changer, the first interactive system to help domain experts and data scientists easily and responsibly edit Generalized Additive Models (GAMs) and fix problematic patterns. With novel interaction techniques, our tool puts interpretability into action--empowering users to analyze, validate, and align model behaviors with their knowledge and values. Physicians have started to use our tool to investigate and fix pneumonia and sepsis risk prediction models, and an evaluation with 7 data scientists working in diverse domains highlights that our tool is easy to use, meets their model editing needs, and fits into their current workflows. Built with modern web technologies, our tool runs locally in users' web browsers or computational notebooks, lowering the barrier to use. GAM Changer is available at the following public demo link: https://interpret.ml/gam-changer., Comment: Accepted at KDD 2022. 11 pages, 19 figures. For a demo video, see https://youtu.be/D6whtfInqTc. For a live demo, visit https://interpret.ml/gam-changer
- Published
- 2022
- Full Text
- View/download PDF
23. Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata
- Author
-
Heger, Amy K., Marquis, Liz B., Vorvoreanu, Mihaela, Wallach, Hanna, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
Data is central to the development and evaluation of machine learning (ML) models. However, the use of problematic or inappropriate datasets can result in harms when the resulting models are deployed. To encourage responsible AI practice through more deliberate reflection on datasets and transparency around the processes by which they are created, researchers and practitioners have begun to advocate for increased data documentation and have proposed several data documentation frameworks. However, there is little research on whether these data documentation frameworks meet the needs of ML practitioners, who both create and consume datasets. To address this gap, we set out to understand ML practitioners' data documentation perceptions, needs, challenges, and desiderata, with the goal of deriving design requirements that can inform future data documentation frameworks. We conducted a series of semi-structured interviews with 14 ML practitioners at a single large, international technology company. We had them answer a list of questions taken from datasheets for datasets (Gebru, 2021). Our findings show that current approaches to data documentation are largely ad hoc and myopic in nature. Participants expressed needs for data documentation frameworks to be adaptable to their contexts, integrated into their existing tools and workflows, and automated wherever possible. Despite the fact that data documentation frameworks are often motivated from the perspective of responsible AI, participants did not make the connection between the questions that they were asked to answer and their responsible AI implications. In addition, participants often had difficulties prioritizing the needs of dataset consumers and providing information that someone unfamiliar with their datasets might need to know. Based on these findings, we derive seven design requirements for future data documentation frameworks., Comment: Camera-ready preprint of paper accepted to CSCW 2022
- Published
- 2022
24. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement.
- Author
-
Bruce J Walker, Thomas Abeel, Terrance Shea, Margaret Priest, Amr Abouelliel, Sharadha Sakthikumar, Christina A Cuomo, Qiandong Zeng, Jennifer Wortman, Sarah K Young, and Ashlee M Earl
- Subjects
Medicine ,Science - Abstract
Advances in modern sequencing technologies allow us to generate sufficient data to analyze hundreds of bacterial genomes from a single machine in a single day. This potential for sequencing massive numbers of genomes calls for fully automated methods to produce high-quality assemblies and variant calls. We introduce Pilon, a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions. Pilon works with many types of sequence data, but is particularly strong when supplied with paired end data from two Illumina libraries with small e.g., 180 bp and large e.g., 3-5 Kb inserts. Pilon significantly improves draft genome assemblies by correcting bases, fixing mis-assemblies and filling gaps. For both haploid and diploid genomes, Pilon produces more contiguous genomes with fewer errors, enabling identification of more biologically relevant genes. Furthermore, Pilon identifies small variants with high accuracy as compared to state-of-the-art tools and is unique in its ability to accurately identify large sequence variants including duplications and resolve large insertions. Pilon is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains. Pilon is freely available as open source software.
- Published
- 2014
- Full Text
- View/download PDF
25. Standardized metadata for human pathogen/vector genomic sequences.
- Author
-
Vivien G Dugan, Scott J Emrich, Gloria I Giraldo-Calderón, Omar S Harb, Ruchi M Newman, Brett E Pickett, Lynn M Schriml, Timothy B Stockwell, Christian J Stoeckert, Dan E Sullivan, Indresh Singh, Doyle V Ward, Alison Yao, Jie Zheng, Tanya Barrett, Bruce Birren, Lauren Brinkac, Vincent M Bruno, Elizabet Caler, Sinéad Chapman, Frank H Collins, Christina A Cuomo, Valentina Di Francesco, Scott Durkin, Mark Eppinger, Michael Feldgarden, Claire Fraser, W Florian Fricke, Maria Giovanni, Matthew R Henn, Erin Hine, Julie Dunning Hotopp, Ilene Karsch-Mizrachi, Jessica C Kissinger, Eun Mi Lee, Punam Mathur, Emmanuel F Mongodin, Cheryl I Murphy, Garry Myers, Daniel E Neafsey, Karen E Nelson, William C Nierman, Julia Puzak, David Rasko, David S Roos, Lisa Sadzewicz, Joana C Silva, Bruno Sobral, R Burke Squires, Rick L Stevens, Luke Tallon, Herve Tettelin, David Wentworth, Owen White, Rebecca Will, Jennifer Wortman, Yun Zhang, and Richard H Scheuermann
- Subjects
Medicine ,Science - Abstract
High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium's minimal information (MIxS) and NCBI's BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant.
- Published
- 2014
- Full Text
- View/download PDF
26. REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research
- Author
-
Smith, Jessie J., Amershi, Saleema, Barocas, Solon, Wallach, Hanna, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
Transparency around limitations can improve the scientific rigor of research, help ensure appropriate interpretation of research findings, and make research claims more credible. Despite these benefits, the machine learning (ML) research community lacks well-developed norms around disclosing and discussing limitations. To address this gap, we conduct an iterative design process with 30 ML and ML-adjacent researchers to develop and test REAL ML, a set of guided activities to help ML researchers recognize, explore, and articulate the limitations of their research. Using a three-stage interview and survey study, we identify ML researchers' perceptions of limitations, as well as the challenges they face when recognizing, exploring, and articulating limitations. We develop REAL ML to address some of these practical challenges, and highlight additional cultural challenges that will require broader shifts in community norms to address. We hope our study and REAL ML help move the ML research community toward more active and appropriate engagement with limitations., Comment: This work appears in the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22)
- Published
- 2022
- Full Text
- View/download PDF
27. Emergence of Epidemic Multidrug-Resistant Enterococcus faecium from Animal and Commensal Strains
- Author
-
François Lebreton, Willem van Schaik, Abigail Manson McGuire, Paul Godfrey, Allison Griggs, Varun Mazumdar, Jukka Corander, Lu Cheng, Sakina Saif, Sarah Young, Qiandong Zeng, Jennifer Wortman, Bruce Birren, Rob J. L. Willems, Ashlee M. Earl, and Michael S. Gilmore
- Subjects
Microbiology ,QR1-502 - Abstract
ABSTRACT Enterococcus faecium, natively a gut commensal organism, emerged as a leading cause of multidrug-resistant hospital-acquired infection in the 1980s. As the living record of its adaptation to changes in habitat, we sequenced the genomes of 51 strains, isolated from various ecological environments, to understand how E. faecium emerged as a leading hospital pathogen. Because of the scale and diversity of the sampled strains, we were able to resolve the lineage responsible for epidemic, multidrug-resistant human infection from other strains and to measure the evolutionary distances between groups. We found that the epidemic hospital-adapted lineage is rapidly evolving and emerged approximately 75 years ago, concomitant with the introduction of antibiotics, from a population that included the majority of animal strains, and not from human commensal lines. We further found that the lineage that included most strains of animal origin diverged from the main human commensal line approximately 3,000 years ago, a time that corresponds to increasing urbanization of humans, development of hygienic practices, and domestication of animals, which we speculate contributed to their ecological separation. Each bifurcation was accompanied by the acquisition of new metabolic capabilities and colonization traits on mobile elements and the loss of function and genome remodeling associated with mobile element insertion and movement. As a result, diversity within the species, in terms of sequence divergence as well as gene content, spans a range usually associated with speciation. IMPORTANCE Enterococci, in particular vancomycin-resistant Enterococcus faecium, recently emerged as a leading cause of hospital-acquired infection worldwide. In this study, we examined genome sequence data to understand the bacterial adaptations that accompanied this transformation from microbes that existed for eons as members of host microbiota. We observed changes in the genomes that paralleled changes in human behavior. An initial bifurcation within the species appears to have occurred at a time that corresponds to the urbanization of humans and domestication of animals, and a more recent bifurcation parallels the introduction of antibiotics in medicine and agriculture. In response to the opportunity to fill niches associated with changes in human activity, a rapidly evolving lineage emerged, a lineage responsible for the vast majority of multidrug-resistant E. faecium infections.
- Published
- 2013
- Full Text
- View/download PDF
28. Comparative Genomics of Vancomycin-Resistant Staphylococcus aureus Strains and Their Positions within the Clade Most Commonly Associated with Methicillin-Resistant S. aureus Hospital-Acquired Infection in the United States
- Author
-
Veronica N. Kos, Christopher A. Desjardins, Allison Griggs, Gustavo Cerqueira, Andries Van Tonder, Matthew T. G. Holden, Paul Godfrey, Kelli L. Palmer, Kip Bodi, Emmanuel F. Mongodin, Jennifer Wortman, Michael Feldgarden, Trevor Lawley, Steven R. Gill, Brian J. Haas, Bruce Birren, and Michael S. Gilmore
- Subjects
Microbiology ,QR1-502 - Abstract
ABSTRACT Methicillin-resistant Staphylococcus aureus (MRSA) strains are leading causes of hospital-acquired infections in the United States, and clonal cluster 5 (CC5) is the predominant lineage responsible for these infections. Since 2002, there have been 12 cases of vancomycin-resistant S. aureus (VRSA) infection in the United States—all CC5 strains. To understand this genetic background and what distinguishes it from other lineages, we generated and analyzed high-quality draft genome sequences for all available VRSA strains. Sequence comparisons show unambiguously that each strain independently acquired Tn1546 and that all VRSA strains last shared a common ancestor over 50 years ago, well before the occurrence of vancomycin resistance in this species. In contrast to existing hypotheses on what predisposes this lineage to acquire Tn1546, the barrier posed by restriction systems appears to be intact in most VRSA strains. However, VRSA (and other CC5) strains were found to possess a constellation of traits that appears to be optimized for proliferation in precisely the types of polymicrobic infection where transfer could occur. They lack a bacteriocin operon that would be predicted to limit the occurrence of non-CC5 strains in mixed infection and harbor a cluster of unique superantigens and lipoproteins to confound host immunity. A frameshift in dprA, which in other microbes influences uptake of foreign DNA, may also make this lineage conducive to foreign DNA acquisition. IMPORTANCE Invasive methicillin-resistant Staphylococcus aureus (MRSA) infection now ranks among the leading causes of death in the United States. Vancomycin is a key last-line bactericidal drug for treating these infections. However, since 2002, vancomycin resistance has entered this species. Of the now 12 cases of vancomycin-resistant S. aureus (VRSA), each was believed to represent a new acquisition of the vancomycin-resistant transposon Tn1546 from enterococcal donors. All acquisitions of Tn1546 so far have occurred in MRSA strains of the clonal cluster 5 genetic background, the most common hospital lineage causing hospital-acquired MRSA infection. To understand the nature of these strains, we determined and examined the nucleotide sequences of the genomes of all available VRSA. Genome comparison identified candidate features that position strains of this lineage well for acquiring resistance to antibiotics in mixed infection.
- Published
- 2012
- Full Text
- View/download PDF
29. Comparative Genomics of Enterococci: Variation in Enterococcus faecalis, Clade Structure in E. faecium, and Defining Characteristics of E. gallinarum and E. casseliflavus
- Author
-
Kelli L. Palmer, Paul Godfrey, Allison Griggs, Veronica N. Kos, Jeremy Zucker, Christopher Desjardins, Gustavo Cerqueira, Dirk Gevers, Suzanne Walker, Jennifer Wortman, Michael Feldgarden, Brian Haas, Bruce Birren, and Michael S. Gilmore
- Subjects
Microbiology ,QR1-502 - Abstract
ABSTRACT The enterococci are Gram-positive lactic acid bacteria that inhabit the gastrointestinal tracts of diverse hosts. However, Enterococcus faecium and E. faecalis have emerged as leading causes of multidrug-resistant hospital-acquired infections. The mechanism by which a well-adapted commensal evolved into a hospital pathogen is poorly understood. In this study, we examined high-quality draft genome data for evidence of key events in the evolution of the leading causes of enterococcal infections, including E. faecalis, E. faecium, E. casseliflavus, and E. gallinarum. We characterized two clades within what is currently classified as E. faecium and identified traits characteristic of each, including variation in operons for cell wall carbohydrate and putative capsule biosynthesis. We examined the extent of recombination between the two E. faecium clades and identified two strains with mosaic genomes. We determined the underlying genetics for the defining characteristics of the motile enterococci E. casseliflavus and E. gallinarum. Further, we identified species-specific traits that could be used to advance the detection of medically relevant enterococci and their identification to the species level. IMPORTANCE The enterococci, in particular, vancomycin-resistant enterococci, have emerged as leading causes of multidrug-resistant hospital-acquired infections. In this study, we examined genome sequence data to define traits with the potential to influence host-microbe interactions and to identify sequences and biochemical functions that could form the basis for the rapid identification of enterococcal species or lineages of importance in clinical and environmental samples.
- Published
- 2012
- Full Text
- View/download PDF
30. Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support
- Author
-
Madaio, Michael, Egede, Lisa, Subramonyam, Hariharan, Vaughan, Jennifer Wortman, and Wallach, Hanna
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computers and Society ,Computer Science - Human-Computer Interaction - Abstract
Various tools and practices have been developed to support practitioners in identifying, assessing, and mitigating fairness-related harms caused by AI systems. However, prior research has highlighted gaps between the intended design of these tools and practices and their use within particular contexts, including gaps caused by the role that organizational factors play in shaping fairness work. In this paper, we investigate these gaps for one such practice: disaggregated evaluations of AI systems, intended to uncover performance disparities between demographic groups. By conducting semi-structured interviews and structured workshops with thirty-three AI practitioners from ten teams at three technology companies, we identify practitioners' processes, challenges, and needs for support when designing disaggregated evaluations. We find that practitioners face challenges when choosing performance metrics, identifying the most relevant direct stakeholders and demographic groups on which to focus, and collecting datasets with which to conduct disaggregated evaluations. More generally, we identify impacts on fairness work stemming from a lack of engagement with direct stakeholders or domain experts, business imperatives that prioritize customers over marginalized groups, and the drive to deploy AI systems at scale., Comment: Camera-ready preprint of paper accepted to the CSCW conference
- Published
- 2021
31. GAM Changer: Editing Generalized Additive Models with Interactive Visualization
- Author
-
Wang, Zijie J., Kale, Alex, Nori, Harsha, Stella, Peter, Nunnally, Mark, Chau, Duen Horng, Vorvoreanu, Mihaela, Vaughan, Jennifer Wortman, and Caruana, Rich
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
Recent strides in interpretable machine learning (ML) research reveal that models exploit undesirable patterns in the data to make predictions, which potentially causes harms in deployment. However, it is unclear how we can fix these models. We present our ongoing work, GAM Changer, an open-source interactive system to help data scientists and domain experts easily and responsibly edit their Generalized Additive Models (GAMs). With novel visualization techniques, our tool puts interpretability into action -- empowering human users to analyze, validate, and align model behaviors with their knowledge and values. Built using modern web technologies, our tool runs locally in users' computational notebooks or web browsers without requiring extra compute resources, lowering the barrier to creating more responsible ML models. GAM Changer is available at https://interpret.ml/gam-changer., Comment: 7 pages, 15 figures, accepted to the Research2Clinics workshop at NeurIPS 2021. For a demo video, see https://youtu.be/2gVSoPoSeJ8. For a live demo, visit https://interpret.ml/gam-changer/
- Published
- 2021
32. (De)Noise: Moderating the Inconsistency Between Human Decision-Makers.
- Author
-
Nina Grgic-Hlaca, Junaid Ali 0001, Krishna P. Gummadi, and Jennifer Wortman Vaughan
- Published
- 2024
- Full Text
- View/download PDF
33. An Equivalence Between Fair Division and Wagering Mechanisms.
- Author
-
Rupert Freeman, Jens Witkowski, Jennifer Wortman Vaughan, and David M. Pennock
- Published
- 2024
- Full Text
- View/download PDF
34. Tinker, Tailor, Configure, Customize: The Articulation Work of Contextualizing an AI Fairness Checklist.
- Author
-
Michael A. Madaio, Jingya Chen, Hanna M. Wallach, and Jennifer Wortman Vaughan
- Published
- 2024
- Full Text
- View/download PDF
35. Evolution of extensively drug-resistant tuberculosis over four decades revealed by whole genome sequencing of Mycobacterium tuberculosis from KwaZulu-Natal, South Africa
- Author
-
Keira A Cohen, Thomas Abeel, Abigail Manson McGuire, Christopher A Desjardins, Vanisha Munsamy, Terrance P Shea, Bruce J Walker, Nonkqubela Bantubani, Deepak V Almeida, Lucia Alvarado, Sinead Chapman, Nomonde R Mvelase, Eamon Y Duffy, Michael G Fitzgerald, Pamla Govender, Sharvari Gujja, Susanna Hamilton, Clint Howarth, Jeffrey D Larimer, Kasmheel Maharaj, Matthew D Pearson, Margaret E Priest, Qiandong Zeng, Nesri Padayatchi, Jacques Grosset, Sarah K Young, Jennifer Wortman, Koleka P Mlisana, Max R O’Donnell, Bruce W Birren, William R Bishai, Alexander S Pym, and Ashlee M Earl
- Subjects
Tuberculosis ,South Africa ,Microbiology ,QR1-502 - Abstract
The largest global outbreak of extensively drug-resistant (XDR) tuberculosis (TB) was identified in Tugela Ferry, KwaZulu-Natal (KZN), South Africa in 2005. The antecedents and timing of the emergence of drug resistance in this fatal epidemic XDR outbreak are unknown, and it is unclear whether drug resistance in this region continues to be driven by clonal spread or by the development of de novo resistance. A whole genome sequencing and drug susceptibility testing (DST) was performed on 337 clinical isolates of Mycobacterium tuberculosis (M.tb) collected in KZN from 2008 to 2013, in addition to three historical isolates, one of which was isolated during the Tugela Ferry outbreak. Using a variety of whole genome comparative approaches, 11 drug-resistant clones of M.tb circulating from 2008 to 2013 were identified, including a 50-member clone of XDR M.tb that was highly related to the Tugela Ferry XDR outbreak strain. It was calculated that the evolutionary trajectory from first-line drug resistance to XDR in this clone spanned more than four decades and began at the start of the antibiotic era. It was also observed that frequent de novo evolution of MDR and XDR was present, with 56 and 9 independent evolutions, respectively. Thus, ongoing amplification of drug-resistance in KwaZulu-Natal is driven by both clonal spread and de novo acquisition of resistance. In drug-resistant TB, isoniazid resistance was overwhelmingly the initial resistance mutation to be acquired, which would not be detected by current rapid molecular diagnostics that assess only rifampicin resistance.
- Published
- 2015
36. From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence
- Author
-
Alvarez-Melis, David, Kaur, Harmanpreet, Daumé III, Hal, Wallach, Hanna, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
We take inspiration from the study of human explanation to inform the design and evaluation of interpretability methods in machine learning. First, we survey the literature on human explanation in philosophy, cognitive science, and the social sciences, and propose a list of design principles for machine-generated explanations that are meaningful to humans. Using the concept of weight of evidence from information theory, we develop a method for generating explanations that adhere to these principles. We show that this method can be adapted to handle high-dimensional, multi-class settings, yielding a flexible framework for generating explanations. We demonstrate that these explanations can be estimated accurately from finite samples and are robust to small perturbations of the inputs. We also evaluate our method through a qualitative user study with machine learning practitioners, where we observe that the resulting explanations are usable despite some participants struggling with background concepts like prior class probabilities. Finally, we conclude by surfacing~design~implications for interpretability tools in general., Comment: HCOMP 2021
- Published
- 2021
37. Designing Disaggregated Evaluations of AI Systems: Choices, Considerations, and Tradeoffs
- Author
-
Barocas, Solon, Guo, Anhong, Kamar, Ece, Krones, Jacquelyn, Morris, Meredith Ringel, Vaughan, Jennifer Wortman, Wadsworth, Duncan, and Wallach, Hanna
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Disaggregated evaluations of AI systems, in which system performance is assessed and reported separately for different groups of people, are conceptually simple. However, their design involves a variety of choices. Some of these choices influence the results that will be obtained, and thus the conclusions that can be drawn; others influence the impacts -- both beneficial and harmful -- that a disaggregated evaluation will have on people, including the people whose data is used to conduct the evaluation. We argue that a deeper understanding of these choices will enable researchers and practitioners to design careful and conclusive disaggregated evaluations. We also argue that better documentation of these choices, along with the underlying considerations and tradeoffs that have been made, will help others when interpreting an evaluation's results and conclusions.
- Published
- 2021
38. Incentive-Compatible Forecasting Competitions
- Author
-
Witkowski, Jens, Freeman, Rupert, Vaughan, Jennifer Wortman, Pennock, David M., and Krause, Andreas
- Subjects
Computer Science - Computer Science and Game Theory - Abstract
We initiate the study of incentive-compatible forecasting competitions in which multiple forecasters make predictions about one or more events and compete for a single prize. We have two objectives: (1) to incentivize forecasters to report truthfully and (2) to award the prize to the most accurate forecaster. Proper scoring rules incentivize truthful reporting if all forecasters are paid according to their scores. However, incentives become distorted if only the best-scoring forecaster wins a prize, since forecasters can often increase their probability of having the highest score by reporting more extreme beliefs. In this paper, we introduce two novel forecasting competition mechanisms. Our first mechanism is incentive compatible and guaranteed to select the most accurate forecaster with probability higher than any other forecaster. Moreover, we show that in the standard single-event, two-forecaster setting and under mild technical conditions, no other incentive-compatible mechanism selects the most accurate forecaster with higher probability. Our second mechanism is incentive compatible when forecasters' beliefs are such that information about one event does not lead to belief updates on other events, and it selects the best forecaster with probability approaching 1 as the number of events grows. Our notion of incentive compatibility is more general than previous definitions of dominant strategy incentive compatibility in that it allows for reports to be correlated with the event outcomes. Moreover, our mechanisms are easy to implement and can be generalized to the related problems of outputting a ranking over forecasters and hiring a forecaster with high accuracy on future events., Comment: 38 pages. Relative to the previous version Appendix A and Theorem 5 are new. This version additionally contains some expanded exposition
- Published
- 2021
39. Mathematical Foundations for Social Computing
- Author
-
Chen, Yiling, Ghosh, Arpita, Kearns, Michael, Roughgarden, Tim, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Computers and Society ,Computer Science - Human-Computer Interaction - Abstract
Social computing encompasses the mechanisms through which people interact with computational systems: crowdsourcing systems, ranking and recommendation systems, online prediction markets, citizen science projects, and collaboratively edited wikis, to name a few. These systems share the common feature that humans are active participants, making choices that determine the input to, and therefore the output of, the system. The output of these systems can be viewed as a joint computation between machine and human, and can be richer than what either could produce alone. The term social computing is often used as a synonym for several related areas, such as "human computation" and subsets of "collective intelligence"; we use it in its broadest sense to encompass all of these things. Social computing is blossoming into a rich research area of its own, with contributions from diverse disciplines including computer science, economics, and other social sciences. Yet a broad mathematical foundation for social computing is yet to be established, with a plethora of under-explored opportunities for mathematical research to impact social computing. As in other fields, there is great potential for mathematical work to influence and shape the future of social computing. However, we are far from having the systematic and principled understanding of the advantages, limitations, and potentials of social computing required to match the impact on applications that has occurred in other fields. In June 2015, we brought together roughly 25 experts in related fields to discuss the promise and challenges of establishing mathematical foundations for social computing. This document captures several of the key ideas discussed., Comment: A Computing Community Consortium (CCC) workshop report, 15 pages
- Published
- 2020
40. Greedy Algorithm almost Dominates in Smoothed Contextual Bandits
- Author
-
Raghavan, Manish, Slivkins, Aleksandrs, Vaughan, Jennifer Wortman, and Wu, Zhiwei Steven
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Online learning algorithms, widely used to power search and content optimization on the web, must balance exploration and exploitation, potentially sacrificing the experience of current users in order to gain information that will lead to better decisions in the future. While necessary in the worst case, explicit exploration has a number of disadvantages compared to the greedy algorithm that always "exploits" by choosing an action that currently looks optimal. We ask under what conditions inherent diversity in the data makes explicit exploration unnecessary. We build on a recent line of work on the smoothed analysis of the greedy algorithm in the linear contextual bandits model. We improve on prior results to show that a greedy approach almost matches the best possible Bayesian regret rate of any other algorithm on the same problem instance whenever the diversity conditions hold, and that this regret is at most $\tilde O(T^{1/3})$., Comment: Results in this paper, without any proofs, have been announced in an extended abstract (Raghavan et al., 2018a), and fleshed out in the technical report (Raghavan et al., 2018b [arXiv:1806.00543]). This manuscript covers a subset of results from Raghavan et al. (2018a,b), focusing on the greedy algorithm, and is streamlined accordingly
- Published
- 2020
41. Designerly Understanding: Information Needs for Model Transparency to Support Design Ideation for AI-Powered User Experience.
- Author
-
Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, and Jennifer Wortman Vaughan
- Published
- 2023
- Full Text
- View/download PDF
42. Supporting Industry Computing Researchers in Assessing, Articulating, and Addressing the Potential Negative Societal Impact of Their Work.
- Author
-
Wesley Hanwen Deng, Solon Barocas, and Jennifer Wortman Vaughan
- Published
- 2024
- Full Text
- View/download PDF
43. Canvil: Designerly Adaptation for LLM-Powered User Experiences.
- Author
-
K. J. Kevin Feng, Q. Vera Liao, Ziang Xiao, Jennifer Wortman Vaughan, Amy X. Zhang, and David W. McDonald
- Published
- 2024
- Full Text
- View/download PDF
44. "I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust.
- Author
-
Sunnie S. Y. Kim, Q. Vera Liao, Mihaela Vorvoreanu, Stephanie Ballard, and Jennifer Wortman Vaughan
- Published
- 2024
- Full Text
- View/download PDF
45. No-Regret and Incentive-Compatible Online Learning
- Author
-
Freeman, Rupert, Pennock, David M., Podimata, Chara, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Science and Game Theory ,Statistics - Machine Learning - Abstract
We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm's predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold. First, we want the learning algorithm to be no-regret with respect to the best fixed expert in hindsight. Second, we want incentive compatibility, a guarantee that each expert's best strategy is to report his true beliefs about the realization of each event. To achieve this goal, we build on the literature on wagering mechanisms, a type of multi-agent scoring rule. We provide algorithms that achieve no regret and incentive compatibility for myopic experts for both the full and partial information settings. In experiments on datasets from FiveThirtyEight, our algorithms have regret comparable to classic no-regret algorithms, which are not incentive-compatible. Finally, we identify an incentive-compatible algorithm for forward-looking strategic agents that exhibits diminishing regret in practice., Comment: Appears in ICML2020
- Published
- 2020
46. Weight of Evidence as a Basis for Human-Oriented Explanations
- Author
-
Alvarez-Melis, David, Daumé III, Hal, Vaughan, Jennifer Wortman, and Wallach, Hanna
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
Interpretability is an elusive but highly sought-after characteristic of modern machine learning methods. Recent work has focused on interpretability via $\textit{explanations}$, which justify individual model predictions. In this work, we take a step towards reconciling machine explanations with those that humans produce and prefer by taking inspiration from the study of explanation in philosophy, cognitive science, and the social sciences. We identify key aspects in which these human explanations differ from current machine explanations, distill them into a list of desiderata, and formalize them into a framework via the notion of $\textit{weight of evidence}$ from information theory. Finally, we instantiate this framework in two simple applications and show it produces intuitive and comprehensible explanations., Comment: Human-Centric Machine Learning (HCML) Workshop @ NeurIPS 2019
- Published
- 2019
47. Toward Fairness in AI for People with Disabilities: A Research Roadmap
- Author
-
Guo, Anhong, Kamar, Ece, Vaughan, Jennifer Wortman, Wallach, Hanna, and Morris, Meredith Ringel
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
AI technologies have the potential to dramatically impact the lives of people with disabilities (PWD). Indeed, improving the lives of PWD is a motivator for many state-of-the-art AI systems, such as automated speech recognition tools that can caption videos for people who are deaf and hard of hearing, or language prediction algorithms that can augment communication for people with speech or cognitive disabilities. However, widely deployed AI systems may not work properly for PWD, or worse, may actively discriminate against them. These considerations regarding fairness in AI for PWD have thus far received little attention. In this position paper, we identify potential areas of concern regarding how several AI technology categories may impact particular disability constituencies if care is not taken in their design, development, and testing. We intend for this risk assessment of how various classes of AI might interact with various classes of disability to provide a roadmap for future research that is needed to gather data, test these hypotheses, and build more inclusive algorithms., Comment: ACM ASSETS 2019 Workshop on AI Fairness for People with Disabilities
- Published
- 2019
48. Truthful Aggregation of Budget Proposals
- Author
-
Freeman, Rupert, Pennock, David M., Peters, Dominik, and Vaughan, Jennifer Wortman
- Subjects
Computer Science - Computer Science and Game Theory - Abstract
We consider a participatory budgeting problem in which each voter submits a proposal for how to divide a single divisible resource (such as money or time) among several possible alternatives (such as public projects or activities) and these proposals must be aggregated into a single aggregate division. Under $\ell_1$ preferences -- for which a voter's disutility is given by the $\ell_1$ distance between the aggregate division and the division he or she most prefers -- the social welfare-maximizing mechanism, which minimizes the average $\ell_1$ distance between the outcome and each voter's proposal, is incentive compatible (Goel et al. 2016). However, it fails to satisfy the natural fairness notion of proportionality, placing too much weight on majority preferences. Leveraging a connection between market prices and the generalized median rules of Moulin (1980), we introduce the independent markets mechanism, which is both incentive compatible and proportional. We unify the social welfare-maximizing mechanism and the independent markets mechanism by defining a broad class of moving phantom mechanisms that includes both. We show that every moving phantom mechanism is incentive compatible. Finally, we characterize the social welfare-maximizing mechanism as the unique Pareto-optimal mechanism in this class, suggesting an inherent tradeoff between Pareto optimality and proportionality., Comment: 28 pages, final journal version
- Published
- 2019
- Full Text
- View/download PDF
49. Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations.
- Author
-
Valerie Chen, Q. Vera Liao, Jennifer Wortman Vaughan, and Gagan Bansal
- Published
- 2023
- Full Text
- View/download PDF
50. Incentive-Compatible Forecasting Competitions.
- Author
-
Jens Witkowski, Rupert Freeman, Jennifer Wortman Vaughan, David M. Pennock, and Andreas Krause 0001
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.