Author: "Narayanan, Arvind" - Searchworks@Jio Institute Digital Library Search Results

Author: Siegel, Zachary S., Kapoor, Sayash, Nagdir, Nitya, Stroebl, Benedikt, and Narayanan, Arvind
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Multiagent Systems
Abstract: AI agents have the potential to aid users on a variety of consequential tasks, including conducting scientific research. To spur the development of useful agents, we need benchmarks that are challenging, but more crucially, directly correspond to real-world tasks of interest. This paper introduces such a benchmark, designed to measure the accuracy of AI agents in tackling a crucial yet surprisingly challenging aspect of scientific research: computational reproducibility. This task, fundamental to the scientific process, involves reproducing the results of a study using the provided code and data. We introduce CORE-Bench (Computational Reproducibility Agent Benchmark), a benchmark consisting of 270 tasks based on 90 scientific papers across three disciplines (computer science, social science, and medicine). Tasks in CORE-Bench consist of three difficulty levels and include both language-only and vision-language tasks. We provide an evaluation system to measure the accuracy of agents in a fast and parallelizable way, saving days of evaluation time for each run compared to a sequential implementation. We evaluated two baseline agents: the general-purpose AutoGPT and a task-specific agent called CORE-Agent. We tested both variants using two underlying language models: GPT-4o and GPT-4o-mini. The best agent achieved an accuracy of 21% on the hardest task, showing the vast scope for improvement in automating routine scientific tasks. Having agents that can reproduce existing work is a necessary step towards building agents that can conduct novel research and could verify and improve the performance of other research agents. We hope that CORE-Bench can improve the state of reproducibility and spur the development of future research agents., Comment: Benchmark harness and code available at http://github.com/siegelz/core-bench
Published: 2024

15. AI Agents That Matter

Author: Kapoor, Sayash, Stroebl, Benedikt, Siegel, Zachary S., Nadgir, Nitya, and Narayanan, Arvind
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: AI agents are an exciting new research direction, and agent development is driven by benchmarks. Our analysis of current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. First, there is a narrow focus on accuracy without attention to other metrics. As a result, SOTA agents are needlessly complex and costly, and the community has reached mistaken conclusions about the sources of accuracy gains. Our focus on cost in addition to accuracy motivates the new goal of jointly optimizing the two metrics. We design and implement one such optimization, showing its potential to greatly reduce cost while maintaining accuracy. Second, the benchmarking needs of model and downstream developers have been conflated, making it hard to identify which agent would be best suited for a particular application. Third, many agent benchmarks have inadequate holdout sets, and sometimes none at all. This has led to agents that are fragile because they take shortcuts and overfit to the benchmark in various ways. We prescribe a principled framework for avoiding overfitting. Finally, there is a lack of standardization in evaluation practices, leading to a pervasive lack of reproducibility. We hope that the steps we introduce for addressing these shortcomings will spur the development of agents that are useful in the real world and not just accurate on benchmarks.
Published: 2024

16. The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Author: Longpre, Shayne, Biderman, Stella, Albalak, Alon, Schoelkopf, Hailey, McDuff, Daniel, Kapoor, Sayash, Klyman, Kevin, Lo, Kyle, Ilharco, Gabriel, San, Nay, Rauh, Maribeth, Skowron, Aviya, Vidgen, Bertie, Weidinger, Laura, Narayanan, Arvind, Sanh, Victor, Adelani, David, Liang, Percy, Bommasani, Rishi, Henderson, Peter, Luccioni, Sasha, Jernite, Yacine, and Soldaini, Luca
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation, frameworks, guides, and practical tools) that support informed data selection, processing, and understanding, precise and limitation-aware artifact documentation, efficient model training, advance awareness of the environmental impact from training, careful model evaluation of capabilities, risks, and claims, as well as responsible model release, licensing and deployment practices. We hope this curated collection of resources helps guide more responsible development. The process of curating this list, enabled us to review the AI development ecosystem, revealing what tools are critically missing, misused, or over-used in existing practices. We find that (i) tools for data sourcing, model evaluation, and monitoring are critically under-serving ethical and real-world needs, (ii) evaluations for model safety, capabilities, and environmental impact all lack reproducibility and transparency, (iii) text and particularly English-centric analyses continue to dominate over multilingual and multi-modal analyses, and (iv) evaluation of systems, rather than just models, is needed so that capabilities and impact are assessed in context.
Published: 2024

17. AI Risk Management Should Incorporate Both Safety and Security

Author: Qi, Xiangyu, Huang, Yangsibo, Zeng, Yi, Debenedetti, Edoardo, Geiping, Jonas, He, Luxi, Huang, Kaixuan, Madhushani, Udari, Sehwag, Vikash, Shi, Weijia, Wei, Boyi, Xie, Tinghao, Chen, Danqi, Chen, Pin-Yu, Ding, Jeffrey, Jia, Ruoxi, Ma, Jiaqi, Narayanan, Arvind, Su, Weijie J, Wang, Mengdi, Xiao, Chaowei, Li, Bo, Song, Dawn, Henderson, Peter, and Mittal, Prateek
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this paper, we advocate that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security, and unambiguously take into account the perspectives of both disciplines in order to devise mostly effective and holistic risk mitigation approaches. Unfortunately, this vision is often obfuscated, as the definitions of the basic concepts of "safety" and "security" themselves are often inconsistent and lack consensus across communities. With AI risk management being increasingly cross-disciplinary, this issue is particularly salient. In light of this conceptual challenge, we introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security, aiming to facilitate a shared understanding and effective collaboration across communities.
Published: 2024

18. REFORMS: Consensus-based Recommendations for Machine-learning-based Science

Author: Kapoor, Sayash, Cantrell, Emily M, Peng, Kenny, Pham, Thanh Hien, Bail, Christopher A, Gundersen, Odd Erik, Hofman, Jake M, Hullman, Jessica, Lones, Michael A, Malik, Momin M, Nanayakkara, Priyanka, Poldrack, Russell A, Raji, Inioluwa Deborah, Roberts, Michael, Salganik, Matthew J, Serra-Garcia, Marta, Stewart, Brandon M, Vandewiele, Gilles, and Narayanan, Arvind
Subjects: Information and Computing Sciences, Philosophy and Religious Studies, History and Philosophy Of Specific Fields, Machine Learning, Consensus, Humans, Reproducibility of Results, Science
Abstract: Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear recommendations for conducting and reporting ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (recommendations for machine-learning-based science). It consists of 32 questions and a paired set of guidelines. REFORMS was developed on the basis of a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.
Published: 2024

19. A Safe Harbor for AI Evaluation and Red Teaming

Author: Longpre, Shayne, Kapoor, Sayash, Klyman, Kevin, Ramaswami, Ashwin, Bommasani, Rishi, Blili-Hamelin, Borhane, Huang, Yangsibo, Skowron, Aviya, Yong, Zheng-Xin, Kotha, Suhas, Zeng, Yi, Shi, Weiyan, Yang, Xianjun, Southen, Reid, Robey, Alexander, Chao, Patrick, Yang, Diyi, Jia, Ruoxi, Kang, Daniel, Pentland, Sandy, Narayanan, Arvind, Liang, Percy, and Henderson, Peter
Subjects: Computer Science - Artificial Intelligence
Abstract: Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal. Although some companies offer researcher access programs, they are an inadequate substitute for independent research access, as they have limited community representation, receive inadequate funding, and lack independence from corporate incentives. We propose that major AI developers commit to providing a legal and technical safe harbor, indemnifying public interest safety research and protecting it from the threat of account suspensions or legal reprisal. These proposals emerged from our collective experience conducting safety, privacy, and trustworthiness research on generative AI systems, where norms and incentives could be better aligned with public interests, without exacerbating model misuse. We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.
Published: 2024

20. On the Societal Impact of Open Foundation Models

Author: Kapoor, Sayash, Bommasani, Rishi, Klyman, Kevin, Longpre, Shayne, Ramaswami, Ashwin, Cihon, Peter, Hopkins, Aspen, Bankston, Kevin, Biderman, Stella, Bogen, Miranda, Chowdhury, Rumman, Engler, Alex, Henderson, Peter, Jernite, Yacine, Lazar, Seth, Maffulli, Stefano, Nelson, Alondra, Pineau, Joelle, Skowron, Aviya, Song, Dawn, Storchan, Victor, Zhang, Daniel, Ho, Daniel E., Liang, Percy, and Narayanan, Arvind
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Foundation models are powerful technologies: how they are released publicly directly shapes their societal impact. In this position paper, we focus on open foundation models, defined here as those with broadly available model weights (e.g. Llama 2, Stable Diffusion XL). We identify five distinctive properties (e.g. greater customizability, poor monitoring) of open foundation models that lead to both their benefits and risks. Open foundation models present significant benefits, with some caveats, that span innovation, competition, the distribution of decision-making power, and transparency. To understand their risks of misuse, we design a risk assessment framework for analyzing their marginal risk. Across several misuse vectors (e.g. cyberattacks, bioweapons), we find that current research is insufficient to effectively characterize the marginal risk of open foundation models relative to pre-existing technologies. The framework helps explain why the marginal risk is low in some cases, clarifies disagreements about misuse risks by revealing that past work has focused on different subsets of the framework with different assumptions, and articulates a way forward for more constructive debate. Overall, our work helps support a more grounded assessment of the societal impact of open foundation models by outlining what research is needed to empirically validate their theoretical benefits and risks.
Published: 2024

21. Foundation Model Transparency Reports

Author: Bommasani, Rishi, Klyman, Kevin, Longpre, Shayne, Xiong, Betty, Kapoor, Sayash, Maslej, Nestor, Narayanan, Arvind, and Liang, Percy
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: Foundation models are critical digital technologies with sweeping societal impact that necessitates transparency. To codify how foundation model developers should provide transparency about the development and deployment of their models, we propose Foundation Model Transparency Reports, drawing upon the transparency reporting practices in social media. While external documentation of societal harms prompted social media transparency reports, our objective is to institutionalize transparency reporting for foundation models while the industry is still nascent. To design our reports, we identify 6 design principles given the successes and shortcomings of social media transparency reporting. To further schematize our reports, we draw upon the 100 transparency indicators from the Foundation Model Transparency Index. Given these indicators, we measure the extent to which they overlap with the transparency requirements included in six prominent government policies (e.g., the EU AI Act, the US Executive Order on Safe, Secure, and Trustworthy AI). Well-designed transparency reports could reduce compliance costs, in part due to overlapping regulatory requirements across different jurisdictions. We encourage foundation model developers to regularly publish transparency reports, building upon recommendations from the G7 and the White House.
Published: 2024

22. How large language models can reshape collective intelligence

Author: Burton, Jason W., Lopez-Lopez, Ezequiel, Hechtlinger, Shahar, Rahwan, Zoe, Aeschbach, Samuel, Bakker, Michiel A., Becker, Joshua A., Berditchevskaia, Aleks, Berger, Julian, Brinkmann, Levin, Flek, Lucie, Herzog, Stefan M., Huang, Saffron, Kapoor, Sayash, Narayanan, Arvind, Nussberger, Anne-Marie, Yasseri, Taha, Nickl, Pietro, Almaatouq, Abdullah, Hahn, Ulrike, Kurvers, Ralf H. J. M., Leavy, Susan, Rahwan, Iyad, Siddarth, Divya, Siu, Alice, Woolley, Anita W., Wulff, Dirk U., and Hertwig, Ralph
Published: 2024
Full Text: View/download PDF

23. Promises and pitfalls of artificial intelligence for legal applications

Author: Kapoor, Sayash, Henderson, Peter, and Narayanan, Arvind
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence
Abstract: Is AI set to redefine the legal profession? We argue that this claim is not supported by the current evidence. We dive into AI's increasingly prevalent roles in three types of legal tasks: information processing; tasks involving creativity, reasoning, or judgment; and predictions about the future. We find that the ease of evaluating legal applications varies greatly across legal tasks, based on the ease of identifying correct answers and the observability of information relevant to the task at hand. Tasks that would lead to the most significant changes to the legal profession are also the ones most prone to overoptimism about AI capabilities, as they are harder to evaluate. We make recommendations for better evaluation and deployment of AI in legal contexts.
Published: 2024

24. REFORMS: Reporting Standards for Machine Learning Based Science

Author: Kapoor, Sayash, Cantrell, Emily, Peng, Kenny, Pham, Thanh Hien, Bail, Christopher A., Gundersen, Odd Erik, Hofman, Jake M., Hullman, Jessica, Lones, Michael A., Malik, Momin M., Nanayakkara, Priyanka, Poldrack, Russell A., Raji, Inioluwa Deborah, Roberts, Michael, Salganik, Matthew J., Serra-Garcia, Marta, Stewart, Brandon M., Vandewiele, Gilles, and Narayanan, Arvind
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Methodology
Abstract: Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear reporting standards for ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist ($\textbf{Re}$porting Standards $\textbf{For}$ $\textbf{M}$achine Learning Based $\textbf{S}$cience). It consists of 32 questions and a paired set of guidelines. REFORMS was developed based on a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.
Published: 2023

25. Security policy audits: why and how

Author: Narayanan, Arvind and Lee, Kevin
Subjects: Computer Science - Cryptography and Security, Computer Science - Computers and Society
Abstract: Information security isn't just about software and hardware -- it's at least as much about policies and processes. But the research community overwhelmingly focuses on the former over the latter, while gaping policy and process problems persist. In this experience paper, we describe a series of security policy audits that we conducted, exposing policy flaws affecting billions of users that can be -- and often are -- exploited by low-tech attackers who don't need to use any tools or exploit software vulnerabilities. The solutions, in turn, need to be policy-based. We advocate for the study of policies and processes, point out its intellectual and practical challenges, lay out our theory of change, and present a research agenda.
Published: 2022
Full Text: View/download PDF

26. Leakage and the Reproducibility Crisis in ML-based Science

Author: Kapoor, Sayash and Narayanan, Arvind
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Methodology
Abstract: The use of machine learning (ML) methods for prediction and forecasting has become widespread across the quantitative sciences. However, there are many known methodological pitfalls, including data leakage, in ML-based science. In this paper, we systematically investigate reproducibility issues in ML-based science. We show that data leakage is indeed a widespread problem and has led to severe reproducibility failures. Specifically, through a survey of literature in research communities that adopted ML methods, we find 17 fields where errors have been found, collectively affecting 329 papers and in some cases leading to wildly overoptimistic conclusions. Based on our survey, we present a fine-grained taxonomy of 8 types of leakage that range from textbook errors to open research problems. We argue for fundamental methodological changes to ML-based science so that cases of leakage can be caught before publication. To that end, we propose model info sheets for reporting scientific claims based on ML models that would address all types of leakage identified in our survey. To investigate the impact of reproducibility errors and the efficacy of model info sheets, we undertake a reproducibility study in a field where complex ML models are believed to vastly outperform older statistical models such as Logistic Regression (LR): civil war prediction. We find that all papers claiming the superior performance of complex ML models compared to LR models fail to reproduce due to data leakage, and complex ML models don't perform substantively better than decades-old LR models. While none of these errors could have been caught by reading the papers, model info sheets would enable the detection of leakage in each case.
Published: 2022

27. How Algorithms Shape the Distribution of Political Advertising: Case Studies of Facebook, Google, and TikTok

Author: Papakyriakopoulos, Orestis, Tessono, Christelle, Narayanan, Arvind, and Kshirsagar, Mihir
Subjects: Computer Science - Social and Information Networks, Computer Science - Computers and Society, Computer Science - Human-Computer Interaction
Abstract: Online platforms play an increasingly important role in shaping democracy by influencing the distribution of political information to the electorate. In recent years, political campaigns have spent heavily on the platforms' algorithmic tools to target voters with online advertising. While the public interest in understanding how platforms perform the task of shaping the political discourse has never been higher, the efforts of the major platforms to make the necessary disclosures to understand their practices falls woefully short. In this study, we collect and analyze a dataset containing over 800,000 ads and 2.5 million videos about the 2020 U.S. presidential election from Facebook, Google, and TikTok. We conduct the first large scale data analysis of public data to critically evaluate how these platforms amplified or moderated the distribution of political advertisements. We conclude with recommendations for how to improve the disclosures so that the public can hold the platforms and political advertisers accountable., Comment: Forthcoming in: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES'22), August 1-3, 2022, Oxford, United Kingdom. ACM, New York, NY, USA, 15 pages
Published: 2022
Full Text: View/download PDF

28. The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning

Author: Hullman, Jessica, Kapoor, Sayash, Nanayakkara, Priyanka, Gelman, Andrew, and Narayanan, Arvind
Subjects: Computer Science - Machine Learning
Abstract: Recent arguments that machine learning (ML) is facing a reproducibility and replication crisis suggest that some published claims in ML research cannot be taken at face value. These concerns inspire analogies to the replication crisis affecting the social and medical sciences. They also inspire calls for the integration of statistical approaches to causal inference and predictive modeling. A deeper understanding of what reproducibility concerns in supervised ML research have in common with the replication crisis in experimental science puts the new concerns in perspective, and helps researchers avoid "the worst of both worlds," where ML researchers begin borrowing methodologies from explanatory modeling without understanding their limitations and vice versa. We contribute a comparative analysis of concerns about inductive learning that arise in causal attribution as exemplified in psychology versus predictive modeling as exemplified in ML. We identify themes that re-occur in reform discussions, like overreliance on asymptotic theory and non-credible beliefs about real-world data generating processes. We argue that in both fields, claims from learning are implied to generalize outside the specific environment studied (e.g., the input dataset or subject sample, modeling implementation, etc.) but are often impossible to refute due to undisclosed sources of variance in the learning pipeline. In particular, errors being acknowledged in ML expose cracks in long-held beliefs that optimizing predictive accuracy using huge datasets absolves one from having to consider a true data generating process or formally represent uncertainty in performance claims. We conclude by discussing risks that arise when sources of errors are misdiagnosed and the need to acknowledge the role of human inductive biases in learning and reform.
Published: 2022
Full Text: View/download PDF

29. The Impact of User Location on Cookie Notices (Inside and Outside of the European Union)

Author: van Eijk, Rob, Asghari, Hadi, Winter, Philipp, and Narayanan, Arvind
Subjects: Computer Science - Computers and Society
Abstract: The web is global, but privacy laws differ by country. Which set of privacy rules do websites follow? We empirically study this question by detecting and analyzing cookie notices in an automated way. We crawl 1,500 European, American, and Canadian websites from each of 18 countries. We detect cookie notices on 40 percent of websites in our sample. We treat the presence or absence of cookie notices, as well as visual differences, as proxies for differences in privacy rules. Using a series of regression models, we find that the website's Top Level Domain explains a substantial portion of the variance in cookie notice metrics, but the user's vantage point does not. This suggests that websites follow one set of privacy rules for all their users. There is one exception to this finding: cookie notices differ when accessing .com domains from inside versus outside of the EU. We highlight ways in which future research could build on our preliminary findings., Comment: Peer-reviewed and presented at IEEE Workshop on Technology and Consumer Protection 2019 (ConPro '19)
Published: 2021

30. Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers

Author: Peng, Kenny, Mathur, Arunesh, and Narayanan, Arvind
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: Machine learning datasets have elicited concerns about privacy, bias, and unethical applications, leading to the retraction of prominent datasets such as DukeMTMC, MS-Celeb-1M, and Tiny Images. In response, the machine learning community has called for higher ethical standards in dataset creation. To help inform these efforts, we studied three influential but ethically problematic face and person recognition datasets -- Labeled Faces in the Wild (LFW), MS-Celeb-1M, and DukeMTM -- by analyzing nearly 1000 papers that cite them. We found that the creation of derivative datasets and models, broader technological and social change, the lack of clarity of licenses, and dataset management practices can introduce a wide range of ethical concerns. We conclude by suggesting a distributed approach to harm mitigation that considers the entire life cycle of a dataset.
Published: 2021

31. Simulation as Experiment: An Empirical Critique of Simulation Research on Recommender Systems

Author: Winecoff, Amy A., Sun, Matthew, Lucherini, Eli, and Narayanan, Arvind
Subjects: Computer Science - Computers and Society, Computer Science - Multiagent Systems
Abstract: Simulation can enable the study of recommender system (RS) evolution while circumventing many of the issues of empirical longitudinal studies; simulations are comparatively easier to implement, are highly controlled, and pose no ethical risk to human participants. How simulation can best contribute to scientific insight about RS alongside qualitative and quantitative empirical approaches is an open question. Philosophers and researchers have long debated the epistemological nature of simulation compared to wholly theoretical or empirical methods. Simulation is often implicitly or explicitly conceptualized as occupying a middle ground between empirical and theoretical approaches, allowing researchers to realize the benefits of both. However, what is often ignored in such arguments is that without firm grounding in any single methodological tradition, simulation studies have no agreed upon scientific norms or standards, resulting in a patchwork of theoretical motivations, approaches, and implementations that are difficult to reconcile. In this position paper, we argue that simulation studies of RS are conceptually similar to empirical experimental approaches and therefore can be evaluated using the standards of empirical research methods. Using this empirical lens, we argue that the combination of high heterogeneity in approaches and low transparency in methods in simulation studies of RS has limited their interpretability, generalizability, and replicability. We contend that by adopting standards and practices common in empirical disciplines, simulation researchers can mitigate many of these weaknesses.
Published: 2021

32. T-RECS: A Simulation Tool to Study the Societal Impact of Recommender Systems

Author: Lucherini, Eli, Sun, Matthew, Winecoff, Amy, and Narayanan, Arvind
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence, Computer Science - Multiagent Systems
Abstract: Simulation has emerged as a popular method to study the long-term societal consequences of recommender systems. This approach allows researchers to specify their theoretical model explicitly and observe the evolution of system-level outcomes over time. However, performing simulation-based studies often requires researchers to build their own simulation environments from the ground up, which creates a high barrier to entry, introduces room for implementation error, and makes it difficult to disentangle whether observed outcomes are due to the model or the implementation. We introduce T-RECS, an open-sourced Python package designed for researchers to simulate recommendation systems and other types of sociotechnical systems in which an algorithm mediates the interactions between multiple stakeholders, such as users and content creators. To demonstrate the flexibility of T-RECS, we perform a replication of two prior simulation-based research on sociotechnical systems. We additionally show how T-RECS can be used to generate novel insights with minimal overhead. Our tool promotes reproducibility in this area of research, provides a unified language for simulating sociotechnical systems, and removes the friction of implementing simulations from scratch., Comment: 17 pages, 5 figures; updated Figure 2(b) after fixing small bug in replication code (see Github for more details)
Published: 2021

33. Resurrecting Address Clustering in Bitcoin

Author: Möser, Malte and Narayanan, Arvind
Subjects: Computer Science - Cryptography and Security
Abstract: Blockchain analysis is essential for understanding how cryptocurrencies like Bitcoin are used in practice, and address clustering is a cornerstone of blockchain analysis. However, current techniques rely on heuristics that have not been rigorously evaluated or optimized. In this paper, we tackle several challenges of change address identification and clustering. First, we build a ground truth set of transactions with known change from the Bitcoin blockchain that can be used to validate the efficacy of individual change address detection heuristics. Equipped with this data set, we develop new techniques to predict change outputs with low false positive rates. After applying our prediction model to the Bitcoin blockchain, we analyze the resulting clustering and develop ways to detect and prevent cluster collapse. Finally, we assess the impact our enhanced clustering has on two exemplary applications., Comment: Financial Cryptography and Data Security, 2022
Published: 2021

34. ‘On Table’ Versus ‘Off Table’ Direct Anterior Approach Total Hip Arthroplasty: Is There a Difference?

Author: Narayanan, Arvind S., Densley, Sebastian M., McCauley, Julie C., Kulidjian, Anna A., Bugbee, William D., and Wilde, Jeffrey M.
Published: 2024
Full Text: View/download PDF

35. Virtual Classrooms and Real Harms: Remote Learning at U.S. Universities

Author: Cohney, Shaanan, Teixeira, Ross, Kohlbrenner, Anne, Narayanan, Arvind, Kshirsagar, Mihir, Shvartzshnaider, Yan, and Sanfilippo, Madelyn
Subjects: Computer Science - Cryptography and Security
Abstract: Universities have been forced to rely on remote educational technology to facilitate the rapid shift to online learning. In doing so, they acquire new risks of security vulnerabilities and privacy violations. To help universities navigate this landscape, we develop a model that describes the actors, incentives, and risks, informed by surveying 49 educators and 14 administrators at U.S. universities. Next, we develop a methodology for administrators to assess security and privacy risks of these products. We then conduct a privacy and security analysis of 23 popular platforms using a combination of sociological analyses of privacy policies and 129 state laws, alongside a technical assessment of platform software. Based on our findings, we develop recommendations for universities to mitigate the risks to their stakeholders.
Published: 2020

36. An In-Depth Measurement Analysis of 5G mmWave PHY Latency and Its Impact on End-to-End Delay

Author: Fezeu, Rostand A. K., Ramadan, Eman, Ye, Wei, Minneci, Benjamin, Xie, Jack, Narayanan, Arvind, Hassan, Ahmad, Qian, Feng, Zhang, Zhi-Li, Chandrashekar, Jaideep, Lee, Myungjin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Brunstrom, Anna, editor, Flores, Marcel, editor, and Fiore, Marco, editor
Published: 2023
Full Text: View/download PDF

37. Privacy Policies over Time: Curation and Analysis of a Million-Document Dataset

Author: Amos, Ryan, Acar, Gunes, Lucherini, Eli, Kshirsagar, Mihir, Narayanan, Arvind, and Mayer, Jonathan
Subjects: Computer Science - Computers and Society
Abstract: Automated analysis of privacy policies has proved a fruitful research direction, with developments such as automated policy summarization, question answering systems, and compliance detection. Prior research has been limited to analysis of privacy policies from a single point in time or from short spans of time, as researchers did not have access to a large-scale, longitudinal, curated dataset. To address this gap, we developed a crawler that discovers, downloads, and extracts archived privacy policies from the Internet Archive's Wayback Machine. Using the crawler and following a series of validation and quality control steps, we curated a dataset of 1,071,488 English language privacy policies, spanning over two decades and over 130,000 distinct websites. Our analyses of the data paint a troubling picture of the transparency and accessibility of privacy policies. By comparing the occurrence of tracking-related terminology in our dataset to prior web privacy measurements, we find that privacy policies have consistently failed to disclose the presence of common tracking technologies and third parties. We also find that over the last twenty years privacy policies have become even more difficult to read, doubling in length and increasing a full grade in the median reading level. Our data indicate that self-regulation for first-party websites has stagnated, while self-regulation for third parties has increased but is dominated by online advertising trade associations. Finally, we contribute to the literature on privacy regulation by demonstrating the historic impact of the GDPR on privacy policies., Comment: 16 pages, 13 figures, public dataset
Published: 2020
Full Text: View/download PDF

38. REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets

Author: Wang, Angelina, Liu, Alexander, Zhang, Ryan, Kleiman, Anat, Kim, Leslie, Zhao, Dora, Shirai, Iroha, Narayanan, Arvind, and Russakovsky, Olga
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Machine learning models are known to perpetuate and even amplify the biases present in the data. However, these data biases frequently do not become apparent until after the models are deployed. Our work tackles this issue and enables the preemptive analysis of large-scale datasets. REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset, surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based. Object-based biases relate to the size, context, or diversity of the depicted objects. Person-based metrics focus on analyzing the portrayal of people within the dataset. Geography-based analyses consider the representation of different geographic locations. These three dimensions are deeply intertwined in how they interact to bias a dataset, and REVISE sheds light on this; the responsibility then lies with the user to consider the cultural and historical context, and to determine which of the revealed biases may be problematic. The tool further assists the user by suggesting actionable steps that may be taken to mitigate the revealed biases. Overall, the key aim of our work is to tackle the machine learning bias problem early in the pipeline. REVISE is available at https://github.com/princetonvisualai/revise-tool, Comment: Extended version of ECCV 2020 Spotlight paper
Published: 2020

39. Leakage and the reproducibility crisis in machine-learning-based science

Author: Kapoor, Sayash and Narayanan, Arvind
Published: 2023
Full Text: View/download PDF

40. A First Look at Commercial 5G Performance on Smartphones

Author: Narayanan, Arvind, Ramadan, Eman, Carpenter, Jason, Liu, Qingxu, Liu, Yu, Qian, Feng, and Zhang, Zhi-Li
Subjects: Computer Science - Networking and Internet Architecture, Computer Science - Performance
Abstract: We conduct to our knowledge a first measurement study of commercial 5G performance on smartphones by closely examining 5G networks of three carriers (two mmWave carriers, one mid-band carrier) in three U.S. cities. We conduct extensive field tests on 5G performance in diverse urban environments. We systematically analyze the handoff mechanisms in 5G and their impact on network performance. We explore the feasibility of using location and possibly other environmental information to predict the network performance. We also study the app performance (web browsing and HTTP download) over 5G. Our study consumes more than 15 TB of cellular data. Conducted when 5G just made its debut, it provides a "baseline" for studying how 5G performance evolves, and identifies key research directions on improving 5G users' experience in a cross-layer manner. We have released the data collected from our study (referred to as 5Gophers) at https://fivegophers.umn.edu/www20., Comment: Published at The Web Conference 2020 (WWW 2020). Please include WWW in any citations
Published: 2019
Full Text: View/download PDF

41. Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites

Author: Mathur, Arunesh, Acar, Gunes, Friedman, Michael J., Lucherini, Elena, Mayer, Jonathan, Chetty, Marshini, and Narayanan, Arvind
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Computers and Society
Abstract: Dark patterns are user interface design choices that benefit an online service by coercing, steering, or deceiving users into making unintended and potentially harmful decisions. We present automated techniques that enable experts to identify dark patterns on a large set of websites. Using these techniques, we study shopping websites, which often use dark patterns to influence users into making more purchases or disclosing more information than they would otherwise. Analyzing ~53K product pages from ~11K shopping websites, we discover 1,818 dark pattern instances, together representing 15 types and 7 broader categories. We examine these dark patterns for deceptive practices, and find 183 websites that engage in such practices. We also uncover 22 third-party entities that offer dark patterns as a turnkey solution. Finally, we develop a taxonomy of dark pattern characteristics that describes the underlying influence of the dark patterns and their potential harm on user decision-making. Based on our findings, we make recommendations for stakeholders including researchers and regulators to study, mitigate, and minimize the use of these patterns., Comment: 32 pages, 11 figures, ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2019)
Published: 2019
Full Text: View/download PDF

42. Keeping the Smart Home Private with Smart(er) IoT Traffic Shaping

Author: Apthorpe, Noah, Huang, Danny Yuxing, Reisman, Dillon, Narayanan, Arvind, and Feamster, Nick
Subjects: Computer Science - Cryptography and Security
Abstract: The proliferation of smart home Internet of Things (IoT) devices presents unprecedented challenges for preserving privacy within the home. In this paper, we demonstrate that a passive network observer (e.g., an Internet service provider) can infer private in-home activities by analyzing Internet traffic from commercially available smart home devices even when the devices use end-to-end transport-layer encryption. We evaluate common approaches for defending against these types of traffic analysis attacks, including firewalls, virtual private networks, and independent link padding, and find that none sufficiently conceal user activities with reasonable data overhead. We develop a new defense, "stochastic traffic padding" (STP), that makes it difficult for a passive network adversary to reliably distinguish genuine user activities from generated traffic patterns designed to look like user interactions. Our analysis provides a theoretical bound on an adversary's ability to accurately detect genuine user activities as a function of the amount of additional cover traffic generated by the defense technique., Comment: 21 pages, 9 figures, 4 tables. This article draws heavily from arXiv:1705.06805, arXiv:1705.06809, and arXiv:1708.05044. Camera-ready version
Published: 2018
Full Text: View/download PDF

43. Correction to: An In-Depth Measurement Analysis of 5G mmWave PHY Latency and Its Impact on End-to-End Delay

Author: Fezeu, Rostand A. K., primary, Ramadan, Eman, additional, Ye, Wei, additional, Minneci, Benjamin, additional, Xie, Jack, additional, Narayanan, Arvind, additional, Hassan, Ahmad, additional, Qian, Feng, additional, Zhang, Zhi-Li, additional, Chandrashekar, Jaideep, additional, and Lee, Myungjin, additional
Published: 2023
Full Text: View/download PDF

44. An In-Depth Measurement Analysis of 5G mmWave PHY Latency and Its Impact on End-to-End Delay

Author: Fezeu, Rostand A. K., primary, Ramadan, Eman, additional, Ye, Wei, additional, Minneci, Benjamin, additional, Xie, Jack, additional, Narayanan, Arvind, additional, Hassan, Ahmad, additional, Qian, Feng, additional, Zhang, Zhi-Li, additional, Chandrashekar, Jaideep, additional, and Lee, Myungjin, additional
Published: 2023
Full Text: View/download PDF

45. Resurrecting Address Clustering in Bitcoin

Author: Möser, Malte, Narayanan, Arvind, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Eyal, Ittay, editor, and Garay, Juan, editor
Published: 2022
Full Text: View/download PDF

46. REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets

Author: Wang, Angelina, Liu, Alexander, Zhang, Ryan, Kleiman, Anat, Kim, Leslie, Zhao, Dora, Shirai, Iroha, Narayanan, Arvind, and Russakovsky, Olga
Published: 2022
Full Text: View/download PDF

47. Formal Barriers to Longest-Chain Proof-of-Stake Protocols

Author: Brown-Cohen, Jonah, Narayanan, Arvind, Psomas, Christos-Alexandros, and Weinberg, S. Matthew
Subjects: Computer Science - Computer Science and Game Theory, Computer Science - Cryptography and Security
Abstract: The security of most existing cryptocurrencies is based on a concept called Proof-of-Work, in which users must solve a computationally hard cryptopuzzle to authorize transactions (`one unit of computation, one vote'). This leads to enormous expenditure on hardware and electricity in order to collect the rewards associated with transaction authorization. Proof-of-Stake is an alternative concept that instead selects users to authorize transactions proportional to their wealth (`one coin, one vote'). Some aspects of the two paradigms are the same. For instance, obtaining voting power in Proof-of-Stake has a monetary cost just as in Proof-of-Work: a coin cannot be freely duplicated any more easily than a unit of computation. However some aspects are fundamentally different. In particular, exactly because Proof-of-Stake is wasteless, there is no inherent resource cost to deviating (commonly referred to as the `Nothing-at-Stake' problem). In contrast to prior work, we focus on incentive-driven deviations (any participant will deviate if doing so yields higher revenue) instead of adversarial corruption (an adversary may take over a significant fraction of the network, but the remaining players follow the protocol). The main results of this paper are several formal barriers to designing incentive-compatible proof-of-stake cryptocurrencies (that don't apply to proof-of-work).
Published: 2018

48. Endorsements on Social Media: An Empirical Study of Affiliate Marketing Disclosures on YouTube and Pinterest

Author: Mathur, Arunesh, Narayanan, Arvind, and Chetty, Marshini
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Computers and Society, Computer Science - Social and Information Networks
Abstract: Online advertisements that masquerade as non-advertising content pose numerous risks to users. Such hidden advertisements appear on social media platforms when content creators or "influencers" endorse products and brands in their content. While the Federal Trade Commission (FTC) requires content creators to disclose their endorsements in order to prevent deception and harm to users, we do not know whether and how content creators comply with the FTC's guidelines. In this paper, we studied disclosures within affiliate marketing, an endorsement-based advertising strategy used by social media content creators. We examined whether content creators follow the FTC's disclosure guidelines, how they word the disclosures, and whether these disclosures help users identify affiliate marketing content as advertisements. To do so, we first measured the prevalence of and identified the types of disclosures in over 500,000 YouTube videos and 2.1 million Pinterest pins. We then conducted a user study with 1,791 participants to test the efficacy of these disclosures. Our findings reveal that only about 10% of affiliate marketing content on both platforms contains any disclosures at all. Further, users fail to understand shorter, non-explanatory disclosures. Based on our findings, we make various design and policy suggestions to help improve advertising disclosure practices on social media platforms., Comment: 26 pages, 6 figures, ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2018)
Published: 2018
Full Text: View/download PDF

49. Privacy, ethics, and data access: A case study of the Fragile Families Challenge

Author: Lundberg, Ian, Narayanan, Arvind, Levy, Karen, and Salganik, Matthew J.
Subjects: Computer Science - Computers and Society
Abstract: Stewards of social science data face a fundamental tension. On one hand, they want to make their data accessible to as many researchers as possible to facilitate new discoveries. At the same time, they want to restrict access to their data as much as possible in order to protect the people represented in the data. In this paper, we provide a case study addressing this common tension in an uncommon setting: the Fragile Families Challenge, a scientific mass collaboration designed to yield insights that could improve the lives of disadvantaged children in the United States. We describe our process of threat modeling, threat mitigation, and third-party guidance. We also describe the ethical principles that formed the basis of our process. We are open about our process and the trade-offs that we made in the hopes that others can improve on what we have done., Comment: 60 pages, 9 figures, 1 table
Published: 2018
Full Text: View/download PDF

50. An Empirical Study of Affiliate Marketing Disclosures on YouTube and Pinterest

Author: Mathur, Arunesh, Narayanan, Arvind, and Chetty, Marshini
Subjects: Computer Science - Social and Information Networks, Computer Science - Computers and Society, Computer Science - Human-Computer Interaction
Abstract: While disclosures relating to various forms of Internet advertising are well established and follow specific formats, endorsement marketing disclosures are often open-ended in nature and written by individual publishers. Because such marketing often appears as part of publishers' actual content, ensuring that it is adequately disclosed is critical so that end-users can identify it as such. In this paper, we characterize disclosures relating to affiliate marketing---a type of endorsement based marketing---on two popular social media platforms: YouTube & Pinterest. We find that only roughly one-tenth of affiliate content on both platforms contains disclosures. Based on our findings, we make policy recommendations geared towards various stakeholders in the affiliate marketing industry, highlighting how both social media platforms and affiliate companies can enable better disclosure practices.
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

393 results on '"Narayanan, Arvind"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources