Author: "Li, Toby Jia-Jun" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Li, Toby Jia-Jun"' showing total 141 results

Start Over Author "Li, Toby Jia-Jun"

141 results on '"Li, Toby Jia-Jun"'

1. Limitations of the LLM-as-a-Judge Approach for Evaluating LLM Outputs in Expert Knowledge Tasks

Author: Szymanski, Annalisa, Ziems, Noah, Eicher-Miller, Heather A., Li, Toby Jia-Jun, Jiang, Meng, and Metoyer, Ronald A.
Subjects: Computer Science - Human-Computer Interaction
Abstract: The potential of using Large Language Models (LLMs) themselves to evaluate LLM outputs offers a promising method for assessing model performance across various contexts. Previous research indicates that LLM-as-a-judge exhibits a strong correlation with human judges in the context of general instruction following. However, for instructions that require specialized knowledge, the validity of using LLMs as judges remains uncertain. In our study, we applied a mixed-methods approach, conducting pairwise comparisons in which both subject matter experts (SMEs) and LLMs evaluated outputs from domain-specific tasks. We focused on two distinct fields: dietetics, with registered dietitian experts, and mental health, with clinical psychologist experts. Our results showed that SMEs agreed with LLM judges 68% of the time in the dietetics domain and 64% in mental health when evaluating overall preference. Additionally, the results indicated variations in SME-LLM agreement across domain-specific aspect questions. Our findings emphasize the importance of keeping human experts in the evaluation process, as LLMs alone may not provide the depth of understanding required for complex, knowledge specific tasks. We also explore the implications of LLM evaluations across different domains and discuss how these insights can inform the design of evaluation workflows that ensure better alignment between human experts and LLMs in interactive systems.
Published: 2024

2. ChartifyText: Automated Chart Generation from Data-Involved Texts via LLM

Author: Zhang, Songheng, Wang, Lei, Li, Toby Jia-Jun, Shen, Qiaomu, Cao, Yixin, and Wang, Yong
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Information Retrieval
Abstract: Text documents with numerical values involved are widely used in various applications such as scientific research, economy, public health and journalism. However, it is difficult for readers to quickly interpret such data-involved texts and gain deep insights. To fill this research gap, this work aims to automatically generate charts to accurately convey the underlying data and ideas to readers, which is essentially a challenging task. The challenges originate from text ambiguities, intrinsic sparsity and uncertainty of data in text documents, and subjective sentiment differences. Specifically, we propose ChartifyText, a novel fully-automated approach that leverages Large Language Models (LLMs) to convert complex data-involved texts to expressive charts. It consists of two major modules: tabular data inference and expressive chart generation. The tabular data inference module employs systematic prompt engineering to guide the LLM (e.g., GPT-4) to infer table data, where data ranges, uncertainties, missing data values and corresponding subjective sentiments are explicitly considered. The expressive chart generation module augments standard charts with intuitive visual encodings and concise texts to accurately convey the underlying data and insights. We extensively evaluate the effectiveness of ChartifyText on real-world data-involved text documents through case studies, in-depth interviews with three visualization experts, and a carefully-designed user study with 15 participants. The results demonstrate the usefulness and effectiveness of ChartifyText in helping readers efficiently and effectively make sense of data-involved texts.
Published: 2024

3. CLEAR: Towards Contextual LLM-Empowered Privacy Policy Analysis and Risk Generation for Large Language Model Applications

Author: Chen, Chaoran, Zhou, Daodao, Ye, Yanfang, Li, Toby Jia-jun, and Yao, Yaxing
Subjects: Computer Science - Human-Computer Interaction
Abstract: The rise of end-user applications powered by large language models (LLMs), including both conversational interfaces and add-ons to existing graphical user interfaces (GUIs), introduces new privacy challenges. However, many users remain unaware of the risks. This paper explores methods to increase user awareness of privacy risks associated with LLMs in end-user applications. We conducted five co-design workshops to uncover user privacy concerns and their demand for contextual privacy information within LLMs. Based on these insights, we developed CLEAR (Contextual LLM-Empowered Privacy Policy Analysis and Risk Generation), a just-in-time contextual assistant designed to help users identify sensitive information, summarize relevant privacy policies, and highlight potential risks when sharing information with LLMs. We evaluated the usability and usefulness of CLEAR across in two example domains: ChatGPT and the Gemini plugin in Gmail. Our findings demonstrated that CLEAR is easy to use and improves user understanding of data practices and privacy risks. We also discussed LLM's duality in posing and mitigating privacy risks, offering design and policy implications.
Published: 2024

4. Careful About What App Promotion Ads Recommend! Detecting and Explaining Malware Promotion via App Promotion Graph

Author: Ma, Shang, Chen, Chaoran, Yang, Shao, Hou, Shifu, Li, Toby Jia-Jun, Xiao, Xusheng, Xie, Tao, and Ye, Yanfang
Subjects: Computer Science - Cryptography and Security, Computer Science - Computers and Society
Abstract: In Android apps, their developers frequently place app promotion ads, namely advertisements to promote other apps. Unfortunately, the inadequate vetting of ad content allows malicious developers to exploit app promotion ads as a new distribution channel for malware. To help detect malware distributed via app promotion ads, in this paper, we propose a novel approach, named ADGPE, that synergistically integrates app user interface (UI) exploration with graph learning to automatically collect app promotion ads, detect malware promoted by these ads, and explain the promotion mechanisms employed by the detected malware. Our evaluation on 18, 627 app promotion ads demonstrates the substantial risks in the app promotion ecosystem., Comment: NDSS Symposium 2025 Accepted Papers
Published: 2024

5. Why am I seeing this: Democratizing End User Auditing for Online Content Recommendations

Author: Chen, Chaoran, Li, Leyang, Cao, Luke, Ye, Yanfang, Li, Tianshi, Yao, Yaxing, and Li, Toby Jia-jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Personalized recommendation systems tailor content based on user attributes, which are either provided or inferred from private data. Research suggests that users often hypothesize about reasons behind contents they encounter (e.g., "I see this jewelry ad because I am a woman"), but they lack the means to confirm these hypotheses due to the opaqueness of these systems. This hinders informed decision-making about privacy and system use and contributes to the lack of algorithmic accountability. To address these challenges, we introduce a new interactive sandbox approach. This approach creates sets of synthetic user personas and corresponding personal data that embody realistic variations in personal attributes, allowing users to test their hypotheses by observing how a website's algorithms respond to these personas. We tested the sandbox in the context of targeted advertisement. Our user study demonstrates its usability, usefulness, and effectiveness in empowering end-user auditing in a case study of targeting ads.
Published: 2024

6. Comparing Criteria Development Across Domain Experts, Lay Users, and Models in Large Language Model Evaluation

Author: Szymanski, Annalisa, Gebreegziabher, Simret Araya, Anuyah, Oghenemaro, Metoyer, Ronald A., and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Large Language Models (LLMs) are increasingly utilized for domain-specific tasks, yet integrating domain expertise into evaluating their outputs remains challenging. A common approach to evaluating LLMs is to use metrics, or criteria, which are assertions used to assess performance that help ensure that their outputs align with domain-specific standards. Previous efforts have involved developers, lay users, or the LLMs themselves in creating these criteria, however, evaluation particularly from a domain expertise perspective, remains understudied. This study explores how domain experts contribute to LLM evaluation by comparing their criteria with those generated by LLMs and lay users. We further investigate how the criteria-setting process evolves, analyzing changes between a priori and a posteriori stages. Our findings emphasize the importance of involving domain experts early in the evaluation process while utilizing complementary strengths of lay users and LLMs. We suggest implications for designing workflows that leverage these strengths at different evaluation stages.
Published: 2024

7. Supporting Co-Adaptive Machine Teaching through Human Concept Learning and Cognitive Theories

Author: Gebreegziabher, Simret Araya, Yang, Yukun, Glassman, Elena L., and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: An important challenge in interactive machine learning, particularly in subjective or ambiguous domains, is fostering bi-directional alignment between humans and models. Users teach models their concept definition through data labeling, while refining their own understandings throughout the process. To facilitate this, we introduce MOCHA, an interactive machine learning tool informed by two theories of human concept learning and cognition. First, it utilizes a neuro-symbolic pipeline to support Variation Theory-based counterfactual data generation. By asking users to annotate counterexamples that are syntactically and semantically similar to already-annotated data but predicted to have different labels, the system can learn more effectively while helping users understand the model and reflect on their own label definitions. Second, MOCHA uses Structural Alignment Theory to present groups of counterexamples, helping users comprehend alignable differences between data items and annotate them in batch. We validated MOCHA's effectiveness and usability through a lab study with 18 participants.
Published: 2024

8. LADICA: A Large Shared Display Interface for Generative AI Cognitive Assistance in Co-Located Team Collaboration

Author: Zhang, Zheng, Peng, Weirui, Chen, Xinyue, Cao, Luke, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Large shared displays, such as digital whiteboards, are useful for supporting co-located team collaborations by helping members perform cognitive tasks such as brainstorming, organizing ideas, and making comparisons. While recent advancement in Large Language Models (LLMs) has catalyzed AI support for these displays, most existing systems either only offer limited capabilities or diminish human control, neglecting the potential benefits of natural group dynamics. Our formative study identified cognitive challenges teams encounter, such as diverse ideation, knowledge sharing, mutual awareness, idea organization, and synchronization of live discussions with the external workspace. In response, we introduce LADICA, a large shared display interface that helps collaborative teams brainstorm, organize, and analyze ideas through multiple analytical lenses, while fostering mutual awareness of ideas and concepts. Furthermore, LADICA facilitates the real-time extraction of key information from verbal discussions and identifies relevant entities. A lab study confirmed LADICA's usability and usefulness., Comment: 21 pages
Published: 2024

9. SQLucid: Grounding Natural Language Database Queries with Interactive Explanations

Author: Tian, Yuan, Kummerfeld, Jonathan K., Li, Toby Jia-Jun, and Zhang, Tianyi
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Computation and Language
Abstract: Though recent advances in machine learning have led to significant improvements in natural language interfaces for databases, the accuracy and reliability of these systems remain limited, especially in high-stakes domains. This paper introduces SQLucid, a novel user interface that bridges the gap between non-expert users and complex database querying processes. SQLucid addresses existing limitations by integrating visual correspondence, intermediate query results, and editable step-by-step SQL explanations in natural language to facilitate user understanding and engagement. This unique blend of features empowers users to understand and refine SQL queries easily and precisely. Two user studies and one quantitative experiment were conducted to validate SQLucid's effectiveness, showing significant improvement in task completion accuracy and user confidence compared to existing interfaces. Our code is available at https://github.com/magic-YuanTian/SQLucid., Comment: Accepted to UIST'24
Published: 2024

10. Context-aware Code Summary Generation

Author: Su, Chia-Yi, Bansal, Aakash, Huang, Yu, Li, Toby Jia-Jun, and McMillan, Collin
Subjects: Computer Science - Software Engineering
Abstract: Code summary generation is the task of writing natural language descriptions of a section of source code. Recent advances in Large Language Models (LLMs) and other AI-based technologies have helped make automatic code summarization a reality. However, the summaries these approaches write tend to focus on a narrow area of code. The results are summaries that explain what that function does internally, but lack a description of why the function exists or its purpose in the broader context of the program. In this paper, we present an approach for including this context in recent LLM-based code summarization. The input to our approach is a Java method and that project in which that method exists. The output is a succinct English description of why the method exists in the project. The core of our approach is a 350m parameter language model we train, which can be run locally to ensure privacy. We train the model in two steps. First we distill knowledge about code summarization from a large model, then we fine-tune the model using data from a study of human programmer who were asked to write code summaries. We find that our approach outperforms GPT-4 on this task., Comment: 21 pages, 5 figures, preprint under review
Published: 2024

11. Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning

Author: Gebreegziabher, Simret Araya, Ai, Kuangshi, Zhang, Zheng, Glassman, Elena L., and Li, Toby Jia-Jun
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Human-Computer Interaction
Abstract: Active Learning (AL) allows models to learn interactively from user feedback. This paper introduces a counterfactual data augmentation approach to AL, particularly addressing the selection of datapoints for user querying, a pivotal concern in enhancing data efficiency. Our approach is inspired by Variation Theory, a theory of human concept learning that emphasizes the essential features of a concept by focusing on what stays the same and what changes. Instead of just querying with existing datapoints, our approach synthesizes artificial datapoints that highlight potential key similarities and differences among labels using a neuro-symbolic pipeline combining large language models (LLMs) and rule-based models. Through an experiment in the example domain of text classification, we show that our approach achieves significantly higher performance when there are fewer annotated data. As the annotated training data gets larger the impact of the generated data starts to diminish showing its capability to address the cold start problem in AL. This research sheds light on integrating theories of human learning into the optimization of AL.
Published: 2024

12. Flowy: Supporting UX Design Decisions Through AI-Driven Pattern Annotation in Multi-Screen User Flows

Author: Lu, Yuwen, Tong, Ziang, Zhao, Qinyi, Oh, Yewon, Wang, Bryan, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Many recent AI-powered UX design tools focus on generating individual static UI screens from natural language. However, they overlook the crucial aspect of interactions and user experiences across multiple screens. Through formative studies with UX professionals, we identified limitations of these tools in supporting realistic UX design workflows. In response, we designed and developed Flowy, an app that augments designers' information foraging process in ideation by supplementing specific user flow examples with distilled design pattern knowledge. Flowy utilizes large multimodal AI models and a high-quality user flow dataset to help designers identify and understand relevant abstract design patterns in the design space for multi-screen user flows. Our user study with professional UX designers demonstrates how Flowy supports realistic UX tasks. Our design considerations in Flowy, such as representations with appropriate levels of abstraction and assisted navigation through the solution space, are generalizable to other creative tasks and embody a human-centered, intelligence augmentation approach to using AI in UX design.
Published: 2024

13. Crepe: A Mobile Screen Data Collector Using Graph Query

Author: Lu, Yuwen, Chen, Meng, Zhao, Qi, Cox, Victor, Yang, Yang, Jiang, Meng, Brockman, Jay, Kay, Tamara, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Collecting mobile datasets remains challenging for academic researchers due to limited data access and technical barriers. Commercial organizations often possess exclusive access to mobile data, leading to a "data monopoly" that restricts the independence of academic research. Existing open-source mobile data collection frameworks primarily focus on mobile sensing data rather than screen content, which is crucial for various research studies. We present Crepe, a no-code Android app that enables researchers to collect information displayed on screen through simple demonstrations of target data. Crepe utilizes a novel Graph Query technique which augments the structures of mobile UI screens to support flexible identification, location, and collection of specific data pieces. The tool emphasizes participants' privacy and agency by providing full transparency over collected data and allowing easy opt-out. We designed and built Crepe for research purposes only and in scenarios where researchers obtain explicit consent from participants. Code for Crepe will be open-sourced to support future academic research data collection.
Published: 2024

14. Programmer Visual Attention During Context-Aware Code Summarization

Author: Bansal, Aakash, Wallace, Robert, Karas, Zachary, Tang, Ningzhi, Huang, Yu, Li, Toby Jia-Jun, and McMillan, Collin
Subjects: Computer Science - Software Engineering
Abstract: Abridged: Programmer attention represents the visual focus of programmers on parts of the source code in pursuit of programming tasks. We conducted an in-depth human study with XY Java programmers, where each programmer generated summaries for 40 methods from five large Java projects over five one-hour sessions. We used eye-tracking equipment to map the visual attention of programmers while they wrote the summaries. We also rate the quality of each summary. We found eye-gaze patterns and metrics that define common behaviors between programmer attention during context-aware code summarization. Specifically, we found that programmers need to read significantly (p<0.01) fewer words and make significantly fewer revisits to words (p\textless0.03) as they summarize more methods during a session, while maintaining the quality of summaries. We also found that the amount of source code a participant looks at correlates with a higher quality summary, but this trend follows a bell-shaped curve, such that after a threshold reading more source code leads to a significant decrease (p<0.01) in the quality of summaries. We also gathered insight into the type of methods in the project that provide the most contextual information for code summarization based on programmer attention. Specifically, we observed that programmers spent a majority of their time looking at methods inside the same class as the target method to be summarized. Surprisingly, we found that programmers spent significantly less time looking at methods in the call graph of the target method. We discuss how our empirical observations may aid future studies towards modeling programmer attention and improving context-aware automatic source code summarization., Comment: 10 pages, 4 figures, 4 tables. this is a pre-print submitted to IEEE Transactions on Software Engineering for review
Published: 2024

15. A Study on Developer Behaviors for Validating and Repairing LLM-Generated Code Using Eye Tracking and IDE Actions

Author: Tang, Ningzhi, Chen, Meng, Ning, Zheng, Bansal, Aakash, Huang, Yu, McMillan, Collin, and Li, Toby Jia-Jun
Subjects: Computer Science - Software Engineering, Computer Science - Human-Computer Interaction
Abstract: The increasing use of large language model (LLM)-powered code generation tools, such as GitHub Copilot, is transforming software engineering practices. This paper investigates how developers validate and repair code generated by Copilot and examines the impact of code provenance awareness during these processes. We conducted a lab study with 28 participants, who were tasked with validating and repairing Copilot-generated code in three software projects. Participants were randomly divided into two groups: one informed about the provenance of LLM-generated code and the other not. We collected data on IDE interactions, eye-tracking, cognitive workload assessments, and conducted semi-structured interviews. Our results indicate that, without explicit information, developers often fail to identify the LLM origin of the code. Developers generally employ similar validation and repair strategies for LLM-generated code, but exhibit behaviors such as frequent switching between code and comments, different attentional focus, and a tendency to delete and rewrite code. Being aware of the code's provenance led to improved performance, increased search efforts, more frequent Copilot usage, and higher cognitive workload. These findings enhance our understanding of how developers interact with LLM-generated code and carry implications for designing tools that facilitate effective human-LLM collaboration in software development.
Published: 2024

16. CoCo Matrix: Taxonomy of Cognitive Contributions in Co-writing with Intelligent Agents

Author: Wan, Ruyuan, Gebreegziabhe, Simret, Li, Toby Jia-Jun, and Badillo-Urquiola, Karla
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: In recent years, there has been a growing interest in employing intelligent agents in writing. Previous work emphasizes the evaluation of the quality of end product-whether it was coherent and polished, overlooking the journey that led to the product, which is an invaluable dimension of the creative process. To understand how to recognize human efforts in co-writing with intelligent writing systems, we adapt Flower and Hayes' cognitive process theory of writing and propose CoCo Matrix, a two-dimensional taxonomy of entropy and information gain, to depict the new human-agent co-writing model. We define four quadrants and situate thirty-four published systems within the taxonomy. Our research found that low entropy and high information gain systems are under-explored, yet offer promising future directions in writing tasks that benefit from the agent's divergent planning and the human's focused translation. CoCo Matrix, not only categorizes different writing systems but also deepens our understanding of the cognitive processes in human-agent co-writing. By analyzing minimal changes in the writing process, CoCo Matrix serves as a proxy for the writer's mental model, allowing writers to reflect on their contributions. This reflection is facilitated through the measured metrics of information gain and entropy, which provide insights irrespective of the writing system used.
Published: 2024
Full Text: View/download PDF

17. MIMOSA: Human-AI Co-Creation of Computational Spatial Audio Effects on Videos

Author: Ning, Zheng, Zhang, Zheng, Ban, Jerrick, Jiang, Kaiwen, Gan, Ruohong, Tian, Yapeng, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Multimedia
Abstract: Spatial audio offers more immersive video consumption experiences to viewers; however, creating and editing spatial audio often expensive and requires specialized equipment and skills, posing a high barrier for amateur video creators. We present MIMOSA, a human-AI co-creation tool that enables amateur users to computationally generate and manipulate spatial audio effects. For a video with only monaural or stereo audio, MIMOSA automatically grounds each sound source to the corresponding sounding object in the visual scene and enables users to further validate and fix the errors in the locations of sounding objects. Users can also augment the spatial audio effect by flexibly manipulating the sounding source positions and creatively customizing the audio effect. The design of MIMOSA exemplifies a human-AI collaboration approach that, instead of utilizing state-of art end-to-end "black-box" ML models, uses a multistep pipeline that aligns its interpretable intermediate results with the user's workflow. A lab user study with 15 participants demonstrates MIMOSA's usability, usefulness, expressiveness, and capability in creating immersive spatial audio effects in collaboration with users.
Published: 2024
Full Text: View/download PDF

18. A Taxonomy for Human-LLM Interaction Modes: An Initial Exploration

Author: Gao, Jie, Gebreegziabher, Simret Araya, Choo, Kenny Tsu Wei, Li, Toby Jia-Jun, Perrault, Simon Tangi, and Malone, Thomas W.
Subjects: Computer Science - Human-Computer Interaction
Abstract: With ChatGPT's release, conversational prompting has become the most popular form of human-LLM interaction. However, its effectiveness is limited for more complex tasks involving reasoning, creativity, and iteration. Through a systematic analysis of HCI papers published since 2021, we identified four key phases in the human-LLM interaction flow - planning, facilitating, iterating, and testing - to precisely understand the dynamics of this process. Additionally, we have developed a taxonomy of four primary interaction modes: Mode 1: Standard Prompting, Mode 2: User Interface, Mode 3: Context-based, and Mode 4: Agent Facilitator. This taxonomy was further enriched using the "5W1H" guideline method, which involved a detailed examination of definitions, participant roles (Who), the phases that happened (When), human objectives and LLM abilities (What), and the mechanics of each interaction mode (How). We anticipate this taxonomy will contribute to the future design and evaluation of human-LLM interaction., Comment: 11 pages, 4 figures, 3 tables. Accepted at CHI Late-Breaking Work 2024
Published: 2024
Full Text: View/download PDF

19. 'I'm categorizing LLM as a productivity tool': Examining ethics of LLM use in HCI research practices

Author: Kapania, Shivani, Wang, Ruiyi, Li, Toby Jia-Jun, Li, Tianshi, and Shen, Hong
Subjects: Computer Science - Human-Computer Interaction
Abstract: Large language models are increasingly applied in real-world scenarios, including research and education. These models, however, come with well-known ethical issues, which may manifest in unexpected ways in human-computer interaction research due to the extensive engagement with human subjects. This paper reports on research practices related to LLM use, drawing on 16 semi-structured interviews and a survey conducted with 50 HCI researchers. We discuss the ways in which LLMs are already being utilized throughout the entire HCI research pipeline, from ideation to system development and paper writing. While researchers described nuanced understandings of ethical issues, they were rarely or only partially able to identify and address those ethical concerns in their own projects. This lack of action and reliance on workarounds was explained through the perceived lack of control and distributed responsibility in the LLM supply chain, the conditional nature of engaging with ethics, and competing priorities. Finally, we reflect on the implications of our findings and present opportunities to shape emerging norms of engaging with large language models in HCI research.
Published: 2024

20. EyeTrans: Merging Human and Machine Attention for Neural Code Summarization

Author: Zhang, Yifan, Li, Jiliang, Karas, Zachary, Bansal, Aakash, Li, Toby Jia-Jun, McMillan, Collin, Leach, Kevin, and Huang, Yu
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: Neural code summarization leverages deep learning models to automatically generate brief natural language summaries of code snippets. The development of Transformer models has led to extensive use of attention during model design. While existing work has primarily and almost exclusively focused on static properties of source code and related structural representations like the Abstract Syntax Tree (AST), few studies have considered human attention, that is, where programmers focus while examining and comprehending code. In this paper, we develop a method for incorporating human attention into machine attention to enhance neural code summarization. To facilitate this incorporation and vindicate this hypothesis, we introduce EyeTrans, which consists of three steps: (1) we conduct an extensive eye-tracking human study to collect and pre-analyze data for model training, (2) we devise a data-centric approach to integrate human attention with machine attention in the Transformer architecture, and (3) we conduct comprehensive experiments on two code summarization tasks to demonstrate the effectiveness of incorporating human attention into Transformers. Integrating human attention leads to an improvement of up to 29.91% in Functional Summarization and up to 6.39% in General Code Summarization performance, demonstrating the substantial benefits of this combination. We further explore performance in terms of robustness and efficiency by creating challenging summarization scenarios in which EyeTrans exhibits interesting properties. We also visualize the attention map to depict the simplifying effect of machine attention in the Transformer by incorporating human attention. This work has the potential to propel AI research in software engineering by introducing more human-centered approaches and data.
Published: 2024
Full Text: View/download PDF

21. SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision Viewers

Author: Ning, Zheng, Wimer, Brianna L., Jiang, Kaiwen, Chen, Keyi, Ban, Jerrick, Tian, Yapeng, Zhao, Yuhang, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Multimedia
Abstract: Blind or Low-Vision (BLV) users often rely on audio descriptions (AD) to access video content. However, conventional static ADs can leave out detailed information in videos, impose a high mental load, neglect the diverse needs and preferences of BLV users, and lack immersion. To tackle these challenges, we introduce SPICA, an AI-powered system that enables BLV users to interactively explore video content. Informed by prior empirical studies on BLV video consumption, SPICA offers novel interactive mechanisms for supporting temporal navigation of frame captions and spatial exploration of objects within key frames. Leveraging an audio-visual machine learning pipeline, SPICA augments existing ADs by adding interactivity, spatial sound effects, and individual object descriptions without requiring additional human annotation. Through a user study with 14 BLV participants, we evaluated the usability and usefulness of SPICA and explored user behaviors, preferences, and mental models when interacting with augmented ADs.
Published: 2024

22. AI Assistance for UX: A Literature Review Through Human-Centered AI

Author: Lu, Yuwen, Yang, Yuewen, Zhao, Qinyi, Zhang, Chengzhi, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Recent advancements in HCI and AI research attempt to support user experience (UX) practitioners with AI-enabled tools. Despite the potential of emerging models and new interaction mechanisms, mainstream adoption of such tools remains limited. We took the lens of Human-Centered AI and presented a systematic literature review of 359 papers, aiming to synthesize the current landscape, identify trends, and uncover UX practitioners' unmet needs in AI support. Guided by the Double Diamond design framework, our analysis uncovered that UX practitioners' unique focuses on empathy building and experiences across UI screens are often overlooked. Simplistic AI automation can obstruct the valuable empathy-building process. Furthermore, focusing solely on individual UI screens without considering interactions and user flows reduces the system's practical value for UX designers. Based on these findings, we call for a deeper understanding of UX mindsets and more designer-centric datasets and evaluation metrics, for HCI and AI communities to collaboratively work toward effective AI support for UX.
Published: 2024

23. Human Still Wins over LLM: An Empirical Study of Active Learning on Domain-Specific Annotation Tasks

Author: Lu, Yuxuan, Yao, Bingsheng, Zhang, Shao, Wang, Yun, Zhang, Peng, Lu, Tun, Li, Toby Jia-Jun, and Wang, Dakuo
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) have demonstrated considerable advances, and several claims have been made about their exceeding human performance. However, in real-world tasks, domain knowledge is often required. Low-resource learning methods like Active Learning (AL) have been proposed to tackle the cost of domain expert annotation, raising this question: Can LLMs surpass compact models trained with expert annotations in domain-specific tasks? In this work, we conduct an empirical experiment on four datasets from three different domains comparing SOTA LLMs with small models trained on expert annotations with AL. We found that small models can outperform GPT-3.5 with a few hundreds of labeled data, and they achieve higher or similar performance with GPT-4 despite that they are hundreds time smaller. Based on these findings, we posit that LLM predictions can be used as a warmup method in real-world applications and human experts remain indispensable in tasks involving data annotation driven by domain-specific knowledge.
Published: 2023

24. From Awareness to Action: Exploring End-User Empowerment Interventions for Dark Patterns in UX

Author: Lu, Yuwen, Zhang, Chao, Yang, Yuewen, Yao, Yaxing, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: The study of UX dark patterns, i.e., UI designs that seek to manipulate user behaviors, often for the benefit of online services, has drawn significant attention in the CHI and CSCW communities in recent years. To complement previous studies in addressing dark patterns from (1) the designer's perspective on education and advocacy for ethical designs; and (2) the policymaker's perspective on new regulations, we propose an end-user-empowerment intervention approach that helps users (1) raise the awareness of dark patterns and understand their underlying design intents; (2) take actions to counter the effects of dark patterns using a web augmentation approach. Through a two-phase co-design study, including 5 co-design workshops (N=12) and a 2-week technology probe study (N=15), we reported findings on the understanding of users' needs, preferences, and challenges in handling dark patterns and investigated the feedback and reactions to users' awareness of and action on dark patterns being empowered in a realistic in-situ setting., Comment: Accepted to CSCW 2024
Published: 2023

25. UI Layout Generation with LLMs Guided by UI Grammar

Author: Lu, Yuwen, Tong, Ziang, Zhao, Qinyi, Zhang, Chengzhi, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: The recent advances in Large Language Models (LLMs) have stimulated interest among researchers and industry professionals, particularly in their application to tasks concerning mobile user interfaces (UIs). This position paper investigates the use of LLMs for UI layout generation. Central to our exploration is the introduction of UI grammar -- a novel approach we proposed to represent the hierarchical structure inherent in UI screens. The aim of this approach is to guide the generative capacities of LLMs more effectively and improve the explainability and controllability of the process. Initial experiments conducted with GPT-4 showed the promising capability of LLMs to produce high-quality user interfaces via in-context learning. Furthermore, our preliminary comparative study suggested the potential of the grammar-based approach in improving the quality of generative results in specific aspects., Comment: ICML 2023 Workshop on AI and HCI
Published: 2023

26. Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation

Author: Suh, Sangho, Chen, Meng, Min, Bryan, Li, Toby Jia-Jun, and Xia, Haijun
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: Thanks to their generative capabilities, large language models (LLMs) have become an invaluable tool for creative processes. These models have the capacity to produce hundreds and thousands of visual and textual outputs, offering abundant inspiration for creative endeavors. But are we harnessing their full potential? We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas, rather than empowering them to explore the vast latent design space in generative models. To address this limitation, we propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses. We demonstrate the feasibility and usefulness of this framework through the design and development of an interactive system, Luminate, and a user study with 14 professional writers. Our work advances how we interact with LLMs for creative tasks, introducing a way to harness the creative potential of LLMs.
Published: 2023
Full Text: View/download PDF

27. An Empathy-Based Sandbox Approach to Bridge the Privacy Gap among Attitudes, Goals, Knowledge, and Behaviors

Author: Chen, Chaoran, Li, Weijun, Song, Wenxin, Ye, Yanfang, Yao, Yaxing, and Li, Toby Jia-jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Managing privacy to reach privacy goals is challenging, as evidenced by the privacy attitude-behavior gap. Mitigating this discrepancy requires solutions that account for both system opaqueness and users' hesitations in testing different privacy settings due to fears of unintended data exposure. We introduce an empathy-based approach that allows users to experience how privacy attributes may alter system outcomes in a risk-free sandbox environment from the perspective of artificially generated personas. To generate realistic personas, we introduce a novel pipeline that augments the outputs of large language models (e.g., GPT-4) using few-shot learning, contextualization, and chain of thoughts. Our empirical studies demonstrated the adequate quality of generated personas and highlighted the changes in privacy-related applications (e.g., online advertising) caused by different personas. Furthermore, users demonstrated cognitive and emotional empathy towards the personas when interacting with our sandbox. We offered design implications for downstream applications in improving user privacy literacy.
Published: 2023
Full Text: View/download PDF

28. Impact of Human-AI Interaction on User Trust and Reliance in AI-Assisted Qualitative Coding

Author: Gao, Jie, Cao, Junming, Yeo, ShunYi, Choo, Kenny Tsu Wei, Zhang, Zheng, Li, Toby Jia-Jun, Zhao, Shengdong, and Perrault, Simon Tangi
Subjects: Computer Science - Human-Computer Interaction
Abstract: While AI shows promise for enhancing the efficiency of qualitative analysis, the unique human-AI interaction resulting from varied coding strategies makes it challenging to develop a trustworthy AI-assisted qualitative coding system (AIQCs) that supports coding tasks effectively. We bridge this gap by exploring the impact of varying coding strategies on user trust and reliance on AI. We conducted a mixed-methods split-plot 3x3 study, involving 30 participants, and a follow-up study with 6 participants, exploring varying text selection and code length in the use of our AIQCs system for qualitative analysis. Our results indicate that qualitative open coding should be conceptualized as a series of distinct subtasks, each with differing levels of complexity, and therefore, should be given tailored design considerations. We further observed a discrepancy between perceived and behavioral measures, and emphasized the potential challenges of under- and over-reliance on AIQCs systems. Additional design implications were also proposed for consideration., Comment: 27 pages with references, 9 figures, 5 tables
Published: 2023

29. AutoDroid: LLM-powered Task Automation in Android

Author: Wen, Hao, Li, Yuanchun, Liu, Guohong, Zhao, Shanhui, Yu, Tao, Li, Toby Jia-Jun, Jiang, Shiqi, Liu, Yunhao, Zhang, Yaqin, and Liu, Yunxin
Subjects: Computer Science - Artificial Intelligence, Computer Science - Software Engineering
Abstract: Mobile task automation is an attractive technique that aims to enable voice-based hands-free user interaction with smartphones. However, existing approaches suffer from poor scalability due to the limited language understanding ability and the non-trivial manual efforts required from developers or end-users. The recent advance of large language models (LLMs) in language understanding and reasoning inspires us to rethink the problem from a model-centric perspective, where task preparation, comprehension, and execution are handled by a unified language model. In this work, we introduce AutoDroid, a mobile task automation system capable of handling arbitrary tasks on any Android application without manual efforts. The key insight is to combine the commonsense knowledge of LLMs and domain-specific knowledge of apps through automated dynamic analysis. The main components include a functionality-aware UI representation method that bridges the UI with the LLM, exploration-based memory injection techniques that augment the app-specific domain knowledge of LLM, and a multi-granularity query optimization module that reduces the cost of model inference. We integrate AutoDroid with off-the-shelf LLMs including online GPT-4/GPT-3.5 and on-device Vicuna, and evaluate its performance on a new benchmark for memory-augmented Android task automation with 158 common tasks. The results demonstrated that AutoDroid is able to precisely generate actions with an accuracy of 90.9%, and complete tasks with a success rate of 71.3%, outperforming the GPT-4-powered baselines by 36.4% and 39.7%. The demo, benchmark suites, and source code of AutoDroid will be released at url{https://autodroid-sys.github.io/}., Comment: Published in MobiCom 2024; Original title: "Empowering LLM to use Smartphone for Intelligent Task Automation"
Published: 2023

30. Modeling Programmer Attention as Scanpath Prediction

Author: Bansal, Aakash, Su, Chia-Yi, Karas, Zachary, Zhang, Yifan, Huang, Yu, Li, Toby Jia-Jun, and McMillan, Collin
Subjects: Computer Science - Software Engineering, Computer Science - Human-Computer Interaction
Abstract: This paper launches a new effort at modeling programmer attention by predicting eye movement scanpaths. Programmer attention refers to what information people intake when performing programming tasks. Models of programmer attention refer to machine prediction of what information is important to people. Models of programmer attention are important because they help researchers build better interfaces, assistive technologies, and more human-like AI. For many years, researchers in SE have built these models based on features such as mouse clicks, key logging, and IDE interactions. Yet the holy grail in this area is scanpath prediction -- the prediction of the sequence of eye fixations a person would take over a visual stimulus. A person's eye movements are considered the most concrete evidence that a person is taking in a piece of information. Scanpath prediction is a notoriously difficult problem, but we believe that the emergence of lower-cost, higher-accuracy eye tracking equipment and better large language models of source code brings a solution within grasp. We present an eye tracking experiment with 27 programmers and a prototype scanpath predictor to present preliminary results and obtain early community feedback., Comment: Accepter at ASE2023 NIER Track. 4 pages + 1 page for references, 4 figures, 1 table
Published: 2023

31. PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data

Author: Zhang, Zheng, Ning, Zheng, Xu, Chenliang, Tian, Yapeng, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction
Abstract: Audio-visual learning seeks to enhance the computer's multi-modal perception leveraging the correlation between the auditory and visual modalities. Despite their many useful downstream tasks, such as video retrieval, AR/VR, and accessibility, the performance and adoption of existing audio-visual models have been impeded by the availability of high-quality datasets. Annotating audio-visual datasets is laborious, expensive, and time-consuming. To address this challenge, we designed and developed an efficient audio-visual annotation tool called Peanut. Peanut's human-AI collaborative pipeline separates the multi-modal task into two single-modal tasks, and utilizes state-of-the-art object detection and sound-tagging models to reduce the annotators' effort to process each frame and the number of manually-annotated frames needed. A within-subject user study with 20 participants found that Peanut can significantly accelerate the audio-visual data annotation process while maintaining high annotation accuracy., Comment: 18 pages, published in UIST'23
Published: 2023
Full Text: View/download PDF

32. Shaping the Emerging Norms of Using Large Language Models in Social Computing Research

Author: Shen, Hong, Li, Tianshi, Li, Toby Jia-Jun, Park, Joon Sung, and Yang, Diyi
Subjects: Computer Science - Human-Computer Interaction
Abstract: The emergence of Large Language Models (LLMs) has brought both excitement and concerns to social computing research. On the one hand, LLMs offer unprecedented capabilities in analyzing vast amounts of textual data and generating human-like responses, enabling researchers to delve into complex social phenomena. On the other hand, concerns are emerging regarding the validity, privacy, and ethics of the research when LLMs are involved. This SIG aims at offering an open space for social computing researchers who are interested in understanding the impacts of LLMs to discuss their current practices, perspectives, challenges when engaging with LLMs in their everyday work and collectively shaping the emerging norms of using LLMs in social computing research.
Published: 2023

33. Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations

Author: Tian, Yuan, Zhang, Zheng, Ning, Zheng, Li, Toby Jia-Jun, Kummerfeld, Jonathan K., and Zhang, Tianyi
Subjects: Computer Science - Databases, Computer Science - Computation and Language, I.2.7
Abstract: Relational databases play an important role in business, science, and more. However, many users cannot fully unleash the analytical power of relational databases, because they are not familiar with database languages such as SQL. Many techniques have been proposed to automatically generate SQL from natural language, but they suffer from two issues: (1) they still make many mistakes, particularly for complex queries, and (2) they do not provide a flexible way for non-expert users to validate and refine incorrect queries. To address these issues, we introduce a new interaction mechanism that allows users to directly edit a step-by-step explanation of a query to fix errors. Our experiments on multiple datasets, as well as a user study with 24 participants, demonstrate that our approach can achieve better performance than multiple SOTA approaches. Our code and datasets are available at https://github.com/magic-YuanTian/STEPS., Comment: Accepted to EMNLP 2023
Published: 2023

34. VISAR: A Human-AI Argumentative Writing Assistant with Visual Programming and Rapid Draft Prototyping

Author: Zhang, Zheng, Gao, Jie, Dhaliwal, Ranjodh Singh, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: In argumentative writing, writers must brainstorm hierarchical writing goals, ensure the persuasiveness of their arguments, and revise and organize their plans through drafting. Recent advances in large language models (LLMs) have made interactive text generation through a chat interface (e.g., ChatGPT) possible. However, this approach often neglects implicit writing context and user intent, lacks support for user control and autonomy, and provides limited assistance for sensemaking and revising writing plans. To address these challenges, we introduce VISAR, an AI-enabled writing assistant system designed to help writers brainstorm and revise hierarchical goals within their writing context, organize argument structures through synchronized text editing and visual programming, and enhance persuasiveness with argumentation spark recommendations. VISAR allows users to explore, experiment with, and validate their writing plans using automatic draft prototyping. A controlled lab study confirmed the usability and effectiveness of VISAR in facilitating the argumentative writing planning process., Comment: 30 pages, published in UIST'23
Published: 2023
Full Text: View/download PDF

35. CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models

Author: Gao, Jie, Guo, Yuchen, Lim, Gionnieve, Zhang, Tianqin, Zhang, Zheng, Li, Toby Jia-Jun, and Perrault, Simon Tangi
Subjects: Computer Science - Human-Computer Interaction
Abstract: Collaborative Qualitative Analysis (CQA) can enhance qualitative analysis rigor and depth by incorporating varied viewpoints. Nevertheless, ensuring a rigorous CQA procedure itself can be both demanding and costly. To lower this bar, we take a theoretical perspective to design the CollabCoder workflow, that integrates Large Language Models (LLMs) into key inductive CQA stages: independent open coding, iterative discussions, and final codebook creation. In the open coding phase, CollabCoder offers AI-generated code suggestions and records decision-making data. During discussions, it promotes mutual understanding by sharing this data within the coding team and using quantitative metrics to identify coding (dis)agreements, aiding in consensus-building. In the code grouping stage, CollabCoder provides primary code group suggestions, lightening the cognitive load of finalizing the codebook. A 16-user evaluation confirmed the effectiveness of CollabCoder, demonstrating its advantages over existing software and providing empirical insights into the role of LLMs in the CQA practice., Comment: Will be published at the ACM CHI Conference on Human Factors in Computing Systems (CHI'24)
Published: 2023

36. KnowledgeShovel: An AI-in-the-Loop Document Annotation System for Scientific Knowledge Base Construction

Author: Zhang, Shao, Jia, Yuting, Xu, Hui, Wang, Dakuo, Li, Toby Jia-jun, Wen, Ying, Wang, Xinbing, and Zhou, Chenghu
Subjects: Computer Science - Digital Libraries, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: Constructing a comprehensive, accurate, and useful scientific knowledge base is crucial for human researchers synthesizing scientific knowledge and for enabling Al-driven scientific discovery. However, the current process is difficult, error-prone, and laborious due to (1) the enormous amount of scientific literature available; (2) the highly-specialized scientific domains; (3) the diverse modalities of information (text, figure, table); and, (4) the silos of scientific knowledge in different publications with inconsistent formats and structures. Informed by a formative study and iterated with participatory design workshops, we designed and developed KnowledgeShovel, an Al-in-the-Loop document annotation system for researchers to construct scientific knowledge bases. The design of KnowledgeShovel introduces a multi-step multi-modal human-AI collaboration pipeline that aligns with users' existing workflows to improve data accuracy while reducing the human burden. A follow-up user evaluation with 7 geoscience researchers shows that KnowledgeShovel can enable efficient construction of scientific knowledge bases with satisfactory accuracy., Comment: 33 pages, 17 figures, manuscript submitted to CHI2023
Published: 2022

37. A Bottom-Up End-User Intelligent Assistant Approach to Empower Gig Workers against AI Inequality

Author: Li, Toby Jia-Jun, Lu, Yuwen, Clark, Jaylexia, Chen, Meng, Cox, Victor, Jiang, Meng, Yang, Yang, Kay, Tamara, Wood, Danielle, and Brockman, Jay
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: The growing inequality in gig work between workers and platforms has become a critical social issue as gig work plays an increasingly prominent role in the future of work. The AI inequality is caused by (1) the technology divide in who has access to AI technologies in gig work; and (2) the data divide in who owns the data in gig work leads to unfair working conditions, growing pay gap, neglect of workers' diverse preferences, and workers' lack of trust in the platforms. In this position paper, we argue that a bottom-up approach that empowers individual workers to access AI-enabled work planning support and share data among a group of workers through a network of end-user-programmable intelligent assistants is a practical way to bridge AI inequality in gig work under the current paradigm of privately owned platforms. This position paper articulates a set of research challenges, potential approaches, and community engagement opportunities, seeking to start a dialogue on this important research topic in the interdisciplinary CHIWORK community., Comment: In 2022 Symposium on Human-Computer Interaction for Work (CHIWORK 2022)
Published: 2022

38. Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative Comprehension

Author: Xu, Ying, Wang, Dakuo, Yu, Mo, Ritchie, Daniel, Yao, Bingsheng, Wu, Tongshuang, Zhang, Zheng, Li, Toby Jia-Jun, Bradford, Nora, Sun, Branda, Hoang, Tran Bao, Sang, Yisi, Hou, Yufang, Ma, Xiaojuan, Yang, Diyi, Peng, Nanyun, Yu, Zhou, and Warschauer, Mark
Subjects: Computer Science - Computation and Language
Abstract: Question answering (QA) is a fundamental means to facilitate assessment and training of narrative comprehension skills for both machines and young children, yet there is scarcity of high-quality QA datasets carefully designed to serve this purpose. In particular, existing datasets rarely distinguish fine-grained reading skills, such as the understanding of varying narrative elements. Drawing on the reading education research, we introduce FairytaleQA, a dataset focusing on narrative comprehension of kindergarten to eighth-grade students. Generated by educational experts based on an evidence-based theoretical framework, FairytaleQA consists of 10,580 explicit and implicit questions derived from 278 children-friendly stories, covering seven types of narrative elements or relations. Our dataset is valuable in two folds: First, we ran existing QA models on our dataset and confirmed that this annotation helps assess models' fine-grained learning skills. Second, the dataset supports question generation (QG) task in the education domain. Through benchmarking with QG models, we show that the QG model trained on FairytaleQA is capable of asking high-quality and more diverse questions., Comment: Accepted to ACL 2022
Published: 2022

39. StoryBuddy: A Human-AI Collaborative Chatbot for Parent-Child Interactive Storytelling with Flexible Parental Involvement

Author: Zhang, Zheng, Xu, Ying, Wang, Yanhao, Yao, Bingsheng, Ritchie, Daniel, Wu, Tongshuang, Yu, Mo, Wang, Dakuo, and Li, Toby Jia-Jun
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Despite its benefits for children's skill development and parent-child bonding, many parents do not often engage in interactive storytelling by having story-related dialogues with their child due to limited availability or challenges in coming up with appropriate questions. While recent advances made AI generation of questions from stories possible, the fully-automated approach excludes parent involvement, disregards educational goals, and underoptimizes for child engagement. Informed by need-finding interviews and participatory design (PD) results, we developed StoryBuddy, an AI-enabled system for parents to create interactive storytelling experiences. StoryBuddy's design highlighted the need for accommodating dynamic user needs between the desire for parent involvement and parent-child bonding and the goal of minimizing parent intervention when busy. The PD revealed varied assessment and educational goals of parents, which StoryBuddy addressed by supporting configuring question types and tracking child progress. A user study validated StoryBuddy's usability and suggested design insights for future parent-AI collaboration systems., Comment: Published at CHI 2022
Published: 2022
Full Text: View/download PDF

40. It is AI's Turn to Ask Humans a Question: Question-Answer Pair Generation for Children's Story Books

Author: Yao, Bingsheng, Wang, Dakuo, Wu, Tongshuang, Zhang, Zheng, Li, Toby Jia-Jun, Yu, Mo, and Xu, Ying
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Existing question answering (QA) techniques are created mainly to answer questions asked by humans. But in educational applications, teachers often need to decide what questions they should ask, in order to help students to improve their narrative understanding capabilities. We design an automated question-answer generation (QAG) system for this education scenario: given a story book at the kindergarten to eighth-grade level as input, our system can automatically generate QA pairs that are capable of testing a variety of dimensions of a student's comprehension skills. Our proposed QAG model architecture is demonstrated using a new expert-annotated FairytaleQA dataset, which has 278 child-friendly storybooks with 10,580 QA pairs. Automatic and human evaluations show that our model outperforms state-of-the-art QAG baseline systems. On top of our QAG system, we also start to build an interactive story-telling application for the future real-world deployment in this educational scenario., Comment: Accepted to ACL 2022
Published: 2021

41. A Need-finding Study for Understanding Text Entry in Smartphone App Usage

Author: Li, Toby Jia-Jun and Myers, Brad A.
Subjects: Computer Science - Human-Computer Interaction
Abstract: Text entry makes up about one-fourth of the smartphone interaction events, and is known to be challenging and difficult. However, there has been little study about the characteristics of text entry in the context of smartphone app usage. In this paper, we present a mixed-method in-situ study conducted in 2016 with 17 active smartphone users to better understand text entry in smartphone app usage. Our results show 80% of text was entered into communication apps, with different apps exhibiting distinct usage patterns. We found that structured data such as URLs and email addresses are rarely typed but instead are auto-completed or replaced with search, copy-and-paste is rarely used, and sessions of smartphone usage with text entry involve more apps and last longer. We conclude with a discussion about the implications on the development of systems to better support mobile interaction.
Published: 2021

42. Screen2Vec: Semantic Embedding of GUI Screens and GUI Components

Author: Li, Toby Jia-Jun, Popowski, Lindsay, Mitchell, Tom M., and Myers, Brad A.
Subjects: Computer Science - Human-Computer Interaction
Abstract: Representing the semantics of GUI screens and components is crucial to data-driven computational methods for modeling user-GUI interactions and mining GUI designs. Existing GUI semantic representations are limited to encoding either the textual content, the visual design and layout patterns, or the app contexts. Many representation techniques also require significant manual data annotation efforts. This paper presents Screen2Vec, a new self-supervised technique for generating representations in embedding vectors of GUI screens and components that encode all of the above GUI features without requiring manual annotation using the context of user interaction traces. Screen2Vec is inspired by the word embedding method Word2Vec, but uses a new two-layer pipeline informed by the structure of GUIs and interaction traces and incorporates screen- and app-specific metadata. Through several sample downstream tasks, we demonstrate Screen2Vec's key useful properties: representing between-screen similarity through nearest neighbors, composability, and capability to represent user tasks., Comment: Accepted to CHI Conference on Human Factors in Computing Systems (CHI 2021)
Published: 2021
Full Text: View/download PDF

43. Geno: A Developer Tool for Authoring Multimodal Interaction on Existing Web Applications

Author: Sarmah, Ritam Jyoti, Ding, Yunpeng, Wang, Di, Lee, Cheuk Yin Phipson, Li, Toby Jia-Jun, and Chen, Xiang 'Anthony'
Subjects: Computer Science - Human-Computer Interaction
Abstract: Supporting voice commands in applications presents significant benefits to users. However, adding such support to existing GUI-based web apps is effort-consuming with a high learning barrier, as shown in our formative study, due to the lack of unified support for creating multimodal interfaces. We present Geno---a developer tool for adding the voice input modality to existing web apps without requiring significant NLP expertise. Geno provides a high-level workflow for developers to specify functionalities to be supported by voice (intents), create language models for detecting intents and the relevant information (parameters) from user utterances, and fulfill the intents by either programmatically invoking the corresponding functions or replaying GUI actions on the web app. Geno further supports multimodal references to GUI context in voice commands (e.g. "move this [event] to next week" while pointing at an event with the cursor). In a study, developers with little NLP expertise were able to add multimodal voice command support for two existing web apps using Geno.
Published: 2020

44. Privacy-Preserving Script Sharing in GUI-based Programming-by-Demonstration Systems

Author: Li, Toby Jia-Jun, Chen, Jingya, Canfield, Brandon, and Myers, Brad A.
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Software Engineering
Abstract: An important concern in end user development (EUD) is accidentally embedding personal information in program artifacts when sharing them. This issue is particularly important in GUI-based programming-by-demonstration (PBD) systems due to the lack of direct developer control of script contents. Prior studies reported that these privacy concerns were the main barrier to script sharing in EUD. We present a new approach that can identify and obfuscate the potential personal information in GUI-based PBD scripts based on the uniqueness of information entries with respect to the corresponding app GUI context. Compared with the prior approaches, ours supports broader types of personal information beyond explicitly pre-specified ones, requires minimal user effort, addresses the threat of re-identification attacks, and can work with third-party apps from any task domain. Our approach also recovers obfuscated fields locally on the script consumer's side to preserve the shared scripts' transparency, readability, robustness, and generalizability. Our evaluation shows that our approach (1) accurately identifies the potential personal information in scripts across different apps in diverse task domains; (2) allows end-user developers to feel comfortable sharing their own scripts; and (3) enables script consumers to understand the operation of shared scripts despite the obfuscated fields., Comment: In the Proceedings of the ACM on Human-Computer Interaction (PACM) Vol.4 No. CSCW1. (CSCW 2020)
Published: 2020
Full Text: View/download PDF

45. Towards Effective Human-AI Collaboration in GUI-Based Interactive Task Learning Agents

Author: Li, Toby Jia-Jun, Chen, Jingya, Mitchell, Tom M., and Myers, Brad A.
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: We argue that a key challenge in enabling usable and useful interactive task learning for intelligent agents is to facilitate effective Human-AI collaboration. We reflect on our past 5 years of efforts on designing, developing and studying the SUGILITE system, discuss the issues on incorporating recent advances in AI with HCI principles in mixed-initiative interactions and multi-modal interactions, and summarize the lessons we learned. Lastly, we identify several challenges and opportunities, and describe our ongoing work
Published: 2020

46. Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations

Author: Li, Toby Jia-Jun, Radensky, Marissa, Jia, Justin, Singarajah, Kirielle, Mitchell, Tom M., and Myers, Brad A.
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: Natural language programming is a promising approach to enable end users to instruct new tasks for intelligent agents. However, our formative study found that end users would often use unclear, ambiguous or vague concepts when naturally instructing tasks in natural language, especially when specifying conditionals. Existing systems have limited support for letting the user teach agents new concepts or explaining unclear concepts. In this paper, we describe a new multi-modal domain-independent approach that combines natural language programming and programming-by-demonstration to allow users to first naturally describe tasks and associated conditions at a high level, and then collaborate with the agent to recursively resolve any ambiguities or vagueness through conversations and demonstrations. Users can also define new procedures and concepts by demonstrating and referring to contents within GUIs of existing mobile apps. We demonstrate this approach in PUMICE, an end-user programmable agent that implements this approach. A lab study with 10 users showed its usability., Comment: The AAAI-20 Workshop on Intelligent Process Automation (IPA-20)
Published: 2019

47. Not at Home on the Range: Peer Production and the Urban/Rural Divide

Author: Johnson, Isaac, Lin, Allen Yilun, Li, Toby Jia-Jun, Hall, Andrew, Halfaker, Aaron, Schöning, Johannes, and Hecht, Brent
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Computers and Society, Computer Science - Social and Information Networks, H.5.m
Abstract: Wikipedia articles about places, OpenStreetMap features, and other forms of peer-produced content have become critical sources of geographic knowledge for humans and intelligent technologies. In this paper, we explore the effectiveness of the peer production model across the rural/urban divide, a divide that has been shown to be an important factor in many online social systems. We find that in both Wikipedia and OpenStreetMap, peer-produced content about rural areas is of systematically lower quality, is less likely to have been produced by contributors who focus on the local area, and is more likely to have been generated by automated software agents (i.e. bots). We then codify the systemic challenges inherent to characterizing rural phenomena through peer production and discuss potential solutions., Comment: 10 pages, published on CHI'16
Published: 2019
Full Text: View/download PDF

48. Demonstration + Natural Language: Multimodal Interfaces for GUI-Based Interactive Task Learning Agents

Author: Li, Toby Jia-Jun, Mitchell, Tom M., Myers, Brad A., Vanderdonckt, Jean, Editor-in-Chief, Li, Yang, editor, and Hilliges, Otmar, editor
Published: 2021
Full Text: View/download PDF

49. AutoDroid: LLM-powered Task Automation in Android

Author: Wen, Hao, primary, Li, Yuanchun, additional, Liu, Guohong, additional, Zhao, Shanhui, additional, Yu, Tao, additional, Li, Toby Jia-Jun, additional, Jiang, Shiqi, additional, Liu, Yunhao, additional, Zhang, Yaqin, additional, and Liu, Yunxin, additional
Published: 2024
Full Text: View/download PDF

50. CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models

Author: Gao, Jie, primary, Guo, Yuchen, additional, Lim, Gionnieve, additional, Zhang, Tianqin, additional, Zhang, Zheng, additional, Li, Toby Jia-Jun, additional, and Perrault, Simon Tangi, additional
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

141 results on '"Li, Toby Jia-Jun"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources