47 results on '"Kokil Jaidka"'
Search Results
2. Social media and anti-immigrant prejudice: a multi-method analysis of the role of social media use, threat perceptions, and cognitive ability
- Author
-
Saifuddin Ahmed, Kokil Jaidka, Vivian Hsueh Hua Chen, Mengxuan Cai, Anfan Chen, Claire Stravato Emes, Valerie Yu, and Arul Chib
- Subjects
social media ,realistic threat ,symbolic threat ,cognitive ability ,emotion ,prejudice ,Psychology ,BF1-990 - Abstract
IntroductionThe discourse on immigration and immigrants is central to contemporary political and public discussions. Analyzing online conversations about immigrants provides valuable insights into public opinion, complemented by data from questionnaires on how attitudes are formed.MethodsThe research includes two studies examining the expressive and informational use of social media. Study 1 conducted a computational text analysis of comments on Singaporean Facebook pages and forums, focusing on how social media is used to discuss immigrants. Study 2 utilized survey data to examine the use of social media at the individual level, testing the relationships between cognitive ability, perceptions of threat, negative emotions towards immigrants, and social media usage within the Integrated Threat Theory framework.ResultsStudy 1 found that discussions about immigrants on social media often involved negative emotions and concerns about economic impact, such as competition for jobs and crime. Complementing these findings about perceived economic threats, Study 2 showed that individuals with higher social media usage and greater perceptions of threat were more likely to have negative emotions towards immigrants. These relationships were mediated by perceptions of threat and were stronger in individuals with lower cognitive abilities.DiscussionThe findings from both studies demonstrate the role of social media in shaping public attitudes towards immigrants, highlighting how perceived threats influence these attitudes. This research suggests the importance of considering how digital platforms contribute to public opinion on immigration, with implications for understanding the dynamics of attitude formation in the digital age.
- Published
- 2024
- Full Text
- View/download PDF
3. Building a Multimodal Classifier of Email Behavior: Towards a Social Network Understanding of Organizational Communication
- Author
-
Harsh Shah, Kokil Jaidka, Lyle Ungar, Jesse Fagan, and Travis Grosser
- Subjects
email ,organization ,social network analysis ,text classification ,computational linguistics ,transformers ,Information technology ,T58.5-58.64 - Abstract
Within organizational settings, communication dynamics are influenced by various factors, such as email content, historical interactions, and interpersonal relationships. We introduce the Email MultiModal Architecture (EMMA) to model these dynamics and predict future communication behavior. EMMA uses data related to an email sender’s social network, performance metrics, and peer endorsements to predict the probability of receiving an email response. Our primary analysis is based on a dataset of 0.6 million corporate emails from 4320 employees between 2012 and 2014. By integrating features that capture a sender’s organizational influence and likability within a multimodal structure, EMMA offers improved performance over models that rely solely on linguistic attributes. Our findings indicate that EMMA enhances email reply prediction accuracy by up to 12.5% compared to leading text-centric models. EMMA also demonstrates high accuracy on other email datasets, reinforcing its utility and generalizability in diverse contexts. Our findings recommend the need for multimodal approaches to better model communication patterns within organizations and teams and to better understand how relationships and histories shape communication trajectories.
- Published
- 2023
- Full Text
- View/download PDF
4. Cross-platform- and subgroup-differences in the well-being effects of Twitter, Instagram, and Facebook in the United States
- Author
-
Kokil Jaidka
- Subjects
Medicine ,Science - Abstract
Abstract Spatial aggregates of survey and web search data make it possible to identify the heterogeneous well-being effects of social media platforms. This study reports evidence from different sources of longitudinal data that suggests that the well-being effects of social media differ across platforms and population groups. The well-being effects of frequent social media visits are consistently positive for Facebook but negative for Instagram. Group-level analyses suggest that the positive well-being effects are experienced mainly by white, high-income populations at both the individual and the county level, while the adverse effects of Instagram use are observed on younger and Black populations. The findings are corroborated when geocoded web search data from Google is used and when self-reports from surveys are used in place of region-level aggregates. Greater Instagram use in regions is also linked to higher depression diagnoses across most sociodemographic groups.
- Published
- 2022
- Full Text
- View/download PDF
5. Electoral and Public Opinion Forecasts with Social Media Data: A Meta-Analysis
- Author
-
Marko M. Skoric, Jing Liu, and Kokil Jaidka
- Subjects
social media ,public opinion ,computational methods ,meta-analysis ,Information technology ,T58.5-58.64 - Abstract
In recent years, many studies have used social media data to make estimates of electoral outcomes and public opinion. This paper reports the findings from a meta-analysis examining the predictive power of social media data by focusing on various sources of data and different methods of prediction; i.e., (1) sentiment analysis, and (2) analysis of structural features. Our results, based on the data from 74 published studies, show significant variance in the accuracy of predictions, which were on average behind the established benchmarks in traditional survey research. In terms of the approaches used, the study shows that machine learning-based estimates are generally superior to those derived from pre-existing lexica, and that a combination of structural features and sentiment analyses provides the most accurate predictions. Furthermore, our study shows some differences in the predictive power of social media data across different levels of political democracy and different electoral systems. We also note that since the accuracy of election and public opinion forecasts varies depending on which statistical estimates are used, the scientific community should aim to adopt a more standardized approach to analyzing and reporting social media data-derived predictions in the future.
- Published
- 2020
- Full Text
- View/download PDF
6. Protests against #delhigangrape on Twitter: Analyzing India’s Arab Spring
- Author
-
Saifuddin Ahmed and Kokil Jaidka
- Subjects
Twitter ,protest ,social movement ,information dissemination ,citizen journalism ,protest reporting ,Political science (General) ,JA1-92 - Abstract
This study offers a comprehensive approach towards analyzing and explaining the role of Twitter in shaping and facilitating social movements especially during protests. It presents automatic and manual analyses of the tweet themes, usage characteristics and major Twitter users during a public outcry against a gangrape incident in Delhi, the capital city of India. Our results identified Twitter as an important channel for the diffusion of ideas and news among a vast set of adopters in defiance of geographical boundaries. Results of the content analyses highlight the prominent use of social media resources in disseminating information on Twitter, and the remarkable role of Twitter users as citizen journalists during the days of the protest. Results of the social network analysis suggest that major role players on Twitter were the offline protest leaders.
- Published
- 2013
- Full Text
- View/download PDF
7. Silenced on social media: the gatekeeping functions of shadowbans in the American Twitterverse
- Author
-
Kokil Jaidka, Subhayan Mukerjee, and Yphtach Lelkes
- Subjects
Linguistics and Language ,Communication ,Language and Linguistics - Abstract
Algorithms play a critical role in steering online attention on social media. Many have alleged that algorithms can perpetuate bias. This study audited shadowbanning, where a user or their content is temporarily hidden on Twitter. We repeatedly tested whether a stratified random sample of American Twitter accounts (n ≈ 25,000) had been subject to various forms of shadowbans. We then identified the type of user and tweet characteristics that predict a shadowban. In general, shadowbans are rare. We found that accounts with bot-like behavior were more likely to face shadowbans, while verified accounts were less likely to be shadowbanned. The replies by Twitter accounts that posted offensive tweets and tweets about politics (from both the left and the right) were more likely to be downtiered. The findings have implications for algorithmic accountability and the design of future audit studies of social media platforms.
- Published
- 2023
- Full Text
- View/download PDF
8. Talking politics: Building and validating data-driven lexica to measure political discussion quality
- Author
-
Kokil Jaidka
- Published
- 2022
- Full Text
- View/download PDF
9. The Political Landscape of the U.S. Twitterverse
- Author
-
Subhayan Mukerjee, Kokil Jaidka, and Yphtach Lelkes
- Subjects
Sociology and Political Science ,Communication - Abstract
Prior research suggests that Twitter users in the United States are more politically engaged and more partisan compared to the American citizenry -- a public that is otherwise characterized by low levels of political knowledge and disinterest in political affairs. This study seeks to understand this disconnect by conducting an observational analysis of the most popular accounts on American Twitter. We identify opinion leaders by drawing a random sample of ordinary American Twitter users and observing whom they follow. We estimate the ideological leaning and political relevance of these opinion leaders as well as crowd-source how they are perceived by ordinary Americans. We find little evidence that American Twitter is as politicized as is made out to be, with politics and hard news outlets constituting a small subset of these opinion leaders. We find no evidence of polarization among these opinion leaders either. While certain professional categories such as political pundits and political figures are more polarized than others, the overall polarization dissipates further when we factor in the rate at which the opinion leaders tweet: a large number of vocal non-partisan opinion leaders drowns out the partisan voices on the platform. Our results suggest that the degree to which Twitter is political, has likely been overstated in the past. Our findings have implications about how we use Twitter to represent public opinion in the United States.
- Published
- 2022
- Full Text
- View/download PDF
10. Questionable and Open Research Practices: Attitudes and Perceptions among Quantitative Communication Researchers
- Author
-
Bert N Bakker, Kokil Jaidka, Timothy Dörr, Neil Fasching, Yphtach Lelkes, and Political Communication & Journalism (ASCoR, FMG)
- Subjects
Linguistics and Language ,Communication ,Language and Linguistics - Abstract
Recent contributions have questioned the credibility of quantitative communication research. While questionable research practices (QRPs) are believed to be widespread, evidence for this belief is, primarily, derived from other disciplines. Therefore, it is largely unknown to what extent QRPs are used in quantitative communication research and whether researchers embrace open research practices (ORPs). We surveyed first and corresponding authors of publications in the top-20 journals in communication science. Many researchers report using one or more QRPs. We find widespread pluralistic ignorance: QRPs are generally rejected, but researchers believe they are prevalent. At the same time, we find optimism about the use of open science practices. In all, our study has implications for theories in communication that rely upon a cumulative body of empirical work: these theories are negatively affected by QRPs but can gain credibility if based upon ORPs. We outline an agenda to move forward as a discipline.
- Published
- 2021
11. Tweets and Votes: A Four-Country Comparison of Volumetric and Sentiment Analysis Approaches
- Author
-
Ahmed, S., Kokil Jaidka, and Skoric, M. M.
- Abstract
This study analyzes different methodological approaches followed in social media literature and their accuracy in predicting the general elections of four countries. Volumetric and unsupervised and supervised sentiment approaches are adopted for generating 12 metrics to compute predicted voteshares. The findings suggest that Twitter-based predictions can produce accurate results for elections, given the digital environment of a country. A cross-country analyses helps to evaluate the quality of predictions and the influence of different contexts, such as technological development and democratic setups. We recommend future scholars to combine volume, sentiment and network aspects of social media to model voting intentions in developing societies.
- Published
- 2021
- Full Text
- View/download PDF
12. Social media use and anti-immigrant attitudes: evidence from a survey and automated linguistic analysis of Facebook posts
- Author
-
Vivian Hsueh-Hua Chen, Saifuddin Ahmed, Arul Chib, Rosalie Hooi, Kokil Jaidka, and Wee Kim Wee School of Communication and Information
- Subjects
Communication ,media_common.quotation_subject ,Political Trust ,05 social sciences ,Immigration ,Communication [Social sciences] ,050801 communication & media studies ,Education ,0508 media and communications ,Linguistic analysis ,Perception ,0502 economics and business ,050211 marketing ,Social media ,Sociology ,Social Media ,Social psychology ,Prejudice (legal term) ,media_common - Abstract
Social media has a role to play in shaping the dynamic relations between immigrants and citizens. This study examines the effects of threat perceptions, consumptive and expressive use of social media, and political trust on attitudes against immigrants in Singapore. Study 1, based on a survey analysis (N = 310), suggests that symbolic but not realistic threat perception, is positively associated with anti-immigrant attitudes. The consumptive use of social media and political trust is negatively related to anti-immigrant attitudes. Moderation analyses suggest that consumptive social media use has negative consequences for individuals with increased symbolic threat perception and high political trust. But is there a correspondence between consumptive and expressive use of social media in terms of predicting prejudicial attitudes? Study 2 benchmarks the survey findings against participants’ opinion expression via Facebook posts (N = 146,332) discussing immigrants. Automated linguistic analyses reveal that self-reported survey measures correlate with the expressive use of social media for discussing immigrants. Higher anti-immigrant attitudes are associated with higher negative sentiment, anger, and swear words in discussing immigrants. The findings highlight the need to pay attention to the combined influence of social media use and individual political beliefs when analyzing intergroup relations. Ministry of Education (MOE) This work was supported by Ministry of Education Singapore [grant number MOE2017-T2-2-145].
- Published
- 2021
- Full Text
- View/download PDF
13. The Association for the Advancement of Artificial Intelligence 2020 Workshop Program
- Author
-
Grace Bang, Guy Barash, Ryan Bea, Jacques Cali, Mauricio Castillo-Effen, Xin Chen, Niyati Chhaya, Rachel Cummings, Rohan Dhoopar, Sebastijan Dumanci, Huáscar Espinoza, Eitan Farchi, Ferdinando Fioretto, Raquel Fuentetaja, Christopher Geib, Odd Erik Gundersen, José Hernández-Orallo, Xiaowei Huang, Kokil Jaidka, Sarah Keren, Seokhwan Kim, Michel Galley, Xiaomo Liu, Tyler Lu, Zhiqiang Ma, Richard Mallah, John McDermid, Martin Michalowski, Reuth Mirsky, Seán Ó hÉigeartaigh, Deepak Ramachandran, Javier Segovia-Aguas, Onn Shehory, Arash Shaban-Nejad, Vered Shwartz, Siddharth Srivastava, Kartik Talamadupula, Jian Tang, Pascal Van Hentenryck, Dell Zhang, and Jian Zhang
- Subjects
Engineering ,Artificial Intelligence ,business.industry ,Association (object-oriented programming) ,Artificial intelligence ,business ,Range (computer programming) - Abstract
The Association for the Advancement of Artificial Intelligence 2020 Workshop Program included twenty-three workshops covering a wide range of topics in artificial intelligence. This report contains the required reports, which were submitted by most, but not all, of the workshop chairs.
- Published
- 2020
- Full Text
- View/download PDF
14. Beyond Positive Emotion: Deconstructing Happy Moments Based on Writing Prompts
- Author
-
Kokil Jaidka, Niyati Chhaya, Saran Mumick, Matthew Killingsworth, Alon Halevy, and Lyle Ungar
- Abstract
This study reports experiments with the newly-released CL-Aff HappyDB dataset, which looks beyond positive emotion in modeling descriptions of happy moments collected through writing prompts. The widespread adoption of social media has improved researchers' access to unsolicited expressions and behaviors. However, most of the approaches to analyzing these expressions involve a keyword search and focuses on predicting sentiment or emotional content rather than understanding a deeper psychological state, such as happiness. The CL-Aff HappyDB dataset is the first effort to distinguish the personal agency and social interaction in writings about happiness, which do not yet have an exact equivalent concept in existing text-based approaches. We report that state of the art approaches for emotion detection have different topical characteristics, and do not generalize well to detect happiness in the CL-Aff HappyDB dataset. Language models trained on the dataset, on the other hand, generalize to social media writing and are a valid approach for downstream tasks, such as predicting life satisfaction from social media posts.
- Published
- 2020
- Full Text
- View/download PDF
15. Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods
- Author
-
Lyle H. Ungar, Margaret L. Kern, Kokil Jaidka, H. Andrew Schwartz, Salvatore Giorgi, and Johannes C. Eichstaedt
- Subjects
020205 medical informatics ,Computer science ,Twitter ,Word count ,Population ,Big data ,Social Sciences ,050109 social psychology ,Sample (statistics) ,02 engineering and technology ,big data ,Phone ,Language assessment ,0202 electrical engineering, electronic engineering, information engineering ,0501 psychology and cognitive sciences ,Social media ,education ,Estimation ,education.field_of_study ,Multidisciplinary ,Computer Sciences ,business.industry ,05 social sciences ,Data science ,machine learning ,subjective well-being ,language analysis ,Physical Sciences ,Psychological and Cognitive Sciences ,business - Abstract
Significance Spatial aggregation of Twitter language may make it possible to monitor the subjective well-being of populations on a large scale. Text analysis methods need to yield robust estimates to be dependable. On the one hand, we find that data-driven machine learning-based methods offer accurate and robust measurements of regional well-being across the United States when evaluated against gold-standard Gallup survey measures. On the other hand, we find that standard English word-level methods (such as Linguistic Inquiry and Word Count 2015’s Positive emotion dictionary and Language Assessment by Mechanical Turk) can yield estimates of county well-being inversely correlated with survey estimates, due to regional cultural and socioeconomic differences in language use. Some of the most frequent misleading words can be removed to improve the accuracy of these word-level methods., Researchers and policy makers worldwide are interested in measuring the subjective well-being of populations. When users post on social media, they leave behind digital traces that reflect their thoughts and feelings. Aggregation of such digital traces may make it possible to monitor well-being at large scale. However, social media-based methods need to be robust to regional effects if they are to produce reliable estimates. Using a sample of 1.53 billion geotagged English tweets, we provide a systematic evaluation of word-level and data-driven methods for text analysis for generating well-being estimates for 1,208 US counties. We compared Twitter-based county-level estimates with well-being measurements provided by the Gallup-Sharecare Well-Being Index survey through 1.73 million phone surveys. We find that word-level methods (e.g., Linguistic Inquiry and Word Count [LIWC] 2015 and Language Assessment by Mechanical Turk [LabMT]) yielded inconsistent county-level well-being measurements due to regional, cultural, and socioeconomic differences in language use. However, removing as few as three of the most frequent words led to notable improvements in well-being prediction. Data-driven methods provided robust estimates, approximating the Gallup data at up to r = 0.64. We show that the findings generalized to county socioeconomic and health outcomes and were robust when poststratifying the samples to be more representative of the general US population. Regional well-being estimation from social media data seems to be robust when supervised data-driven methods are used.
- Published
- 2020
- Full Text
- View/download PDF
16. Using Graph-Aware Reinforcement Learning to Identify Winning Strategies in Diplomacy Games (Student Abstract)
- Author
-
Hansin Ahuja, Lynnette Hui Xian Ng, and Kokil Jaidka
- Subjects
FOS: Computer and information sciences ,Computer Science - Computers and Society ,Computer Science - Computation and Language ,Computers and Society (cs.CY) ,General Medicine ,Computation and Language (cs.CL) - Abstract
This abstract proposes an approach towards goal-oriented modeling of the detection and modeling complex social phenomena in multiparty discourse in an online political strategy game. We developed a two-tier approach that first encodes sociolinguistic behavior as linguistic features then use reinforcement learning to estimate the advantage afforded to any player. In the first tier, sociolinguistic behavior, such as Friendship and Reasoning, that speakers use to influence others are encoded as linguistic features to identify the persuasive strategies applied by each player in simultaneous two-party dialogues. In the second tier, a reinforcement learning approach is used to estimate a graph-aware reward function to quantify the advantage afforded to each player based on their standing in this multiparty setup. We apply this technique to the game Diplomacy, using a dataset comprising of over 15,000 messages exchanged between 78 users. Our graph-aware approach shows robust performance compared to a context-agnostic setup.
- Published
- 2021
17. Cross-platform- and subgroup-differences in the well-being effects of Twitter, Instagram, and Facebook in the United States
- Author
-
Kokil Jaidka
- Subjects
Multidisciplinary ,Surveys and Questionnaires ,Humans ,Names ,Social Media ,United States - Abstract
Spatial aggregates of survey and web search data make it possible to identify the heterogeneous well-being effects of social media platforms. This study reports evidence from different sources of longitudinal data that suggests that the well-being effects of social media differ across platforms and population groups. The well-being effects of frequent social media visits are consistently positive for Facebook but negative for Instagram. Group-level analyses suggest that the positive well-being effects are experienced mainly by white, high-income populations at both the individual and the county level, while the adverse effects of Instagram use are observed on younger and Black populations. The findings are corroborated when geocoded web search data from Google is used and when self-reports from surveys are used in place of region-level aggregates. Greater Instagram use in regions is also linked to higher depression diagnoses across most sociodemographic groups.
- Published
- 2021
18. Beyond Anonymity: Network Affordances, Under Deindividuation, Improve Social Media Discussion Quality
- Author
-
Alvin Zhou, Kokil Jaidka, Sophie Lecheler, Jana Laura Egelhofer, and Yphtach Lelkes
- Subjects
History ,Operationalization ,Deindividuation ,Polymers and Plastics ,Computer Networks and Communications ,business.industry ,media_common.quotation_subject ,Internet privacy ,Rationality ,Industrial and Manufacturing Engineering ,Computer Science Applications ,Incivility ,Quality (business) ,Social media ,Sociology ,Business and International Management ,business ,Social identity theory ,media_common ,Anonymity - Abstract
The online sphere allows people to be personally anonymous while simultaneously being socially identifiable. Twitter users can use a pseudonym but signal allegiance to a political party in their profile (e.g., #MAGA). We explore the interplay of these two dimensions of anonymity on a custom-built social media platform that allowed us to examine the causal effects of personal and social anonymity on discussion quality. We find no support for the hypothesis that personal anonymity breeds incivility or lowers discussion quality in discussions on gun rights. On the other hand, when personal anonymity is combined with social identifiability (operationalized as political party visibility), it improves several features linked to discussion quality, that is, higher rationality and lower incivility. We discuss the mechanisms that might explain the results and offer recommendations for future experiments about the design of social media platforms.
- Published
- 2021
- Full Text
- View/download PDF
19. Modeling Constraints Can Identify Winning Arguments in Multi-Party Interactions (Student Abstract)
- Author
-
Suzanna Sia, Kokil Jaidka, Niyati Chayya, and Kevin Duh
- Subjects
General Medicine - Abstract
In contexts where debate and deliberation is the norm, participants are regularly presented with new information that conflicts with their original beliefs. When required to update their beliefs (belief alignment), they may choose arguments that align with their worldview (confirmation bias). We test this and competing hypotheses in a constraint-based modeling approach to predict the winning arguments in multi-party interactions in the Reddit ChangeMyView dataset. We impose structural constraints that reflect competing hypotheses on a hierarchical generative Variational Auto-encoder. Our findings suggest that when arguments are further from the initial belief state of the target, they are more likely to succeed.
- Published
- 2022
- Full Text
- View/download PDF
20. The Psychology of Semantic Spaces: Experiments with Positive Emotion (Student Abstract)
- Author
-
Xuan Liu, Kokil Jaidka, and Niyati Chayya
- Subjects
General Medicine - Abstract
Psychological concepts can help computational linguists to better model the latent semantic spaces of emotions, and understand the underlying states motivating the sharing or suppressing of emotions. This abstract applies the understanding of agency and social interaction in the happiness semantic space to its role in positive emotion. First, BERT-based fine-tuning yields an expanded seed set to understand the vocabulary of the latent space. Next, results benchmarked against many emotion datasets suggest that the approach is valid, robust, offers an improvement over direct prediction, and is useful for downstream predictive tasks related to psychological states.
- Published
- 2022
- Full Text
- View/download PDF
21. Social Media Reveals Urban-Rural Differences in Stress across China
- Author
-
Jesse Cui, Tingdan Zhang, Kokil Jaidka, Dandan Pang, Garrick Sherman, Vinit Jakhetiya, Lyle H Ungar, and Sharath Chandra Guntuku
- Subjects
Social and Information Networks (cs.SI) ,FOS: Computer and information sciences ,Computer Science - Computers and Society ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,QA75 Electronic computers. Computer science ,Computers and Society (cs.CY) ,Computer Science - Social and Information Networks ,Computation and Language (cs.CL) - Abstract
Modeling differential stress expressions in urban and rural regions in China can provide a better understanding of the effects of urbanization on psychological well-being in a country that has rapidly grown economically in the last two decades. This paper studies linguistic differences in the experiences and expressions of stress in urban-rural China from Weibo posts from over 65,000 users across 329 counties using hierarchical mixed-effects models. We analyzed phrases, topical themes, and psycho-linguistic word choices in Weibo posts mentioning stress to better understand appraisal differences surrounding psychological stress in urban and rural communities in China; we then compared them with large-scale polls from Gallup. After controlling for socioeconomic and gender differences, we found that rural communities tend to express stress in emotional and personal themes such as relationships, health, and opportunity while users in urban areas express stress using relative, temporal, and external themes such as work, politics, and economics. These differences exist beyond controlling for GDP and urbanization, indicating a fundamentally different lifestyle between rural and urban residents in very specific environments, arguably having different sources of stress. We found corroborative trends in physical, financial, and social wellness with urbanization in Gallup polls., Accepted at AAAI Conference on Web and Social Media (ICWSM) 2022
- Published
- 2021
22. Reports of the Workshops Held at the 2019 AAAI Conference on Artificial Intelligence
- Author
-
Guy Barash, Mauricio Castillo-Effen, Niyati Chhaya, Peter Clark, Huáscar Espinoza, Eitan Farchi, Christopher Geib, Odd Erik Gundersen, Seán HÉigeartaigh, José Hernández-Orallo, Chiori Hori, Xiaowei Huang, Kokil Jaidka, Pavan Kapanipathi, Sarah Keren, Seokhwan Kim, Marc Lanctot, Danny Lange, Julian McAuley, David Martinez, Marwan Mattar, null Mausam, Martin Michalowski, Reuth Mirsky, Roozbeh Mottaghi, Joseph Osborn, Julien Perolat, Martin Schmid, Arash Shaban-Nejad, Onn Shehory, Biplav Srivastava, William Streilein, Kartik Talamadupula, Julian Togelius, Koichiro Yoshino, Quanshi Zhang, and Imed Zitouni
- Subjects
Computer science ,business.industry ,Deep learning ,Robotics ,Plan (drawing) ,Recommender system ,computer.software_genre ,Knowledge extraction ,Artificial Intelligence ,Reinforcement learning ,Artificial intelligence ,Dialog system ,business ,computer ,Agile software development - Abstract
The workshop program of the Association for the Advancement of Artificial Intelligence’s 33rd Conference on Artificial Intelligence (AAAI-19) was held in Honolulu, Hawaii, on Sunday and Monday, January 27–28, 2019. There were fifteen workshops in the program: Affective Content Analysis: Modeling Affect-in-Action, Agile Robotics for Industrial Automation Competition, Artificial Intelligence for Cyber Security, Artificial Intelligence Safety, Dialog System Technology Challenge, Engineering Dependable and Secure Machine Learning Systems, Games and Simulations for Artificial Intelligence, Health Intelligence, Knowledge Extraction from Games, Network Interpretability for Deep Learning, Plan, Activity, and Intent Recognition, Reasoning and Learning for Human-Machine Dialogues, Reasoning for Complex Question Answering, Recommender Systems Meet Natural Language Processing, Reinforcement Learning in Games, and Reproducible AI. This report contains brief summaries of the all the workshops that were held.
- Published
- 2019
- Full Text
- View/download PDF
23. Brevity is the Soul of Twitter: The Constraint Affordance and Political Discussion
- Author
-
Alvin Zhou, Yphtach Lelkes, Kokil Jaidka, and Wee Kim Wee School of Communication and Information
- Subjects
History ,Linguistics and Language ,Natural experiment ,Polymers and Plastics ,media_common.quotation_subject ,Political communication ,050801 communication & media studies ,Industrial and Manufacturing Engineering ,Language and Linguistics ,Politics ,0508 media and communications ,050602 political science & public administration ,Social media ,Sociology ,Business and International Management ,Affordance ,media_common ,Social network ,business.industry ,Politeness ,Communication ,05 social sciences ,Communication [Social sciences] ,Public relations ,Deliberation ,0506 political science ,Political Discussion ,Political Communication ,Public sphere ,Computational sociology ,business - Abstract
Many hoped that social networking sites would allow for the open exchange of information and a revival of the public sphere. Unfortunately, conversations on social media are often toxic and not conducive to healthy political discussions. Twitter, the most widely used social network for political discussions, doubled the limit of characters in a tweet in November 2017, which provided an opportunity to study the effect of technological affordances on political discussions using a discontinuous time series design. Using supervised and unsupervised natural language processing methods, we analyzed 358,242 tweet replies to U.S. politicians from January 2017 to March 2018. We show that doubling the permissible length of a tweet led to less uncivil, more polite, and more constructive discussions online. However, the declining trend in the empathy and respectfulness of these tweets raises concerns about the implications of the changing norms for the quality of political deliberation.
- Published
- 2019
- Full Text
- View/download PDF
24. WikiTalkEdit: A Dataset for modeling Editors’ behaviors on Wikipedia
- Author
-
Iknoor Singh, Kokil Jaidka, Lyle H. Ungar, Andrea Ceolin, and Niyati Chhaya
- Subjects
Matching (statistics) ,Computer science ,business.industry ,media_common.quotation_subject ,05 social sciences ,02 engineering and technology ,computer.software_genre ,Style (sociolinguistics) ,Evidentiality ,0202 electrical engineering, electronic engineering, information engineering ,Criticism ,020201 artificial intelligence & image processing ,Conversation ,Artificial intelligence ,0509 other social sciences ,Dialog box ,050904 information & library sciences ,Baseline (configuration management) ,F1 score ,business ,computer ,Natural language processing ,media_common - Abstract
This study introduces and analyzes WikiTalkEdit, a dataset of conversations and edit histories from Wikipedia, for research in online cooperation and conversation modeling. The dataset comprises dialog triplets from the Wikipedia Talk pages, and editing actions on the corresponding articles being discussed. We show how the data supports the classic understanding of style matching, where positive emotion and the use of first-person pronouns predict a positive emotional change in a Wikipedia contributor. However, they do not predict editorial behavior. On the other hand, feedback invoking evidentiality and criticism, and references to Wikipedia’s community norms, is more likely to persuade the contributor to perform edits but is less likely to lead to a positive emotion. We developed baseline classifiers trained on pre-trained RoBERTa features that can predict editorial change with an F1 score of .54, as compared to an F1 score of .66 for predicting emotional change. A diagnostic analysis of persisting errors is also provided. We conclude with possible applications and recommendations for future work. The dataset is publicly available for the research community at https://github.com/kj2013/WikiTalkEdit/.
- Published
- 2021
- Full Text
- View/download PDF
25. Questionable and open research practices: attitudes and perceptions among quantitative communication researchers
- Author
-
Yphtach Lelkes, Fasching N, Dörr T, Kokil Jaidka, and Bert N. Bakker
- Subjects
Open research ,business.industry ,Perception ,media_common.quotation_subject ,Public relations ,Psychology ,business ,media_common - Abstract
Recent contributions have questioned the credibility of quantitative communication research. While questionable research practices are believed to be widespread, evidence for this claim is primarily derived from other disciplines. Before change in communication research can happen, it is important to document the extent to which QRPs are used and whether researchers are open to the changes proposed by the so-called open science agenda. We conducted a large survey among authors of papers published in the top-20 journals in communication science in the last ten years (N=1039). A non-trivial percent of researchers report using one or more QRPs. While QRPs are generally considered unacceptable, researchers perceive QRPs to be common among their colleagues. At the same time, we find optimism about the use of open science practices in communication research. We end with a series of recommendations outlining what journals, institutions and researchers can do moving forward.
- Published
- 2020
- Full Text
- View/download PDF
26. Reports of the Workshops of the 32nd AAAI Conference on Artificial Intelligence
- Author
-
Joseph C. Osborn, Nicholas Mattei, Martin Michalowski, Reuth Mirsky, Bruno Bouchard, William W. Streilein, Sarah Keren, Kokil Jaidka, Amit Sheth, David R. Martinez, Ilan Shimshoni, Arunesh Sinha, Howie Shrobe, Amelie Gyrard, K. Brent Venable, Atanu R. Sinha, Anna Zamansky, Eitan Farchi, Georgios Theocharous, Roni Khardon, Sébastien Gaboury, Noam Brown, Onn Shehory, Arash Shaban-Nejad, Kevin Bouchard, Biplav Srivastava, Neal Wagner, Parisa Kordjamshidi, Christopher W. Geib, Niyati Chhaya, and Cem Safak Sahin
- Subjects
Engineering ,business.industry ,Plan (drawing) ,Preference handling ,GeneralLiterature_MISCELLANEOUS ,Marketing science ,Knowledge extraction ,Artificial Intelligence ,Affective content analysis ,Smart environment ,Artificial intelligence ,Internet of Things ,business ,Intent recognition - Abstract
The AAAI-18 workshop program included 15 workshops covering a wide range of topics in AI. Workshops were held Sunday and Monday, February 2–7, 2018, at the Hilton New Orleans Riverside in New Orleans, Louisiana, USA. This report contains summaries of the Affective Content Analysis workshop; the Artificial Intelligence Applied to Assistive Technologies and Smart Environments; the AI and Marketing Science workshop; the Artificial Intelligence for Cyber Security workshop; the AI for Imperfect-Information Games; the Declarative Learning Based Programming workshop; the Engineering Dependable and Secure Machine Learning Systems workshop; the Health Intelligence workshop; the Knowledge Extraction from Games workshop; the Plan, Activity, and Intent Recognition workshop; the Planning and Inference workshop; the Preference Handling workshop; the Reasoning and Learning for Human-Machine Dialogues workshop; and the the AI Enhanced Internet of Things Data Processing for Intelligent Applications workshop.
- Published
- 2018
- Full Text
- View/download PDF
27. Do birds of different feather flock together? Analyzing the political use of social media through a language-based approach in a multilingual context
- Author
-
Jaeho Cho, Saifuddin Ahmed, and Kokil Jaidka
- Subjects
05 social sciences ,Media studies ,050801 communication & media studies ,Context (language use) ,language.human_language ,Homophily ,Human-Computer Interaction ,Politics ,0508 media and communications ,Arts and Humanities (miscellaneous) ,General election ,0502 economics and business ,language ,050211 marketing ,Social media ,Sociology ,Social network analysis ,General Psychology ,Period (music) ,Malay - Abstract
This study analyzes the political use of Twitter in the run-up to the 2013 Malaysian General Election. It follows a content and social network analysis approach to investigate the interplay of language and political partisanship in social media use, among Twitter users in Malaysia. In the period leading up to the 2013 elections, Twitter posts collected under the hashtag #GE13 reveal that communities that post in English versus the Malay language, differ in how they use Twitter and with whom they interact. As compared to English users, Malay users are more likely to seek political information and express their political opinion. In online discussions, we observe language-based homophily within the English and Malay language communities, but there are some cross-cutting interactions between opposing political communities. We discuss the implications of our findings for the political use of new communication technologies in multi-ethnic and multilingual societies.
- Published
- 2018
- Full Text
- View/download PDF
28. Corrigendum to: Questionable and Open Research Practices: Attitudes and Perceptions among Quantitative Communication Researchers
- Author
-
Bert N Bakker, Kokil Jaidka, Timothy Dörr, Neil Fasching, and Yphtach Lelkes
- Subjects
Linguistics and Language ,Communication ,Language and Linguistics - Published
- 2022
- Full Text
- View/download PDF
29. Introduction to the special issue on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL)
- Author
-
Guillaume Cabanac, Dietmar Wolfram, Muthu Kumar Chandrasekaran, Philipp Mayr, Kokil Jaidka, Ingo Frommholz, and Min-Yen Kan
- Subjects
Information retrieval ,Information seeking ,business.industry ,Computer science ,05 social sciences ,Sensemaking ,Library and Information Sciences ,050905 science studies ,computer.software_genre ,Digital library ,Information extraction ,Universal Networking Language ,Human–computer information retrieval ,Question answering ,State (computer science) ,Artificial intelligence ,0509 other social sciences ,050904 information & library sciences ,business ,computer ,Natural language processing - Abstract
The large scale of scholarly publications poses a challenge for scholars in information seeking and sensemaking. Bibliometric, information retrieval (IR), text mining, and natural language processing techniques can assist to address this challenge, but have yet to be widely used in digital libraries (DL). This special issue on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL) was compiled after the first joint BIRNDL workshop that was held at the joint conference on digital libraries (JCDL 2016) in Newark, New Jersey, USA. It brought together IR and DL researchers and professionals to elaborate on new approaches in natural language processing, information retrieval, scientometric, and recommendation techniques that can advance the state of the art in scholarly document understanding, analysis, and retrieval at scale. This special issue includes 14 papers: four extended papers originating from the first BIRNDL workshop 2016 and the BIR workshop at ECIR 2016, four extended system reports of the CL-SciSumm Shared Task 2016 and six original research papers submitted via the open call for papers.
- Published
- 2017
- Full Text
- View/download PDF
30. Leveling the playing field: The use of Twitter by politicians during the 2014 Indian general election campaign
- Author
-
Jaeho Cho, Kokil Jaidka, and Saifuddin Ahmed
- Subjects
Inequality ,Computer Networks and Communications ,business.industry ,media_common.quotation_subject ,05 social sciences ,Face (sociological concept) ,050801 communication & media studies ,Advertising ,Public relations ,CONTEST ,0506 political science ,Politics ,0508 media and communications ,Resource (project management) ,General election ,Political science ,050602 political science & public administration ,Social media ,Electrical and Electronic Engineering ,business ,Affordance ,media_common - Abstract
In this study, it is theorized that the communicative affordances offered by social media platforms will enable politically under-resourced candidates to contest the marginalization they face in traditional media. Multivariate analyses were conducted of the tweets of 205 political candidates of the 2014 Indian general election. Findings reveal that fringe party candidates received the least media attention and tended to use Twitter more frequently than major party candidates, especially for interaction and mobilization. Minor party candidates also received less media attention, albeit their Twitter usage patterns were not significantly different than major party candidates. The results illustrate that social media platforms can help overcome resource inequality in politics. The larger implications of this study are discussed.
- Published
- 2017
- Full Text
- View/download PDF
31. The internet and participation inequality : a multilevel examination of 108 countries
- Author
-
Ahmed, S., Cho, J., Kokil Jaidka, Eichstaedt, J. C., Ungar, L. H., and Wee Kim Wee School of Communication and Information
- Subjects
Communication and Media Studies ,Internet ,Television and Digital Media ,political inequality ,Journalism and Professional Writing ,Civic Participation ,education ,Government Intervention ,participation gap ,Communication [Social sciences] ,press freedom ,humanities ,Film - Abstract
This study investigates the role of the Internet in civic participation inequality across 108 countries. Merging individual-level survey data from the 2016 Gallup World Poll with country-level indices, we conduct multilevel analyses to answer three broader sets of questions: (1) Does access to the Internet increase the likelihood of civic participation? (2) Does Internet access amplify or lessen socioeconomic stratification in civic participation? (3) Do press freedom and government intervention as contextual factors shape the role of the Internet in civic participation inequality? The findings suggest that Internet access increases the likelihood of civic participation while it also deepens socioeconomic stratification in participation. Cross-level interactions unveil that the intervening role of the Internet remains unaffected by press freedom, but government intervention through the promotion of ICT use can help control the growing inequality. We discuss the theoretical implications of these findings for political inequality research and the applied global significance. Published version
- Published
- 2020
32. Framing social conflicts in news coverage and social media: A multicountry comparative study
- Author
-
Kokil Jaidka, Saifuddin Ahmed, and Jaeho Cho
- Subjects
Communication and Media Studies ,Sociology and Political Science ,media_common.quotation_subject ,Twitter ,050801 communication & media studies ,Geopolitics ,0508 media and communications ,Journalism and Professional Writing ,Political science ,050602 political science & public administration ,Information system ,Social conflict ,Social media ,News media ,media_common ,Singapore ,riot ,Communication ,05 social sciences ,Censorship ,Media studies ,Communication & Media Studies ,international news ,0506 political science ,geopolitical proximity ,Framing (social sciences) ,Multinational corporation ,Framing - Abstract
© 2018, The Author(s) 2018. This study attempts to understand how geopolitical proximity influences framing of social conflicts in news coverage and social media discussions. Within the context of 2013 Little India riot in Singapore, a manual content and automated linguistic analyses are conducted on 227 news articles and 4,495 tweets. A multinational comparison suggests that news media follow the traditional hypothesis of geopolitical proximity and international news coverage. However, Twitter seems less constrained by geopolitical boundaries of news making allowing citizens to bypass press censorship in an alternate information system. The reasons for framing differences across mediums and between countries are explored. Implications of these findings and limitations of the study are discussed.
- Published
- 2019
- Full Text
- View/download PDF
33. Predicting elections from social media: a three-country, three-method comparative study
- Author
-
Saifuddin Ahmed, Marko M. Skoric, Martin Hilbert, and Kokil Jaidka
- Subjects
Social network ,Computer science ,business.industry ,Communication ,Journalism And Professional Writing ,05 social sciences ,Sentiment analysis ,050801 communication & media studies ,Education ,0508 media and communications ,0502 economics and business ,Econometrics ,Asian country ,050211 marketing ,Social media ,Communication And Media Studies ,Robustness (economics) ,business - Abstract
This study introduces and evaluates the robustness of different volumetric, sentiment, and social network approaches to predict the elections in three Asian countries–Malaysia, India, and Pakistan from Twitter posts. We find that predictive power of social media performs well for India and Pakistan but is not effective for Malaysia. Overall, we find that it is useful to consider the recency of Twitter posts while using it to predict a real outcome, such as an election result. Sentiment information mined using machine learning models was the most accurate predictor of election outcomes. Social network information is stable despite sudden surges in political discussions, for e.g. around elections-related news events. Methods combining sentiment and volume information, or sentiment and social network information, are effective at predicting smaller vote shares, for e.g. vote shares in the case of independent candidates and regional parties. We conclude with a detailed discussion on the caveats of social media analysis for predicting real-world outcomes and recommendations for future work.
- Published
- 2019
- Full Text
- View/download PDF
34. Questionable and Open Research Practices: Attitudes and Perceptions among Quantitative Communication Researchers.
- Author
-
Bakker, Bert N, Kokil, Jaidka, Dörr, Timothy, Fasching, Neil, and Lelkes, Yphtach
- Subjects
- *
QUANTITATIVE research , *COMMUNICATIONS research , *RESEARCH methodology , *RESEARCH methodology evaluation , *OPEN data movement - Abstract
Recent contributions have questioned the credibility of quantitative communication research. While questionable research practices (QRPs) are believed to be widespread, evidence for this belief is, primarily, derived from other disciplines. Therefore, it is largely unknown to what extent QRPs are used in quantitative communication research and whether researchers embrace open research practices (ORPs). We surveyed first and corresponding authors of publications in the top-20 journals in communication science. Many researchers report using one or more QRPs. We find widespread pluralistic ignorance: QRPs are generally rejected, but researchers believe they are prevalent. At the same time, we find optimism about the use of open science practices. In all, our study has implications for theories in communication that rely upon a cumulative body of empirical work: these theories are negatively affected by QRPs but can gain credibility if based upon ORPs. We outline an agenda to move forward as a discipline. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
35. Social Media and Electoral Predictions:A Meta-Analytic Review
- Author
-
Jing Liu, Marki M. Skoric, and Kokil Jaidka
- Subjects
Social media ,Sociology ,Positive economics - Published
- 2019
- Full Text
- View/download PDF
36. Report on the 3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018)
- Author
-
Philipp Mayr, Muthu Kumar Chandrasekaran, and Kokil Jaidka
- Subjects
FOS: Computer and information sciences ,Hardware and Architecture ,05 social sciences ,Digital Libraries (cs.DL) ,Computer Science - Digital Libraries ,0509 other social sciences ,050905 science studies ,050904 information & library sciences ,Information Retrieval (cs.IR) ,Management Information Systems ,Computer Science - Information Retrieval - Abstract
The $3^{rd}$ joint BIRNDL workshop was held at the 41st ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) in Ann Arbor, USA. BIRNDL 2018 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated three paper sessions and the $4^{th}$ edition of the CL-SciSumm Shared Task., 6 pages, to appear in SIGIR Forum
- Published
- 2018
37. Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018)
- Author
-
Muthu Kumar Chandrasekaran, Kokil Jaidka, and Philipp Mayr
- Subjects
Information retrieval ,business.industry ,Computer science ,Information seeking ,05 social sciences ,02 engineering and technology ,Scientometrics ,Bibliometrics ,computer.software_genre ,Digital library ,Automatic summarization ,Information extraction ,Text mining ,User experience design ,Citation analysis ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,0509 other social sciences ,Computational linguistics ,050904 information & library sciences ,business ,computer ,Natural language processing - Abstract
The large scale of scholarly publications poses a challenge for scholars in information seeking and sensemaking. Information retrieval~(IR), bibliometric and natural language processing (NLP) techniques could enhance scholarly search, retrieval and user experience but are not yet widely used. To this purpose, we propose the third iteration of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL). The workshop is intended to stimulate IR, NLP researchers and Digital Library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, text mining and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The BIRNDL workshop will incorporate multiple invited talks, paper sessions, a poster session and the 4th edition of the Computational Linguistics (CL) Scientific Summarization Shared Task.
- Published
- 2018
- Full Text
- View/download PDF
38. Modeling and Visualizing Locus of Control with Facebook Language
- Author
-
Kokil Jaidka, Buffone, A., Eichstaedt, J., Rouhizadeh, M., and Ungar, L. H.
- Abstract
A body of literature has demonstrated that users' psychological traits such as personality can be predicted from their posts on social media. However, there is still a gap between the computational and descriptive analyses of the language features associated with different psychological traits, and their use by social scientists and psychologists to make deeper behavioral inferences. In this study, we aim to bridge this gap with a visualization that situates the language associated with one psychological trait in the context of other psychological dimensions. We predict Locus of Control (LoC), an individual's perception of personal control over events in their lives, from their Facebook language (F1=0.82). We then look at how language explains the relationship of LoC with consciousness and emotional stability.
- Published
- 2018
- Full Text
- View/download PDF
39. Facebook versus Twitter: Differences in Self-Disclosure and Trait Prediction
- Author
-
Kokil Jaidka, Sharath Guntuku, and Lyle Ungar
- Abstract
This study compares self-disclosure on Facebook and Twitter through the lens of demographic and psychological traits. Predictive evaluation reveals that language models trained on Facebook posts are more accurate at predicting age, gender, stress, and empathy than those trained on Twitter posts. Qualitative analyses of the underlying linguistic and demographic differences reveal that users are significantly more likely to disclose information about their family, personal concerns, and emotions and provide a more `honest' self-representation on Facebook. On the other hand, the same users significantly preferred to disclose their needs, drives, and ambitions on Twitter. The higher predictive performance of Facebook is also partly due to the greater volume of language on Facebook than Twitter -- Facebook and Twitter are equally good at predicting user traits when the same-sized language samples are used to train language models. We explore the implications of these differences in cross-platform user trait prediction.
- Published
- 2018
- Full Text
- View/download PDF
40. Identifying Locus of Control in Social Media Language
- Author
-
Lyle H. Ungar, H. Andrew Schwartz, Kokil Jaidka, Anneke Buffone, Laura Smith, and Masoud Rouhizadeh
- Subjects
Cognitive science ,Locus of control ,Computer science ,Rhetorical question ,Social media ,Semantics ,Control (linguistics) ,Syntax ,Style (sociolinguistics) ,Task (project management) - Abstract
Individuals express their locus of control, or “control”, in their language when they identify whether or not they are in control of their circumstances. Although control is a core concept underlying rhetorical style, it is not clear whether control is expressed by how or by what authors write. We explore the roles of syntax and semantics in expressing users’ sense of control –i.e. being “controlled by” or “in control of” their circumstances– in a corpus of annotated Facebook posts. We present rich insights into these linguistic aspects and find that while the language signaling control is easy to identify, it is more challenging to label it is internally or externally controlled, with lexical features outperforming syntactic features at the task. Our findings could have important implications for studying self-expression in social media.
- Published
- 2018
- Full Text
- View/download PDF
41. Diachronic degradation of language models: Insights from social media
- Author
-
Lyle H. Ungar, Niyati Chhaya, and Kokil Jaidka
- Subjects
Computer science ,0202 electrical engineering, electronic engineering, information engineering ,Profiling (information science) ,020201 artificial intelligence & image processing ,Social media ,02 engineering and technology ,Language model ,010501 environmental sciences ,01 natural sciences ,Data science ,Natural language ,0105 earth and related environmental sciences - Abstract
Natural languages change over time because they evolve to the needs of their users and the socio-technological environment. This study investigates the diachronic accuracy of pre-trained language models for downstream tasks in machine learning and user profiling. It asks the question: given that the social media platform and its users remain the same, how is language changing over time? How can these differences be used to track the changes in the affect around a particular topic? To our knowledge, this is the first study to show that it is possible to measure diachronic semantic drifts within social media and within the span of a few years.
- Published
- 2018
- Full Text
- View/download PDF
42. Understanding and Measuring Psychological Stress using Social Media
- Author
-
Sharath Chandra Guntuku, Anneke Buffone, Kokil Jaidka, Johannes C. Eichstaedt, and Lyle H. Ungar
- Subjects
FOS: Computer and information sciences ,Computer Science - Computers and Society ,Computer Science - Computation and Language ,Computers and Society (cs.CY) ,Computation and Language (cs.CL) - Abstract
A body of literature has demonstrated that users' mental health conditions, such as depression and anxiety, can be predicted from their social media language. There is still a gap in the scientific understanding of how psychological stress is expressed on social media. Stress is one of the primary underlying causes and correlates of chronic physical illnesses and mental health conditions. In this paper, we explore the language of psychological stress with a dataset of 601 social media users, who answered the Perceived Stress Scale questionnaire and also consented to share their Facebook and Twitter data. Firstly, we find that stressed users post about exhaustion, losing control, increased self-focus and physical pain as compared to posts about breakfast, family-time, and travel by users who are not stressed. Secondly, we find that Facebook language is more predictive of stress than Twitter language. Thirdly, we demonstrate how the language based models thus developed can be adapted and be scaled to measure county-level trends. Since county-level language is easily available on Twitter using the Streaming API, we explore multiple domain adaptation algorithms to adapt user-level Facebook models to Twitter language. We find that domain-adapted and scaled social media-based measurements of stress outperform sociodemographic variables (age, gender, race, education, and income), against ground-truth survey-based stress measurements, both at the user- and the county-level in the U.S. Twitter language that scores higher in stress is also predictive of poorer health, less access to facilities and lower socioeconomic status in counties. We conclude with a discussion of the implications of using social media as a new tool for monitoring stress levels of both individuals and counties., Comment: Accepted for publication in the proceedings of ICWSM 2019
- Published
- 2018
- Full Text
- View/download PDF
43. Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)
- Author
-
Philipp Mayr, Kokil Jaidka, and Muthu Kumar Chandrasekaran
- Subjects
FOS: Computer and information sciences ,Computer science ,Bibliometrics ,050905 science studies ,computer.software_genre ,Computer Science - Information Retrieval ,World Wide Web ,Text mining ,Question answering ,Relevance (information retrieval) ,Digital Libraries (cs.DL) ,Cognitive models of information retrieval ,Information retrieval ,business.industry ,Information seeking ,05 social sciences ,Computer Science - Digital Libraries ,Digital library ,Data science ,Automatic summarization ,Information extraction ,Human–computer information retrieval ,Artificial intelligence ,0509 other social sciences ,Computational linguistics ,050904 information & library sciences ,business ,computer ,Natural language processing ,Information Retrieval (cs.IR) - Abstract
The large scale of scholarly publications poses a challenge for scholars in information seeking and sensemaking. Bibliometrics, information retrieval (IR), text mining and NLP techniques could help in these search and look-up activities, but are not yet widely used. This workshop is intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, text mining and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The BIRNDL workshop at SIGIR 2017 will incorporate an invited talk, paper sessions and the third edition of the Computational Linguistics (CL) Scientific Summarization Shared Task., Comment: 2 pages, workshop paper accepted at the SIGIR 2017
- Published
- 2017
- Full Text
- View/download PDF
44. Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016)
- Author
-
Muthu Kumar Chandrasekaran, Guillaume Cabanac, Kokil Jaidka, Ingo Frommholz, Min-Yen Kan, Dietmar Wolfram, Philipp Mayr, Recherche d’Information et Synthèse d’Information (IRIT-IRIS), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Université Toulouse III - Paul Sabatier (UT3), National University of Singapore (NUS), University of Bedfordshire, Adobe Systems Inc., Leibniz-Institute for the Social Sciences [Mannheim] (GESIS ), UFR Santé, Médecine et Biologie Humaine, Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - INPT (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Adobe System (USA), Leibniz Institute for the Social Sciences - GESIS (GERMANY), National University of Singapore - NUS (REPUBLIC OF SINGAPORE), University of Wisconsin - Milwaukee (USA), University of Bedfordshire (UNITED KINGDOM), and Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
- Subjects
Text mining ,Computer science ,Bibliometrics ,050905 science studies ,computer.software_genre ,Information retrieval ,Théorie de l'information ,Digital libraries ,business.industry ,Scale (chemistry) ,Natural language processing ,05 social sciences ,Recherche d'information ,Sensemaking ,Digital library ,Metadata ,[INFO.INFO-IT]Computer Science [cs]/Information Theory [cs.IT] ,[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] ,Artificial intelligence ,0509 other social sciences ,Computational linguistics ,050904 information & library sciences ,business ,computer - Abstract
International audience; The large scale of scholarly publications poses a challenge for scholars in information-seeking and sensemaking. Bibliometric, information retrieval (IR), text mining and NLP techniques could help in these activities, but are not yet widely used in digital libraries. This workshop is intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometric and recommendation techniques which can advance the state-of-the-art in scholarly document understanding, analysis and retrieval at scale.
- Published
- 2016
45. A literature review framework for multi-document summarization of research papers
- Author
-
Kokil Jaidka, Jin Cheon Na, Khoo Soo Guan, Christopher, and Wee Kim Wee School of Communication and Information
- Subjects
Library and information science::General [DRNTU] ,Humanities::Linguistics::Sociolinguistics::Computational linguistics [DRNTU] ,Humanities::Linguistics [DRNTU] ,Engineering::Computer science and engineering::Information systems::Information systems applications [DRNTU] - Abstract
This study is in the area of multi-document summarization of research papers. It addresses the gap identified between the structure and readability of human-written summaries and other automatic multi-document summaries, which only focus on selecting the more important information from the set of documents but neglect to consider its readability. In the context of this overall goal, the first part of this study develops a literature review framework which specifies the structural, rhetorical and content characteristics of human-written literature reviews. In the second part of the study, an automatic method is developed which partially implements this framework, to generate multi-document summaries of research papers emulating some characteristic of human-written literature reviews. The framework is based on extensive discourse and information analyses of literature reviews in the domain of information science. The corpus for analysis comprised 120 literature review sections published as a part of research papers in international peer-reviewed top information science journals – Journal of the American Society for Information Science and Technology (JASIST), Journal of Information Science (JIS) and Journal of Documentation (JDoc) over the years 2000-2008. The macro-level analysis identifies the document structure within a literature review, which comprises 9 types of discourse elements. The sentence-level analysis identifies 22 rhetorical functions employed in literature reviews and 153 linguistic devices which frame information within sentences. The information analysis identifies significant associations between the source sections of selected sentences and the transformations performed on them. Results show that literature reviews are written in two main styles – integrative literature reviews and descriptive literature reviews. Integrative literature reviews present information from several studies in a condensed form as a critical summary, possibly complemented with a comparison, evaluation or comment on the research gap. They focus on highlighting relationships amongst concepts or comparing studies against each other. Descriptive reviews present more experimental detail about previous studies, such as their approach, results and evaluation. These findings are incorporated into the multi-level literature review framework, comprising their macro-level structure and their rhetorical functions, as well as the information summarization strategies. Based on this framework, in the second part of the study a multi-document summarization method emulating characteristics of human literature reviews is developed to generate an integrative summary that combines information across the papers and highlights the agreements and disagreements among them. It extracts information concepts from research papers by imitating researchers’ preferences, integrates them across the set of related papers and organizes them as a topic tree; finally, it presents them using sentence templates which realize rhetorical functions. The method which is presented here only focuses on summarizing and comparing the research objectives information across papers, and hence it applies only those components of the framework which are appropriate to choose and synthesize research objective information. Automatic content evaluation shows no significant difference between the summaries generated by the automatic method, and the baseline sentence extraction system, MEAD. However, the quality characteristics of the automatic summaries are a significant improvement over MEAD summaries because about two-thirds of all assessors (35 PhD students and professors in Library and Information Science) preferred to use them over MEAD summaries; they are also perceived as significantly more useful for obtaining a research overview or seeing comparisons across studies. The automatic summaries are also considered more readable in the way they relate topics and sentence to each other. However, they still have grammatical errors and repetitions; to resolve those, it is recommended to improve include some post-processing steps in the automatic method. Assessors with different levels of research experience are found to hold different expectations from the final summary – the ones with less experience look for more details about individual studies; it can be inferred that they prefer a more descriptive literature review. More experienced assessors want to understand the bigger picture and the main themes of the research; evidently, they want a more integrative literature review. These insights can help in customizing the automatic method for its users. This study is in the area of multi-document summarization of research papers. It addresses the gap identified between the quality of human-written summaries and other automatic multi. document summaries, which only focus on selecting the more important information from the set of documents but neglect to consider its readability. In the context of this overall goal, the first part of this study develops a literature review framework which specifies the structural, rhetorical and content characteristics of human-written literature reviews. The framework is based on extensive discourse and content analysis of literature reviews which identified the macro-level structure, sentence-level rhetorical functions and the authors' selection and transformation strategies which constitute literature reviews. The second part of the study develops an automatic method to partially implement this framework and generate multi-document summanes of research papers emulating some characteristics of human-written literature reviews in selecting, integrating, organizing and framing information. Assessors perceive this automatic summary as significantly more useful and readable than the summaries of the baseline system, MEAD, which employs a sentence extraction method. Doctor of Philosophy (WKWSCI)
- Published
- 2014
46. Protests against #delhigangrape on Twitter: Analyzing India’s Arab Spring
- Author
-
Kokil Jaidka and Saifuddin Ahmed
- Subjects
Sociology and Political Science ,Twitter ,Media studies ,Information Dissemination ,Citizen journalism ,information dissemination ,social movement ,Computer Science Applications ,ComputingMilieux_GENERAL ,protest ,protest reporting ,lcsh:Political science (General) ,Political science ,Capital city ,citizen journalism ,Social media ,InformationSystems_MISCELLANEOUS ,lcsh:JA1-92 ,Social network analysis ,Social movement - Abstract
This study offers a comprehensive approach towards analyzing and explaining the role of Twitter in shaping and facilitating social movements especially during protests. It presents automatic and manual analyses of the tweet themes, usage characteristics and major Twitter users during a public outcry against a gangrape incident in Delhi, the capital city of India. Our results identified Twitter as an important channel for the diffusion of ideas and news among a vast set of adopters in defiance of geographical boundaries. Results of the content analyses highlight the prominent use of social media resources in disseminating information on Twitter, and the remarkable role of Twitter users as citizen journalists during the days of the protest. Results of the social network analysis suggest that major role players on Twitter were the offline protest leaders.
- Published
- 2013
47. The CL-scisumm shared task 2018: Results and key insights
- Author
-
Kokil Jaidka, Yasunaga, M., Chandrasekaran, M. K., Radev, D., and Kan, M. -Y
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Computation and Language (cs.CL) ,Information Retrieval (cs.IR) ,Computer Science - Information Retrieval - Abstract
This overview describes the official results of the CL-SciSumm Shared Task 2018 -- the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain. This year, the dataset comprised 60 annotated sets of citing and reference papers from the open access research papers in the CL domain. The Shared Task was organized as a part of the 41st Annual Conference of the Special Interest Group in Information Retrieval (SIGIR), held in Ann Arbor, USA in July 2018. We compare the participating systems in terms of two evaluation metrics. The annotated dataset and evaluation scripts can be accessed and used by the community from: \url{https://github.com/WING-NUS/scisumm-corpus}., BIRNDL @ SIGIR 2018. arXiv admin note: substantial text overlap with arXiv:1907.09854
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.