Descriptor: "Natural Language Processing" / Topic: computer - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Natural Language Processing"' showing total 171,671 results

Start Over Descriptor "Natural Language Processing" Topic computer

171,671 results on '"Natural Language Processing"'

1. Measuring Implicit Bias in ICU Notes Using Word-Embedding Neural Network Models.

Author: Cobert, Julien, Mills, Hunter, Lee, Albert, Gologorskaya, Oksana, Espejo, Edie, Jeon, Sun, Boscardin, W, Heintz, Timothy, Kennedy, Christopher, Ashana, Deepshikha, Chapman, Allyson, Raghunathan, Karthik, Smith, Alex, and Lee, Sei
Subjects: critical care, inequity, linguistics, machine learning, natural language processing, Humans, Natural Language Processing, Intensive Care Units, Neural Networks, Computer, Algorithms, Critical Illness, Bias, Electronic Health Records, Male, Female
Abstract: BACKGROUND: Language in nonmedical data sets is known to transmit human-like biases when used in natural language processing (NLP) algorithms that can reinforce disparities. It is unclear if NLP algorithms of medical notes could lead to similar transmissions of biases. RESEARCH QUESTION: Can we identify implicit bias in clinical notes, and are biases stable across time and geography? STUDY DESIGN AND METHODS: To determine whether different racial and ethnic descriptors are similar contextually to stigmatizing language in ICU notes and whether these relationships are stable across time and geography, we identified notes on critically ill adults admitted to the University of California, San Francisco (UCSF), from 2012 through 2022 and to Beth Israel Deaconess Hospital (BIDMC) from 2001 through 2012. Because word meaning is derived largely from context, we trained unsupervised word-embedding algorithms to measure the similarity (cosine similarity) quantitatively of the context between a racial or ethnic descriptor (eg, African-American) and a stigmatizing target word (eg, nonco-operative) or group of words (violence, passivity, noncompliance, nonadherence). RESULTS: In UCSF notes, Black descriptors were less likely to be similar contextually to violent words compared with White descriptors. Contrastingly, in BIDMC notes, Black descriptors were more likely to be similar contextually to violent words compared with White descriptors. The UCSF data set also showed that Black descriptors were more similar contextually to passivity and noncompliance words compared with Latinx descriptors. INTERPRETATION: Implicit bias is identifiable in ICU notes. Racial and ethnic group descriptors carry different contextual relationships to stigmatizing words, depending on when and where notes were written. Because NLP models seem able to transmit implicit bias from training data, use of NLP algorithms in clinical prediction could reinforce disparities. Active debiasing strategies may be necessary to achieve algorithmic fairness when using language models in clinical research.
Published: 2024

2. Artificial Intelligence and Machine Learning.

Author: Muthuraj and Singla, Shrutika
Subjects: BIOLOGICAL evolution, REINFORCEMENT (Psychology), DATA security, ARTIFICIAL intelligence, NATURAL language processing, DEEP learning, ARTIFICIAL neural networks, MACHINE learning, ALGORITHMS, USER interfaces
Abstract: Artificial Intelligence (AI) and Machine Learning (ML) have rapidly gained prominence as transformative technologies with immense potential to revolutionize various industries and domains. This research paper presents a comprehensive review of AI and ML, encompassing their fundamental concepts, techniques, and applications. Additionally, it explores recent advancements in the field and offers valuable insights into the future prospects of AI and ML. The paper discusses the historical evolution of AI, the different approaches to AI development, and the components that constitute AI systems. Furthermore, it delves into the core concepts and algorithms of ML, including supervised, unsupervised, and reinforcement learning, as well as the advent of deep learning and neural networks. The applications of AI and ML across diverse domains such as natural language processing, computer vision, healthcare, and finance are also discussed. Recent advancements, such as transfer learning, generative adversarial networks, explainable AI, and federated learning, are highlighted, along with the challenges and limitations faced by these technologies, such as ethical concerns, data quality issues, and interpretability challenges. The paper concludes by presenting future perspectives, including the integration of AI with other technologies, advancements in human-computer interaction, and the impact of quantum computing on ML. This research emphasizes the importance of ongoing research and development in AI and ML and the need to address ethical, security, and interpretability considerations for responsible and beneficial implementation in society. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

3. Identifying suicide documentation in clinical notes through zero‐shot learning.

Author: Workman, Terri Elizabeth, Goulet, Joseph L., Brandt, Cynthia A., Warren, Allison R., Eleazer, Jacob, Skanderson, Melissa, Lindemann, Luke, Blosnich, John R., O'Leary, John, and Zeng‐Treitler, Qing
Abstract: Background and Aims: In deep learning, a major difficulty in identifying suicidality and its risk factors in clinical notes is the lack of training samples given the small number of true positive instances among the number of patients screened. This paper describes a novel methodology that identifies suicidality in clinical notes by addressing this data sparsity issue through zero‐shot learning. Our general aim was to develop a tool that leveraged zero‐shot learning to effectively identify suicidality documentation in all types of clinical notes. Methods: US Veterans Affairs clinical notes served as data. The training data set label was determined using diagnostic codes of suicide attempt and self‐harm. We used a base string associated with the target label of suicidality to provide auxiliary information by narrowing the positive training cases to those containing the base string. We trained a deep neural network by mapping the training documents' contents to a semantic space. For comparison, we trained another deep neural network using the identical training data set labels, and bag‐of‐words features. Results: The zero‐shot learning model outperformed the baseline model in terms of area under the curve, sensitivity, specificity, and positive predictive value at multiple probability thresholds. In applying a 0.90 probability threshold, the methodology identified notes documenting suicidality but not associated with a relevant ICD‐10‐CM code, with 94% accuracy. Conclusion: This method can effectively identify suicidality without manual annotation. Key points: Due to data sparsity, suicidality documentation is difficult to identify in clinical notes. Zero‐shot learning addresses the data sparsity issue. Zero‐shot learning enables identification of suicidality and its risks in clinical notes, and associated patients, where no diagnostic code relevant to suicide or self‐harm has been recorded. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

4. Identifying suicide documentation in clinical notes through zero‐shot learning

Author: Terri Elizabeth Workman, Joseph L. Goulet, Cynthia A. Brandt, Allison R. Warren, Jacob Eleazer, Melissa Skanderson, Luke Lindemann, John R. Blosnich, John O'Leary, and Qing Zeng‐Treitler
Subjects: computer, international classification of diseases, natural language processing, neural networks, suicide, Veterans, Medicine
Abstract: Abstract Background and Aims In deep learning, a major difficulty in identifying suicidality and its risk factors in clinical notes is the lack of training samples given the small number of true positive instances among the number of patients screened. This paper describes a novel methodology that identifies suicidality in clinical notes by addressing this data sparsity issue through zero‐shot learning. Our general aim was to develop a tool that leveraged zero‐shot learning to effectively identify suicidality documentation in all types of clinical notes. Methods US Veterans Affairs clinical notes served as data. The training data set label was determined using diagnostic codes of suicide attempt and self‐harm. We used a base string associated with the target label of suicidality to provide auxiliary information by narrowing the positive training cases to those containing the base string. We trained a deep neural network by mapping the training documents’ contents to a semantic space. For comparison, we trained another deep neural network using the identical training data set labels, and bag‐of‐words features. Results The zero‐shot learning model outperformed the baseline model in terms of area under the curve, sensitivity, specificity, and positive predictive value at multiple probability thresholds. In applying a 0.90 probability threshold, the methodology identified notes documenting suicidality but not associated with a relevant ICD‐10‐CM code, with 94% accuracy. Conclusion This method can effectively identify suicidality without manual annotation.
Published: 2023
Full Text: View/download PDF

5. Bio-AnswerFinder: a system to find answers to questions from biomedical texts

Author: Ozyurt, Ibrahim Burak, Bandrowski, Anita, and Grethe, Jeffrey S
Subjects: Data Management and Data Science, Information and Computing Sciences, Artificial Intelligence, Good Health and Well Being, Biomedical Research, Data Mining, Humans, Information Storage and Retrieval, Natural Language Processing, Neural Networks, Computer, Data Format, Library and Information Studies, Bioinformatics and computational biology, Data management and data science
Abstract: The ever accelerating pace of biomedical research results in corresponding acceleration in the volume of biomedical literature created. Since new research builds upon existing knowledge, the rate of increase in the available knowledge encoded in biomedical literature makes the easy access to that implicit knowledge more vital over time. Toward the goal of making implicit knowledge in the biomedical literature easily accessible to biomedical researchers, we introduce a question answering system called Bio-AnswerFinder. Bio-AnswerFinder uses a weighted-relaxed word mover's distance based similarity on word/phrase embeddings learned from PubMed abstracts to rank answers after question focus entity type filtering. Our approach retrieves relevant documents iteratively via enhanced keyword queries from a traditional search engine. To improve document retrieval performance, we introduced a supervised long short term memory neural network to select keywords from the question to facilitate iterative keyword search. Our unsupervised baseline system achieves a mean reciprocal rank score of 0.46 and Precision@1 of 0.32 on 936 questions from BioASQ. The answer sentences are further ranked by a fine-tuned bidirectional encoder representation from transformers (BERT) classifier trained using 100 answer candidate sentences per question for 492 BioASQ questions. To test ranking performance, we report a blind test on 100 questions that three independent annotators scored. These experts preferred BERT based reranking with 7% improvement on MRR and 13% improvement on Precision@1 scores on average.
Published: 2020

6. Detecting conversation topics in primary care office visits from transcripts of patient-provider interactions

Author: Park, Jihyun, Kotzias, Dimitrios, Kuo, Patty, Logan, Robert L, Merced, Kritzia, Singh, Sameer, Tanana, Michael, Taniskidou, Efi Karra, Lafata, Jennifer Elston, Atkins, David C, Tai-Seale, Ming, Imel, Zac E, and Smyth, Padhraic
Subjects: Clinical Research, Bioengineering, Health Services, Generic health relevance, Good Health and Well Being, Aged, Communication, Datasets as Topic, Humans, Machine Learning, Medical Records, Middle Aged, Natural Language Processing, Neural Networks, Computer, Office Visits, Physician-Patient Relations, Primary Health Care, Tape Recording, classification, supervised machine learning, patient care, communication, Information and Computing Sciences, Engineering, Medical and Health Sciences, Medical Informatics
Abstract: ObjectiveAmid electronic health records, laboratory tests, and other technology, office-based patient and provider communication is still the heart of primary medical care. Patients typically present multiple complaints, requiring physicians to decide how to balance competing demands. How this time is allocated has implications for patient satisfaction, payments, and quality of care. We investigate the effectiveness of machine learning methods for automated annotation of medical topics in patient-provider dialog transcripts.Materials and methodsWe used dialog transcripts from 279 primary care visits to predict talk-turn topic labels. Different machine learning models were trained to operate on single or multiple local talk-turns (logistic classifiers, support vector machines, gated recurrent units) as well as sequential models that integrate information across talk-turn sequences (conditional random fields, hidden Markov models, and hierarchical gated recurrent units).ResultsEvaluation was performed using cross-validation to measure 1) classification accuracy for talk-turns and 2) precision, recall, and F1 scores at the visit level. Experimental results showed that sequential models had higher classification accuracy at the talk-turn level and higher precision at the visit level. Independent models had higher recall scores at the visit level compared with sequential models.ConclusionsIncorporating sequential information across talk-turns improves the accuracy of topic prediction in patient-provider dialog by smoothing out noisy information from talk-turns. Although the results are promising, more advanced prediction techniques and larger labeled datasets will likely be required to achieve prediction performance appropriate for real-world clinical applications.
Published: 2019

7. Interactive Conversational Agents for Health Promotion, Prevention, and Care: Protocol for a Mixed Methods Systematic Scoping Review.

Author: Sasseville, Maxime, Sanchez, Romina H. Barony, Yameogo, Achille R., Bergeron-Drolet, Laurie-Ann, Bergeron, Frédéric, and Gagnon, Marie-Pierre
Subjects: HEALTH promotion, CHATBOTS, MIXED methods research, NATURAL language processing, DATA extraction, DIGITAL health
Abstract: Background: Interactive conversational agents, also known as "chatbots," are computer programs that use natural language processing to engage in conversations with humans to provide or collect information. Although the literature on the development and use of chatbots for health interventions is growing, important knowledge gaps remain, such as identifying design aspects relevant to health care and functions to offer transparency in decision-making automation. Objective: This paper presents the protocol for a scoping review that aims to identify and categorize the interactive conversational agents currently used in health care. Methods: A mixed methods systematic scoping review will be conducted according to the Arksey and O'Malley framework and the guidance of Peters et al for systematic scoping reviews. A specific search strategy will be formulated for 5 of the most relevant databases to identify studies published in the last 20 years. Two reviewers will independently apply the inclusion criteria using the full texts and extract data. We will use structured narrative summaries of main themes to present a portrait of the current scope of available interactive conversational agents targeting health promotion, prevention, and care. We will also summarize the differences and similarities between these conversational agents. Results: The search strategy and screening steps were completed in March 2022. Data extraction and analysis started in May 2022, and the results are expected to be published in October 2022. Conclusions: This fundamental knowledge will be useful for the development of interactive conversational agents adapted to specific groups in vulnerable situations in health care and community settings. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

8. A nursing note-aware deep neural network for predicting mortality risk after hospital discharge.

Author: Huang, Yong-Zhen, Chen, Yan-Ming, Lin, Chih-Cheng, Chiu, Hsiao-Yean, and Chang, Yung-Chun
Subjects: *RISK assessment, *PREDICTION models, *CRITICALLY ill, *PATIENTS, *RECEIVER operating characteristic curves, *PATIENT readmissions, *NURSING records, *SYSTEMS development, *CATASTROPHIC illness, *DISCHARGE planning, *NATURAL language processing, *DESCRIPTIVE statistics, *ARTIFICIAL neural networks, *INTENSIVE care units, *ELECTRONIC health records, *SURVIVAL analysis (Biometry), MORTALITY risk factors
Abstract: ICU readmissions and post-discharge mortality pose significant challenges. Previous studies used EHRs and machine learning models, but mostly focused on structured data. Nursing records contain crucial unstructured information, but their utilization is challenging. Natural language processing (NLP) can extract structured features from clinical text. This study proposes the Crucial Nursing Description Extractor (CNDE) to predict post-ICU discharge mortality rates and identify high-risk patients for unplanned readmission by analyzing electronic nursing records. Developed a deep neural network (NurnaNet) with the ability to perceive nursing records, combined with a bio-clinical medicine pre-trained language model (BioClinicalBERT) to analyze the electronic health records (EHRs) in the MIMIC III dataset to predict the death of patients within six month and two year risk. A cohort and system development design was used. Based on data extracted from MIMIC-III, a database of critically ill in the US between 2001 and 2012, the results were analyzed. We calculated patients' age using admission time and date of birth information from the MIMIC dataset. Patients under 18 or over 89 years old, or who died in the hospital, were excluded. We analyzed 16,973 nursing records from patients' ICU stays. We have developed a technology called the Crucial Nursing Description Extractor (CNDE), which extracts key content from text. We use the logarithmic likelihood ratio to extract keywords and combine BioClinicalBERT. We predict the survival of discharged patients after six months and two years and evaluate the performance of the model using precision, recall, the F 1 -score, the receiver operating characteristic curve (ROC curve), the area under the curve (AUC), and the precision–recall curve (PR curve). The research findings indicate that NurnaNet achieved good F 1 -scores (0.67030, 0.70874) within six months and two years. Compared to using BioClinicalBERT alone, there was an improvement in performance of 2.05 % and 1.08 % for predictions within six months and two years, respectively. CNDE can effectively reduce long-form records and extract key content. NurnaNet has a good F 1 -score in analyzing the data of nursing records, which helps to identify the risk of death of patients after leaving the hospital and adjust the regular follow-up and treatment plan of relevant medical care as soon as possible. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Robust Scene Parsing by Mining Supportive Knowledge From Dataset

Author: Siwei Lyu, Zhicheng Jiao, Ao Luo, Hong Cheng, Fan Yang, Yuezun Li, and Xin Li
Subjects: Parsing, Artificial neural network, Pixel, Computer Networks and Communications, Computer science, business.industry, computer.software_genre, Computer Science Applications, Image (mathematics), Task (project management), Artificial Intelligence, Segmentation, Artificial intelligence, business, Raw data, Representation (mathematics), computer, Software, Natural language processing
Abstract: Scene parsing, or semantic segmentation, aims at labeling all pixels in an image with the predefined categories of things and stuff. Learning a robust representation for each pixel is crucial for this task. Existing state-of-the-art (SOTA) algorithms employ deep neural networks to learn (discover) the representations needed for parsing from raw data. Nevertheless, these networks discover desired features or representations only from the given image (content), ignoring more generic knowledge contained in the dataset. To overcome this deficiency, we make the first attempt to explore the meaningful supportive knowledge, including general visual concepts (i.e., the generic representations for objects and stuff) and their relations from the whole dataset to enhance the underlying representations of a specific scene for better scene parsing. Specifically, we propose a novel supportive knowledge mining module (SKMM) and a knowledge augmentation operator (KAO), which can be easily plugged into modern scene parsing networks. By taking image-specific content and dataset-level supportive knowledge into full consideration, the resulting model, called knowledge augmented neural network (KANN), can better understand the given scene and provide greater representational power. Experiments are conducted on three challenging scene parsing and semantic segmentation datasets: Cityscapes, Pascal-Context, and ADE20K. The results show that our KANN is effective and achieves better results than all existing SOTA methods.
Published: 2023

10. Learning Enhanced Acoustic Latent Representation for Small Scale Affective Corpus with Adversarial Cross Corpora Integration

Author: Chun-Min Chang and Chi-Chun Lee
Subjects: Domain adaptation, business.industry, Computer science, Scale (chemistry), computer.software_genre, Autoencoder, Human-Computer Interaction, Constraint (information theory), Adversarial system, Artificial intelligence, Representation (mathematics), Transfer of learning, business, computer, Feature learning, Software, Natural language processing
Abstract: Achieving robust cross contexts speech emotion recognition (SER) has become a critical next direction of research for wide adoption of SER technology. The core challenge is in the large variability of affective speech that is highly contextualized. Prior works have worked on this as a transfer learning problem that mostly focuses on developing domain adaptation strategy. However, many of the existing speech emotion corpora, even those considered as large scale, are still limited in size resulting in an unsatisfactory transfer result. On the other hand, directly collecting context-specific corpus often results in an even smaller data size leading to an inevitably non-robust accuracy. In order to mitigate this issue, we propose the concept of enhancing the affect-related variability when learning the in-context acoustic latent representation by integrating out-of-context emotion data. Specifically, we utilize adversarial autoencoder network as our backbone with multiple out-of-context emotion labels derived for each in-context samples that serve as an auxiliary constraint in learning the latent representation. We extensively evaluate our framework using three in-context databases with three out-of-context databases. In this work, we demonstrate not only an improved recognition accuracy but also a comprehensive analysis on the effectiveness of this representation learning strategy.
Published: 2023

11. A Deep Learning Framework for News Readers’ Emotion Prediction Based on Features From News Article and Pseudo Comments

Author: Zhao Sun, Ying Wang, Xu Mou, Xintong Li, Qinke Peng, and Muhammad Fiaz Bashir
Subjects: Structure (mathematical logic), business.industry, Computer science, Process (engineering), Deep learning, Representation (arts), computer.software_genre, ENCODE, Computer Science Applications, Human-Computer Interaction, Control and Systems Engineering, Encoding (semiotics), The Internet, Social media, Artificial intelligence, Electrical and Electronic Engineering, business, computer, Software, Natural language processing, Information Systems
Abstract: With the rapid development of the Internet, readers tend to share their views and emotions about news events. Predicting these emotions provides a vital role in social media applications (e.g., sentiment retrieval, opinion summary, and election prediction). However, news articles usually consist of objective texts that lack emotion words, making emotion prediction challenging. From prior studies, we know that comments that come directly from readers are full of emotions. Therefore, in this article, we propose a deep learning framework that first merges article and comment information to predict readers' emotions. At the same time, in the prediction process, we design a pseudo comment representation for unpublished news articles by the comments of published news. In addition, a better model is required to encode articles that contain implicit emotions. To solve this problem, we propose a block emotion attention network (BEAN) to encode news articles better. It includes an emotion attention mechanism and a hierarchical structure to capture emotion words and generate structural information during encoding. Experiments performed on three public datasets show that BEAN achieves the state-of-the-art average Pearson (AP) and accuracy (Acc@1). Moreover, results on four self-collected datasets show that both the introduction of emotional comments and BEAN in our framework improve the ability to predict readers' emotions.
Published: 2023

12. Varieties of Error and Varieties of Evidence in Scientific Inference

Author: Juergen Landes and Barbara Osimani
Subjects: Philosophy, History, History and Philosophy of Science, business.industry, Scientific inference, Artificial intelligence, business, computer.software_genre, computer, Natural language processing, Mathematics
Published: 2023

13. Scalable Identity-Oriented Speech Retrieval

Author: Rongzhong Lian, Lixin Fan, Chen Zhang, Jinhua Peng, Chaotao Chen, Yawen Li, Lei Chen, and Di Jiang
Subjects: Computer science, business.industry, Search engine indexing, Speech retrieval, Financial risk management, Snippet, computer.software_genre, Computer Science Applications, Task (project management), Computational Theory and Mathematics, Scalable system, Scalability, Identity (object-oriented programming), Artificial intelligence, business, computer, Natural language processing, Information Systems
Abstract: With the prevalence of voice devices in our daily life, speech data is accumulated at an unprecedented speed. The vast amount of speech data form an invaluable database for security surveillance and financial risk management. However, the speeches collected from different sources are not necessarily annotated with regard to the speaker identity, making the task of retrieving all the speech records for a given identity extremely challenging. In this paper, we propose a scalable system for Identity-Oriented Speech Retrieval (IO-SR), which seamlessly integrates speaker modeling and deep indexing techniques. Given a speech snippet and a speech database, IO-SR efficiently retrieves all speech snippets that are uttered by the same speaker as the given one. Evaluations on an industrial dataset containing millions of speech snippets show that our system achieves superior performance compared with the state-of-the-arts.
Published: 2023

14. Pose-Guided Hierarchical Semantic Decomposition and Composition for Human Parsing

Author: Beibei Yang, Changqian Yu, Changxin Gao, Nong Sang, and Jin-Gang Yu
Subjects: Structure (mathematical logic), Parsing, Intersection (set theory), business.industry, Computer science, computer.software_genre, Computer Science Applications, Task (project management), Human-Computer Interaction, Control and Systems Engineering, Decomposition (computer science), Segmentation, Artificial intelligence, Electrical and Electronic Engineering, Focus (optics), business, Pose, computer, Software, Natural language processing, Information Systems
Abstract: Human parsing is a fine-grained semantic segmentation task, which needs to understand human semantic parts. Most existing methods model human parsing as a general semantic segmentation, which ignores the inherent relationship among hierarchical human parts. In this work, we propose a pose-guided hierarchical semantic decomposition and composition framework for human parsing. Specifically, our method includes a semantic maintained decomposition and composition (SMDC) module and a pose distillation (PC) module. SMDC progressively disassembles the human body to focus on the more concise regions of interest in the decomposition stage and then gradually assembles human parts under the guidance of pose information in the composition stage. Notably, SMDC maintains the atomic semantic labels during both stages to avoid the error propagation issue of the hierarchical structure. To further take advantage of the relationship of human parts, we introduce pose information as explicit guidance for the composition. However, the discrete structure prediction in pose estimation is against the requirement of the continuous region in human parsing. To this end, we design a PC module to broadcast the maximum responses of pose estimation to form the continuous structure in the way of knowledge distillation. The experimental results on the look-into-person (LIP) and PASCAL-Person-Part datasets demonstrate the superiority of our method compared with the state-of-the-art methods, that is, 55.21% mean Intersection of Union (mIoU) on LIP and 69.88% mIoU on PASCAL-Person-Part.
Published: 2023

15. The Accuracy Trap: The Values and Meaning of Algorithmic Mapping, from Mineral Extraction to Climate Change

Author: William Rankin
Subjects: History, business.industry, Geography, Planning and Development, Climate change, Environmental Science (miscellaneous), computer.software_genre, Trap (computing), Arts and Humanities (miscellaneous), Extraction (military), Meaning (existential), Artificial intelligence, business, computer, Natural language processing, Mathematics
Abstract: For specialists and non-specialists alike, maps are one of the central ways that the environment becomes visible and comprehensible. Since the 1960s, both the practice and the values of environmental mapping have been transformed by new algorithmic methods for turning point-by-point measurements into a smooth cartographic image, especially when visualising the invisible geographies of pollution, climate change, and underground resources. The resulting maps are now ubiquitous. This article argues that algorithmic methods - especially those created by a small group of French mining engineers and installed widely in software by the 1990s - shifted the values and meaning of environmental mapping away from a traditional concern with qualitative realism to a new emphasis on quantitative accuracy. This was a shift not just in the goals of mapping but in the kind of environment that maps ultimately construct. The prioritisation of accuracy should therefore not be seen as a straightforward improvement, as its associated values raise difficult conceptual problems, both historical and historiographic, about scale, expertise and the role of human judgement in the creation of environmental fact. Historicising the techniques and meanings of mapping is especially important as environmental historians consider new geospatial methods - including algorithmic methods - in their own work.
Published: 2023

16. Automatic emotion recognition for groups: a review

Author: Koen V. Hindriks, Emmeke Anna Veltmeijer, Charlotte Gerritsen, Artificial intelligence, Network Institute, Artificial Intelligence (section level), and Social AI
Subjects: Coping (psychology), Modalities, Computer science, business.industry, computer.software_genre, Code (semiotics), Human-Computer Interaction, Automatic group, Crowds, Robustness (computer science), Social media, Artificial intelligence, Emotion recognition, business, computer, Software, Natural language processing
Abstract: This review aims to summarize and describe research on the topic of automatic group emotion recognition. In recent years, the topic of emotion analysis of groups or crowds has gained interest, with studies performing emotion detection in different contexts, using different datasets and modalities (such as images, video, audio, social media messages), and taking different approaches. Articles are included after an innovative search method, including Dense Query Extraction and automatic cross-referencing. Discussed are the types of groups and emotion models considered in automatic emotion recognition research, common datasets for all modalities, general approaches taken, and reported performances. These performances are discussed, followed by an analysis of the application possibilities of the discussed methods. To ensure clear, replicable, and comparable studies, we suggest research should test on multiple, common datasets and report on multiple metrics, when possible. Implementation details and code should be made available where possible. An area of interest for future work is to build systems with more real-world application possibilities, coping with changing group sizes, different emotional subgroups, and changing emotions over time, while having a higher robustness and working with datasets with reduced biases.
Published: 2023

17. Detecting Dependency-Related Sentiment Features for Aspect-Level Sentiment Classification

Author: Changxi Zhu, Jingyun Xu, Xing Zhang, Xingwei Tan, and Yi Cai
Subjects: Dependency (UML), Artificial neural network, Computer science, Polarity (physics), Dependency relation, business.industry, Parse tree, computer.software_genre, Term (time), Human-Computer Interaction, Syntactic structure, Artificial intelligence, business, computer, Software, Sentence, Natural language processing
Abstract: Aspect level sentiment classification aims to classify the sentiment polarity of a given aspect term or aspect category in a sentence. For sentiment classification towards a given aspect term, since a sentence may contain more than one aspect term, there may exist some opinions which are not the modifiers of the given aspect term. It is necessary to capture relevant opinion for a certain aspect term. Previous works use the relative distance between an aspect term and all other words in a sentence, in order to capture the nearest opinion of the aspect term. This can be infeasible when the sentence has a complex syntactic structure. In this paper, we detect the dependency relation between an aspect term and its related sentiment words in the dependency parse tree. Then, we integrate this relationship into CNN and Bi-LSTM respectively. Experiments show that the related sentiment features for an aspect term is helpful for models to discriminate its sentiment polarity, and our proposed models achieve state-of-the-art results among neural network models.
Published: 2023

18. Embedding Refinement Framework for Targeted Aspect-Based Sentiment Analysis

Author: Lin Gui, Jiachen Du, Rongdi Yin, Bin Liang, Ruifeng Xu, Yulan He, and Min Yang
Subjects: Commonsense knowledge, Computer science, business.industry, Deep learning, Sentiment analysis, Context (language use), computer.software_genre, QA76, Sparse coefficient, Human-Computer Interaction, Benchmark (computing), Embedding, Artificial intelligence, business, computer, Software, Word (computer architecture), Natural language processing
Abstract: The state-of-the-art approaches to Targeted Aspect-Based Sentiment Analysis (TABSA) are mostly deep learning models based on attention mechanisms. One problem in most previous studies is that embeddings of targets and aspects are either pre-trained from large external corpora or randomly initialized. We argue that affective commonsense knowledge and words indicative of sentiment could be used to learn better target and aspect embeddings. We therefore propose an embedding refinement framework called RAEC (Refining Affective Embedding from Context), in which sentiment concepts extracted from affective commonsense knowledge and word relative location information are incorporated to derive context-affective embeddings. Furthermore, a sparse coefficient vector is exploited in refining the embeddings of targets and aspects separately. In this way, embeddings of targets and aspects can capture the highly relevant affective words. Experimental results on two benchmark datasets show that our framework can be easily integrated with existing embedding-based TABSA models and achieves state-of-the-art results compared to models relying on pre-trained word embeddings or built on other embedding refinement methods.
Published: 2023

19. Cyclic Selection: Auxiliaries Are Merged, Not Inserted

Author: Asia Pietraszko
Subjects: Linguistics and Language, Computer science, business.industry, Artificial intelligence, computer.software_genre, business, computer, Language and Linguistics, Selection (genetic algorithm), Linguistics, Natural language processing
Abstract: Traditional approaches to verbal periphrasis (compound tenses) treat auxiliary verbs as lexical items that enter syntactic derivation like any other lexical item, via Selection/Merge. An alternative view is that auxiliary verbs are inserted into a previously built structure (e.g., Bach 1967, Arregi 2000, Embick 2000, Cowper 2010, Bjorkman 2011, Arregi and Klecha 2015). Arguments for the insertion approach include auxiliaries’ last-resort distribution and the fact that, in many languages, auxiliaries are not systematically associated with a given inflectional category (Bjorkman’s (2011) “overflow” distribution). Here, I argue against the insertion approach. I demonstrate that the overflow pattern and last-resort distribution follow from Cyclic Selection (Pietraszko 2017)—a Merge counterpart of Cyclic Agree (Béjar and Rezac 2009). I also show that the insertion approach makes wrong predictions about compound tenses in Swahili, a language with overflow periphrasis. Under my approach, an auxiliary verb is a verbal head externally merged as a specifier of a functional head, such as T. It then undergoes m-merger with that head, instantiating an External-Merge version of Matushansky’s (2006) conception of head movement.
Published: 2023

20. Natural language processing and deep learning chatbot using long short term memory algorithm

Author: S. Balaji and E. Kasthuri
Subjects: 010302 applied physics, business.industry, Computer science, Deep learning, 02 engineering and technology, General Medicine, 021001 nanoscience & nanotechnology, computer.software_genre, 01 natural sciences, Chatbot, Long short term memory, Human interaction, 0103 physical sciences, Artificial intelligence, 0210 nano-technology, business, computer, Natural language processing
Abstract: Chatbots are becoming increasingly popular as virtual assistants, many businesses are launching If-This-Then-That programs to help them get started. Such systems, on the other hand, often generate chatbots that are stagnant and difficult to handle. The market for chatbots is increasing. Computer systems are often required, and online connections are improved by allowing users to express their needs, desires, or questions naturally and clearly by speaking, tapping, and talking. They're easy to use, perfect for people of all ages, and have the most detailed responses to questions. The majority of today's education is distributed through e-learning. A chatbot is one of the most powerful ways for students to read, as it can answer questions at any time without the need for human interaction. This chatbot is highly capable of overcoming student uncertainty without the need for human interaction. The IF THIS Then THAT paradigm is used by the bulk of chatbots. Natural language processing and deep learning technologies were used to build this chatbot. As a consequence, the chatbot will comprehend questions at a higher level.
Published: 2023

21. Improved Technology Similarity Measurement in the Medical Field based on Subject-Action-Object Semantic Structure: A Case Study of Alzheimer's Disease

Author: Rongrong Li, Yuqin Liu, Shuo Zhang, and Xuefeng Wang
Subjects: Structure (mathematical logic), Vocabulary, Computer science, business.industry, Strategy and Management, media_common.quotation_subject, Unified Medical Language System, Object (computer science), computer.software_genre, Semantic network, Feature (linguistics), Similarity (psychology), Artificial intelligence, Electrical and Electronic Engineering, International Patent Classification, business, computer, Natural language processing, media_common
Abstract: This article presents an improved method of measuring technology similarity by introducing a subject-action-object (SAO) analysis that uses the feature weights of semantic structure and professional vocabulary to measure technology similarity in the medical field. First, the SAO semantic structures are extracted and cleaned; then the structures related to technology are identified using a semantic network of the unified medical language system (UMLS). Second, the similarity between the SAO semantic structures is evaluated using semantic information from the Metathesaurus of the UMLS. Third, the feature weights of the SAO semantic structure are introduced to represent the importance of the patentees’ technology features. Finally, using the SAO and weight information, each patentee's vector is constructed to measure the technology complementarity between different patentees. This study conducts empirical research on Alzheimer's disease. The results indicate that the propose method for measuring technology similarity enables finer distinctions with more reliable outcomes than the traditional methods that are based on keywords and international patent classification.
Published: 2023

22. Joint Embedding of Deep Visual and Semantic Features for Medical Image Report Generation

Author: Jian Zhang, Yan Yang, Qingming Huang, Weidong Han, Jun Yu, and Hanliang Jiang
Subjects: Modality (human–computer interaction), Computer science, business.industry, Pipeline (computing), computer.software_genre, Computer Science Applications, Image (mathematics), Task (project management), Encoding (memory), Signal Processing, Media Technology, Benchmark (computing), Embedding, Artificial intelligence, Electrical and Electronic Engineering, business, computer, Natural language processing, Natural language
Abstract: Medical image report generation (MeIRG) aims at generating associated diagnosis descriptions with natural language sentences from medical images, which is essential in the computer-aided diagnosis system. Nevertheless, this task remains challenging in that medical images and linguistic expressions should be understood jointly which however show great discrepancies in the modality. To fill this visual-to-semantic gap, we propose a novel framework that follows the encoder-decoder pipeline. Our framework is characterized by encoding both deep visual and semantic embeddings through a triple-branch network (TriNet) during the encoding phase. The visual attention branch captures attended visual embeddings from medical images with the soft-attention mechanism. The medical report (MeRP) embedding branch predicts semantic report embeddings. The embedding branch of medical subject headings (MeSH) obtains semantic embeddings of related medical tags as complementary information. Then, outputs of these branches are fused and fed into a decoder for the report generation. Experimental results on two benchmark datasets have demonstrated the excellent performance of our method.
Published: 2023

23. Hierarchical Multiscale Recurrent Neural Networks for Detecting Suicide Notes

Author: Nina Dethlefs, Alexander P. Turner, Geeth de Mel, and Annika Marie Schoene
Subjects: business.industry, Computer science, Word count, computer.software_genre, Suicide prevention, Task (project management), Human-Computer Interaction, Linguistic analysis, Recurrent neural network, Personal pronoun, Social media, Artificial intelligence, Baseline (configuration management), business, computer, Software, Natural language processing
Abstract: Recent statistics in suicide prevention show that people are increasingly posting their last words online and with the unprecedented availability of textual data from social media platforms researchers have the opportunity to analyse such data. Furthermore, psychological studies have shown that our state of mind can manifest itself in the linguistic features we use to communicate. In this paper, we investigate whether it is possible to automatically identify suicide notes from other types of social media blogs in two document-level classification tasks. The first task aims to identify suicide notes from depressed and blog posts in a balanced dataset, whilst the second experiment looks at how well suicide notes can be classified when there is a vast amount of neutral text data, which makes the task more applicable to real-world scenarios. Furthermore we perform a linguistic analysis using LIWC (Linguistic Inquiry and Word Count). We present a learning model for modelling long sequences in two experiment series. We achieve an f1-score of 88.26% over the baselines of 0.60 in experiment 1 and 96.1% over the baseline in experiment 2. Finally, we show through visualisations which features the learning model identifies, these include emotions such as love and personal pronouns.
Published: 2023

24. Emotional Attention Detection and Correlation Exploration for Image Emotion Distribution Learning

Author: Zhiwei Xu and Shangfei Wang
Subjects: Distribution (number theory), business.industry, Computer science, Perspective (graphical), Relationship learning, computer.software_genre, Image (mathematics), Arousal, Human-Computer Interaction, Correlation, Benchmark (computing), Graph (abstract data type), Artificial intelligence, business, GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries), computer, Software, Natural language processing
Abstract: Current works on image emotion distribution learning typically extract visual representations from the holistic image or explore emotion-related regions in the image from a global-wise perspective. However, different regions of an image contribute differently to the arousal of each emotion. Existing works do not deeply explore corresponding emotion-aware regions of each emotion in the image, nor do they fully capture the relationship between each emotion-aware region and the emotion labels. In this paper, we propose a novel attention based emotion distribution learning method, which can explore the emotion-related regions of images from the perspective of each emotion category, and can conduct region relationship learning. Specifically, we introduce a semantic guided attention detection network to generate class-wise attention maps for each emotion and a global-wise attention map for the holistic image. Meanwhile, an emotional graph-based network is adopted to capture the correlation between each region and the emotion distribution. Experiments on several benchmark datasets demonstrate the superiority of the proposed method compared to related works.
Published: 2023

25. An Efficient Approach of Product Recommendation System using NLP Technique

Author: Akhilesh Kumar Sharma, Rachit Adhvaryu, Suthar Dhruvi Pankajkumar, Prajapati Parthkumar Gordhanbhai, Amit Kumar, and Bhavna Bajpai
Subjects: business.industry, Computer science, Feature vector, Visitor pattern, General Medicine, Recommender system, Clothing, computer.software_genre, Artificial intelligence, Product (category theory), Architecture, business, computer, Natural language processing
Abstract: As we are moving toward an age of digital globalization and online shopping, there is an increasing need for an efficient and reliable system that can help the consumers and the visitors to find their suitable products. Currently, various websites display the searched product when a visitor comes to their website. What we need is a system, which can recommend the products which are like the searched products. This will help the consumer to find out another product in case the item is unavailable, or the searched product is not good enough, or when they would like to look through different similar products. A good recommendation system has been found out to be financially beneficial for the companies also. It is found out that consumer is 35% more likely to buy a product if the recommendation is good enough for consumers. This proposed approach to the problem of the product recommendation system is to make use of the Amazon Apparel database, which contains data of 180,000 apparels. We are going to use NLP technologies and CNN to help in predicting similar products. Title of the product is used as a major attribute during NLP analysis and recommendation. CNN used at last to create a feature vector from product images, and use this vector combined with all the other vectors, for prediction. We compare the distance between vectors of all products and recommend the products with least distance. VGG-16 architecture used to extract the features from the images.
Published: 2023

26. Multi-Level Query Interaction for Temporal Language Grounding

Author: Haoyu Tang, Lin Wang, Jihua Zhu, Qinghai Zheng, and Tianwei Zhang
Subjects: Computational complexity theory, Computer science, Interface (Java), business.industry, Mechanical Engineering, computer.software_genre, Semantics, Computer Science Applications, Task (computing), Automotive Engineering, Benchmark (computing), Graph (abstract data type), Artificial intelligence, business, computer, Word (computer architecture), Natural language processing, Sentence
Abstract: Understanding what is happening in the surveillance video is important for human-machine interface in transportation systems, where temporal language grounding is one of the key tasks, targeting at localizing the desired moment in an untrimmed video with a given sentence query that is relevant to the moment. This task is challenging due to the following reasons: 1) the requirement of understanding the video contents and query semantics comprehensively, and 2) building the bridge between the cross-modal semantics. To tackle these problems, early methods first sample video clips and then match them with the sentence to find the most relevant one. To reduce the computational complexity associated with video clip sampling, recent methods directly predict the temporal boundaries of the desired moment on the fused features of the sentence and the video frames. However, all the previous methods often learn the word-level or phrase-level features of the sentence, or directly generates the global sentence representation by attention mechanisms or graph network. However, we argue that applying only word-level or phrase-level semantic information and cross-modal interactions is not enough to fully capture the correspondence between the video and the query. To this end, we proposed a novel Multi-level Query Exploration and Interaction (MQEI) model, which explores the semantics in both the word- and phrase-level and captures the multi-level interactions between the video and the query through an attention module. Extensive experiments on two public benchmark datasets ActivityNet Captions and Charades-STA demonstrate that the proposed model can outperform all the state-of-the-art methods consistently.
Published: 2022

27. Structured Multimodal Attentions for TextVQA

Author: Qi Wu, Chenyu Gao, Yuliang Liu, Peng Wang, Anton van den Hengel, Hui Li, and Qi Zhu
Subjects: FOS: Computer and information sciences, Vocabulary, Computer science, Computer Vision and Pattern Recognition (cs.CV), media_common.quotation_subject, Computer Science - Computer Vision and Pattern Recognition, computer.software_genre, Discriminative model, Artificial Intelligence, Question answering, Humans, media_common, Modality (human–computer interaction), business.industry, Applied Mathematics, Optical character recognition, Computational Theory and Mathematics, Graph (abstract data type), Neural Networks, Computer, Computer Vision and Pattern Recognition, Artificial intelligence, business, computer, Algorithms, Software, Natural language, Generative grammar, Natural language processing
Abstract: In this paper, we propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above. SMA first uses a structural graph representation to encode the object-object, object-text and text-text relationships appearing in the image, and then designs a multimodal graph attention network to reason over it. Finally, the outputs from the above modules are processed by a global-local attentional answering module to produce an answer splicing together tokens from both OCR and general vocabulary iteratively by following M4C. Our proposed model outperforms the SoTA models on TextVQA dataset and two tasks of ST-VQA dataset among all models except pre-training based TAP. Demonstrating strong reasoning ability, it also won first place in TextVQA Challenge 2020. We extensively test different OCR methods on several reasoning models and investigate the impact of gradually increased OCR performance on TextVQA benchmark. With better OCR results, different models share dramatic improvement over the VQA accuracy, but our model benefits most blessed by strong textual-visual reasoning ability. To grant our method an upper bound and make a fair testing base available for further works, we also provide human-annotated ground-truth OCR annotations for the TextVQA dataset, which were not given in the original release. The code and ground-truth OCR annotations for the TextVQA dataset are available at https://github.com/ChenyuGAO-CS/SMA, winner of TextVQA Challenge 2020, Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence
Published: 2022

28. Content and Style Aware Generation of Text-Line Images for Handwriting Recognition

Author: Lei Kang, Mauricio Villegas, Alicia Fornés, Marçal Rusiñol, and Pau Riba
Subjects: FOS: Computer and information sciences, Handwriting, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, computer.software_genre, Synthetic data, Pattern Recognition, Automated, Style (sociolinguistics), Artificial Intelligence, business.industry, Applied Mathematics, Volume (computing), Visual appearance, ComputingMethodologies_PATTERNRECOGNITION, Computational Theory and Mathematics, Handwriting recognition, ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Computer Vision and Pattern Recognition, State (computer science), Artificial intelligence, Line (text file), business, computer, Algorithms, Software, Natural language processing
Abstract: Handwritten Text Recognition has achieved an impressive performance in public benchmarks. However, due to the high inter- and intra-class variability between handwriting styles, such recognizers need to be trained using huge volumes of manually labeled training data. To alleviate this labor-consuming problem, synthetic data produced with TrueType fonts has been often used in the training loop to gain volume and augment the handwriting style variability. However, there is a significant style bias between synthetic and real data which hinders the improvement of recognition performance. To deal with such limitations, we propose a generative method for handwritten text-line images, which is conditioned on both visual appearance and textual content. Our method is able to produce long text-line samples with diverse handwriting styles. Once properly trained, our method can also be adapted to new target data by only accessing unlabeled text-line images to mimic handwritten styles and produce images with any textual content. Extensive experiments have been done on making use of the generated samples to boost Handwritten Text Recognition performance. Both qualitative and quantitative results demonstrate that the proposed approach outperforms the current state of the art., Comment: Accepted to TPAMI
Published: 2022

29. Explaining Semi-Supervised Text Alignment Through Visualization

Author: Stefan Jänicke, David Joseph Wrisley, and Christofer Meinecke
Subjects: Visual analytics, Computer science, business.industry, media_common.quotation_subject, Semantics, computer.software_genre, Computer Graphics and Computer-Aided Design, Syntax, Visualization, Data visualization, Reading (process), Signal Processing, Task analysis, Domain knowledge, Computer Vision and Pattern Recognition, Artificial intelligence, business, computer, Software, Natural language processing, media_common
Abstract: The analysis of variance in complex text traditions is an arduous task when carried out manually. Text alignment algorithms provide domain experts with a robust alternative to such repetitive tasks. Existing white-box approaches allow the digital humanities to establish syntax-based metrics taking into account the spelling, morphology and order of words. However, they produce limited results, as semantic meanings are typically not taken into account. Our interdisciplinary collaboration between visualization and digital humanities combined a semi-supervised text alignment approach based on word embeddings that take not only syntactic but also semantic text features into account, thereby improving the overall quality of the alignment. In our collaboration, we developed different visual interfaces that communicate the word distribution in high-dimensional vector space generated by the underlying neural network for increased transparency, assessment of the tool's reliability and overall improved hypothesis generation. We further offer visual means to enable the expert reader to feed domain knowledge into the system at multiple levels with the aim of improving both the product and the process of text alignment. This ultimately illustrates how visualization can engage with and augment complex modes of reading in the humanities.
Published: 2022

30. Exemplar-model account of categorization and recognition when training instances never repeat

Author: Robert M. Nosofsky and Mingjia Hu
Subjects: Linguistics and Language, Dissociation (neuropsychology), business.industry, Contrast (statistics), Experimental and Cognitive Psychology, PsycINFO, Exemplar theory, computer.software_genre, Language and Linguistics, Small set, Categorization, Null (SQL), Concept learning, Artificial intelligence, Psychology, business, computer, Natural language processing
Abstract: In a novel version of the classic dot-pattern prototype-distortion paradigm of category learning, Homa et al. (2019) tested a condition in which individual training instances never repeated, and observed results that they claimed severely challenged exemplar models of classification and recognition. Among the results was a dissociation in which participants classified transfer items with high accuracy in the no-repeat condition, yet in old-new recognition tests showed no ability to discriminate between old and new items of the same level of distortion from the prototype. In addition, speed of classification learning was no faster in a condition in which a small set of training instances was repeated continuously compared with the no-repeat condition. Here we show through computer-simulation modeling that exemplar models naturally capture the classification-recognition dissociation in the no-repeat condition, as well as a wide variety of other qualitative effects reported by Homa et al. (2019). We also conduct new conceptual-replication experiments to investigate their reported null effect of repeated versus nonrepeated training instances on speed of classification learning. In contrast to Homa et al. (2019) we find that speed of learning is substantially faster in the repeat condition than in the no-repeat condition, precisely as exemplar models predict. The exemplar model also captures a wide variety of transfer effects observed following the completion of category learning, including the classification-recognition dissociation observed across the repeat and no-repeat conditions. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Published: 2022

31. Early prediction of writing quality using keystroke logging

Author: Luuk Van Waes, Rianne Conijn, Christine L. Cook, Menno van Zaanen, Human Technology Interaction, and EAISI Foundational
Subjects: Academic writing, Computer science, computer.software_genre, Keystroke logging, Early prediction, Education, Task (project management), Educational sciences, Computer. Automation, 060201 languages & linguistics, business.industry, 4. Education, 05 social sciences, Writing process, Educational technology, 050301 education, Contrast (statistics), Linguistics, Regression analysis, 06 humanities and the arts, Automatic summarization, Writing quality, Writing processes, Computational Theory and Mathematics, Literature, 0602 languages and literature, Artificial intelligence, business, 0503 education, computer, Natural language processing
Abstract: Feedback is important to improve writing quality; however, to provide timely and personalized feedback is a time-intensive task. Currently, most literature focuses on providing (human or machine) support on product characteristics, especially after a draft is submitted. However, this does not assist students who struggle during the writing process. Therefore, in this study, we investigate the use of keystroke analysis to predict writing quality throughout the writing process. Keystroke data were analyzed from 126 English as a second language learners performing a timed academic summarization task. Writing quality was measured using participants’ final grade. Based on previous literature, 54 keystroke features were extracted. Correlational analyses were conducted to identify the relationship between keystroke features and writing quality. Next, machine learning models (regression and classification) were used to predict final grade and classify students who might need support at several points during the writing process. The results show that, in contrast to previous work, the relationship between writing quality and keystroke data was rather limited. None of the regression models outperformed the baseline, and the classification models were only slightly better than the majority class baseline (highest AUC = 0.57). In addition, the relationship between keystroke features and writing quality changed throughout the course of the writing process. To conclude, the relationship between keystroke data and writing quality might be less clear than previously posited.
Published: 2022

32. Active Learning Based 3D Semantic Labeling From Images and Videos

Author: Hainan Cui, Hongmin Liu, Hanqing Jiang, Zhanyi Hu, Shuhan Shen, and Mengqi Rong
Subjects: business.industry, Semantic labeling, Active learning (machine learning), Computer science, Media Technology, Artificial intelligence, Electrical and Electronic Engineering, business, computer.software_genre, computer, Natural language processing
Published: 2022

33. DeepSIM: Deep Semantic Information-Based Automatic Mandelbug Classification

Author: Guanping Xiao, Zheng Zheng, Kishor S. Trivedi, Xiaoting Du, and Zenghui Zhou
Subjects: Word embedding, business.industry, Computer science, Deep learning, computer.software_genre, Semantic data model, Convolutional neural network, Software, Heisenbug, Classifier (linguistics), Artificial intelligence, Electrical and Electronic Engineering, Safety, Risk, Reliability and Quality, business, computer, Word (computer architecture), Natural language processing
Abstract: Understanding and predicting types of bugs are of practical importance for developers to improve the testing efficiency and take appropriate steps to address bugs in software releases. However, due to the complex conditions under which faults manifest and the complexity of the classification rules, the automatic classification of Mandelbugs is a difficult task. In this article, we present a deep semantic information-based Mandelbug classification method that combines a semantic model with a deep learning classifier and makes use of both labeled and unlabeled bug reports. By training the bug report semantic model on millions of bug reports, each word in the text of a bug report is represented as a word embedding that preserves the semantic relationship among the words. Then, a convolutional neural network model is designed to capture the high-level features of bug reports to obtain a more accurate classification. Moreover, the effects of the semantic model size and domain on the classification results are investigated, and the quality of word embeddings is evaluated by analyzing several important parameters.
Published: 2022

34. A Task-oriented Chatbot Based on LSTM and Reinforcement Learning

Author: Tai-Liang Chou and Yu-Ling Hsueh
Subjects: Sequence, Discriminator, General Computer Science, business.industry, Computer science, Deep learning, computer.software_genre, Chatbot, Reinforcement learning, Artificial intelligence, Raw data, business, computer, Natural language processing, Sentence, Generator (mathematics)
Abstract: Thanks to the advancements in deep learning, chatbots are widely used in messaging applications. Undoubtedly, a chatbot is a new way of interaction between humans and machines. However, most of the chatbots act as a simple question answering system that responds with formulated answers. Traditional conversational chatbots usually adopt a retrieval-based model that requires a large amount of conversational data for retrieving various intents. Hence, training a chatbot model that uses low-resource conversational data to generate more diverse dialogues is desirable. We propose a method to build a task-oriented chatbot using a sentence generation model that generates sequences based on the generative adversarial network. The architecture of our model contains a generator that generates a diverse sentence and a discriminator that judges the sentences by comparing the generated and the ground-truth sentences. In the generator, we combine the attention model with the sequence-to-sequence model using hierarchical long short-term memory to extract sentence information. For the discriminator, our reward mechanism assigns low rewards for repeated sentences and high rewards for diverse sentences. Extensive experiments are presented to demonstrate the utility of our model that generates more diverse and information-rich sentences than those of the existing approaches.
Published: 2022

35. A meta-analysis of the line bisection task in children

Author: Marietta Papadatou-Pastou, Gemma Learmonth, and Kaul D
Subjects: endocrine system, Landmark, business.industry, Computer science, General Medicine, computer.software_genre, Task (project management), Text mining, Arts and Humanities (miscellaneous), Meta-analysis, Artificial intelligence, business, computer, Natural language processing, General Psychology
Abstract: Meta-analyses have shown subtle, group-level asymmetries of spatial attention in adults favouring the left hemispace (pseudoneglect). However, no meta-analysis has synthesized data on children. We performed a random-effects meta-analysis of spatial biases in children aged ≤16 years. Databases (PsycINFO, Web of ScienceScopus) and pre-print servers (bioRxiv, medRxivPsyArXiv) were searched for studies involving typically developing children with a mean age of ≤16, who were tested using line bisection. Thirty-three datasets, from 31 studies, involving 2101 children, were included. No bias was identified overall, but there was a small leftward bias in a subgroup where all children were aged ≤16. Moderator analysis found symmetrical neglect, with right-handed actions resulting in right-biased bisections, and left-handed actions in left-biased bisections. Bisections were more leftward in studies with a higher percentage of boys relative to girls. Mean age, hand preference, and control group status did not moderate biases, and there was no difference between children aged ≤7 and ≥7 years, although the number of studies in each moderator analysis was small. There was no evidence of small study bias. We conclude that pseudoneglect may be present in children but is dependent on individual characteristics (sex) and/or task demands (hand used).Registration: Open Science Framework (https://osf.io/n68fz/).
Published: 2022

36. Hierarchical self attention based sequential labelling model for Bhojpuri, Maithili and Magahi languages

Author: Anil Kumar Singh, Rajesh Kumar Mundotiya, and Swasti Mishra
Subjects: General Computer Science, Machine translation, Structured support vector machine, Computer science, Bhojpuri, business.industry, computer.software_genre, language.human_language, Maithili, Information extraction, Language technology, language, Artificial intelligence, Language family, business, computer, Natural language processing, Chunking (computing)
Abstract: Sequential labelling plays a vital role in solving numerous Natural Language Processing (NLP) applications such as Machine Translation and Information Extraction etc. One of these is Part-of-Speech (POS) tagging, which assigns a sequence of grammatical categories to the given sentence, and Chunking which groups them into ‘chunks’ or what can be called minimal phrases. Bhojpuri, Maithili and Magahi are low resource languages and widely spoken in central north-eastern India, belonging to the Indo-Aryan language family. The creation of an annotated corpus for POS tagging and Chunking, and then building an initial automatic tool for these problems is the first attempt towards building language technology tools for these languages. The annotated corpus used to develop POS Taggers and Chunkers, based on various machine learning algorithms (TnT, CRF, MEMM and Structured SVM) and state-of-the-art LSTM-CNN-CRF model, and then these compared with the obtained results on two new proposed deep learning-based models, Self-Attention Hierarchical Bi-LSTM CRF (SAHBiLC) and a fine-tuned version of it, Fine-SAHBiLC. The SAHBiLC and Fine-SAHBiLC models outperform on Bhojpuri (Accuracy for POS and Chunking is 0.86% and 0.94%, respectively) and Maithili (Accuracy for POS and Chunking is 0.86% and 0.95%, respectively) and Magahi (Accuracy for POS is 0.86%).
Published: 2022

37. Moroccan Arabic vocabulary generation using a rule-based approach

Author: Ridouane Tachicart and Karim Bouzoubaa
Subjects: Vocabulary, General Computer Science, Machine translation, Computer science, business.industry, media_common.quotation_subject, Concatenation, Spell, 020206 networking & telecommunications, Rule-based system, 02 engineering and technology, Lexicon, computer.software_genre, Set (abstract data type), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, State (computer science), Artificial intelligence, business, computer, Natural language processing, media_common
Abstract: NLP resources play a crucial role in the building of many NLP applications. The importance of these resources depends not only on their size and coverage but also on the richness and the precision of the annotated information they provide. In the case of resource-scarce languages such as Moroccan Arabic, the building of NLP applications is limited due to the lack of these resources. To overcome this problem, we follow a rule-based approach to generate a Moroccan morphological vocabulary (MORV) which constitutes the first step addressing the problem of Moroccan morphological generation. MORV is designed and implemented based on two main components: On one hand, an MA lexicon and a list of fully annotated affixes and clitics that we have created specifically to ensure the generation process. On the other hand, a set of rules covering the concatenation and the orthographic adjustments of the generated words. Moreover, given a base form, MORV outputs more than 4.5 M Moroccan words with rich morphological features such as tense, gender, number, state, etc. We tested the coverage of MORV on texts collected from Moroccan social media and realized that it reaches a vocabulary coverage of 84% and a precision of 94%. This system is a benefit for building other NLP applications such as spell checking, morphological analysis, and machine translation.
Published: 2022

38. Open Named Entity Modeling From Embedding Distribution

Author: Wang Tao, Ying Luo, Luo Si, Li Linlin, and Hai Zhao
Subjects: Structure (mathematical logic), Word embedding, business.industry, Computer science, Space (commercial competition), computer.software_genre, Computer Science Applications, Named entity, Computational Theory and Mathematics, Named-entity recognition, Embedding, Artificial intelligence, business, computer, Natural language processing, Word (computer architecture), Information Systems
Abstract: In this paper, we report our discovery on named entity distribution in general word embedding space, which helps an open definition on multilingual named entity definition rather than previous closed and constraint definition on named entities through a named entity dictionary, which is usually derived from human labor and replies on schedule update. Our initial visualization of monolingual word embeddings indicates named entities tend to gather together despite of named entity types and language difference, which enable us to model all named entities using a specific geometric structure inside embedding space, namely, the named entity hypersphere. For monolingual case, the proposed named entity model gives an open description on diverse named entity types and different languages. For cross-lingual case, mapping the proposed named entity model provides a novel way to build named entity dataset for resource-poor languages. At last, the proposed named entity model may be shown as a very useful clue to significantly enhance state-of-the-art named entity recognition systems generally.
Published: 2022

39. Contextual semantic embeddings based on fine-tuned AraBERT model for Arabic text multi-class categorization

Author: Fatima-Zahra El-Alami, Noureddine En Nahnahi, and Said Ouatik El Alaoui
Subjects: Word embedding, General Computer Science, business.industry, Computer science, 020206 networking & telecommunications, 02 engineering and technology, computer.software_genre, Class (biology), Support vector machine, Categorization, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), 020201 artificial intelligence & image processing, Artificial intelligence, business, Transfer of learning, computer, Feature learning, Natural language processing, Sentence
Abstract: Despite that pre-trained word embedding models have advanced a wide range of natural language processing applications, they ignore the contextual information and meaning within the text. In this paper, we investigate the potential of the pre-trained Arabic BERT (Bidirectional Encoder Representations from Transformers) model to learn universal contextualized sentence representations aiming to showcase its usefulness for Arabic text Multi-class categorization. We propose to exploit the pre-trained AraBERT for contextual text representation learning in two different ways, transfer learning model and feature extractor. On the one hand, we employ the Arabic BERT (AraBERT) model after fine-tuning its parameters on the OSAC datasets to transfer its knowledge for the Arabic text categorization. On the other hand, we inquire into AraBERT performance, as a feature extractor model, by combining it with several classifiers, including CNN, LSTM, Bi-LSTM, MLP, and SVM. Finally, we conduct an exhaustive set of experiments comparing two BERT models, namely AraBERT and multilingual BERT. The findings show that the fine-tuned AraBERT model accomplishes state-of-the-art performance results and attains up to 99% in terms of F1-score and accuracy.
Published: 2022

40. Offline Arabic handwritten word recognition: A transfer learning approach

Author: Mohamed Awni, Hazem M. Abbas, and Mahmoud I. Khalil
Subjects: General Computer Science, Artificial neural network, Computer science, business.industry, Deep learning, computer.software_genre, Lexicon, Convolution, Task (project management), ComputingMethodologies_PATTERNRECOGNITION, Word recognition, Artificial intelligence, business, Transfer of learning, computer, Natural language processing, Word (computer architecture)
Abstract: Offline Arabic handwritten word recognition is still a challenging task. Many deep learning approaches perform admirably on this task if the lexicon size is not too large and the number of training samples is sufficient for the training process. The transfer learning technique is commonly used to compensate for the lack of training samples, but there is a wide controversy about the effectiveness of applying it to cross-domain tasks. In this paper, we examine the performance of three deep convolution neural networks that have been randomly initialized for recognizing Arabic handwritten words. Then, we evaluate the performance of the ResNet18 model that has been pre-trained on the ImageNet dataset for the same task. Finally, we propose an approach based on sequentially transferring the mid-level word image representations through two consecutive phases using the ResNet18 model. We carried out four different sets of experiments using two popular offline Arabic handwritten word datasets: the AlexU-W and the IFN/ENIT (v2.0p1e) to figure out the most effective way of applying transfer learning. Our results demonstrate that using the ImageNet as a source dataset improves the recognition accuracy of the ten frequently misclassified words in the IFN/ENIT dataset by 14%, while our proposed approach gives a rise of 35.45%. In the whole dataset, we achieved recognition accuracy up to 96.11%, which is nearly a 2.5% enhancement compared with other state-of-the-art methods.
Published: 2022

41. Towards the automatic generation of Arabic Lexical Recognition Tests using orthographic and phonological similarity maps

Author: Saeed Salah, Raid Zaghal, Mohammad Nassar, and Osama Hamed
Subjects: Vocabulary, General Computer Science, business.industry, Computer science, media_common.quotation_subject, Confusion matrix, 020206 networking & telecommunications, Phonology, 02 engineering and technology, Pronunciation, computer.software_genre, Word lists by frequency, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Language proficiency, Artificial intelligence, Precision and recall, business, computer, Orthography, Natural language processing, media_common
Abstract: Lexical Recognition Test (LRT) themes are one of the main methods that are widely used to measure language proficiency of some common languages such as English, German and Spanish. However, similar research for Arabic is still at development stages, and existing proposals mainly use human-crafted methods. In this paper, a new methodology, based on a newly developed algorithm, was proposed with the aim of automatically constructing high quality nonwords associated with a real quick measurement of Arabic proficiency levels (Arabic LRT). The suggested algorithm will automatically generate nonwords based on Arabic special characteristics they are orthography (spelling), phonology (pronunciation), n-grams and the word frequency map, which is an important factor to create a multi-level test. With the help of a large dataset of Arabic vocabulary, the proposed algorithm was experimented. For this purpose, a Web-based application, following the suggested methodology, was designed and implemented to facilitate the process of collecting and analyzing learners’ responses. The experimental results have shown that the LRT questions that were automatically generated by the proposed system had confused the learners, this is clear from the output of the confusion matrix which showed that (1/3) of the generated nonwords were able to distract the learners (with accuracy 65%). Consequentially, the results of recall and precision have smaller values, 0.52 and 0.48, respectively.
Published: 2022

42. Progress in Machine Translation

Author: Liang Huang, Zhongjun He, Hua Wu, Kenneth Church, and Haifeng Wang
Subjects: Environmental Engineering, General Computer Science, Machine translation, Computer science, business.industry, Materials Science (miscellaneous), General Chemical Engineering, media_common.quotation_subject, General Engineering, Energy Engineering and Power Technology, Translation (geometry), computer.software_genre, Field (computer science), Quality (business), Artificial intelligence, business, computer, Natural language processing, Transformer (machine learning model), media_common
Abstract: After more than 70 years of evolution, great achievements have been made in machine translation. Especially in recent years, translation quality has been greatly improved with the emergence of neural machine translation (NMT). In this article, we first review the history of machine translation from rule-based machine translation to example-based machine translation and statistical machine translation. We then introduce NMT in more detail, including the basic framework and the current dominant framework, Transformer, as well as multilingual translation models to deal with the data sparseness problem. In addition, we introduce cutting-edge simultaneous translation methods that achieve a balance between translation quality and latency. We then describe various products and applications of machine translation. At the end of this article, we briefly discuss challenges and future research directions in this field.
Published: 2022

43. Attention-based position-aware framework for aspect-based opinion mining using bidirectional long short-term memory

Author: Chetana Prakash and Azizkhan F Pathan
Subjects: General Computer Science, business.industry, Computer science, Deep learning, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, Sentiment analysis, Context (language use), computer.software_genre, Lexicon, SemEval, Variety (cybernetics), Artificial intelligence, business, computer, Natural language processing, Word (computer architecture), Sentence
Abstract: Aspect-based Opinion Mining is a form of fine-grained Sentiment Analysis and it models the semasiological relationship between aspect terms and context words in a sentence. The presence of a variety of context words has a significant impact on a sentence's sentiment polarity. As a result, while designing a model, it is necessary to consider the interaction of aspects and context words. Although existing approaches have taken into account an aspect’s position in a sentence, much of the research works have not explored the use of Sentiment Lexicons with the Deep Learning algorithms. In this paper, we propose a framework for an Attention-based position-aware Bidirectional Long Short-Term Memory network for Aspect-based Opinion Mining that incorporates a Sentiment Intensity Lexicon. The aspect word’s pre-trained vector is adjusted to be closer to semantically and sentimentally similar nearest neighbors and further away from sentimentally dissimilar neighbors. The proposed framework calculates aspect weights by concatenating the external knowledge in the form of lexicon sentiment intensity scores with word embeddings and position information. The framework experiments on the SemEval 2014 dataset. The results of the experiments illustrate that injecting external knowledge into the Bidirectional Long Short-Term Memory network can improve classification accuracy significantly.
Published: 2022

44. Adapting to the pandemic crisis through the lens of language: On the English semantic neologism lockdown and its adoption into Czech

Author: Zora Obstová, Aleš Klégr, and Ondřej Tichý
Subjects: Czech, Computer science, business.industry, computer.software_genre, language.human_language, language, General Materials Science, Lexico, Artificial intelligence, business, computer, Interim report, Digitization, Natural language processing, computer.programming_language
Published: 2022

45. Efficient Relational Sentence Ordering Network

Author: Yingming Li, Baiyun Cui, and Zhongfei Mark Zhang
Subjects: Computer science, business.industry, Applied Mathematics, computer.software_genre, Automatic summarization, Task (computing), Computational Theory and Mathematics, Code refactoring, Artificial Intelligence, Pointer (computer programming), Computer Vision and Pattern Recognition, Language model, Artificial intelligence, Paragraph, business, computer, Encoder, Software, Natural language processing, Sentence
Abstract: In this paper, we propose a novel deep Efficient Relational Sentence Ordering Network (referred to as ERSON) by leveraging pre-trained language model in both encoder and decoder architectures to strengthen the coherence modeling of the entire model. Specifically, we first introduce a divide-and-fuse BERT (referred to as DF-BERT), a new refactor of BERT network, where lower layers in the improved model encode each sentence in the paragraph independently, which are shared by different sentence pairs, and the higher layers learn the cross-attention between sentence pairs jointly. It enables us to capture the semantic concepts and contextual information between the sentences of the paragraph, while significantly reducing the runtime and memory consumption without sacrificing the model performance. Besides, a Relational Pointer Decoder (referred to as RPD) is developed, which utilizes the pre-trained Next Sentence Prediction (NSP) task of BERT to capture the useful relative ordering information between sentences to enhance the order predictions. In addition, a variety of knowledge distillation based losses are added as auxiliary supervision to further improve the ordering performance. The extensive evaluations on Sentence Ordering, Order Discrimination, and Multi-Document Summarization tasks show the superiority of ERSON to the state-of-the-art ordering methods.
Published: 2022

46. A survey on deep learning for textual emotion analysis in social networks

Author: Zhouhao Ouyang, Lihong Cao, Aimin Yang, Xinguang Li, Weijia Jia, Sancheng Peng, Yongmei Zhou, and Shui Yu
Subjects: 0805 Distributed Computing, 1005 Communications Technologies, 1203 Design Practice and Management, Computer Networks and Communications, business.industry, Computer science, Deep learning, Emotion classification, computer.software_genre, Categorization, Hardware and Architecture, Artificial intelligence, business, Feature learning, computer, Natural language processing, Sentence
Abstract: Textual Emotion Analysis (TEA) aims to extract and analyze user emotional states in texts. There has been rapid development of various Deep Learning (DL) methods that have proven successful in many domains such as audio, image, and natural language processing. This trend has drawn increasing numbers of researchers away from traditional machine learning to DL for their scientific research. In this paper, we provide an overview on TEA based on DL methods. After introducing a background for emotion analysis that includes defining emotion, emotion classification methods, and application domains of emotion analysis, we summarize DL technology, and the word/sentence representation learning method. We then categorize existing TEA methods based on text structures and linguistic types: text-oriented monolingual methods, text conversations-oriented monolingual methods, text-oriented cross-linguistic methods, and emoji-oriented cross-linguistic methods. We close by discussing emotion analysis challenges and future research trends. We hope that our survey will assist interested readers in understanding the relationship between TEA and DL methods while also improving TEA development.
Published: 2022

47. Arabic light-based stemmer using new rules

Author: Hamood Alshalabi, Fatima N. Al-Aswadi, Nazlia Omar, Kamal Ali Alezabi, and Sabrina Tiun
Subjects: Structure (mathematical logic), General Computer Science, Arabic, business.industry, Computer science, computer.software_genre, language.human_language, Prefix, Infix, Morpheme, language, Artificial intelligence, business, computer, Word length, Natural language processing
Abstract: Superior stemming algorithms aid significantly in many natural language processing (NLP) applications such as information retrieval. Arabic light-based stemmer is one of the most important stemming algorithms. However, partially due to the highly inflected and complexity of Arabic language morphological structure, most of the existing Arabic light-based stemmer algorithms eliminate a few numbers of suffixes and prefixes or both in the process of recognising the infix patterns to determine roots. The elimination of suffixes and prefixes leads to many inefficient results. Hence, this study aims to develop an improved light-based algorithm of the Arabic stemmer by proposing an appropriate suffixes and prefixes list, which is supported by rules according to word length (without using a morpheme or patterns on a stem). Our improved Dlight Arabic stemmer focuses on determining and removing the infix patterns under many rules on length-words and according to a specific order of the stages of the stemming to extract the double, triple and quadruple roots from long and short Arabic words. To evaluate our proposed light-based Arabic stemmer, we compared our stemmer against existing Arabic stemmers, namely Light10, Condlight and ARLST. The experimental results showed the proposed Develop Arabic Light-Based Stemmer (Dlight) obtained the best performance with 68% of F-measure, while the other three Arabic stemmers yield slightly lower F-measure. Finally, establishing an appropriate list of suffixes and prefixes with word length rules to stem Arabic words can improve the performance of a light-based Arabic stemmer.
Published: 2022

48. Preference and Constraint Factor Model for Event Recommendation

Author: Yulu Du, Yujie Zhang, Yi'an Lai, and Xiangwu Meng
Subjects: Computational Theory and Mathematics, business.industry, Computer science, Event (relativity), Artificial intelligence, computer.software_genre, business, computer, Natural language processing, Preference, Computer Science Applications, Information Systems, Constraint factor
Published: 2022

49. Integration of morphological features and contextual weightage using monotonic chunk attention for part of speech tagging

Author: Rupjyoti Baruah, Rajesh Kumar Mundotiya, Anil Kumar Singh, and Arpit Mehta
Subjects: Dependency (UML), General Computer Science, business.industry, Computer science, Deep learning, Treebank, computer.software_genre, language.human_language, Sequence labeling, Telugu, language, Artificial intelligence, Marathi, business, computer, Encoder, Natural language processing, Word (computer architecture)
Abstract: Part-of-Speech (POS) tagging is a fundamental sequence labeling problem in Natural Language Processing. Recent deep learning sequential models combine the forward and backward word informatio for POS tagging. The information of contextual words to the current word play a vital role in capturing the non-continuous relationship. We have proposed Monotonic chunk-wise attention with CNN-GRU-Softmax (MCCGS), a deep learning architecture that adheres to these essential information. This architecture consists of Input Encoder (IE), encodes word and character-level, Contextual Encoder (CE), assigns the weightage to adjacent word and Disambiguator (D), which resolves intra-label dependencies as core components. Moreover, different morphological features have been integrated into the core components of MCCGS architecture as MCCGS-IE, MCCGS-CE and MCCGS-D. The MCCGS architecture is validated on the 21 languages from Universal Dependency (UD) treebank. The state-of-the-art models, Type constraints, Retrofitting, Distant Supervision from Disparate Sources and Position-aware Self Attention, MCCGS and its variants such as MCCGS-IE, MCCGS-CE and MCCGS-D are obtained mean accuracy 83.65 % , 81.29 % , 84.10 % , 90.18 % , 90.40 % , 91.40 % , 90.90 % , 92.30 % , respectively. The proposed model architecture provides state-of-the-art accuracy on the low resource languages as Marathi ( 93.58 % ), Tamil ( 87.50 % ), Telugu ( 96.69 % ) and Sanskrit ( 97.28 % ) from UD treebank and Hindi ( 95.64 % ) and Urdu ( 87.47 % ) from Hindi-Urdu multi-representational treebank.
Published: 2022

50. Big data, actually: Examining systematic messaging in 188 romantic comedies using unsupervised machine learning

Author: Melissa M. Moore and Yotam Ophir
Subjects: Cultural Studies, business.industry, Communication, Big data, computer.software_genre, Romance, Interpersonal relationship, Unsupervised learning, Psychology (miscellaneous), Artificial intelligence, business, Psychology, computer, Applied Psychology, Natural language processing
Published: 2022

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

171,671 results on '"Natural Language Processing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources