Descriptor: "Topic model" / Publication Year Range: Last 3 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Topic model"' showing total 791 results

Start Over Descriptor "Topic model" Publication Year Range Last 3 years

791 results on '"Topic model"'

1. Topic optimization–incorporated collaborative recommendation for social tagging

Author: Pan, Xuwei, Zeng, Xuemei, and Ding, Ling
Published: 2024
Full Text: View/download PDF

2. Language and the use of law are predictive of judge gender and seniority.

Author: Font-Pomarol, Lluc, Piga, Angelo, Nasarre-Aznar, Sergio, Sales-Pardo, Marta, and Guimerà, Roger
Subjects: JUDGES, GENDER differences (Psychology), IMPLICIT bias, LEGAL judgments, EMPLOYEE seniority
Abstract: There are examples of how unconscious bias can influence actions of people. In the judiciary, however, despite some examples there is no general theory on whether different demographic attributes such as gender, seniority or ethnicity affect case sentencing. We aim to gain insight into this issue by analyzing over 100k decisions of three different areas of law with the goal of understanding whether judge identity or judge attributes such as gender and seniority can be inferred from decision documents. We find that stylistic features of decisions are predictive of judge identities, their gender and their seniority, a finding that is aligned with results from analysis of written texts outside the judiciary. Surprisingly, we find that features based on legislation cited are also predictive of judge identities and attributes. While own content reuse by judges can explain our ability to predict judge identities, no specific reduced set of features can explain the differences we find in the legislation cited of decisions when we group judges by gender or seniority. Our findings open the door for further research on how these differences translate into how judges apply the law and, ultimately, to promote a more transparent and fair judiciary system. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Testing high-dimensional multinomials with applications to text analysis.

Author: Cai, T Tony, Ke, Zheng T, and Turner, Paxton
Subjects: DISTRIBUTION (Probability theory), CENTRAL limit theorem, ATTRIBUTION of authorship, FILM reviewing, GAUSSIAN distribution
Abstract: Motivated by applications in text mining and discrete distribution inference, we test for equality of probability mass functions of K groups of high-dimensional multinomial distributions. Special cases of this problem include global testing for topic models, two-sample testing in authorship attribution, and closeness testing for discrete distributions. A test statistic, which is shown to have an asymptotic standard normal distribution under the null hypothesis, is proposed. This parameter-free limiting null distribution holds true without requiring identical multinomial parameters within each group or equal group sizes. The optimal detection boundary for this testing problem is established, and the proposed test is shown to achieve this optimal detection boundary across the entire parameter space of interest. The proposed method is demonstrated in simulation studies and applied to analyse two real-world datasets to examine, respectively, variation among customer reviews of Amazon movies and the diversity of statistical paper abstracts. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Global strategies for disruptive technology protection and regulation: evidence from policy textual analysis.

Author: Wang, Chenlin, Liu, Xiaojuan, and Shen, Jing
Subjects: *ELECTRICITY markets, *MARKET design & structure (Economics), *CONTENT analysis, *SUSTAINABLE development, *POLICY analysis
Abstract: Disruptive technology (DT) has the power to reshape markets and societal structures, necessitating both protection and regulation. This paper aims to examine DT protection and regulation strategies at both micro and macro levels through policy texts. We construct an analysis framework integrating Responsible Research and Innovation (RRI) principles, innovation theory, and DT growth process. Key measures are extracted by content analysis, and dynamic topic model is used to discover evolving policy focuses. The framework is applied to analyze DT policy files of major powers. The results indicate that supply-side policy tools are employed to promote sustainable development in DT innovation, while environmental tools are utilised for early governance to preempt potential risks and innovation damage. The focuses of DT policies exhibit three evolution trends: attenuation, reinforcement, and fluctuation. Protective policies show concentration and coherence, while regulatory policies demonstrate integration and correlation. This study broadens existing perspectives and analytical frameworks, providing valuable insights for DT development and governance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Big data in transportation: a systematic literature analysis and topic classification.

Author: Tzika-Kostopoulou, Danai, Nathanail, Eftihia, and Kokkinos, Konstantinos
Subjects: BIG data, CONVOLUTIONAL neural networks, SMART cities, URBAN planning, CLASSIFICATION
Abstract: This paper identifies trends in the application of big data in the transport sector and categorizes research work across scientific subfields. The systematic analysis considered literature published between 2012 and 2022. A total of 2671 studies were evaluated from a dataset of 3532 collected papers, and bibliometric techniques were applied to capture the evolution of research interest over the years and identify the most influential studies. The proposed unsupervised classification model defined categories and classified the relevant articles based on their particular scientific interest using representative keywords from the title, abstract, and keywords (referred to as top words). The model's performance was verified with an accuracy of 91% using Naïve Bayesian and Convolutional Neural Networks approach. The analysis identified eight research topics, with urban transport planning and smart city applications being the dominant categories. This paper contributes to the literature by proposing a methodology for literature analysis, identifying emerging scientific areas, and highlighting potential directions for future research. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. A relational background knowledge boosting based topic model for Chinese poems.

Author: Lei Peng and Porntrakoon, Paitoon
Subjects: CHINESE poetry, POETRY (Literary form), GIBBS sampling
Abstract: Classical Chinese poetry has been increasingly popular in recent years, and modeling its topic is quite a promising area of research. Chinese poems have the characteristic of short in length, but traditional topic models perform poorly when faced with short texts due to the text sparsity. Therefore, topic model should be improved to satisfy the scenario of classical Chinese poems. In this paper, a relational background knowledge boosting based topic model (RBKBTM) was proposed to overcome the text sparsity of Chinese poems. We incorporated background information into the model, which expanded the text content from the semantic perspective. The background knowledge was combined using word embedding and TextRank and was then fed into the core computing process. Subsequently, a new sampling formula was derived. Our proposed model was tested on three different tasks using three different datasets. The results demonstrate that the incorporated background knowledge can effectively overcomes text sparsity, improving the performance and effectiveness of the topic model. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Supervised Dynamic Correlated Topic Model for Classifying Categorical Time Series.

Author: Pais, Namitha, Ravishanker, Nalini, and Rajasekaran, Sanguthevar
Subjects: *ESCHERICHIA coli, *TIME series analysis, *SUPERVISED learning, *EXPECTATION-maximization algorithms, *KALMAN filtering
Abstract: In this paper, we describe the supervised dynamic correlated topic model (sDCTM) for classifying categorical time series. This model extends the correlated topic model used for analyzing textual documents to a supervised framework that features dynamic modeling of latent topics. sDCTM treats each time series as a document and each categorical value in the time series as a word in the document. We assume that the observed time series is generated by an underlying latent stochastic process. We develop a state-space framework to model the dynamic evolution of the latent process, i.e., the hidden thematic structure of the time series. Our model provides a Bayesian supervised learning (classification) framework using a variational Kalman filter EM algorithm. The E-step and M-step, respectively, approximate the posterior distribution of the latent variables and estimate the model parameters. The fitted model is then used for the classification of new time series and for information retrieval that is useful for practitioners. We assess our method using simulated data. As an illustration to real data, we apply our method to promoter sequence identification data to classify E. coli DNA sub-sequences by uncovering hidden patterns or motifs that can serve as markers for promoter presence. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Model Analytic in Fintech User Comment Features Using LDA-CNN on Imbalanced Data.

Author: Widiantoro, Albertus Dwiyoga, Mustafid, Mustafid, and Sanjaya, Ridwan
Subjects: FINANCIAL technology, PEER-to-peer lending, CONVOLUTIONAL neural networks
Abstract: Peer-to-peer (P2P) lending platforms are growing significantly, and users always leave comments on the application to provide ratings. User comments are important to analyze to see the needs and constraints of fintech users. The Purpose of the research is to create an analytical model that effectively addresses the problem of limited accuracy in classification due to data imbalance in P2P Lending platforms. The research aims to improve feature detection and overall model quality by effectively managing imbalanced data. The design involves a combination of techniques. First, the Latent Dirichlet Allocation (LDA) method is used to organize topics and label data. To address the data imbalance, the study employs Random Over Sampling (ROS) and Neighborhood Cleaning Rule (NCL). The final classification is performed using Convolutional Neural Networks (CNN). Additionally, a comparative analysis with other algorithms like LSTM and CNN-LSTM is carried out to validate the effectiveness of the proposed approach. The Findings reveal that the CNN-ROS-NCL model is capable of managing imbalanced data, which improves class distribution and enhances the model's quality by reducing noise and misleading samples. The CNN model achieved a classification accuracy of 94.66% on 10 feature classes, suggesting a significant improvement in feature detection and classification performance on the P2P Lending platform. The Originality of this research lies in the innovative integration of LDA for topic analysis with CNN for classification a novel approach in the context of fintech feature development. This combination has not previously been used in fintech and offers a new way to automatically detect features in Fintech P2P Lending user comment datasets by identifying key topics. The research contributes to the enhancement of fintech applications and services by providing a model that improves the understanding and processing of user comments on P2P platforms. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Improving Jobs-Resumes Classification: A Labor Market Intelligence Approach.

Author: Beristain, Saúl Iván, Barbosa, Rutilio Rodolfo López, and Barriocanal, Elena García
Subjects: MACHINE learning, INFORMATION technology, JOB resumes, JOB vacancies, LABOR market
Abstract: This research proposes a framework to improve the efficiency of classification and matching of descriptions of skill on resumes with jobs vacancies using labor market intelligence over a dataset of resumes harvested from social networks. To carry out the experiments, a Kaggle dataset was downloaded containing information from the LinkedIn social network with more than 200,000 records that were later filtered and pre-processed to generate a topic model to classify the entire dataset. Later, using machine learning algorithms, prediction exercises were performed to determine the most efficient match. This model offers high percentages of efficiency when predicting the job position of a candidate of information technology (IT) areas This prediction is achieved due the reduction of categories in these areas generated by the creation of the corresponding topic model to match the resume with the job position. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Language and the use of law are predictive of judge gender and seniority

Author: Lluc Font-Pomarol, Angelo Piga, Sergio Nasarre-Aznar, Marta Sales-Pardo, and Roger Guimerà
Subjects: Gender differences, Topic model, Judicial decisions, Computer applications to medicine. Medical informatics, R858-859.7
Abstract: Abstract There are examples of how unconscious bias can influence actions of people. In the judiciary, however, despite some examples there is no general theory on whether different demographic attributes such as gender, seniority or ethnicity affect case sentencing. We aim to gain insight into this issue by analyzing over 100k decisions of three different areas of law with the goal of understanding whether judge identity or judge attributes such as gender and seniority can be inferred from decision documents. We find that stylistic features of decisions are predictive of judge identities, their gender and their seniority, a finding that is aligned with results from analysis of written texts outside the judiciary. Surprisingly, we find that features based on legislation cited are also predictive of judge identities and attributes. While own content reuse by judges can explain our ability to predict judge identities, no specific reduced set of features can explain the differences we find in the legislation cited of decisions when we group judges by gender or seniority. Our findings open the door for further research on how these differences translate into how judges apply the law and, ultimately, to promote a more transparent and fair judiciary system.
Published: 2024
Full Text: View/download PDF

11. Does Topic Consistency Matter? A Study of Critic and User Reviews in the Movie Industry.

Author: Kim, Eunsoo, Ding, MengQi, Wang, Xin, and Lu, Shijie
Subjects: FILM reviewing, MOTION picture plots & themes, MOTION picture industry, FILM critics, MOTION picture audiences, CONSUMERS' reviews, WORD of mouth advertising, ECONOMIC demand, FILM box office revenue
Abstract: Online review platforms often present reviews from both critics and general users. In this research, the authors propose a measure called "topic consistency" to capture the degree of overlap between critic and user review content. High topic consistency suggests greater information recall due to repeated presentation of the same topics, which may increase the memorability of movie attributes and therefore positively affect movie demand. The authors measure the topic consistency between critic and user reviews using topic models and further study the financial consequences of this measure using data for movies released in the United States. Topic consistency is positively associated with subsequent box office revenue, suggesting a positive relationship between topic consistency and movie demand. Furthermore, the effect of topic consistency on demand is the greatest for movies with mediocre review ratings and when the review ratings from critics are close to those from users. Using lab experiments, the authors provide evidence of the causal link between topic consistency and consumers' willingness to watch a movie, and support for the potential mediation through the information recall of reviews. Movie producers and advertisers should consider highlighting or inducing a central theme for critics and users to discuss, as the more the review content of critics and users overlaps, the higher a movie's revenue. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

12. The Journey of Language Models in Understanding Natural Language

Author: Liu, Yuanrui, Zhou, Jingping, Sang, Guobiao, Huang, Ruilong, Zhao, Xinzhe, Fang, Jintao, Wang, Tiexin, Li, Bohan, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Jin, Cheqing, editor, Yang, Shiyu, editor, Shang, Xuequn, editor, Wang, Haofen, editor, and Zhang, Yong, editor
Published: 2024
Full Text: View/download PDF

13. Research on the Characteristics and Evolution of Digital Economic Policy Topics Based on BERTopic Model

Author: Li, Ye, Zhang, Cunyang, Zhao, Linfang, Appolloni, Andrea, Series Editor, Caracciolo, Francesco, Series Editor, Ding, Zhuoqi, Series Editor, Gogas, Periklis, Series Editor, Huang, Gordon, Series Editor, Nartea, Gilbert, Series Editor, Ngo, Thanh, Series Editor, Striełkowski, Wadim, Series Editor, Liao, Junfeng, editor, Li, Hongbo, editor, and Ng, Edward H. K., editor
Published: 2024
Full Text: View/download PDF

14. Enhancing LDA Method by the Use of Feature Maximization

Author: Lamirel, Jean-Charles, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Villmann, Thomas, editor, Kaden, Marika, editor, Geweniger, Tina, editor, and Schleif, Frank-Michael, editor
Published: 2024
Full Text: View/download PDF

15. CoTE: A Flexible Method for Joint Learning of Topic and Embedding Models

Author: Zhao, Bo, Yuan, Chunfeng, Huang, Yihua, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Song, Xiangyu, editor, Feng, Ruyi, editor, Chen, Yunliang, editor, Li, Jianxin, editor, and Min, Geyong, editor
Published: 2024
Full Text: View/download PDF

16. Enhancing LSTM and Fusing Articles of Law for Legal Text Summarization

Author: Chen, Zhe, Ye, Lin, Zhang, Hongli, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
Published: 2024
Full Text: View/download PDF

17. Differentiable Topics Guided New Paper Recommendation

Author: Li, Wen, Xie, Yi, Jiang, Hailan, Sun, Yuqing, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
Published: 2024
Full Text: View/download PDF

18. Topic and knowledge-enhanced modeling for edge-enabled IoT user identity linkage across social networks

Author: Rui Huang, Tinghuai Ma, Huan Rong, Kai Huang, Nan Bi, Ping Liu, and Tao Du
Subjects: Internet of things, User identity linkage, Cross-social network, Topic model, Knowledge graph, Computer engineering. Computer hardware, TK7885-7895, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract The Internet of Things (IoT) devices spawn growing diverse social platforms and online data at the network edge, propelling the development of cross-platform applications. To integrate cross-platform data, user identity linkage is envisioned as a promising technique by detecting whether different accounts from multiple social networks belong to the same identity. The profile and social relationship information of IoT users may be inconsistent, which deteriorates the reliability of the effectiveness of identity linkage. To this end, we propose a topic and knowledge-enhanced model for edge-enabled IoT user identity linkage across social networks, named TKM, which conducts feature representation of user generated contents from both post-level and account-level for identity linkage. Specifically, a topic-enhanced method is designed to extract features at the post-level. Meanwhile, we develop an external knowledge-based Siamese neural network for user-generated content alignment at the account-level. Finally, we show the superiority of TKM over existing methods on two real-world datasets. The results demonstrate the improvement in prediction and retrieval performance achieved by utilizing both post-level and account-level representation for identity linkage across social networks.
Published: 2024
Full Text: View/download PDF

19. Discovering the relationship between the number of film review topics and box office with NLP techniques.

Author: Li, Bo, Dai, Wei, Liu, Shang, and Shi, Yong
Abstract: Numerous movie reviews on the internet can be used to analyze audience experiences of movies and provide inspiration for the operation and production of movies. In this work, we integrate and match the review data on Douban and the box office data on the website maoyan.www. and utilize the topic model in natural language processing to determine the number of each movie review topics, then establish an regression equation to analyze the relationship between the number of topics and movie box office. Our analysis shows a significant positive correlation between film box office and the number of movie review topics, and we have provided an explanation for this. Finally, we also analyzed the practical application value of this work. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Sharing the same bed with different dreams: Topic modeling the research-practice gap in public relations 2011-2020.

Author: Wang, Xiao and Zhang, Maggie Mengqing
Subjects: PUBLIC relations, SCHOLARLY periodicals, KNOWLEDGE transfer
Abstract: Prior empirical efforts in uncovering the research-practice gap in public relations have often been restricted to perceptions and evaluations of people participating in the investigation. Moving beyond the linear perspective on knowledge transfer that dominates relevant discussions for decades, this study adopted topic modeling as an inductive analytical approach to examine a comprehensive set of texts representing the perspective of scholars and practitioners over a 10-year period from 2011 to 2020. A comparison of 35 topics discerned from academic journals (1,209 titles/abstracts) and professional texts (2,378 articles) revealed that a total of 18 topics were peculiar to each corpus, providing sound evidence of the substantial divide between scholars and practitioners. However, two communities shared common or comparable concerns over 17 topics, suggesting a significant convergence on crucial issues. Moreover, scholars and practitioners assigned varying weights to these topics in their publications, which indicated noteworthy differences in the primary areas of interest for both communities. In addition to deepening our understanding of the width and nuances of the research-practice gap in the field of public relations in a quantitative way, findings obtained from this study also signal the direction toward which scholars and practitioners should make progress to bridge the gap. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Topic and knowledge-enhanced modeling for edge-enabled IoT user identity linkage across social networks.

Author: Huang, Rui, Ma, Tinghuai, Rong, Huan, Huang, Kai, Bi, Nan, Liu, Ping, and Du, Tao
Subjects: SOCIAL networks, USER-generated content, ONLINE social networks, SOCIAL media, INTERNET of things, SOCIAL belonging, FEATURE extraction
Abstract: The Internet of Things (IoT) devices spawn growing diverse social platforms and online data at the network edge, propelling the development of cross-platform applications. To integrate cross-platform data, user identity linkage is envisioned as a promising technique by detecting whether different accounts from multiple social networks belong to the same identity. The profile and social relationship information of IoT users may be inconsistent, which deteriorates the reliability of the effectiveness of identity linkage. To this end, we propose a topic and knowledge-enhanced model for edge-enabled IoT user identity linkage across social networks, named TKM, which conducts feature representation of user generated contents from both post-level and account-level for identity linkage. Specifically, a topic-enhanced method is designed to extract features at the post-level. Meanwhile, we develop an external knowledge-based Siamese neural network for user-generated content alignment at the account-level. Finally, we show the superiority of TKM over existing methods on two real-world datasets. The results demonstrate the improvement in prediction and retrieval performance achieved by utilizing both post-level and account-level representation for identity linkage across social networks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Research on sales and ethics: Mapping the past and charting the future.

Author: Hartmann, Nathaniel N., Wieland, Heiko, Gustafson, Brandon, and Habel, Johannes
Subjects: RESEARCH ethics, RESEARCH questions, PERIODICAL articles, SALES management, PERIODICAL publishing
Abstract: The scholarly literature at the intersection of sales and ethics is vast and, therefore, difficult to summarize. To explore the state of the sales–ethics landscape, the authors apply probabilistic topic modeling to a dataset composed of 293 journal articles published from 1980 to 2022. The critical examination of the results leads to a framework that identifies 32 topics and groups these topics into five high-level topic areas. Building on these topics and topic areas, the authors explore where future research on sales and ethics should focus, using in-depth interviews with 30 scholars and 15 practitioners. The results of these interviews reveal important implications for sales and ethics and overarching research questions regarding (1) understanding existing realities, (2) understanding new realities, and (3) advancing research practices. In doing so, this study provides a platform for much-needed research and scholarly discourse on sales and ethics. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Judicial hierarchy and discursive influence.

Author: Herron, Felix, Carlson, Keith, Rockmore, Daniel N., and Livermore, Michael A.
Subjects: *FEDERAL courts, *LEGAL language, *APPELLATE courts, *CONSTITUTIONAL courts, *COMPLEXITY (Philosophy)
Abstract: We apply a dynamic influence model to the opinions of the US federal courts to examine the role of the US Supreme Court in influencing the direction of legal discourse in the federal courts. We propose two mechanisms for how the Court affects innovation in legal language: a selection mechanism where the Court's influence primarily derives from its discretionary jurisdiction, and an authorship mechanism in which the Court's influence derives directly from its own innovations. To test these alternative hypotheses, we develop a novel influence measure based on a dynamic topic model that separates the Court's own language innovations from those of the lower courts. Applying this measure to the US federal courts, we find that the Supreme Court primarily exercises influence through the selection mechanism, with modest additional influence attributable to the authorship mechanism. This article is part of the theme issue 'A complexity science approach to law and governance'. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. The flood, the traitors, and the protectors: affect and white identity in the Internet Research Agency's Islamophobic propaganda on Twitter.

Author: Ganesh, Bharath and Faggiani, Nicolò
Subjects: *ISLAMOPHOBIA, *MUSLIMS, *VOTERS, *DISINFORMATION, *RACISM, *TRAITORS
Abstract: Between 2015 and 2017, the Internet Research Agency (IRA) – a Kremlin-backed "troll farm" based in St. Petersburg – executed a propaganda campaign on Twitter to target US voters. Scholarship has expended relatively little effort to study the role of Islamophobia in the IRA's propaganda campaign. Following critical disinformation research, this article demonstrates that Islamophobia, affect, and white identity played a crucial role in the IRA's targeting of right-wing US voters. With an official release of tweets and associated visual content from Twitter, we use topic modeling and visual analysis to explore both how, and to what extent, the IRA used Islamophobia in its propaganda. To do so, we develop a multimodal distant reading technique to study how the IRA aligned users with contemporary far right social movements by deploying racial and emotional appeals that center on narrating a transnational white identity under threat from Islam and Muslims. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. 基于混合兴趣主题模型的推荐方法.

Author: 邱云飞 and 田丰维
Abstract: To solve the cold start problem caused by user interest sparsity in cross-area project recommendation, this paper proposes a recommendation method on mixed interest topic model PA-LDA. PA-LDA uses the P-LDA module, which generates the interest topic distribution to target project by mining users′ historical behavior data. Then P-LDA employs conduct parameter estimation to build model by the interaction between the topics and the content words, which helps to measure the users′ interest on the target project. PA-LDA uses A-LDA module to measure the area interest on the target project. PA-LDA employs top-k method to recommend the target project based on the result of the two mixed interest measurements. The effectiveness and efficiency of our method are verified by experiments on two real data sets EdX and GCSE. The research can effectively explain the principles of effect on recommendation by user interest and domain interest. It also realizes the interest feature capture in multi-dimensional area recommendation, which improves the adaptability and accuracy of recommendation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. A quantitative window on the history of statistics: topic-modelling 120 years of Biometrika.

Author: Bertoldi, Nicola, Lareau, Francis, Pence, Charles H, and Malaterre, Christophe
Subjects: *INTELLECTUAL development, *HISTORY of publishing, *STATISTICS, *DIGITAL technology, *PERIODICAL articles, *THEMATIC analysis
Abstract: As one of the oldest continuously publishing journals in statistics (published since 1901), Biometrika provides a unique window onto the history of statistics and its epistemic development throughout the 20th and the beginning of the 21st centuries. While the early history of the discipline, with the works of key figures, such as Karl Pearson, Francis Galton, or Ronald Fisher, is relatively well known, the later (and longer) episodes of its intellectual development remain understudied. By applying digital tools to the full-text corpus of the journal articles (N = 5,596), the objective of this study is to provide a novel quantitative exploration of the history of the statistical sciences via an all-encompassing view of 120 years of Biometrika. To this aim, topic-modelling analyses are used and provide insights into the epistemic content of the journal and its evolution. Striking changes in the thematic content of the journal are documented and quantified for the first time, from the decline of Pearsonian and Weldonian biometrical research and the journal's tight connection to biology in the 1930s to the rise of modern statistical methods beginning in the 1960s and 1970s. Newly developed approaches are used to infer author networks from publication topics. The resulting network of authors shows the existence of several communities, well-aligned with topic clusters and their evolution through time. It also highlights the role of specific figures over more than a century of publishing history and provides a first window onto the foundation, development, and diverse applications of the statistical sciences. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Direction-Oriented Topic Modeling with Applications in Traffic Scene Analysis.

Author: Ahmadi, Parvin and Gholampour, Iman
Abstract: Unlike text analysis for which topic models are historically developed, traffic video analysis is dealing with much simpler topics, made of restricted motion patterns. In this paper, we propose a dual-layer direction-oriented framework for more efficient traffic motion patterns description based on topic models through considering the simplicity of traffic topics. The aforesaid framework compels the involved topic models to learn the foreknown visually meaningful motion patterns that exist in traffic scenes, as developed theoretically in this paper. Experimental results produced by common datasets show that the proposed method provides more intuitive topics for traffic flow description. Based on experimental results, our framework outperforms other topic-model based methods by 4% to more than 11% in detecting abnormal events, in terms of the area under the Receiver Operating Characteristic curve. In addition to that, in a scene analysis evaluation at intersections equipped with traffic signals, our method reaches 4% higher traffic phase detection accuracy, compared to conventional topic models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. How to discover consumer attention to design topics of fast fashion: a topic modeling approach

Author: Pan, Xuwei, Li, Jihu, Luo, Jianhong, and Zhan, Wenbang
Published: 2024
Full Text: View/download PDF

29. Temporal analysis of computational economics: a topic modeling approach

Author: Mishra, Malvika, Vishwakarma, Santosh Kumar, Malviya, Lokesh, and Anjana, S.
Published: 2024
Full Text: View/download PDF

30. Understanding Food Waste in Bulk in Incheon: Based on Naver Blog Big Data

Author: Junsuk Choi, Joonhyeong Joseph Kim, and Sang Mook Lee
Subjects: food waste, text mining, correlation analysis, topic model, lda, Business, HF5001-6182, Finance, HG1-9999
Abstract: Purpose: T his study aims t o provide insight into t he c ontemporary food w aste i ssue f or t he b etter management of food waste in large quantities in Incheon, one specific region in Korea, based on the analysis of the Naver Blog corpus, employing analysis techniques including text mining and LDA topic modeling. Design/methodology/approach: In order to achieve the aforementioned objectives, the current study has employed the R program for the analysis (e.g., TF, tf-idf, correlation, and LDA) of 868 Naver Blog posts which included information about food waste, and/or garbage produced in bulk by foodservice operators in the context of Incheon. Findings: The frequently addressed keywords in the dataset include food, waste, garbage, processor, workplace, food waste dewaterer, microorganism, pulverizer, plastic, compressor, restaurant, sink, cafeteria, and bean sprouts. The correlation analysis demonstrated that waste is largely generated by food material, and food material is closely related to specific types of private companies (e.g., cafeterias) and public places (e.g., military bases, prisons, hospitals). The LDA identified three topics: the implications for food waste produced by the workplace, recent equipment and technologies used for food waste processing, and effect of waste on the environment and call for remedies. Research limitations/implications: While this study has shed light on contemporary issues in relation to food waste in Incheon, it is suggested that the involved parties in the waste management industry pay more attention to the development of effective waste management strategies by hospitality operators in a specific region. Originality/value: This study responds to a lack of understanding underpinning foodservice operators who produce a large quantity of food waste in Incheon, albeit much attention has been paid to some recent research on food waste.
Published: 2024
Full Text: View/download PDF

31. Analyzing Alzheimer’s Disease Research Trends: Insights From Improved Dynamic Topic Modeling

Author: Juan Shen and Vladimir Y. Mariano
Subjects: Alzheimer’s disease, topic model, topic mining, trends analysis, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: This study addresses the need for a comprehensive understanding of evolving research trends and challenges in Alzheimer’s disease research literature from 2016 to 2023. Employing an improved Dynamic Topic Model (DTM), we analyze the landscape from four critical perspectives: identifying predominant topics, analyzing variability in topic intensity, tracing evolutionary trajectories, and delineating development patterns of key research terms. Through an intensive investigation of four pivotal topics, we emphasize the imperative for sustained attention to these areas, crucial for guiding future research initiatives. Our findings reveal notable fluctuations in topic intensity, primarily attributed to nascent research domains lacking well-defined directions and cohesive research teams. Moreover, we observe a tendency for topics of high similarity to converge over time, signifying maturation and consolidation within the field. Importantly, our study underscores how focal points in Alzheimer’s disease research shift across developmental stages, shaped by dynamic interactions among the research community’s social dynamics, technological advancements, and evolving scientific priorities.
Published: 2024
Full Text: View/download PDF

32. Big topic modeling based on a two-level hierarchical latent Beta-Liouville allocation for large-scale data and parameter streaming.

Author: Ihou, Koffi Eddy and Bouguila, Nizar
Abstract: As an extension to the standard symmetric latent Dirichlet allocation topic model, we implement asymmetric Beta-Liouville as a conjugate prior to the multinomial and therefore propose the maximum a posteriori for latent Beta-Liouville allocation as an alternative to maximum likelihood estimator for models such as probabilistic latent semantic indexing, unigrams, and mixture of unigrams. Since most Bayesian posteriors, for complex models, are intractable in general, we propose a point estimate (the mode) that offers a much tractable solution. The maximum a posteriori hypotheses using point estimates are much easier than full Bayesian analysis that integrates over the entire parameter space. We show that the proposed maximum a posteriori reduces the three-level hierarchical latent Beta-Liouville allocation to two-level topic mixture as we marginalize out the latent variables. In each document, the maximum a posteriori provides a soft assignment and constructs dense expectation–maximization probabilities over each word (responsibilities) for accurate estimates. For simplicity, we present a stochastic at word-level online expectation–maximization algorithm as an optimization method for maximum a posteriori latent Beta-Liouville allocation estimation whose unnormalized reparameterization is equivalent to a stochastic collapsed variational Bayes. This implicit connection between the collapsed space and expectation–maximization-based maximum a posteriori latent Beta-Liouville allocation shows its flexibility and helps in providing alternative to model selection. We characterize efficiency in the proposed approach for its ability to simultaneously stream both large-scale data and parameters seamlessly. The performance of the model using predictive perplexities as evaluation method shows the robustness of the proposed technique with text document datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning.

Author: Mei, Qishun and Li, Xuhui
Subjects: *FEATURE extraction, *ANNOTATIONS
Abstract: To address the limitations of existing methods of short-text entity disambiguation, specifically in terms of their insufficient feature extraction and reliance on massive training samples, we propose an entity disambiguation model called COLBERT, which fuses LDA-based topic features and BERT-based semantic features, as well as using contrastive learning, to enhance the disambiguation process. Experiments on a publicly available Chinese short-text entity disambiguation dataset show that the proposed model achieves an F1-score of 84.0%, which outperforms the benchmark method by 0.6%. Moreover, our model achieves an F1-score of 74.5% with a limited number of training samples, which is 2.8% higher than the benchmark method. These results demonstrate that our model achieves better effectiveness and robustness and can reduce the burden of data annotation as well as training costs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Identifying Key Issues in Integration of Autonomous Ships in Container Ports: A Machine-Learning-Based Systematic Literature Review.

Author: Hirata, Enna and Hansen, Annette Skovsted
Subjects: LITERATURE reviews, CONTAINER terminals, CONTAINER ships, SHIPPING containers, INTERNET security laws, THIRD-party logistics, AUTONOMOUS underwater vehicles
Abstract: Background: Autonomous ships have the potential to increase operational efficiency and reduce carbon footprints through technology and innovation. However, there is no comprehensive literature review of all the different types of papers related to autonomous ships, especially with regard to their integration with ports. This paper takes a systematic review approach to extract and summarize the main topics related to autonomous ships in the fields of container shipping and port management. Methods: A machine learning method is used to extract the main topics from more than 2000 journal publications indexed in WoS and Scopus. Results: The research findings highlight key issues related to technology, cybersecurity, data governance, regulations, and legal frameworks, providing a different perspective compared to human manual reviews of papers. Conclusions: Our search results confirm several recommendations. First, from a technological perspective, it is advised to increase support for the research and development of autonomous underwater vehicles and unmanned aerial vehicles, establish safety standards, mandate testing of wave model evaluation systems, and promote international standardization. Second, from a cyber–physical systems perspective, efforts should be made to strengthen logistics and supply chains for autonomous ships, establish data governance protocols, enforce strict control over IoT device data, and strengthen cybersecurity measures. Third, from an environmental perspective, measures should be implemented to address the environmental impact of autonomous ships. This can be achieved by promoting international agreements from a global societal standpoint and clarifying the legal framework regarding liability in the event of accidents. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Seeded Sequential LDA: A Semi-Supervised Algorithm for Topic-Specific Analysis of Sentences.

Author: Watanabe, Kohei and Baturo, Alexander
Subjects: *ALGORITHMS, *CLASSIFICATION algorithms, *RESEARCH personnel, *CONTENT analysis, *THESIS statements (Rhetoric)
Abstract: Topic models have been widely used by researchers across disciplines to automatically analyze large textual data. However, they often fail to automate content analysis, because the algorithms cannot accurately classify individual sentences into pre-defined topics. Aiming to make topic classification more theoretically grounded and content analysis in general more topic-specific, we have developed Seeded Sequential Latent Dirichlet allocation (LDA), extending the existing LDA algorithm, and implementing it in a widely accessible open-source package. Taking a large corpus of speeches delivered by delegates at the United Nations General Assembly as an example, we explain how our algorithm differs from the original algorithm; why it can classify sentences more accurately; how it accepts pre-defined topics in deductive or semi-deductive analysis; how such ex-ante topic mapping differs from ex-post topic mapping; how it enables topic-specific framing analysis in applied research. We also offer practical guidance on how to determine the optimal number of topics and select seed words for the algorithm. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Study on the evolution of hot topics in the urban development.

Author: Zhou, Ping and Jiang, Difei
Abstract: Urbanization is crucially important for people to improve the quality of life. Thus, it is of importance to study the evolution of hot topics to explore the functions of cities for meeting the increasing demands of people. In this paper, we explored the semantic analysis of hot topics and trends in urban studies from the literature, which provides a research direction for future studies. Based on articles collected from the Science Citation Index Expanded and Conference Proceedings Citation Index-Science databases from 2000 to 2016, we found that the number of urban studies increased in stability during that time. Followed by England and China, USA was the largest contributor for studies in this field. Based on the keywords and abstracts of these articles, we extracted the topics of the study using a clustering method and topic model, and calculated the hot values of the topics. Finally, we obtained 15 hot topics in the field of urban studies, among which "city", "school", "regional economic", and "estate" were the hottest topics that indicated the focus of the research study. An anomaly detection method was used to analyze the change trend of topics' hot values, and we found that the hot value of these topics overall were on the rise, especially "urban education" and "urban planning" increased significantly, which indicated that they attracted an increasing amount of scholars' attention, but the hot value of "health" and "Gis" decreased significantly recently, which suggested that research interest in these two topics is decreasing. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Cereal-legume intercropping: a smart review using topic modelling.

Author: Landschoot, Sofie, Zustovi, Riccardo, Dewitte, Kevin, Randall, Nicola P., Maenhout, Steven, and Haesaert, Geert
Subjects: CATCH crops, SUSTAINABLE agriculture, PLANT exudates, BUCKWHEAT, NITROGEN fixation, BIBLIOMETRICS, INTERCROPPING, TRITICALE
Abstract: Introduction: Over the last decade, there has been a growing interest in cereal-legume intercropping for sustainable agriculture. As a result numerous papers, including reviews, focus on this topic. Screening this large amount of papers, to identify knowledge gaps and future research opportunities, manually, would be a complex and time consuming task. Materials and methods: Bibliometric analysis combined with text mining and topic modelling, to automatically find topics and to derive a representation of intercropping papers as a potential solution to reduce the workload was tested. Both common (e.g. wheat and soybean) as well as underutilized crops (e.g. buckwheat, lupin, triticale) were the focus of this study. The corpus used for the analysis was retrieved from Web of Science and Scopus on 5th September 2022 and consisted of 4,732 papers. Results: The number of papers on cereal-legume intercropping increased in recent years, with most studies being located in China. Literature mainly dealt with the cereals maize and wheat and the legume soybean whereas buckwheat andlupinreceivedlittleattention from academic researchers. These underutilized crops are certainly interesting to be used as intercropping partners, however, additional research on optimization of management and cultivar's choice is important. Yield and nitrogen fixation are the most commonly studied traits in cereal-legume intercropping. Last decade, there is an increasing interest in climate resilience, sustainability and biodiversity. Also the term "ecosystem services" came into play, but still with a low frequency. The regulating services and provisioning services seem to be the most studied, in contrast terms related to potential cultural services were not encountered. Discussion: In conclusion, based on this review several research opportunities were identified. Minor crops like lupin and buckwheat need to be evaluated for their role as intercropping partners. The interaction between species based on e.g. root exudates needs to be further unraveled. Also diseases, pests and weeds in relation to intercropping deserve more attention and finally more in-depth research on the additional benefits/ecosystem services associated with intercropping systems is necessary. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. The Mediated Construction of Crises--Combining Automated and Qualitative Content Analysis to Investigate the Use of Crisis Labels in Headlines of Swiss News Media between 1998 and 2020.

Author: Vogler, Daniel and Meissner, Florian
Subjects: PRESS, COVID-19 pandemic, PUBLIC health, GLOBAL Financial Crisis, 2008-2009, AUTOMOBILE drivers
Abstract: The recent accumulation of crises has led scholars to diagnose that crises increasingly dominate news headlines. However, there is little empirical evidence for this diagnosis because previous research often misses the longitudinal perspective. To address this gap in research, we used automated content analysis to investigate to what extent five Swiss newspapers used the crisis label in their headlines between 1998 and 2020. In the next step, we applied topic modeling to the dataset of 10,458 articles with crisis labels in their headlines to detect which topics were covered under the crisis label. Finally, we used a qualitative content analysis to name and describe the automatically identified topics. Our exploratory longitudinal design calls into question the diagnosis of the increasing use of crisis labels in media reporting. Instead, the 2008 financial crisis and the COVID-19 pandemic stand out as strong drivers of crisis labeling in headlines. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Visualising Knowledge, Research Hotspots and Trends of Literacy Studies in the Context of Library, 1969-2021.

Author: Jana, Anupta and Rout, Rosalien
Subjects: LITERATURE studies, LIBRARY research, INFORMATION literacy, BIBLIOMETRICS, INFORMATION retrieval
Abstract: In this study, we conducted an in-depth analysis spanning 53 years, from 1969 to 2021, focusing on the field of literacy studies within the context of libraries. Our exploration involved a dataset of 4,986 articles retrieved from the Scopus database. Our primary objective was to visualize knowledge by identifying and exploring prominent trends and hotspots in literacy studies. To achieve this, we adopted a comprehensive approach. The methodology employed in this study combined traditional approaches with contemporary tools. The dataset was analyzed using the R software for conventional methodologies, while MATLAB was utilized for cutting-edge techniques. The multifaceted approach allowed us to uncover patterns of continuous growth, identify key contributors, and employ the Latent Dirichlet Allocation (LDA) model to recognize emerging and significant topics. The study revealed a consistent pattern of continuous growth in the field of literacy studies, indicating the acquisition of new knowledge over time. Key contributors, including productive authors, influential journals, and active countries, were identified. The application of the LDA model enabled us to recognize newly emerged, developed, and important topics. The significance of this research lies in its contribution to understanding the dynamic landscape of literacy studies within library contexts, offering valuable insights for future research and practical applications in the field. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Critical Uncertainty Analysis of the Application of New-Generation Digital Technology in China's Construction Industry—A Study Based on the LDA-DEMATEL-ISM Improvement Model.

Author: Li, Hui, Sun, Yanpeng, Zhang, Jingxiao, Liu, Die, Han, Zhengji, and Li, Yu
Subjects: DIGITAL technology, DIGITAL transformation, CONSTRUCTION industry, CRITICAL analysis, RATE of return, RESEARCH personnel
Abstract: As the main driving force for the digital transformation of the construction industry, the uncertainty of digital technology in the application process has seriously hindered the high-quality development of the construction industry. In order to promote the wide application of digital technology in the construction industry and clarify the key uncertainties in its application process, this paper identifies the uncertainty index system of digital technology application based on the LDA topic model and literature analysis; the DEMATEL-ISM method is used to construct the multilevel hierarchical structure model of the uncertainty indicators in the application of digital technology to study the mutual influence among the indicators and to find the key uncertainty indicators. The research results show that the uncertainty indicators of the application of digital technology in the construction industry are divided into five levels: policy, industry, personnel, economy and law, and that the perfection of the policy guarantee system is the key uncertainty indicator for the investment return period of digital technology application. The standard contract model for digital technology is a direct uncertainty indicator for the application of digital technology in the construction industry. The results of this study help researchers and practitioners to focus on the key barriers and provide a list of key elements for construction companies to promote the application of next-generation digital technologies to improve digitalization of the construction industry. This study also provides a policy reference to further promote the digital transformation of the construction industry. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Uncovering the structures of privacy research using bibliometric network analysis and topic modelling

Author: van Dijk, Friso, Gadellaa, Joost, van Toledo, Chaïm, Spruit, Marco, Brinkkemper, Sjaak, and Brinkhuis, Matthieu
Published: 2023
Full Text: View/download PDF

42. Survey of Automatic Labeling Methods for Topic Models

Author: HE Dongbin, TAO Sha, ZHU Yanhong, REN Yanzhao, CHU Yunxia
Subjects: topic model, latent dirichlet allocation (lda), topic labeling, topic label, Electronic computers. Computer science, QA75.5-76.95
Abstract: Topic models are often used in modeling unstructured corpora and discrete data to extract the latent topic. As topics are generally expressed in the form of word lists, it is usually difficult for users to understand the meanings of topics, especially when users lack knowledge in the subject area. Although manually labeling topics can generate more explanatory and easily understandable topic labels, the cost is too high for the method to be feasible. Therefore, research on automatic labeling of topic discovered provides solutions to the problem. Firstly, the currently most popular technique, latent Dirichlet allocation (LDA), is elaborated and analyzed. According to the three different representations of topic labels, based on phrases, abstracts, and pictures, the topic labeling methods are classified into three types. Then, centered on improving the interpretability of topics, with different types of generated topic labels utilized, the relevant research in recent years is sorted out, analyzed, and summarized. The applicable scenarios and usability of different labels are also discussed. Meanwhile, methods are further categorized according to their different characteristics. The focus is placed on the quantitative and qualitative analysis of the abstract topic labels generated through lexical-based, submodular optimization, and graph-based methods. The differences between separate methods with respect to the learning types, technologies used, and data sources are then compared. Finally, the existing problems and trend of development of research on automatic topic labeling are discussed. Based on deep learning, integrating with sentiment analysis, and continuously expanding the applicable scenarios of topic labeling, will be the directions of future development.
Published: 2023
Full Text: View/download PDF

43. CFMf topic-model: comparison with LDA and Top2Vec

Author: Lamirel, Jean-Charles, Lareau, Francis, and Malaterre, Christophe
Published: 2024
Full Text: View/download PDF

44. Discovering knowledge map and evolutionary path of HRM and ER: using the STM combined with Word2vec

Author: Yu, Dejian and Xiang, Bo
Published: 2023
Full Text: View/download PDF

45. Unsupervised learning for medical data: A review of probabilistic factorization methods.

Author: Neijzen, Dorien and Lunter, Gerton
Subjects: *MATRIX decomposition, *NONNEGATIVE matrices, *LOW-rank matrices, *PRINCIPAL components analysis, *K-means clustering, *AKAIKE information criterion
Abstract: We review popular unsupervised learning methods for the analysis of high‐dimensional data encountered in, for example, genomics, medical imaging, cohort studies, and biobanks. We show that four commonly used methods, principal component analysis, K‐means clustering, nonnegative matrix factorization, and latent Dirichlet allocation, can be written as probabilistic models underpinned by a low‐rank matrix factorization. In addition to highlighting their similarities, this formulation clarifies the various assumptions and restrictions of each approach, which eases identifying the appropriate method for specific applications for applied medical researchers. We also touch upon the most important aspects of inference and model selection for the application of these methods to health data. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

46. リスクガバナンスの分野横断的波及効果―レギュラトリーサイエンス的視野からの考察―.

Author: 村上道夫, 小野恭子, 井上知也, 西川佳孝, 小島直也, 岩崎雄一, 平井祐介, 藤井健吉, and 永井孝志
Subjects: DECISION making, STRUCTURAL models, RISK assessment, CLIMATE change, FRAMES (Social sciences)
Abstract: In this review, we selected Klinke and Renn (2002), entitled “A new approach to risk evaluation and management: risk-based, precaution-based, and discourse-based strategies,” published in Risk Analysis, as the most influential article from a regulatory science perspective. In this review paper, we first summarize what the Klinke and Renn (2002) paper claimed, and then classify the topics by structural topic modeling in order to analyze in what fields the Klinke and Renn (2002) paper was cited. Representative references were extracted from each classified topic to further investigate in what context Klinke and Renn (2002) paper were cited. In addition, we also organized the citation status in literature other than journal papers. Through the analyses, we found the Klinke and Renn (2002) paper was cited in a wide variety of topics, including uncertainty and decision making, regulation, systems and decision making, communication, climate change, management, systems and evaluation, and infrastructure. The citations could be divided into four main categories: characteristics of risk, criteria for assessing risk, proposed risk classes, and risk governance frames. In particular, the discussion of risk governance frames proved to have a broad impact on cross disciplines. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

47. A survey of topic models: From a whole-cycle perspective.

Author: Cheng, Gang, You, Qinliang, Shi, Lei, Wang, Zhenxue, Luo, Jia, and Li, Tianbin
Subjects: *INFORMATION science, *RESEARCH personnel, *SOCIAL networks, *STATISTICS, *MODEL theory, *NURSING informatics
Abstract: With the rapid development of information science and social networks, the Internet has accumulated various data containing valuable information and topics. The topic model has become one of the primary semantic modeling and classification methods. It has been widely studied in academia and industry. However, most topic models only focus on long texts and often suffer from semantic sparsity problems. The sparse, short text content and irregular data have brought major challenges to the application of topic models in semantic modeling and topic discovery. To overcome these challenges, researchers have explored topic models and achieved excellent results. However, most of the current topic models are applicable to a specific model task. The majority of current reviews ignore the whole-cycle perspective and framework. It brings great challenges for novices to learn topic models. To deal with the above challenges, we investigate more than a hundred papers on topic models and summarize the research progress on the entire topic model process, including theory, method, datasets, and evaluation indicator. In addition, we also analyzed the statistical data results of the topic model through experiments and introduced its applications in different fields. The paper provides a whole-cycle learning path for novices. It encourages researchers to give more attention to the topic model algorithm and the theory itself without paying extra attention to understanding the relevant datasets, evaluation methods and latest progress. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

48. A systematic review of the use of topic models for short text social media analysis.

Author: Laureate, Caitlin Doogan Poet, Buntine, Wray, and Linger, Henry
Abstract: Recently, research on short text topic models has addressed the challenges of social media datasets. These models are typically evaluated using automated measures. However, recent work suggests that these evaluation measures do not inform whether the topics produced can yield meaningful insights for those examining social media data. Efforts to address this issue, including gauging the alignment between automated and human evaluation tasks, are hampered by a lack of knowledge about how researchers use topic models. Further problems could arise if researchers do not construct topic models optimally or use them in a way that exceeds the models' limitations. These scenarios threaten the validity of topic model development and the insights produced by researchers employing topic modelling as a methodology. However, there is currently a lack of information about how and why topic models are used in applied research. As such, we performed a systematic literature review of 189 articles where topic modelling was used for social media analysis to understand how and why topic models are used for social media analysis. Our results suggest that the development of topic models is not aligned with the needs of those who use them for social media analysis. We have found that researchers use topic models sub-optimally. There is a lack of methodological support for researchers to build and interpret topics. We offer a set of recommendations for topic model researchers to address these problems and bridge the gap between development and applied research on short text topic models. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

49. 主题模型自动标记方法研究综述.

Author: 何东彬, 陶莎, 朱艳红, 任延昭, and 褚云霞
Abstract: Copyright of Journal of Frontiers of Computer Science & Technology is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2023
Full Text: View/download PDF

50. Fuel vehicles or new energy vehicles? A study on the differentiation of vehicle consumer demand based on online reviews.

Author: Wang, Xiaoguang, Cheng, Yue, Lv, Tao, and Cai, Rongjiang
Abstract: Purpose: The authors hope to filter valuable information from online reviews, obtain objective and accurate information about the demands of auto consumers and help auto companies develop more reasonable production and marketing strategies for healthy and sustainable development. This paper aims to discuss the aforementioned objectives. Design/methodology/approach: The authors collected review data from online automotive forums and generated a corpus after pre-processing. Then, the authors extracted consumer demands and topics using the LDA model. Finally, the authors used a trained Word2vec tool to extend the consumer demand topics. Findings: Different types of vehicle consumers have the same demands, such as "Space," "Power Performance," and "Brand Comparison," and distinct demands, such as "Appearance," "Safety," "Service," and "New Energy Features"; consumers who buy new energy vehicles are still accustomed to comparing with the brands or models of fuel vehicles; new energy vehicles consumers pay more attention to services and service quality during the purchasing and using process. Research limitations/implications: The development time of new energy vehicles is relatively short, with some models being available for only one year or even six months. The smaller amount of available data may impact the applicability of topic models. The sample size, especially for new energy vehicles, needs to be increased to improve the general applicability of topic models further. Practical implications: First, this measure helps online review websites improve their existing review publication mechanisms, enhance the overall quality of online review content, increase user traffic and promote the healthy development of online review websites. Second, this allows for timely adjustments in future product production and sales plans and further enhances automotive companies' ability to leverage online reviews for Internet marketing. Originality/value: The authors have improved the accuracy and stability of the fused topic model, providing a scientific and efficient research tool for multi-dimensional topic mining of online reviews. With the help of research results, consumers can more easily understand the discussion topics and thus filter out valuable reference information. As a result, automotive companies may gain information about consumer demands and product quality feedback and thus quickly adjust production and marketing strategies to increase sales and market share. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

791 results on '"Topic model"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources