11 results on '"Topic model"'
Search Results
2. Exploring Time-aware Multi-pattern Group Venue Recommendation in LBSNs.
- Author
-
BI LIANG, XIANGWU MENG, and YUJIE ZHANG
- Abstract
Location-based social networks (LBSNs) have become a popular platform for users to share their activities with friends and families, which provide abundant information for us to study issues of group venue recommendation by utilizing the characteristics of check-in data. Although there are some studies on group recommendation for venues, few studies consider the group’s venue preference in different temporal patterns. In this article, we discover that the group’s activity venue has a temporal effect, that is, the group’s preference for the activity venue is different at different times. For example, a couple of lovers prefer to travel to tropical regions in winter and relax in bars in the evening. Based on this discovery, we present a Time-aware Multi-pattern (TaMp) topic model to capture the group’s interest in the activity venue in multiple temporal patterns (including the daily pattern, the weekly pattern, the monthly pattern, and the quarterly pattern). The TaMp model takes into account the topic, members, temporality, and venue information of group activities and the latent relations among them, especially the strong correlation between the activity time and the corresponding activity venue. Then, we propose a group venue recommendation method based on the TaMp model. In addition, an improved grouping algorithm (iGA) in LBSNs is put forward to enhance the rationality of grouping and the accuracy of group venue recommendation. We conduct comprehensive experiments to evaluate the performance of TaMp on two real-world datasets. The results show that our proposed method outperforms the state-of-the-art group venue recommendation and demonstrates the significance of temporal effects in explaining group activities. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. Topic-based Video Analysis: A Survey.
- Author
-
PAL, RATNABALI, SEKH, ARIF AHMED, DOGRA, DEBI PROSAD, KAR, SAMARJIT, ROY, PARTHA PRATIM, and PRASAD, DILIP K.
- Subjects
- *
VIDEO surveillance , *CLOSED-circuit television , *COMPUTERS , *APPLICATION software , *VIDEOS , *VIDEO recording , *COMPUTER vision , *SUPERVISED learning - Abstract
Manual processing of a large volume of video data captured through closed-circuit television is challenging due to various reasons. First, manual analysis is highly time-consuming. Moreover, as surveillance videos are recorded in dynamic conditions such as in the presence of camera motion, varying illumination, or occlusion, conventional supervised learning may not work always. Thus, computer vision-based automatic surveillance scene analysis is carried out in unsupervised ways. Topicmodelling is one of the emerging fields used in unsupervised information processing. Topic modelling is used in text analysis, computer vision applications, and other areas involving spatio-temporal data. In this article, we discuss the scope, variations, and applications of topic modelling, particularly focusing on surveillance video analysis. We have provided a methodological survey on existing topic models, their features, underlying representations, characterization, and applications in visual surveillance's perspective. Important research papers related to topic modelling in visual surveillance have been summarized and critically analyzed in this article. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Active Learning for Effectively Fine-Tuning Transfer Learning to Downstream Task.
- Author
-
Bashar, Md Abul and Nayak, Richi
- Subjects
- *
ACTIVE learning , *NATURAL language processing , *LANGUAGE models , *DEEP learning - Abstract
Language model (LM) has become a common method of transfer learning in Natural Language Processing (NLP) tasks when working with small labeled datasets. An LM is pretrained using an easily available large unlabelled text corpus and is fine-tuned with the labelled data to apply to the target (i.e., downstream) task. As an LM is designed to capture the linguistic aspects of semantics, it can be biased to linguistic features. We argue that exposing an LM model during fine-tuning to instances that capture diverse semantic aspects (e.g., topical, linguistic, semantic relations) present in the dataset will improve its performance on the underlying task. We propose a Mixed Aspect Sampling (MAS) framework to sample instances that capture different semantic aspects of the dataset and use the ensemble classifier to improve the classification performance. Experimental results show that MAS performs better than random sampling as well as the state-of-the-art active learning models to abuse detection tasks where it is hard to collect the labelled data for building an accurate classifier. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
5. Seed-Guided Topic Model for Document Filtering and Classification.
- Author
-
CHENLIANG LI, SHIQIAN CHEN, JIAN XING, AIXIN SUN, and ZONGYANG MA
- Subjects
- *
CLASSIFICATION , *SEEDS , *SEMANTICS , *VOCABULARY - Abstract
One important necessity is to filter out the irrelevant information and organize the relevant information into meaningful categories. However, developing text classifiers often requires a large number of labeled documents as training examples.Manually labeling documents is costly and time-consuming.More importantly, it becomes unrealistic to knowall the categories covered by the documents beforehand. Recently, a fewmethods have been proposed to label documents by using a small set of relevant keywords for each category, known as dataless text classification. In this article, we propose a seed-guided topic model for the dataless text filtering and classification (named DFC). Given a collection of unlabeled documents, and for each specified category a small set of seed words that are relevant to the semantic meaning of the category, DFC filters out the irrelevant documents and classifies the relevant documents into the corresponding categories through topic influence. DFC models two kinds of topics: category-topics and general-topics. Also, there are two kinds of category-topics: relevant-topics and irrelevant-topics. Each relevant-topic is associated with one specific category, representing its semantic meaning. The irrelevant-topics represent the semantics of the unknown categories covered by the document collection. And the general-topics capture the global semantic information. DFC assumes that each document is associated with a single category-topic and a mixture of general-topics. A novelty of the model is that DFC learns the topics by exploiting the explicit word co-occurrence patterns between the seed words and regular words (i.e., non-seed words) in the document collection. A document is then filtered, or classified, based on its posterior category-topic assignment. Experiments on two widely used datasets show that DFC consistently outperforms the state-of-the-art dataless text classifiers for both classi- fication with filtering and classification without filtering. In many tasks, DFC can also achieve comparable or even better classification accuracy than the state-of-the-art supervised learning solutions. Our experimental results further show that DFC is insensitive to the tuning parameters.Moreover, we conduct a thorough study about the impact of seed words for existing dataless text classification techniques. The results reveal that it is not using more seed words but the document coverage of the seed words for the corresponding category that affects the dataless classification performance. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
6. Enhancing Topic Modeling for Short Texts with Auxiliary Word Embeddings.
- Author
-
CHENLIANG LI, YU DUAN, HAORAN WANG, ZHIQIAN ZHANG, AIXIN SUN, and ZONGYANG MA
- Subjects
- *
SEMANTICS , *COMPARATIVE linguistics , *POISSON distribution , *LANGUAGE & languages , *LEXICOLOGY , *DISTRIBUTION (Probability theory) - Abstract
Many applications require semantic understanding of short texts, and inferring discriminative and coherent latent topics is a critical and fundamental task in these applications. Conventional topic models largely rely on word co-occurrences to derive topics from a collection of documents. However, due to the length of each document, short texts are much more sparse in terms of word co-occurrences. Recent studies show that the Dirichlet Multinomial Mixture (DMM) model is effective for topic inference over short texts by assuming that each piece of short text is generated by a single topic. However, DMM has two main limitations. First, even though it seems reasonable to assume that each short text has only one topic because of its shortness, the definition of "shortness" is subjective and the length of the short texts is dataset dependent. That is, the single-topic assumption may be too strong for some datasets. To address this limitation, we propose to model the topic number as a Poisson distribution, allowing each short text to be associated with a small number of topics (e.g., one to three topics). This model is named PDMM. Second, DMM (and also PDMM) does not have access to background knowledge (e.g., semantic relations between words) when modeling short texts. When a human being interprets a piece of short text, the understanding is not solely based on its content words, but also their semantic relations. Recent advances in word embeddings offer effective learning of word semantic relations from a large corpus. Such auxiliary word embeddings enable us to address the second limitation. To this end, we propose to promote the semantically related words under the same topic during the sampling process, by using the generalized Pólya urn (GPU) model. Through the GPU model, background knowledge about word semantic relations learned from millions of external documents can be easily exploited to improve topic modeling for short texts. By directly extending the PDMM model with the GPU model, we propose two more effective topic models for short texts, named GPU-DMM and GPU-PDMM. Through extensive experiments on two real-world short text collections in two languages, we demonstrate that PDMM achieves better topic representations than state-of-the-art models, measured by topic coherence. The learned topic representation leads to better accuracy in a text classification task, as an indirect evaluation. Both GPU-DMM and GPU-PDMM further improve topic coherence and text classification accuracy. GPUPDMM outperforms GPU-DMM at the price of higher computational costs. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
7. Personalized Microtopic Recommendation on Microblogs.
- Author
-
Li, Yang, Jiang, Jing, Liu, Ting, Qiu, Minghui, and Sun, Xiaofei
- Subjects
- *
MICROBLOGS , *RECOMMENDER systems , *INFORMATION filtering , *DATA analysis , *CONTENT analysis - Abstract
Microblogging services such as Sina Weibo and Twitter allow users to create tags explicitly indicated by the # symbol. In Sina Weibo, these tags are called microtopics, and in Twitter, they are called hashtags. In Sina Weibo, each microtopic has a designate page and can be directly visited or commented on. Recommending these microtopics to users based on their interests can help users efficiently acquire information. However, it is non-trivial to recommend microtopics to users to satisfy their information needs. In this article, we investigate the task of personalized microtopic recommendation, which exhibits two challenges. First, users usually do not give explicit ratings to microtopics. Second, there exists rich information about users and microtopics, for example, users' published content and biographical information, but it is not clear how to best utilize such information. To address the above two challenges, we propose a joint probabilistic latent factor model to integrate rich information into a matrix factorization-based solution to microtopic recommendation. Our model builds on top of collaborative filtering, content analysis, and feature regression. Using two real-world datasets, we evaluate our model with different kinds of content and contextual information. Experimental results show that our model significantly outperforms a few competitive baseline methods, especially in the circumstance where users have few adoption behaviors. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
8. A Spatial-Temporal Topic Model for the Semantic Annotation of POIs in LBSNs.
- Author
-
He, Tieke, Yin, Hongzhi, Chen, Zhenyu, Zhou, Xiaofang, Sadiq, Shazia, and Luo, Bin
- Subjects
- *
LOCATION-based services , *SPATIOTEMPORAL processes , *SEMANTIC Web , *SOCIAL networks , *PROBABILISTIC inference - Abstract
Semantic tags of points of interest (POIs) are a crucial prerequisite for location search, recommendation services, and data cleaning. However, most POIs in location-based social networks (LBSNs) are either tag-missing or tag-incomplete. This article aims to develop semantic annotation techniques to automatically infer tags for POIs. We first analyze two LBSN datasets and observe that there are two types of tags, category-related ones and sentimental ones, which have unique characteristics. Category-related tags are hierarchical, whereas sentimental ones are category-aware. All existing related work has adopted classification methods to predict high-level category-related tags in the hierarchy, but they cannot apply to infer either low-level category tags or sentimental ones. In light of this, we propose a latent-class probabilistic generative model, namely the spatial-temporal topic model (STM), to infer personal interests, the temporal and spatial patterns of topics/semantics embedded in users’ check-in activities, the interdependence between category-topic and sentiment-topic, and the correlation between sentimental tags and rating scores from users’ check-in and rating behaviors. Then, this learned knowledge is utilized to automatically annotate all POIs with both category-related and sentimental tags in a unified way. We conduct extensive experiments to evaluate the performance of the proposed STM on a real large-scale dataset. The experimental results show the superiority of our proposed STM, and we also observe that the real challenge of inferring category-related tags for POIs lies in the low-level ones of the hierarchy and that the challenge of predicting sentimental tags are those with neutral ratings. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
9. STCAPLRS.
- Author
-
Fang, Quan, Xu, Changsheng, Hossain, M. Shamim, and Muhammad, G.
- Subjects
- *
SOCIAL media research , *GEOGRAPHIC spatial analysis , *REGRESSION analysis , *END users (Information technology) , *INTERNET users - Abstract
Newly emerging location-based social media network services (LBSMNS) provide valuable resources to understand users’ behaviors based on their location histories. The location-based behaviors of a user are generally influenced by both user intrinsic interest and the location preference, and moreover are spatial-temporal context dependent. In this article, we propose a spatial-temporal context-aware personalized location recommendation system (STCAPLRS), which offers a particular user a set of location items such as points of interest or venues (e.g., restaurants and shopping malls) within a geospatial range by considering personal interest, local preference, and spatial-temporal context influence. STCAPLRS can make accurate recommendation and facilitate people’s local visiting and new location exploration by exploiting the context information of user behavior, associations between users and location items, and the location and content information of location items. Specifically, STCAPLRS consists of two components: offline modeling and online recommendation. The core module of the offline modeling part is a context-aware regression mixture model that is designed to model the location-based user behaviors in LBSMNS to learn the interest of each individual user, the local preference of each individual location, and the context-aware influence factors. The online recommendation part takes a querying user along with the corresponding querying spatial-temporal context as input and automatically combines the learned interest of the querying user, the local preference of the querying location, and the context-aware influence factor to produce the top-k recommendations. We evaluate the performance of STCAPLRS on two real-world datasets: Dianping and Foursquare. The results demonstrate the superiority of STCAPLRS in recommending location items for users in terms of both effectiveness and efficiency. Moreover, the experimental analysis results also illustrate the excellent interpretability of STCAPLRS. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
10. Object-Oriented Travel Package Recommendation.
- Author
-
CHANG TAN, Ql LIU, ENHONG CHEN, HUI XIONG, and XIANG WU
- Subjects
- *
RECOMMENDER systems , *TRAVEL -- Information services , *TOURISTS , *INFORMATION filtering systems , *CONSUMER preferences - Abstract
The article discusses the Objected-Oriented Recommender System (ORS) for personalized travel package recommendations to tourists. The ability of the ORS to import additional context information to the travel package recommendation process in a cost-effective way is highlighted. The Object that collects the feature-value pairs is discussed. The use of two models in the ORS framework for extracting the implicit relationships among Objects is discussed.
- Published
- 2014
- Full Text
- View/download PDF
11. Dynamic Joint Sentiment-Topic Model.
- Author
-
CHENGHUA LIN, WEI GAO, KAM-FAI WONG, and YULAN HE
- Subjects
- *
SOCIAL media research , *SENTIMENT analysis , *DATA distribution , *COMMUNICATIONS research - Abstract
Social media data are produced continuously by a large and uncontrolled number of users. The dynamic nature of such data requires the sentiment and topic analysis model to be also dynamically updated, capturing the most recent language use of sentiments and topics in text. We propose a dynamic Joint Sentiment-Topic model (dJST) which allows the detection and tracking of views of current and recurrent interests and shifts in topic and sentiment. Both topic and sentiment dynamics are captured by assuming that the current sentiment-topic-specific word distributions are generated according to the word distributions at previous epochs. We study three different ways of accounting for such dependency information: (1) sliding window where the current sentiment-topic word distributions are dependent on the previous sentiment-topic-specific word distributions in the last S epochs; (2) skip model where history sentiment topic word distributions are considered by skipping some epochs in between; and (3) multiscale model where previous long- and short-timescale distributions are taken into consideration. We derive efficient online inference procedures to sequentially update the model with newly arrived data and show the effectiveness of our proposed model on the Mozilla add-on reviews crawled between 2007 and 2011. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.