217 results on '"Qin Lu"'
Search Results
2. Improving Attention Model Based on Cognition Grounded Data for Sentiment Analysis
- Author
-
Minglei Li, Qin Lu, Yunfei Long, Chu-Ren Huang, and Rong Xiang
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,05 social sciences ,Sentiment analysis ,Cognition ,Context (language use) ,02 engineering and technology ,Lexicon ,computer.software_genre ,Preference ,Human-Computer Interaction ,Reading (process) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,0509 other social sciences ,050904 information & library sciences ,business ,computer ,Software ,Word (computer architecture) ,Natural language processing ,Sentence ,media_common - Abstract
Attention models are proposed in sentiment analysis and other classification tasks because some words are more important than others to train the attention models. However, most existing methods either use local context based information, affective lexicons, or user preference information. In this work, we propose a novel attention model trained by cognition grounded eye-tracking data. First,a reading prediction model is built using eye-tracking data as dependent data and other features in the context as independent data. The predicted reading time is then used to build a cognition grounded attention layer for neural sentiment analysis. Our model can capture attentions in context both in terms of words at sentence level as well as sentences at document level. Other attention mechanisms can also be incorporated together to capture other aspects of attentions, such as local attention, and affective lexicons. Results of our work include two parts. The first part compares our proposed cognition ground attention model with other state-of-the-art sentiment analysis models. The second part compares our model with an attention model based on other lexicon based sentiment resources. Evaluations show that sentiment analysis using cognition grounded attention model outperforms the state-of-the-art sentiment analysis methods significantly. Comparisons to affective lexicons also indicate that using cognition grounded eye-tracking data has advantages over other sentiment resources by considering both word information and context information. This work brings insight to how cognition grounded data can be integrated into natural language processing (NLP) tasks.
- Published
- 2021
3. Lexical data augmentation for sentiment analysis
- Author
-
Chu-Ren Huang, Wenjie Li, Yunfei Long, Emmanuele Chersoni, Rong Xiang, and Qin Lu
- Subjects
Information Systems and Management ,Computer Networks and Communications ,business.industry ,Computer science ,Deep learning ,Sentiment analysis ,Library and Information Sciences ,computer.software_genre ,Abstract machine ,Learning methods ,Artificial intelligence ,business ,computer ,Natural language processing ,Information Systems - Abstract
Machine learning methods, especially deep learning models, have achieved impressive performance in various natural language processing tasks including sentiment analysis. However, deep lea...
- Published
- 2021
4. Tropical Cyclone Intensity Classification and Estimation Using Infrared Satellite Images With Deep Learning
- Author
-
Xiao-Jie Wang, Xiao-Qin Lu, Chang-Jiang Zhang, and Lei-Ming Ma
- Subjects
Atmospheric Science ,010504 meteorology & atmospheric sciences ,Mean squared error ,Computer science ,Feature extraction ,intensity grade classification ,Geophysics. Cosmic physics ,0211 other engineering and technologies ,02 engineering and technology ,01 natural sciences ,Convolutional neural network ,intensity estimation ,Deep convolutional neural network (CNN) ,Computers in Earth Sciences ,Image resolution ,TC1501-1800 ,021101 geological & geomatics engineering ,0105 earth and related environmental sciences ,Pixel ,business.industry ,QC801-809 ,Deep learning ,Pattern recognition ,tropical cyclone (TC) ,Ocean engineering ,Geostationary orbit ,Satellite ,Artificial intelligence ,business - Abstract
A novel tropical cyclone (TC) intensity classification and estimation model (TCICENet) is proposed using infrared geostationary satellite images from the northwest Pacific Ocean basin in combination with a cascading deep convolutional neural network (CNN). The proposed model consists of two CNN network modules: a TC intensity classification (TCIC) module and a TC intensity estimation (TCIE) module. First, the TCIC module is utilized to divide TC intensity into three categories using infrared satellite images. Next, three TCIE models based on the CNN regression network that combine different intensity types of infrared satellite images with the TC best track data are presented. The three TCIE models consider classification error with the TCIC module in order to improve TCIE accuracy. A total of 1001 TCs from 1981-2019 were used to verify the proposed TCICENet model, with 844 TCs from 1981-2013 employed as training samples, 76 TCs from 2014-2016 used as validation samples, and 81 TCs from 2017-2019 used as testing samples. In order to reduce the computation burden of training the TCICENet model, various input image sizes were explored. An image size of 170 × 170 pixels achieved the best performance, with an overall root mean square error of 8.60 kt and a mean absolute error of 6.67 kt compared to the best track.
- Published
- 2021
5. Orthographic features for emotion classification in Chinese in informal short texts
- Author
-
Yunfei Long, Qin Lu, I-Hsuan Chen, and Chu-Ren Huang
- Subjects
050101 languages & linguistics ,Linguistics and Language ,Computer science ,Emotion classification ,02 engineering and technology ,Library and Information Sciences ,computer.software_genre ,Language and Linguistics ,Education ,Code-mixing ,Task (project management) ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,0501 psychology and cognitive sciences ,business.industry ,Deep learning ,05 social sciences ,Orthographic projection ,Contrast (statistics) ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,Artificial intelligence ,Computational linguistics ,business ,computer ,Natural language processing - Abstract
Informal short texts on the web are rich in emotions as they often reflect unfiltered immediate reactions to breaking news events. The emotion density, however, stands in contrast to its poverty of linguistic contexts and features for emotion classification. This paper tackles that challenge by proposing orthographic features based on orthographic code mixing and code-switching for both non-ML and ML approaches. Our results show that orthographic features routinely outperform grammatical features for emotion classification for short texts in all approaches as expected. Orthographic features were also shown to make more significant contributions, especially in terms of precision and in formal texts when state of the art deep learning algorithms are applied. This result confirms the effectiveness of the orthographic change feature to the task of emotion classification. These results are argued to be applicable to all languages because of the common code-shifting in languages with non-Latin orthographies, and the use of non-letter symbols in all languages.
- Published
- 2020
6. Direct Crystallization of Proteins from Impure Sources
- Author
-
Xi Zhang, Qing-Di Cheng, Bo Wang, Xiang-Bin Zeng, Qin-Qin Lu, Hai Hou, Yue Liu, Ahmad Fiaz, Da-Chuan Yin, Jin Li, and Chen-Yan Zhang
- Subjects
Thesaurus (information retrieval) ,010405 organic chemistry ,Computer science ,General Chemistry ,010402 general chemistry ,Condensed Matter Physics ,01 natural sciences ,0104 chemical sciences ,law.invention ,World Wide Web ,law ,natural sciences ,General Materials Science ,Crystallization - Abstract
In recent years, with the rapidly increasing demand for pure protein products in various fields (biomedicines, biochemical reagents, food industries, etc.), the need for low-cost, high-quality prot...
- Published
- 2020
7. Graph-Adaptive Semi-Supervised Tracking of Dynamic Processes Over Switching Network Modes
- Author
-
Vassilis N. Ioannidis, Qin Lu, and Georgios B. Giannakis
- Subjects
Adaptive algorithm ,Computer science ,Signal Processing ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Topological graph theory ,Inference ,020206 networking & telecommunications ,02 engineering and technology ,Electrical and Electronic Engineering ,Network topology ,Algorithm ,Graph - Abstract
A plethora of network-science related applications call for inference of spatio-temporal graph processes. Such an inference task can be aided by the underlying graph topology that might jump over discrete modes. For example, the connectivity in dynamic brain networks, switches among candidate topologies, each corresponding to a different emotional state, also known as the network mode. Taking advantage of limited nodal observations, the present contribution deals with semi-supervised tracking of dynamic processes over a given candidate set of graphs with unknown switches. Towards this end, a dynamical model is introduced to capture the per-slot spatial correlation using the active topology, as well as the temporal variation across slots through a state-space model. A scalable graph-adaptive Bayesian approach is developed, based on what is termed interacting multi-graph model (IMGM), to track the dynamic nodal processes and the active graph topology on-the-fly. Besides switching topologies, the proposed IMGM algorithm can accommodate various generalizations, including multiple dynamic functions, multiple kernels, and adaptive observation noise covariances. IMGM learns the dynamical model that best fits the data from a pool of available models. Thus, the resultant adaptive algorithm does not require offline model training. Numerical tests with synthetic and real datasets demonstrate the superior tracking performance of the novel approach compared to the mode-clairvoyant existing alternatives.
- Published
- 2020
8. A Two-Tier Service Filtering Model for Web Service QoS Prediction
- Author
-
Mingyu Li, Qin Lu, and Mingge Zhang
- Subjects
Matching (statistics) ,General Computer Science ,Computer science ,QoS ,02 engineering and technology ,invalid service ,computer.software_genre ,Personalization ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Service (business) ,contextual information ,Information retrieval ,service filter ,Quality of service ,General Engineering ,Functional requirement ,Filter (signal processing) ,Key (cryptography) ,Service recommendation ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Web service ,lcsh:TK1-9971 ,computer - Abstract
Service recommendation technology is the key to realize the personalization of intelligent services. The recommended services need to meet functional requirements as well as non-functional requirements. Therefore, QoS-based service recommendation came into being. To perform intelligent service recommendations, matching users with convenient services based on QoS becomes an inevitable task. However, most of the service recommendation models are based on user interaction records to predict and recommend, ignoring the service-user correlation and unstable QoS values. In this article, we propose a new service recommendation model. We have performed two-tier filtering calculation on a large number of Web Services, filtering the contextual information of users and services and the instability of services. In the first filtering layer, we take the instability of QoS as an indicator to eliminate invalid services, which significantly reduces the service scale and eliminates the interference of invalid services on the recommendation to a certain extent. Further, we process the contextual information of both users and services in the second filtering layer. Considering the impact of the correlation between the service and the user, we use the geographic location information of the user and the service, and solve the combined features generated by the similarity between the user and the service to filter. Considering the sparsity of the service recommendation environment and the influence of noise generated by useless features, we use a model of factorization machine combined with the attention mechanism for computational processing. It effectively distinguishes the interactive importance of different features. We have conducted many experiments on real dataset, and the results show that our model is better than most baseline model in terms of recommendation performance.
- Published
- 2020
9. Leveraging writing systems changes for deep learning based Chinese affective analysis
- Author
-
Yunfei Long, Qin Lu, Yufei Zheng, Rong Xiang, Ying Jiao, and Wenhao Ying
- Subjects
Syntax (programming languages) ,business.industry ,Computer science ,Feature vector ,Deep learning ,05 social sciences ,050301 education ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Writing system ,Artificial Intelligence ,Scripting language ,Pattern recognition (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Memory model ,business ,0503 education ,computer ,Software ,Natural language processing - Abstract
Affective analysis of social media text is in great demand. Online text written in Chinese communities often contains mixed scripts including major text written in Chinese, an ideograph-based writing system, and minor text using Latin letters, an alphabet-based writing system. This phenomenon is referred to as writing systems changes (WSCs). Past studies have shown that WSCs often reflect unfiltered immediate affections. However, the use of WSCs poses more challenges in Natural Language Processing tasks because WSCs can break the syntax of the major text. In this work, we present our work to use WSCs as an effective feature in a hybrid deep learning model with attention network. The WSCs scripts are first identified by their encoding range. Then, the document representation of the text is learned through a Long Short-Term Memory model and the minor text is learned by a separate Convolution Neural Network model. To further highlight the WSCs components, an attention mechanism is adopted to re-weight the feature vector before the classification layer. Experiments show that the proposed hybrid deep learning method which better incorporates WSCs features can further improve performance compared to the state-of-the-art classification models. The experimental result indicates that WSCs can serve as effective information in affective analysis of the social media text.
- Published
- 2019
10. AliMe Avatar: Multi-modal Content Production and Presentation for Live-streaming E-commerce
- Author
-
Fu Sun, Haiqing Chen, Xikai Liu, Liming Pu, Ji Zhang, Jiashuo Zhang, Hehong Chen, Zhongzhou Zhao, Qin Lu, Bo Chen, Liqun Xie, Feng-Lin Li, Qi Huang, and Xuming Lin
- Subjects
Focus (computing) ,Multimedia ,Computer science ,business.industry ,media_common.quotation_subject ,E-commerce ,computer.software_genre ,Computer graphics ,Product (business) ,Presentation ,Broadcasting (networking) ,Mode (computer interface) ,business ,computer ,Avatar ,media_common - Abstract
We present AliMe Avatar, a Vtuber designed for live-streaming sales in the E-commerce field. To support the emerging live shopping mode, the core of our digitial avatar is to enable customers to understand products and encourage customers to purchase in a virtual broadcasting room. Based on computer graphics & vision, natural language processing, and speech recognition & synthesis, our AI avatar is able to offer three kinds of key capabilities: custom appearance, product broadcasting, and multi-modal interaction. Currently, it has been launched online in the Taobao app, broadcasts 700+ hours and serves hundreds of thousands of customers per day. In this paper, we mainly focus on the product broadcasting part, demonstrate the system, present the underlying techniques, and share our experience in dealing with live-streaming E-commerce.
- Published
- 2021
11. Image caption generation method based on an interaction mechanism and scene concept selection module
- Author
-
Qin Lu and Liping Zhang
- Subjects
Matching (graph theory) ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Process (computing) ,Semantics ,Image (mathematics) ,Data modeling ,Selection (linguistics) ,Systems architecture ,Computer vision ,Artificial intelligence ,business ,Natural language - Abstract
Image caption generation is the process of converting image content information into a smooth and accurate natural language. In the existing image caption generation methods, a consistent matching problem exists between the semantic information and the image information, and the existing methods encode based on only low-level spatial features or high-level text features, which limits the richness of the resulting image caption. To address this problem, this paper designs an interaction mechanism for image caption generation to achieve the mutual selection of image information (global information, target information) and semantic information in both directions. On the basis of the interaction mechanism, a scene concept selection module is designed, and the extracted scene concept information is selected by the interactive information generated by the interaction mechanism. This approach solves the problem of consistent matching between image information and semantic information and the problem of image caption enrichment. Experiments on MSCOCO data sets show that our model can generate accurate captions that match the scene information and is superior to many existing image caption generation models.
- Published
- 2021
12. Online Unsupervised Learning Using Ensemble Gaussian Processes with Random Features
- Author
-
Qin Lu, Georgios Vasileios Karanikolas, and Georgios B. Giannakis
- Subjects
symbols.namesake ,Kernel (linear algebra) ,Rank (linear algebra) ,Computer science ,Dimensionality reduction ,Benchmark (computing) ,symbols ,Nonlinear dimensionality reduction ,Unsupervised learning ,Gaussian process ,Algorithm ,Ensemble learning - Abstract
Gaussian process latent variable models (GPLVMs) are powerful, yet computationally heavy tools for nonlinear dimensionality reduction. Existing scalable variants utilize low- rank kernel matrix approximants that in essence subsample the embedding space. This work develops an efficient online approach based on random features by replacing spatial with spectral subsampling. The novel approach bypasses the need for optimizing over spatial samples, without sacrificing performance. Different from GPLVM, whose performance depends on the choice of the kernel, the proposed algorithm relies on an ensemble of kernels - what allows adaptation to a wide range of operating environments. It further allows for initial exploration of a richer function space, relative to methods adhering to a single fixed kernel, followed by sequential contraction of the search space as more data become available. Tests on benchmark datasets demonstrate the effectiveness of the proposed method.
- Published
- 2021
13. Gaussian Process Temporal-Difference Learning with Scalability and Worst-Case Performance Guarantees
- Author
-
Georgios B. Giannakis and Qin Lu
- Subjects
Mathematical optimization ,symbols.namesake ,Computer science ,Bellman equation ,Bounded function ,Benchmark (computing) ,symbols ,State space ,Estimator ,Reinforcement learning ,Temporal difference learning ,Gaussian process - Abstract
Value function approximation is a crucial module for policy evaluation in reinforcement learning when the state space is large or continuous. The present paper revisits policy evaluation via temporal-difference (TD) learning from the Gaussian process (GP) perspective. Leveraging random features to approximate the GP prior, an online scalable (OS) approach, termed OS-GPTD, is developed to estimate the value function for a given policy by observing a sequence of state-reward pairs. To benchmark the performance of OS-GPTD even in the adversarial setting, where the modeling assumptions are violated, complementary worst-case analyses are performed. The cumulative Bellman error, as well as the long-term reward prediction error, are upper bounded relative to their counterparts from a fixed value function estimator with the entire state-reward trajectory in hindsight. Performance of the novel OS-GPTD is evaluated on two benchmark problems.
- Published
- 2021
14. Graph-Adaptive Incremental Learning Using an Ensemble of Gaussian Process Experts
- Author
-
Konstantinos D. Polyzos, Georgios B. Giannakis, and Qin Lu
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,Node (networking) ,Network science ,Machine learning ,computer.software_genre ,symbols.namesake ,Kernel (statistics) ,Scalability ,symbols ,Graph (abstract data type) ,Artificial intelligence ,Uncertainty quantification ,Function (engineering) ,business ,Gaussian process ,computer ,media_common - Abstract
Graph-guided semi-supervised learning (SSL) is a major task emerging in a gamut of network science applications. However, most SSL approaches rely on deterministic similarity metrics for prediction, thus providing only point estimates of the sought function. To allow for uncertainty quantification, which is of utmost importance in safety-critical applications, this work tackles the SSL task in a Gaussian process (GP) based Bayesian framework to propagate the distribution of nonparametric function estimates. Specifically, an incremental learning scenario is considered, where prediction of the desired value of a new node per iteration is followed by processing the corresponding nodal observation. Capitalizing on random features for scalability, an ensemble of GP experts is employed, each associated with a unique kernel from a known dictionary, to choose the fitted kernel combination in a graph- and data-adaptive fashion, thus bypassing the need for offline model training. Experiments with synthetic and real data showcase the merits of the proposed approach.
- Published
- 2021
15. Interest Rate Derivatives Modeling and Risk Management in the HJM Framework
- Author
-
Qin Lu and Donald R. Chambers
- Subjects
Heath–Jarrow–Morton framework ,Actuarial science ,Interest rate derivative ,business.industry ,Computer science ,business ,Risk management - Published
- 2021
16. Metaphor Detection: Leveraging Culturally Grounded Eventive Information
- Author
-
Yunfei Long, Qin Lu, I-Hsuan Chen, and Chu-Ren Huang
- Subjects
General Computer Science ,Computer science ,Metaphor ,media_common.quotation_subject ,General Engineering ,eventive information ,Ontology (information science) ,Linguistics ,Writing system ,Chinese radicals ,writing system ,General Materials Science ,ontology ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Metaphor detection ,lcsh:TK1-9971 ,media_common - Abstract
Metaphors are compact packages of information with rich cultural background information. As one of the most powerful linguistic forms with non-literal meaning, metaphor detection in natural language processing can be both challenging and rewarding. We propose an innovative method for metaphor detection and classification leveraging culturally grounded eventive information. This culturally grounded information is organized based on ontological structure, which in turn facilitates further semantic processing of the result of our classification. As a culturally bound ontological system, the Chinese writing system has basic concepts organized according to semantic radicals, which are symbols containing rich eventive information that represent categorical concepts. This paper illustrates the basic design principles of applying ontological structures in metaphor detection by taking into account radicals representing body parts, instruments, materials, and movements. Our approach to leverage the eventive information of the Chinese writing system in metaphor detection is based on the fact that such information is available as an integral part of the writing system of any text. We hypothesize that eventive information can be accessed through the “embodied” source domain information represented by the radicals without syntactic processing or annotation. In terms of the theory of metaphor, we further hypothesize that eventive types in the embodied source domain maps to, and hence can help to predict, eventive meaning in the target domain of metaphor. Our studies show that the event information encoded in lexical items can facilitate classification of metaphoric events and identification of metaphors in Chinese texts effectively. We achieved improvements in Chinese metaphor detection over state-of-the-art approaches in our first classification experiment, and our proposed approach is shown to be generalizable in a second experiment involving new sets of characters with the same radicals.
- Published
- 2019
17. A CRISPR-Cas autocatalysis-driven feedback amplification network for supersensitive DNA diagnostics
- Author
-
Kai Shi, Shiyi Xie, Chunyang Lei, Zhou Nie, Denghui Gao, Renyun Tian, Qin Lu, Shuo Wang, and Haizhen Zhu
- Subjects
Computer science ,Biosensing Techniques ,Computational biology ,medicine.disease_cause ,Feedback ,Autocatalysis ,chemistry.chemical_compound ,Nucleic Acids ,medicine ,Humans ,CRISPR ,Research Articles ,Mutation ,Multidisciplinary ,SciAdv r-articles ,DNA ,Molecular diagnostics ,Chemistry ,genomic DNA ,chemistry ,Nucleic acid ,Synthetic Biology ,CRISPR-Cas Systems ,Function (biology) ,Research Article - Abstract
A CRISPR-Cas–driven positive feedback circuit with exponential dynamic enables ultrasensitive detection of genomic DNA., Artificial nucleic acid circuits with precisely controllable dynamic and function have shown great promise in biosensing, but their utility in molecular diagnostics is still restrained by the inability to process genomic DNA directly and moderate sensitivity. To address this limitation, we present a CRISPR-Cas–powered catalytic nucleic acid circuit, namely, CRISPR-Cas–only amplification network (CONAN), for isothermally amplified detection of genomic DNA. By integrating the stringent target recognition, helicase activity, and trans-cleavage activity of Cas12a, a Cas12a autocatalysis-driven artificial reaction network is programmed to construct a positive feedback circuit with exponential dynamic in CONAN. Consequently, CONAN achieves one-enzyme, one-step, real-time detection of genomic DNA with attomolar sensitivity. Moreover, CONAN increases the intrinsic single-base specificity of Cas12a, and enables the effective detection of hepatitis B virus infection and human bladder cancer–associated single-nucleotide mutation in clinical samples, highlighting its potential as a powerful tool for disease diagnostics.
- Published
- 2021
18. PolyU CBS-Comp at SemEval-2021 Task 1: Lexical Complexity Prediction (LCP)
- Author
-
Chu-Ren Huang, Rong Xiang, Wenjie Li, Qin Lu, Emmanuele Chersoni, and Jinghang Gu
- Subjects
business.industry ,Computer science ,Context (language use) ,Gradient boosting ,Artificial intelligence ,business ,computer.software_genre ,computer ,Sentence ,Word (computer architecture) ,Natural language processing ,SemEval ,Task (project management) - Abstract
In this contribution, we describe the system presented by the PolyU CBS-Comp Team at the Task 1 of SemEval 2021, where the goal was the estimation of the complexity of words in a given sentence context. Our top system, based on a combination of lexical, syntactic, word embeddings and Transformers-derived features and on a Gradient Boosting Regressor, achieves a top correlation score of 0.754 on the subtask 1 for single words and 0.659 on the subtask 2 for multiword expressions.
- Published
- 2021
19. Quantification of Fatigue Damage for Structural Details in Slender Coastal Bridges Using Machine Learning-Based Methods
- Author
-
Qin Lu, Jin Zhu, and Wei Zhang
- Subjects
Stress (mechanics) ,Computer science ,business.industry ,021105 building & construction ,0211 other engineering and technologies ,020101 civil engineering ,Fatigue damage ,02 engineering and technology ,Building and Construction ,Structural engineering ,business ,0201 civil engineering ,Civil and Structural Engineering - Abstract
Exposed to the challenging coastal environment, slender bridges could experience significant dynamic responses and complex stress states resulting from the coupled dynamic impacts of wind,...
- Published
- 2020
20. Semi-Supervised Learning of Processes Over Multi-Relational Graphs
- Author
-
Qin Lu, Vassilis N. Ioannidis, and Georgios B. Giannakis
- Subjects
Theoretical computer science ,Computer science ,0202 electrical engineering, electronic engineering, information engineering ,Probabilistic logic ,020206 networking & telecommunications ,Network science ,02 engineering and technology ,Semi-supervised learning ,Graph - Abstract
Semi-supervised learning (SSL) of dynamic processes over graphs is encountered in several applications of network science. Most of the existing approaches are unable to handle graphs with multiple relations, which arise in various real-world networks. This work deals with SSL of dynamic processes over multi-relational graphs (MRGs). Towards this end, a structured dynamical model is introduced to capture the spatio-temporal nature of dynamic graph processes, and incorporate contributions from multiple relations of the graph in a probabilistic fashion. Given nodal samples over a subset of nodes and the MRG, the expectation-maximization (EM) algorithm is adapted to extrapolate nodal features over unobserved nodes, and infer the contributions from the multiple relations in the MRG simultaneously. Experiments with real data showcase the merits of the proposed approach.
- Published
- 2020
21. Lexical Data Augmentation for Text Classification in Deep Learning
- Author
-
Rong Xiang, Chu-Ren Huang, Qin Lu, Emmanuele Chersoni, and Yunfei Long
- Subjects
Computer science ,business.industry ,Deep learning ,05 social sciences ,Substitution (logic) ,02 engineering and technology ,Accuracy improvement ,Machine learning ,computer.software_genre ,Variety (cybernetics) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,0509 other social sciences ,050904 information & library sciences ,business ,computer - Abstract
This paper presents our work on using part-of-speech focused lexical substitution for data augmentation (PLSDA) to enhance the prediction capabilities and the performance of deep learning models. This paper explains how PLSDA uses part-of-speech information to identify words and make use of different augmentation strategies to find semantically related substitutions to generate new instances for training. Evaluations of PLSDA is conducted on a variety of datasets across different text classification tasks. When PLSDA is applied to four deep learning models, results show that classifiers trained with PLSDA achieve 1.3% accuracy improvement on average.
- Published
- 2020
22. Risk perception and the warning strategy based on microscopic driving state
- Author
-
Chao Li, Qian Li, Jun Bi, Dong-Fan Xie, Xiao-Mei Zhao, and Rong-Qin Lu
- Subjects
Adult ,Male ,Risk ,Automobile Driving ,Computer science ,Process (engineering) ,Acceleration ,Human Factors and Ergonomics ,Models, Biological ,0502 economics and business ,Humans ,0501 psychology and cognitive sciences ,Safety, Risk, Reliability and Quality ,050107 human factors ,Behavior ,050210 logistics & transportation ,05 social sciences ,Accidents, Traffic ,Public Health, Environmental and Occupational Health ,Risk perception ,Risk analysis (engineering) ,Female ,Perception ,State (computer science) ,Safety ,human activities - Abstract
The paper aimed to explore the relationship between risks and individuals’ driving states and then design an efficient method to help drivers avoid high risks. The relationship between risks and individuals’ driving states was deeply studied first. Microscopic driving states were categorized into different clusters, and it was found that the risks are distinct in different clusters and a specific driver might experience different risks in car-following process. Then, according to these findings, a risk warning strategy was designed to help drivers avoid high risks. The risk warning is active when the risk is higher than its threshold. The Helly models were used to mimic the drivers’ reaction to study the influence of the warning strategy. Simulation results showed that with the consideration of the risk warning, the spacing obviously increases, and the oscillations of velocity and acceleration are significantly shrunk, and risks in the driving process dampen down. Because drivers can perceive high risks during the driving process, and then appropriately change their car-following decisions to avoid high risks. These findings are helpful to improve driving behaviors and promote traffic safety.
- Published
- 2018
23. MTTFsite : cross-cell-type TF binding site prediction by using multi-task learning
- Author
-
Yunfei Long, Qin Lu, Hongpeng Wang, Lin Gui, Jiyun Zhou, and Ruifeng Xu
- Subjects
Statistics and Probability ,Cell type ,Computer science ,Multi-task learning ,Gene Expression ,Biochemistry ,03 medical and health sciences ,0302 clinical medicine ,Text mining ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Binding Sites ,business.industry ,Supervised learning ,Pattern recognition ,Expression (computer science) ,Genome Analysis ,Original Papers ,Computer Science Applications ,DNA binding site ,Computational Mathematics ,Computational Theory and Mathematics ,Artificial intelligence ,business ,030217 neurology & neurosurgery ,Protein Binding ,Transcription Factors - Abstract
Motivation The prediction of transcription factor binding sites (TFBSs) is crucial for gene expression analysis. Supervised learning approaches for TFBS predictions require large amounts of labeled data. However, many TFs of certain cell types either do not have sufficient labeled data or do not have any labeled data. Results In this paper, a multi-task learning framework (called MTTFsite) is proposed to address the lack of labeled data problem by leveraging on labeled data available in cross-cell types. The proposed MTTFsite contains a shared CNN to learn common features for all cell types and a private CNN for each cell type to learn private features. The common features are aimed to help predicting TFBSs for all cell types especially those cell types that lack labeled data. MTTFsite is evaluated on 241 cell type TF pairs and compared with a baseline method without using any multi-task learning model and a fully shared multi-task model that uses only a shared CNN and do not use private CNNs. For cell types with insufficient labeled data, results show that MTTFsite performs better than the baseline method and the fully shared model on more than 89% pairs. For cell types without any labeled data, MTTFsite outperforms the baseline method and the fully shared model by more than 80 and 93% pairs, respectively. A novel gene expression prediction method (called TFChrome) using both MTTFsite and histone modification features is also presented. Results show that TFBSs predicted by MTTFsite alone can achieve good performance. When MTTFsite is combined with histone modification features, a significant 5.7% performance improvement is obtained. Availability and implementation The resource and executable code are freely available at http://hlt.hitsz.edu.cn/MTTFsite/ and http://www.hitsz-hlt.com:8080/MTTFsite/. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2019
24. A Multi-Task Service Recommendation Model Considering Dynamic and Static QoS
- Author
-
Liang Xinmei, Mingyu Li, Qin Lu, and Mingge Zhang
- Subjects
Service (business) ,Structure (mathematical logic) ,Computer science ,business.industry ,Quality of service ,Context (computing) ,The Internet ,Data mining ,Web service ,computer.software_genre ,business ,computer ,Task (project management) - Abstract
Recommending the best Web service for users is a necessary task in the internet environment. At present, many of the proposed recommendations have yielded good results. Among them, recommendation methods based on Quality of Service (QoS) emerge endlessly. However, most service recommendation methods consider static or dynamic QoS separately and do not fully consider the impact of their combination. In this paper, we proposed a multi-task service recommendation model that not only models high-order and low-order features simultaneously but also considers the context information generated by the user invoking the service. Our model integrates Factorization Machine (FM) and Bi-directional Long Short-Term Memory (Bi-LSTM) into a deep neural network structure, leveraging their feature combination and deep mining capabilities. Furthermore, we use a pair of attention mechanisms to focus the task model on finding useful information related to the current output in the service data to improve the results of service recommendations. We have done enormous experiments on real QoS data sets, and the results prove that compared with other mainstream recommendation methods, the recommendation performance of this method is greatly improved.
- Published
- 2019
25. Collaborative Filtering Algorithm Based on Rating Prediction and User Characteristics
- Author
-
Zhihao Zhang, Na Song, and Qin Lu
- Subjects
Computer science ,media_common.quotation_subject ,020208 electrical & electronic engineering ,02 engineering and technology ,Recommender system ,MovieLens ,k-nearest neighbors algorithm ,Similarity (network science) ,0202 electrical engineering, electronic engineering, information engineering ,Collaborative filtering ,Key (cryptography) ,020201 artificial intelligence & image processing ,Quality (business) ,Algorithm ,media_common - Abstract
Collaborative filtering directly predicts potential favorite items of user based on user's behavior records. It is one of the key technologies in personalized recommendation systems. The traditional similarity measurement method relies on user's rating data in the case of data sparseness, which causes a decrease in the recommendation quality of recommendation systems. To solve this problem, this paper proposes a collaborative filtering algorithm based on item rating prediction and user characteristics. The first step is to select the k nearest neighbor sets of the item using the KNN algorithm, and then calculate the similarity between the items using the improved similarity measurement method, and initially predict the user's rating on the unrated item to improve the sparsity problem. The second step considers the user characteristics when predicting the similarity between users according to the item ratings. Finally, the algorithm combining item-based rating prediction and user characteristics is adopted to make recommendations for the user. The experimental results on MovieLens and Douban datasets show that the proposed collaborative filtering algorithm based on rating prediction and user characteristics can effectively improve the quality of recommendation system compared with the traditional algorithm.
- Published
- 2019
26. GTX.Digest.VCF: an online NGS data interpretation system based on intelligent gene ranking and large-scale text mining
- Author
-
Zhang Shaowei, Hua Wang, Zhuo Song, Yanwei Xi, Chengkun Wu, Yanhuang Jiang, Yanghui Zhang, Yu Shuojun, Qin Lu, and Lei Peng
- Subjects
0301 basic medicine ,lcsh:Internal medicine ,lcsh:QH426-470 ,Text mining ,Computer science ,NGS data interpretation ,Genomics ,Field (computer science) ,Ranking (information retrieval) ,03 medical and health sciences ,Annotation ,0302 clinical medicine ,Genetics ,Data Mining ,lcsh:RC31-1245 ,Distributed parallel computing ,Genetics (clinical) ,Information retrieval ,business.industry ,High-Throughput Nucleotide Sequencing ,Biomedical text mining ,Neural network ,lcsh:Genetics ,030104 developmental biology ,Workflow ,Analytics ,Data Interpretation, Statistical ,Gene prioritization ,Neural Networks, Computer ,business ,Software ,030217 neurology & neurosurgery - Abstract
Background An important task in the interpretation of sequencing data is to highlight pathogenic genes (or detrimental variants) in the field of Mendelian diseases. It is still challenging despite the recent rapid development of genomics and bioinformatics. A typical interpretation workflow includes annotation, filtration, manual inspection and literature review. Those steps are time-consuming and error-prone in the absence of systematic support. Therefore, we developed GTX.Digest.VCF, an online DNA sequencing interpretation system, which prioritizes genes and variants for novel disease-gene relation discovery and integrates text mining results to provide literature evidence for the discovery. Its phenotype-driven ranking and biological data mining approach significantly speed up the whole interpretation process. Results The GTX.Digest.VCF system is freely available as a web portal at http://vcf.gtxlab.com for academic research. Evaluation on the DDD project dataset demonstrates an accuracy of 77% (235 out of 305 cases) for top-50 genes and an accuracy of 41.6% (127 out of 305 cases) for top-5 genes. Conclusions GTX.Digest.VCF provides an intelligent web portal for genomics data interpretation via the integration of bioinformatics tools, distributed parallel computing, biomedical text mining. It can facilitate the application of genomic analytics in clinical research and practices.
- Published
- 2019
27. Research on Web Service Selection Based on Improved Skyline Algorithm
- Author
-
Liang Xinmei, Mingyu Li, and Qin Lu
- Subjects
Skyline ,Online chat ,Point (typography) ,Service set ,Computer science ,business.industry ,Quality of service ,02 engineering and technology ,Space (commercial competition) ,computer.software_genre ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,The Internet ,Web service ,business ,computer ,Algorithm - Abstract
With the rapid development of Internet technology, the Internet has entered people's daily lives, and people can do many things through the Internet, such as: online chat, online viewing of materials, online shopping, sending and receiving mail. The Internet has brought great convenience to our lives while also promoting social and economic development. Due to the wide application of the Internet, a large amount of Internet data is generated on the network. There are more and more web services with different functional attributes and different non-functional attributes on the network. It is more and more difficult for people to choose web services. The traditional web service selection method is When dealing with massive data, it faces great challenges. How to extract valuable services from massive web data has become an urgent problem to be solved. The traditional service selection method compares the QoS (quality of service) attributes of services to select the services with the best attributes. This method is time consuming. Therefore, this paper uses the database query technology Skyline to select services and extract the SP (Skyline point) services among the potential Web services. The Skyline algorithm is further improved, and the algorithm divides the entire service set into regions. The improved Skyline algorithm can effectively filter and reduce the dominance checks among regions without dominant relationships, which saves memory space and greatly improves the execution efficiency. Finally, the high accuracy and efficiency of the improved Skyline algorithm are verified based on simulation data and a real data set.
- Published
- 2019
28. Collaborative filtering algorithm based on user interest change
- Author
-
Qin Lu and Na Song
- Subjects
Measurement method ,Service (systems architecture) ,Computer science ,Improved algorithm ,02 engineering and technology ,Pearson product-moment correlation coefficient ,MovieLens ,symbols.namesake ,Similarity (network science) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Collaborative filtering ,020201 artificial intelligence & image processing ,Penalty method ,Algorithm - Abstract
The collaborative filtering algorithm can provide personalized service recommendations based on the user's personal interests. In view of the shortcomings of traditional similarity measurement methods for user interest changes, this paper proposes a collaborative filtering algorithm based on user interest changes. When calculating the user similarity, the time penalty function is added to the traditional Pearson correlation coefficient, and then the score prediction is used to obtain the target user's rating data to make recommendations for the user. The improved algorithm can adapt to the change of user interest under the change of time. The experimental results on the MovieLens dataset show that compared with the traditional algorithm, the proposed algorithm can make recommendations for users more accurately.
- Published
- 2019
29. Pseudo-random Number Sequence Generator Based on Chaotic Logistic-Tent System
- Author
-
Congxu Zhu, Shuai Li, and Qin Lu
- Subjects
Nonlinear Sciences::Chaotic Dynamics ,Pseudorandom number generator ,Sequence ,Computer science ,Histogram ,Chaotic ,NIST ,Randomness tests ,Algorithm ,Randomness ,Generator (mathematics) - Abstract
In this paper, a new scheme of designing pseudo-random number generator (PRNG) based on a 1D discrete chaotic system is presented. Firstly, a new 1D compound discrete chaotic system, Logistic-Tent map, is proposed. The Logistic-Tent system emerges chaos phenomenon in a wider range of parameters. Moreover, the chaotic array produced with this Logistic-Tent map has higher randomness. Secondly, a pseudo-random number generating algorithm is proposed. The pseudo-random numbers geberated by this algorithm are uniformly distributed and can successfully pass the randomness tests with NIST SP800-22 software package. Furthermore, histogram test, information entropy analysis, and sensitivity test confirmed that the proposed approach can generate pseudo-random number sequences with good cryptographic performance.
- Published
- 2019
30. Learning Graph Processes with Multiple Dynamical Models
- Author
-
Georgios B. Giannakis, Vassilis N. Ioannidis, Qin Lu, and Mario Coutino
- Subjects
Theoretical computer science ,Computer science ,Bayesian probability ,0202 electrical engineering, electronic engineering, information engineering ,Inference ,Graph (abstract data type) ,020206 networking & telecommunications ,02 engineering and technology ,Graph - Abstract
Network-science related applications frequently deal with inference of spatio-temporal processes. Such inference tasks can be aided by a graph whose topology contributes to the underlying spatio-temporal dependencies. Contemporary approaches extrapolate dynamic processes relying on a fixed dynamical model, that is not adaptive to changes in the dynamics. Alleviating this limitation, the present work adopts a candidate set of graph-adaptive dynamical models with one active at any given time. Given partially observed nodal samples, a scalable Bayesian tracker is leveraged to infer the graph processes and learn the active dynamical model simultaneously in a data-driven fashion. The resulting algorithm is termed graph-adaptive interacting multiple dynamical models (Grad-IMDM). Numerical tests with synthetic and real data corroborate that the proposed Grad-IMDM is capable of tracking the graph processes and adapting to the dynamical model that best fits the data.
- Published
- 2019
31. Phrase embedding learning based on external and internal context with compositionality constraint
- Author
-
Dan Xiong, Minglei Li, Qin Lu, and Yunfei Long
- Subjects
Information Systems and Management ,Phrase ,Word embedding ,Principle of compositionality ,Computer science ,business.industry ,Context (language use) ,02 engineering and technology ,computer.software_genre ,Management Information Systems ,Constraint (information theory) ,Artificial Intelligence ,020204 information systems ,Face (geometry) ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,Embedding ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Software ,Natural language processing - Abstract
Different methods are proposed to learn phrase embedding, which can be mainly divided into two strands. The first strand is based on the distributional hypothesis to treat a phrase as one non-divisible unit and to learn phrase embedding based on its external context similar to learn word embedding. However, distributional methods cannot make use of the information embedded in component words and they also face data spareness problem. The second strand is based on the principle of compositionality to infer phrase embedding based on the embedding of its component words. Compositional methods would give erroneous result if a phrase is non-compositional. In this paper, we propose a hybrid method by a linear combination of the distributional component and the compositional component with an individualized phrase compositionality constraint. The phrase compositionality is automatically computed based on the distributional embedding of the phrase and its component words. Evaluation on five phrase level semantic tasks and experiments show that our proposed method has overall best performance. Most importantly, our method is more robust as it is less sensitive to datasets.
- Published
- 2018
32. Tracking Initially Unresolved Thrusting Objects Using an Optical Sensor
- Author
-
Karl Granstrom, Benny Milgrom, Peter Willett, Qin Lu, Yaakov Bar-Shalom, and Ronen Ben-Dov
- Subjects
020301 aerospace & aeronautics ,business.industry ,Estimation theory ,Computer science ,Aerospace Engineering ,020206 networking & telecommunications ,02 engineering and technology ,Filter (signal processing) ,Tracking (particle physics) ,Acceleration ,Cardinal point ,0203 mechanical engineering ,0202 electrical engineering, electronic engineering, information engineering ,Trajectory ,Point (geometry) ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
This paper considers the problem of estimating the three-dimensional states of a salvo of thrusting/ballistic endo-atmospheric objects using two-dimensional (2-D) Cartesian measurements from the focal plane array (FPA) of a single fixed optical sensor. Since the initial separations in the FP are smaller than the resolution of the sensor, there are merged FP measurements, compounding the usual false-alarm and missed-detection uncertainty. We present a two-step methodology. First, we assume a Wiener process acceleration model for the motion of the images of the objects in the optical sensor's FPA. We model the merged measurements with increased variance, and thence employ a multi-Bernoulli (MB) filter using the 2-D measurements in the FPA. Second, using the set of associated measurements for each confirmed MB track, we formulate a parameter estimation problem, whose maximum likelihood solution can be obtained via numerical search and can be used for impact point prediction. Simulation results illustrate the performance of the proposed method.
- Published
- 2018
33. Learning Heterogeneous Network Embedding From Text and Links
- Author
-
Dan Xiong, Chu-Ren Huang, Yunfei Long, Qin Lu, Chenglin Bi, Rong Xiang, and Minglei Li
- Subjects
Theoretical computer science ,Network embedding ,General Computer Science ,Computer science ,Node (networking) ,heterogeneous network ,General Engineering ,Context (language use) ,02 engineering and technology ,Recurrent neural network ,Text processing ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Embedding ,020201 artificial intelligence & image processing ,General Materials Science ,text processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Representation (mathematics) ,attention mechanism ,lcsh:TK1-9971 ,Heterogeneous network - Abstract
Finding methods to represent multiple types of nodes in heterogeneous networks is both challenging and rewarding, as there is much less work in this area compared with that of homogeneous networks. In this paper, we propose a novel approach to learn node embedding for heterogeneous networks through a joint learning framework of both network links and text associated with nodes. A novel attention mechanism is also used to make good use of text extended through links to obtain much larger network context. Link embedding is first learned through a random-walk-based method to process multiple types of links. Text embedding is separately learned at both sentence level and document level to capture salient semantic information more comprehensively. Then, both types of embeddings are jointly fed into a hierarchical neural network model to learn node representation through mutual enhancement. The attention mechanism follows linked edges to obtain context of adjacent nodes to extend context for node representation. The evaluation on a link prediction task in a heterogeneous network data set shows that our method outperforms the current state-of-the-art method by 2.5%-5.0% in AUC values with p-value less than 10-9, indicating very significant improvement.
- Published
- 2018
34. An ensemble approach for emotion cause detection with event extraction and multi-kernel SVMs
- Author
-
Dongyin Wu, Qin Lu, Ruifeng Xu, Lin Gui, and Jiannan Hu
- Subjects
Multidisciplinary ,Event (computing) ,business.industry ,Computer science ,Emotion classification ,02 engineering and technology ,010502 geochemistry & geophysics ,Machine learning ,computer.software_genre ,01 natural sciences ,Task (project management) ,Support vector machine ,Tree structure ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,Representation (mathematics) ,business ,computer ,Emotion Markup Language ,0105 earth and related environmental sciences - Abstract
In this paper, we present a new challenging task for emotion analysis, namely emotion cause extraction. In this task, we focus on the detection of emotion cause a.k.a the reason or the stimulant of an emotion, rather than the regular emotion classification or emotion component extraction. Since there is no open dataset for this task available, we first designed and annotated an emotion cause dataset which follows the scheme of W3C Emotion Markup Language. We then present an emotion cause detection method by using event extraction framework, where a tree structure-based representation method is used to represent the events. Since the distribution of events is imbalanced in the training data, we propose an under-sampling-based bagging algorithm to solve this problem. Even with a limited training set, the proposed approach may still extract sufficient features for analysis by a bagging of multi-kernel based SVMs method. Evaluations show that our approach achieves an F-measure 7.04% higher than the state-of-the-art methods.
- Published
- 2017
35. Inferring Affective Meanings of Words from Word Embedding
- Author
-
Qin Lu, Minglei Li, Yunfei Long, and Lin Gui
- Subjects
Word embedding ,Computer science ,business.industry ,Sentiment analysis ,Context (language use) ,02 engineering and technology ,Lexicon ,computer.software_genre ,Machine learning ,Semantics ,Human-Computer Interaction ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Affective computing ,computer ,Software ,Natural language processing - Abstract
Affective lexicon is one of the most important resource in affective computing for text. Manually constructed affective lexicons have limited scale and thus only have limited use in practical systems. In this work, we propose a regression-based method to automatically infer multi-dimensional affective representation of words via their word embedding based on a set of seed words. This method can make use of the rich semantic meanings obtained from word embedding to extract meanings in some specific semantic space. This is based on the assumption that different features in word embedding contribute differently to a particular affective dimension and a particular feature in word embedding contributes differently to different affective dimensions. Evaluation on various affective lexicons shows that our method outperforms the state-of-the-art methods on all the lexicons under different evaluation metrics with large margins. We also explore different regression models and conclude that the Ridge regression model, the Bayesian Ridge regression model and Support Vector Regression with linear kernel are the most suitable models. Comparing to other state-of-the-art methods, our method also has computation advantage. Experiments on a sentiment analysis task show that the lexicons extended by our method achieve better results than publicly available sentiment lexicons on eight sentiment corpora. The extended lexicons are publicly available for access.
- Published
- 2017
36. Affective awareness in neural sentiment analysis
- Author
-
Jing Li, Jinghang Gu, Chu-Ren Huang, Rong Xiang, Wenjie Li, Mingyu Wan, and Qin Lu
- Subjects
Information Systems and Management ,Artificial neural network ,Computer science ,Sentiment analysis ,02 engineering and technology ,Collective emotions ,Lexicon ,Affect control theory ,Management Information Systems ,Artificial Intelligence ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Valence (psychology) ,Software ,Cognitive psychology - Abstract
Sentiment analysis is helpful to bestow ability of understanding human’s attitude in texts on artificial intelligence systems. In this area, text sentiment is usually signaled by a few indicative words that convey affective meanings and arouse readers’ collective emotions. However, most existing sentiment analysis models have predominantly featured through neural network architectures with end-to-end training manner and limited awareness of affective knowledge, which, as a result, often fails to pinpoint the essential features for sentiment prediction. In this work, we present a novel approach for sentiment analysis by fusing external affective knowledge into neural networks. The affective knowledge is distilled from two sentiment lexicons grounded by two psychological theories, e.g., the Affect Control Theory and word affections in terms of Valence, Arousal, and Dominance. To examine the effects of affective knowledge over sentiment analysis, we conduct cross-dataset and cross-model experiments along with a detailed ablation analysis. Results show that our proposed method outperforms trendy neural networks in all the five benchmarks with consistent and significant improvement (1.4% Accuracy in average). Further discussions demonstrate that all affective attributes exhibit positive effects to model enhancement and our model is robust to the change of lexicon size.
- Published
- 2021
37. Parallel combinatory multicarrier modulation in underwater acoustic communications
- Author
-
Deqing Wang, Shengli Zhou, Xiaoyi Hu, and Qin Lu
- Subjects
Channel code ,Frequency-shift keying ,010505 oceanography ,business.industry ,Computer science ,Galois theory ,020206 networking & telecommunications ,02 engineering and technology ,Spectral efficiency ,Topology ,01 natural sciences ,Computer Science Applications ,Modulation ,0202 electrical engineering, electronic engineering, information engineering ,Binary code ,Electrical and Electronic Engineering ,Low-density parity-check code ,Telecommunications ,business ,Multipath propagation ,Underwater acoustic communication ,Computer Science::Information Theory ,0105 earth and related environmental sciences ,Communication channel - Abstract
Parallel combinatory multicarrier (PCMC) modulation is a generalisation of the legacy multicarrier frequency shift keying (FSK) scheme, where for each group of M subcarriers, more than one (say L ) subcarriers are chosen for simultaneous transmission. The PCMC scheme provides an effective way to increase the spectral efficiency while maintaining non-coherent detection at the receiver. This study provides an in-depth study of the PCMC scheme with emphasis on how to couple with binary or non-binary channel coding. One favourable system configuration is identified, having parameters ( L , M ) = (4,8), which increases the spectral efficiency by 50% relative to the most efficient FSK scheme from the FSK family. Coupled with non-binary low-density parity-check (LDPC) coding over the Galois field GF(64), the PCMC scheme with ( L , M ) = (4,8) is shown to have robust performance in multipath fading channels.
- Published
- 2017
38. Negative transfer detection in transductive transfer learning
- Author
-
Lin Gui, Yu Zhou, Qin Lu, Jiachen Du, and Ruifeng Xu
- Subjects
Rademacher distribution ,business.industry ,Computer science ,Noise reduction ,Negative transfer ,02 engineering and technology ,Machine learning ,computer.software_genre ,Reduction (complexity) ,Noise ,Artificial Intelligence ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Performance improvement ,Set (psychology) ,business ,Transfer of learning ,Algorithm ,computer ,Software - Abstract
Transfer learning method has been widely used in machine learning when training data is limited. However, class noise accumulated during learning iterations can lead to negative transfer which can adversely affect performance when more training data is used. In this paper, we propose a novel method to identify noise samples for noise reduction. More importantly, the method can detect the point where negative transfer happens such that transfer learning can terminate at the near top performance point. In this method, we use the sum of the Rademacher distribution to estimate the class noise rate of transferred data. Transferred data having high probability of being labeled wrongly is removed to reduce noise accumulation. This negative sample reduction process can be repeated several times during transfer learning until we find the point where negative transfer occurs. As we can detect the point where negative transfer occurs, our method not only has the ability to delay the point where negative transfer happens, but also the ability to stop transfer learning algorithms at the right place for top performance gain. Evaluation based on cross-lingual/domain opinion analysis evaluation data set shows that our algorithm achieves the state-of-the-art result. Furthermore, our system shows a monotonic increase trend in performance improvement when more training data are used beating the performance degradation curse of most transfer learning methods when training data reaches certain size.
- Published
- 2017
39. Ensemble with Estimation: Seeking for Optimization in Class Noisy Data
- Author
-
Qin Lu, Binyang Li, Xizhao Wang, Zhiyuan Wen, Lin Gui, and Ruifeng Xu
- Subjects
Training set ,Computer science ,business.industry ,Complex system ,Computational intelligence ,Pattern recognition ,02 engineering and technology ,Ensemble learning ,GeneralLiterature_MISCELLANEOUS ,Artificial Intelligence ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,A priori and a posteriori ,Learning methods ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Noisy data ,Classifier (UML) ,Software - Abstract
Class noise, as know as the mislabeled data intraining set, can lead to poor accuracy in classification nomatter what machine learning methods are used. A reasonable estimation of class noise has a significant impact onthe performance of learning methods. However, the error inexisting estimation is inevitable theoretically and infer theperformance of optimal classifier trained on noisy data. Instead of seeking a single optimal classifier on noisy data,in this work, we use a set of weak classifiers, which arecaused by negative impacts of noisy data, to learn an ensemble strong classififier which is based on the training error and estimation of class noise. By this strategy, the proposed ensemble with estimation method overcomes the gapbetween the estimation and true distribution of class noise.Our proposed method does not require any a priori knowledge about class noises. We prove that the optimal ensembleclassifier on the noisy distribution can approximate the optimal classifier on the clean distribution when the training setgrows. Comparisons with existing algorithms show that ourmethods outperform state-of-the-art approaches on a largenumber of benchmark datasets in different domains. Boththe theoretical analysis and the experimental result revealthat our method can improve the performance, works wellon clean data and is robust on the algorithm parameter.
- Published
- 2019
- Full Text
- View/download PDF
40. Quantification of Fatigue Damage of Structural Details in Slender Coastal Bridges Using Machine Learning Based Methods
- Author
-
Jin Zhu, Qin Lu, and Wei Zhang
- Subjects
Computer science ,business.industry ,Service life ,Fatigue damage ,Structural engineering ,Structural health monitoring ,business ,Fatigue limit - Published
- 2019
41. Prediction of TF-Binding Site by Inclusion of Higher Order Position Dependencies
- Author
-
Lin Gui, Hongpeng Wang, Qin Lu, Jiyun Zhou, and Ruifeng Xu
- Subjects
Dependency (UML) ,Binding Sites ,Models, Statistical ,Computer science ,Applied Mathematics ,Feature extraction ,Computational Biology ,Proteins ,DNA ,Mice ,Order (biology) ,Position (vector) ,Databases, Genetic ,Genetics ,Animals ,Humans ,TF binding ,Neural Networks, Computer ,Hidden Markov model ,Biological system ,Biotechnology ,Cellular biophysics ,Protein Binding ,Transcription Factors - Abstract
Most proposed methods for TF-binding site (TFBS) predictions only use low order dependencies for predictions due to the lack of efficient methods to extract higher order dependencies. In this work, we first propose a novel method to extract higher order dependencies by applying CNN on histone modification features. We then propose a novel TFBS prediction method, referred to as CNN_TF, by incorporating low order and higher order dependencies. CNN_TF is first evaluated on 13 TFs in the mES cell. Results show that using higher order dependencies outperforms low order dependencies significantly on 11 TFs. This indicates that higher order dependencies are indeed more effective for TFBS predictions than low order dependencies. Further experiments show that using both low order dependencies and higher order dependencies improves performance significantly on 12 TFs, indicating the two dependency types are complementary. To evaluate the influence of cell-types on prediction performances, CNN_TF was applied to five TFs in five cell-types of humans. Even though low order dependencies and higher order dependencies show different contributions in different cell-types, they are always complementary in predictions. When comparing to several state-of-the-art methods, CNN_TF outperforms them by at least 5.3 percent in AUPR.
- Published
- 2019
42. Water Wave Optimization for Flow-Shop Scheduling
- Author
-
Min-Xia Zhang, Yi-Chen Du, Jia-Yu Wu, Xue Wu, and Xue-Qin Lu
- Subjects
Mathematical optimization ,021103 operations research ,Optimization problem ,Job shop scheduling ,Computer science ,business.industry ,0211 other engineering and technologies ,Evolutionary algorithm ,02 engineering and technology ,Flow shop scheduling ,Range (mathematics) ,Metaheuristic algorithms ,0202 electrical engineering, electronic engineering, information engineering ,Combinatorial optimization ,020201 artificial intelligence & image processing ,Local search (optimization) ,business ,Metaheuristic - Abstract
Flow-shop scheduling problem (FSP) is a well-known combinatorial optimization problem which has a wide range of practical applications. However, FSP is known to be NP-hard when there are more than two machines, for which traditional exact algorithms can only solve small-size problem instances, and many metaheuristic algorithms are mostly suitable for solving large-size instances. Water wave optimization (WWO) is a novel metaheuristic evolutionary algorithm that draws inspiration from shallow water wave model for optimization problems. In this paper, we propose two WWO algorithms for FSP. The first algorithm adapts the original evolutionary operators of the basic WWO according to the solution space of FSP. The second algorithm further improves the first algorithm with a self-adaptive local search procedure. Experimental results on test instances show that the proposed strategies are effective for solving FSP, and the WWO algorithm with self-adaptive local search exhibits significant performance advantages over many other well-known metaheuristic algorithms.
- Published
- 2019
43. Improving Multi-label Emotion Classification by Integrating both General and Domain-specific Knowledge
- Author
-
Wenhao Ying, Qin Lu, and Rong Xiang
- Subjects
business.industry ,Computer science ,Deep learning ,Emotion classification ,Sentiment analysis ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,Domain (software engineering) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Domain knowledge ,General knowledge ,Artificial intelligence ,Language model ,0305 other medical science ,business ,computer ,Natural language processing ,0105 earth and related environmental sciences - Abstract
Deep learning based general language models have achieved state-of-the-art results in many popular tasks such as sentiment analysis and QA tasks. Text in domains like social media has its own salient characteristics. Domain knowledge should be helpful in domain relevant tasks. In this work, we devise a simple method to obtain domain knowledge and further propose a method to integrate domain knowledge with general knowledge based on deep language models to improve performance of emotion classification. Experiments on Twitter data show that even though a deep language model fine-tuned by a target domain data has attained comparable results to that of previous state-of-the-art models, this fine-tuned model can still benefit from our extracted domain knowledge to obtain more improvement. This highlights the importance of making use of domain knowledge in domain-specific applications.
- Published
- 2019
44. Research on Web Service Selection Based on User Preference
- Author
-
Qin Lu and Maoying Wu
- Subjects
Relation (database) ,Computer science ,Cosine similarity ,Fuzzy number ,TOPSIS ,Data mining ,Web service ,User requirements document ,computer.software_genre ,computer ,Preference ,Weighting - Abstract
At the present stage, weight is often used to express the user preference to QoS (Quality of Service). Due to the user’s subjective judgment and the fuzziness of preference description, weight calculated through the traditional weighting method is difficult to express the user preference correctly. To solve the fuzziness of QoS attribute preference description and improve the correctness of service selection, the improved order relation analysis method (G1-method) by fuzzy number is adopted to represent the subjective weight of the user firstly; and the entropy weight method is adopted to determine the objective weight of the QoS attribute; finally, the objective weight is used to revise the subjective weight to calculate the comprehensive weight. Based on the user preference, the service is selected by improving the TOPSIS method with COSINE similarity. According to the experiment, the uncertainty of user preference description is effectively solved, the accuracy of service selection is improved through the improved TOPSIS method, and the selected service is more in line with the user requirement.
- Published
- 2018
45. Incorporating multi-kernel function and Internet verification for Chinese person name disambiguation
- Author
-
Shuai Wang, Lin Gui, Jian Xu, Qin Lu, and Ruifeng Xu
- Subjects
Information retrieval ,General Computer Science ,Computer science ,business.industry ,010102 general mathematics ,01 natural sciences ,Theoretical Computer Science ,Feature (linguistics) ,Entity linking ,Knowledge base ,String kernel ,Kernel (statistics) ,The Internet ,0101 mathematics ,Precision and recall ,business ,Word order - Abstract
The study on person name disambiguation aims to identify different entities with the same person name through document linking to different entities. The traditional disambiguation approach makes use of words in documents as features to distinguish different entities. Due to the lack of use of word order as a feature and the limited use of external knowledge, the traditional approach has performance limitations. This paper presents an approach for named entity disambiguation through entity linking based on a multikernel function and Internet verification to improve Chinese person name disambiguation. The proposed approach extends a linear kernel that uses in-document word features by adding a string kernel to construct a multi-kernel function. This multi-kernel can then calculate the similarities between an input document and the entity descriptions in a named person knowledge base to form a ranked list of candidates to different entities. Furthermore, Internet search results based on keywords extracted from the input document and entity descriptions in the knowledge base are used to train classifiers for verification. The evaluations on CIPS-SIGHAN 2012 person name disambiguation bakeoff dataset show that the use of word orders and Internet knowledge through a multi-kernel function can improve both precision and recall and our system has achieved state-of-the-art performance.
- Published
- 2016
46. Method Development and Validation for Simultaneous Determination of 44Ca, 34S, 28Si and 18 Other Trace Elements in Pharmaceutical Packaging Materials’ Extractable Solutions by Inductively Coupled Plasma-Mass Spectrometry (ICP-MS)
- Author
-
Qin Lu, Dan Xie, and WeiChun Yang
- Subjects
Matrix (chemical analysis) ,Accuracy and precision ,business.industry ,Computer science ,Standard solution ,Operational costs ,Process engineering ,business ,Method development ,Pharmaceutical packaging ,Inductively coupled plasma mass spectrometry ,TRACE (psycholinguistics) - Abstract
ICH (International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use) already issued and implemented Q3D guideline in elemental impurities in final drug products. Consequently, it will be essential to monitor trace elements from packaging material to ensure the final drug product compliance. This study successfully developed a new method for simultaneous identification and quantification of 44Ca, 34S, 28Si and 18 trace elements (27Al, 51V, 52Cr, 55Mn, 56Fe, 58Ni, 59Co, 63Cu, 66Zn, 75As, 78Se, 95Mo, 111Cd, 118Sn, 121Sb, 137Ba, 201Hg and 208Pb) in pharmaceutical packaging materials’ extractable solutions by using ICP-MS in one single method without auxiliary. The method development focused on elemental mass selection, optimization of ICP-MS operational parameters and the sample/standard solutions preparation. Furthermore, the new developed analytical method (accuracy and precision, standard and sample linearity, matrix specificity and robustness of the method) was successfully validated by following US and European compendia criteria. The success of the analytical method development and validation illustrates that the trace elements analysis in pharmaceutical industry becomes feasible per the single ICP-MS method. The analysis of trace elements via this new developed ICP-MS method can provide the worthy information for risk assessment of packaging system and final drug products with relatively low operational cost.
- Published
- 2020
47. Dual memory network model for sentiment analysis of review text
- Author
-
Yunfei Long, Chu-Ren Huang, Ge Xu, Mingyu Derek Ma, Qin Lu, Jiaxing Shen, Rong Xiang, and Elvira Perez Vallejos
- Subjects
Information Systems and Management ,User profile ,business.industry ,Computer science ,Sentiment analysis ,02 engineering and technology ,Machine learning ,computer.software_genre ,Management Information Systems ,Dual (category theory) ,Artificial Intelligence ,Salient ,020204 information systems ,Product (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Software ,Network model - Abstract
In sentiment analysis of product reviews, both user and product information are proven to be useful. Current works handle user profile and product information in a unified model which may not be able to learn salient features of users and products effectively. In this work, we propose a dual user and product memory network (DUPMN) model to learn user profiles and product information for reviews classification using separate memory networks. Then, the two representations are used jointly for sentiment analysis. The use of separate models aims to capture user profiles and product information more effectively. Comparing with state-of-the-art unified prediction models, evaluations on three benchmark datasets (IMDB, Yelp13, and Yelp14) show that our dual learning model gives performance gain of 0.6%, 1.2%, and 0.9%, respectively. The improvements are also deemed very significant measured by p-values.
- Published
- 2020
48. Dual Memory Network Model for Biased Product Review Classification
- Author
-
Qin Lu, Yunfei Long, Rong Xiang, Chu-Ren Huang, and Mingyu Ma
- Subjects
FOS: Computer and information sciences ,User profile ,Computer Science - Computation and Language ,business.industry ,Computer science ,Computer Science - Artificial Intelligence ,Work (physics) ,Sentiment analysis ,02 engineering and technology ,Machine learning ,computer.software_genre ,Dual (category theory) ,03 medical and health sciences ,Artificial Intelligence (cs.AI) ,0302 clinical medicine ,Product (mathematics) ,030221 ophthalmology & optometry ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,computer ,Network model - Abstract
In sentiment analysis (SA) of product reviews, both user and product information are proven to be useful. Current tasks handle user profile and product information in a unified model which may not be able to learn salient features of users and products effectively. In this work, we propose a dual user and product memory network (DUPMN) model to learn user profiles and product reviews using separate memory networks. Then, the two representations are used jointly for sentiment prediction. The use of separate models aims to capture user profiles and product information more effectively. Compared to state-of-the-art unified prediction models, the evaluations on three benchmark datasets, IMDB, Yelp13, and Yelp14, show that our dual learning model gives performance gain of 0.6%, 1.2%, and 0.9%, respectively. The improvements are also deemed very significant measured by p-values., To appear in 2018 EMNLP 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
- Published
- 2018
49. EL_LSTM: Prediction of DNA-Binding Residue from Protein Sequence by Combining Long Short-Term Memory and Ensemble Learning
- Author
-
Lin Gui, Hongpeng Wang, Jiyun Zhou, Qin Lu, and Ruifeng Xu
- Subjects
Computer science ,Feature vector ,0206 medical engineering ,Feature extraction ,02 engineering and technology ,Machine Learning ,Protein sequencing ,Genetics ,Databases, Protein ,Binding Sites ,Artificial neural network ,business.industry ,Applied Mathematics ,Computational Biology ,Pattern recognition ,DNA ,Ensemble learning ,Support vector machine ,DNA-Binding Proteins ,Pairwise comparison ,Artificial intelligence ,business ,Classifier (UML) ,020602 bioinformatics ,Algorithms ,Biotechnology - Abstract
Most past works for DNA-binding residue prediction did not consider the relationships between residues. In this paper, we propose a novel approach for DNA-binding residue prediction, referred to as EL_LSTM, which includes two main components. The first component is the Long Short-Term Memory (LSTM), which learns pairwise relationships between residues through a bi-gram model and then learns feature vectors for all residues. The second component is an ensemble learning based classifier introduced to tackle the data imbalance problem in binding residue predictions. We use a variant of the bagging strategy in ensemble learning to achieve balanced samples. Evaluations on PDNA-224 and DBP-123 show that adding feature relationships performs better than classifiers without feature relationships by at least 0.028 on MCC, 1.18 percent on ST and 0.012 on AUC. This indicates the usefulness of feature relationships for DNA-binding residue predictions. Evaluation on using ensemble learning indicates that the improvement can reach at least 0.021 on MCC, 1.32 percent on ST, and 0.018 on AUC compared to the use of a single LSTM classifier. Comparisons with the state-of-the-art predictors show that our proposed EL_LSTM outperforms them significantly. Further feature analysis validates the effectiveness of LSTM for the prediction of DNA-binding residues.
- Published
- 2018
50. Analysis and Comparison of Potential Traffic Risks Based on Different Field Data
- Author
-
Qian Li, Dong-Fan Xie, Xiao-Mei Zhao, Rong-Qin Lu, and Rui Jiang
- Subjects
050210 logistics & transportation ,Computer science ,Field data ,0502 economics and business ,05 social sciences ,0501 psychology and cognitive sciences ,Data mining ,computer.software_genre ,computer ,050107 human factors - Published
- 2018
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.