Author: "Jianyong Duan" / Search Limiters: Full Text - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jianyong Duan"' showing total 27 results

Start Over Author "Jianyong Duan" Search Limiters Full Text

27 results on '"Jianyong Duan"'

1. Visual Clue Guidance and Consistency Matching Framework for Multimodal Named Entity Recognition

Author: Li He, Qingxiang Wang, Jie Liu, Jianyong Duan, and Hao Wang
Subjects: multimodal named entity recognition, contrastive learning, feature pyramid, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: The goal of multimodal named entity recognition (MNER) is to detect entity spans in given image–text pairs and classify them into corresponding entity types. Despite the success of existing works that leverage cross-modal attention mechanisms to integrate textual and visual representations, we observe three key issues. Firstly, models are prone to misguidance when fusing unrelated text and images. Secondly, most existing visual features are not enhanced or filtered. Finally, due to the independent encoding strategies employed for text and images, a noticeable semantic gap exists between them. To address these challenges, we propose a framework called visual clue guidance and consistency matching (GMF). To tackle the first issue, we introduce a visual clue guidance (VCG) module designed to hierarchically extract visual information from multiple scales. This information is utilized as an injectable visual clue guidance sequence to steer text representations for error-insensitive prediction decisions. Furthermore, by incorporating a cross-scale attention (CSA) module, we successfully mitigate interference across scales, enhancing the image’s capability to capture details. To address the third issue of semantic disparity between text and images, we employ a consistency matching (CM) module based on the idea of multimodal contrastive learning, facilitating the collaborative learning of multimodal data. To validate the effectiveness of our proposed framework, we conducted comprehensive experimental studies, including extensive comparative experiments, ablation studies, and case studies, on two widely used benchmark datasets, demonstrating the efficacy of the framework.
Published: 2024
Full Text: View/download PDF

2. Self-Distillation and Pinyin Character Prediction for Chinese Spelling Correction Based on Multimodality

Author: Li He, Feng Liu, Jie Liu, Jianyong Duan, and Hao Wang
Subjects: Chinese spelling correction, multimodality, pinyin prediction, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Chinese spelling correction (CSC) constitutes a pivotal and enduring goal in natural language processing, serving as a foundational element for various language-related tasks by detecting and rectifying spelling errors in textual content. Numerous methods for Chinese spelling correction leverage multimodal information, including character, character sound, and character shape, to establish connections between incorrect and correct characters. Research indicates that a majority of spelling errors stem from pinyin similarity, with character similarity accounting for half of the errors. Consequently, effectively modeling character pinyin and character relationships emerges as a key challenge in the CSC task. In this study, we propose enhancing the CSC task by introducing the pinyin character prediction task. We employ an adaptive weighting method in the pinyin character prediction task to address predictions in a more granular manner, achieving a balance between the two prediction tasks. The proposed model, SPMSpell, utilizes ChineseBERT as an encoder to capture multimodal feature information simultaneously. It incorporates three parallel decoders for character prediction, pinyin prediction, and self-distillation modules. To mitigate potential overfitting concerning pinyin, a self-distillation method is introduced to prioritize character information in predictions. Extensive experiments conducted on three SIGHAN benchmark tests showcase that the model introduced in this paper attains a superior level of performance. This substantiates the correctness and superiority of the adaptive weighted pinyin character prediction task and underscores the effectiveness of the self-distillation module.
Published: 2024
Full Text: View/download PDF

3. DaGATN: A Type of Machine Reading Comprehension Based on Discourse-Apperceptive Graph Attention Networks

Author: Mingli Wu, Tianyu Sun, Zhuangzhuang Wang, and Jianyong Duan
Subjects: machine reading comprehension, graph attention network, logical reasoning, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: In recent years, with the advancement of natural language processing techniques and the release of models like ChatGPT, how language models understand questions has become a hot topic. In handling complex logical reasoning with pre-trained models, its performance still has room for improvement. Inspired by DAGN, we propose an improved DaGATN (Discourse-apperceptive Graph Attention Networks) model. By constructing a discourse information graph to learn logical clues in the text, we decompose the context, question, and answer into elementary discourse units (EDUs) and connect them with discourse relations to construct a relation graph. The text features are learned through a discourse graph attention network and applied to downstream multiple-choice tasks. Our method was evaluated on the ReClor dataset and achieved an accuracy of 74.3%, surpassing the best-known performance methods utilizing deberta-xlarge-level pre-trained models, and also performed better than ChatGPT (Zero-Shot).
Published: 2023
Full Text: View/download PDF

4. An Open-Domain Event Extraction Method Incorporating Semantic and Dependent Syntactic Information

Author: Li He, Qian Zhang, Jianyong Duan, and Hao Wang
Subjects: semantic dependency syntax, graph convolution networks, open-domain event extraction, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Open-domain event extraction is a fundamental task that aims to extract non-predefined types of events from news clusters. Some researchers have noticed that its performance can be enhanced by improving dependency relationships. Recently, graphical convolutional networks (GCNs) have been widely used to integrate dependency syntactic information into neural networks. However, they usually introduce noise and deteriorate the generalization. To tackle this issue, we propose using Bi-LSTM to obtain semantic representations of BERT intermediate layer features and infuse the dependent syntactic information. Compared to current methods, Bi-LSTM is more robust and has less dependency on word vectors and artificial features. Experiments on public datasets show that our approach is effective for open-domain event extraction tasks.
Published: 2023
Full Text: View/download PDF

5. Event Detection Using a Self-Constructed Dependency and Graph Convolution Network

Author: Li He, Qingxin Meng, Qing Zhang, Jianyong Duan, and Hao Wang
Subjects: event detection, dependency parsing, graph convolution network, attention mechanism, natural language processing, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: The extant event detection models, which rely on dependency parsing, have exhibited commendable efficacy. However, for some long sentences with more words, the results of dependency parsing are more complex, because each word corresponds to a directed edge with a dependency parsing label. These edges do not all provide guidance for the event detection model, and the accuracy of dependency parsing tools decreases with the increase in sentence length, resulting in error propagation. To solve these problems, we developed an event detection model that uses a self-constructed dependency and graph convolution network. First, we statistically analyzed the ACE2005 corpus to prune the dependency parsing tree, and combined the named entity features in the sentence to generate an undirected graph. Second, we implemented an enhanced graph convolution network using the multi-head attention mechanism to understand the representation of nodes in the graph. Finally, a gating mechanism combined the semantic and structural dependency information of the sentence, enabling us to accomplish the event detection task. A series of experiments conducted on the ACE2005 corpus demonstrates that the proposed method enhances the performance of the event detection model.
Published: 2023
Full Text: View/download PDF

6. Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training

Author: Hao Wang, Lekai Zhou, Jianyong Duan, and Li He
Subjects: named entity recognition, cross-lingual, adversarial training, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Named entity recognition aims to extract entities with specific meaning from unstructured text. Currently, deep learning methods have been widely used for this task and have achieved remarkable results, but it is often difficult to achieve better results with less labeled data. To address this problem, this paper proposes a method for cross-lingual entity recognition based on an attention mechanism and adversarial training, using resource-rich language annotation data to migrate to low-resource languages for named entity recognition tasks and outputting changing semantic vectors through the attention mechanism to effectively solve the long-sequence semantic dilution problem. To verify the effectiveness of the proposed method, the method in this paper is applied to the English–Chinese cross-lingual named entity recognition task based on the WeiboNER data set and the People-Daily2004 data set. The obtained F1 value of the optimal model is 53.22% (a 6.29% improvement compared to the baseline). The experimental results show that the cross-lingual adversarial named entity recognition method proposed in this paper can significantly improve the results of named entity recognition in low resource languages.
Published: 2023
Full Text: View/download PDF

7. Document-Level Event Role Filler Extraction Using Key-Value Memory Network

Author: Hao Wang, Miao Li, Jianyong Duan, Li He, and Qing Zhang
Subjects: event extraction, document-level, key-value memory network, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Previous work has demonstrated that end-to-end neural sequence models work well for document-level event role filler extraction. However, the end-to-end neural network model suffers from the problem of not being able to utilize global information, resulting in incomplete extraction of document-level event arguments. This is because the inputs to BiLSTM are all single-word vectors with no input of contextual information. This phenomenon is particularly pronounced at the document level. To address this problem, we propose key-value memory networks to enhance document-level contextual information, and the overall model is represented at two levels: the sentence-level and document-level. At the sentence-level, we use BiLSTM to obtain key sentence information. At the document-level, we use a key-value memory network to enhance document-level representations by recording information about those words in articles that are sensitive to contextual similarity. We fuse two levels of contextual information by means of a fusion formula. We perform various experimental validations on the MUC-4 dataset, and the results show that the model using key-value memory networks works better than the other models.
Published: 2023
Full Text: View/download PDF

8. An Open-Domain Event Extraction Method Incorporating Semantic and Dependent Syntactic Information

Author: Wang, Li He, Qian Zhang, Jianyong Duan, and Hao
Subjects: semantic dependency syntax, graph convolution networks, open-domain event extraction
Abstract: Open-domain event extraction is a fundamental task that aims to extract non-predefined types of events from news clusters. Some researchers have noticed that its performance can be enhanced by improving dependency relationships. Recently, graphical convolutional networks (GCNs) have been widely used to integrate dependency syntactic information into neural networks. However, they usually introduce noise and deteriorate the generalization. To tackle this issue, we propose using Bi-LSTM to obtain semantic representations of BERT intermediate layer features and infuse the dependent syntactic information. Compared to current methods, Bi-LSTM is more robust and has less dependency on word vectors and artificial features. Experiments on public datasets show that our approach is effective for open-domain event extraction tasks.
Published: 2023
Full Text: View/download PDF

9. Hierarchical Preference Hash Network for News Recommendation

Author: Jianyong DUAN, Liangcai LI, Mei ZHANG, and Hao WANG
Subjects: Artificial Intelligence, Hardware and Architecture, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, Software
Published: 2022

10. New Word Detection Using BiLSTM+CRF Model with Features

Author: Zheng Tan, Mei Zhang, Jianyong Duan, and Hao Wang
Subjects: Artificial Intelligence, Hardware and Architecture, business.industry, Computer science, Computer Vision and Pattern Recognition, Artificial intelligence, Electrical and Electronic Engineering, business, computer.software_genre, computer, Software, Word (computer architecture), Natural language processing
Published: 2020

11. Measuring Semantic Similarity between Words Based on Multiple Relational Information

Author: Yuwei Wu, Jianyong Duan, Hao Wang, and Mingli Wu
Subjects: Semantic similarity, Artificial Intelligence, Hardware and Architecture, Computer science, business.industry, Computer Vision and Pattern Recognition, Artificial intelligence, Electrical and Electronic Engineering, computer.software_genre, business, computer, Software, Natural language processing
Published: 2020

12. Single Failure Recovery Method for Erasure Coded Storage System with Heterogeneous Devices

Author: Li Ma, Jianyong Duan, Junyi Guo, and Yingxun Fu
Subjects: Recovery method, Artificial Intelligence, Hardware and Architecture, Computer science, business.industry, Embedded system, Computer data storage, Erasure, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, business, Software
Published: 2019

13. Strip-Switched Deployment Method to Optimize Single Failure Recovery for Erasure Coded Storage Systems

Author: Shilin Wen, Li Ma, Jianyong Duan, and Yingxun Fu
Subjects: 020203 distributed computing, Computer science, business.industry, 02 engineering and technology, 020202 computer hardware & architecture, Artificial Intelligence, Hardware and Architecture, Software deployment, 0202 electrical engineering, electronic engineering, information engineering, Erasure, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, business, Software, Computer hardware
Published: 2018

14. Error Correction for Search Engine by Mining Bad Case

Author: Tianxiao Ji, Hao Wang, and Jianyong Duan
Subjects: Search engine, Artificial Intelligence, Hardware and Architecture, Computer science, Computer Vision and Pattern Recognition, Data mining, Electrical and Electronic Engineering, Error detection and correction, computer.software_genre, computer, Software
Published: 2018

15. Detecting Transportation Modes Using Deep Neural Network

Author: Lei Zhang, Gaojun Liu, Jianyong Duan, and Hao Wang
Subjects: 050210 logistics & transportation, Artificial neural network, Computer science, 05 social sciences, 02 engineering and technology, computer.software_genre, Artificial Intelligence, Hardware and Architecture, 0502 economics and business, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Data mining, Electrical and Electronic Engineering, computer, Software
Published: 2017

16. Chinese Spelling Error Detection Using a Fusion Lattice LSTM

Author: Bing Wang, Jiajun Zhang, Jianyong Duan, and Hao Wang
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, General Computer Science, Computer science, Speech recognition, Pinyin, 02 engineering and technology, Spelling, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), Preprocessor, 020201 artificial intelligence & image processing, Input method, Pinyin input method, Chinese characters, Error detection and correction, Computation and Language (cs.CL)
Abstract: Spelling error detection serves as a crucial preprocessing in many natural language processing applications. Due to the characteristics of Chinese Language, Chinese spelling error detection is more challenging than error detection in English. Existing methods are mainly under a pipeline framework, which artificially divides error detection process into two steps. Thus, these methods bring error propagation and cannot always work well due to the complexity of the language environment. Besides existing methods only adopt character or word information, and ignore the positive effect of fusing character, word, pinyin1 information together. We propose an LF-LSTM-CRF model, which is an extension of the LSTMCRF with word lattices and character-pinyin-fusion inputs. Our model takes advantage of the end-to-end framework to detect errors as a whole process, and dynamically integrates character, word and pinyin information. Experiments on the SIGHAN data show that our LF-LSTM-CRF outperforms existing methods with similar external resources consistently, and confirm the feasibility of adopting the end-to-end framework and the availability of integrating of character, word and pinyin information., 8 pages,5 figures
Published: 2019

17. The Collaborative Filtering Method Based on Social Information Fusion

Author: Hao Wang, Jianyong Duan, Peng Mi, and Yadi Song
Subjects: 0209 industrial biotechnology, Information retrieval, Article Subject, Social network, business.industry, Computer science, General Mathematics, lcsh:Mathematics, General Engineering, 02 engineering and technology, Recommender system, lcsh:QA1-939, 020901 industrial engineering & automation, lcsh:TA1-2040, 0202 electrical engineering, electronic engineering, information engineering, Social relationship, Collaborative filtering, 020201 artificial intelligence & image processing, The Internet, business, Social information, lcsh:Engineering (General). Civil engineering (General)
Abstract: In the social network, similar users are assumed to prefer similar items, so searching the similar users of a target user plays an important role for most collaborative filtering methods. Existing collaborative filtering methods use user ratings of items to search for similar users. Nowadays, abundant social information is produced by the Internet, such as user profiles, social relationships, behaviors, interests, and so on. Only using user ratings of items is not sufficient to recommend wanted items and search for similar users. In this paper, we propose a new collaborative filtering method using social information fusion. Our method first uses social information fusion to search for similar users and then updates the user rating of items for recommendation using similar users. Experiments show that our method outperforms the existing methods based on user ratings of items and using social information fusion to search similar users is an available way for collaborative filtering methods of recommender systems.
Published: 2019

18. An Indirected Recommendation Model for Chinese Microblog

Author: Mei Zhang, Zheng Dong, and Jianyong Duan
Subjects: Topic model, General Computer Science, Microblogging, Computer science, media_common.quotation_subject, Population, 02 engineering and technology, 01 natural sciences, Latent Dirichlet allocation, 010305 fluids & plasmas, World Wide Web, Recommendation model, symbols.namesake, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Social media, education, Function (engineering), media_common, education.field_of_study, Information retrieval, business.industry, Information sharing, Information technology, symbols, 020201 artificial intelligence & image processing, business
Abstract: Microblog is a browser-based platform for web user’s information sharing and communication. With the rapidly increasing of microblog population, its recommendation function becomes necessary. This paper proposes the recommendation by the Latent Dirichlet Allocation topic model, which combines the user interests into the model to meet their needs. We also conduct a comparative analysis between indirect and direct recommendation algorithms. The experimental results show that the indirect recommendation is more effective for the micro-blog recommendation.
Published: 2016

19. Geometric analysis of concept vectors based on similarity values

Author: Jianyong Duan and Hui Liu
Subjects: Theoretical computer science, Similarity (geometry), Euclidean space, Attribute computing, Topology, Similitude, lcsh:PL1001-3208, Set (abstract data type), Semantic similarity, lcsh:Chinese language and literature, Position (vector), Isometry, Distance geometry, Vector space, Mathematics
Abstract: In this paper, we offer a geometric framework for the computing of a concept’s conceptual vector based on its similarity position with other concepts in a vector space called concept space, which is a set of concept vectors together with a distance function derived from a similarity model. We show that there exists an isometry to map a concept space to a Euclidean space. So, the concept vector can be mapped to a coordinate in a Euclidean space and vice versa. Therefore, given only the similarity position of a concept, we can locate its coordinate and its concept vector subsequently, using distance geometry methods. We prove that such mapping functions do exist under some conditions. We also discuss how to map non-numerical attributes. At last, we show some preliminary experimental results and thoughts in the implementation of an attribute mining task. This work will benefit attribute retrieval tasks.
Published: 2017

20. A hybrid framework to extract bilingual multiword expression from free text

Author: Wang Jing-zhong, Yushi Xu, Jianyong Duan, and Mei Zhang
Subjects: Phrase, business.industry, Process (engineering), Active learning (machine learning), Computer science, Speech recognition, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, General Engineering, computer.software_genre, Syntax, Computer Science Applications, Multiword expression, Text mining, Artificial Intelligence, Filter (video), ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Artificial intelligence, business, computer, Natural language processing
Abstract: Bilingual multiword expression extraction is always a significant problem in extracting meaning from free text. This involves analyzing large amounts of textual information. In this paper we propose a text mining approach to extract bilingual multiword expression. Both statistic and rule-based methods are employed into the system. There are two phases in the extraction process. In the first phase, lots of candidates are extracted from the corpus by statistic methods. The algorithm of multiple sequence alignment is sensitive to the flexible multiword. In the second phase, error-driven rules and patterns are extracted from corpus. For acquired high qualified instances, the manual work with active learning is also performed in sample selection. These trained rules are used to filter the candidates. Bilingual comparisons are used in a parallel corpus. Parts of bilingual syntactic patterns are obtained from the bilingual phrase dictionary. Some related experiments are designed for achieving the best performance because there are lots of parameters in this system. Experimental results showed our approach gains good performance.
Published: 2011

21. Spoken language understanding using weakly supervised learning

Author: Hui Liu, Ruzhan Lu, Wei-Lin Wu, Feng Gao, Jianyong Duan, and Yuquan Chen
Subjects: Computer science, business.industry, Speech recognition, Supervised learning, computer.software_genre, Speech processing, Theoretical Computer Science, Human-Computer Interaction, Comprehension, Bootstrapping (electronics), Robustness (computer science), Artificial intelligence, Computational linguistics, business, computer, Software, Natural language processing, Utterance, Spoken language
Abstract: In this paper, we present a weakly supervised learning approach for spoken language understanding in domain-specific dialogue systems. We model the task of spoken language understanding as a two-stage classification problem. Firstly, the topic classifier is used to identify the topic of an input utterance. Secondly, with the restriction of the recognized target topic, the slot classifiers are trained to extract the corresponding slot-value pairs. It is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. More importantly, it allows that weakly supervised strategies are employed for training the two kinds of classifiers, which could significantly reduce the number of labeled sentences. We investigated active learning and naive self-training for the two kinds of classifiers. Also, we propose a practical method for bootstrapping topic-dependent slot classifiers from a small amount of labeled sentences. Experiments have been conducted in the context of the Chinese public transportation information inquiry domain and the English DARPA Communicator domain. The experimental results show the effectiveness of our proposed SLU framework and demonstrate the possibility to reduce human labeling efforts significantly.
Published: 2010

22. A bio-inspired application of natural language processing: A case study in extracting multiword expression

Author: Jianyong Duan, Yi Hu, and Ru Li
Subjects: Sequence, Multiple sequence alignment, Computer science, business.industry, General Engineering, computer.software_genre, Computer Science Applications, Multiword expression, Text mining, Ranking, Artificial Intelligence, Redundancy (engineering), Artificial intelligence, business, computer, Natural language processing
Abstract: For the multiword expression (MWE) extraction, the multiple sequence alignment (MSA) is proposed on the motivation of gene recognition. Because textual sequence is similar to gene sequence in pattern analysis. This MSA technique is combined with error-driven rules, with the improved efficiency beyond the traditional methods. It provides a guarantee for the MWE recall. It uses the dynamic programming method to prevent candidates from combinational explosion, and provides a global solution for pattern extraction instead of sub-pattern redundancy. Consequently, it has accurate measures for flexible patterns. In experiment, some advanced statistical measures are performed for ranking candidates. In the comparison experiment, the MSA approach achieved better results.
Published: 2009

23. MULTI-ENGINE COLLABORATIVE BOOTSTRAPPING FOR WORD SENSE DISAMBIGUATION

Author: Ruzhan Lu, Xuening Li, and Jianyong Duan
Subjects: Word-sense disambiguation, Artificial Intelligence, Computer science, business.industry, Bootstrapping (linguistics), Artificial intelligence, business, computer.software_genre, Machine learning, computer, Natural language processing
Abstract: In this paper we propose a new word sense disambiguation method called Multi-engine Collaborative Bootstrapping (MCB) that combines different types of corpora and also uses two languages for bootstrapping. MCB uses the bilingual bootstrapping as its core algorithm that leading to incremental knowledge acquisition. The EM model is applied to train parameters in a base learner. The feature translation model is improved by semantic correlation estimation. In addition we use multi-engine selection to produce qualified starting seeds from parallel corpora and monolingual corpora. Those seeds that are generated through unsupervised machine learning approaches can also ensure bootstrapping effectiveness in contrast with manually selected seeds in spite of their different selection mechanisms. Experimental results prove the effectiveness of MCB. Some factors including feature space and starting seed number are concerned involved in our experiments because the EM algorithm is sensitive to starting values. Limitation of resources is also a concern.
Published: 2007

24. Error Checking for Chinese Query by Mining Web Log

Author: Peng Mi, Jianyong Duan, and Hui Liu
Subjects: Article Subject, Computer science, General Mathematics, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, Query language, computer.software_genre, Query optimization, Search engine, Query expansion, Web query classification, Computer Science::Databases, computer.programming_language, Information retrieval, Web search query, Computer Science::Information Retrieval, lcsh:Mathematics, General Engineering, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), lcsh:QA1-939, lcsh:TA1-2040, Web log analysis software, Sargable, Data mining, lcsh:Engineering (General). Civil engineering (General), computer, Boolean conjunctive query, RDF query language
Abstract: For the search engine, error-input query is a common phenomenon. This paper uses web log as the training set for the query error checking. Through then-gram language model that is trained by web log, the queries are analyzed and checked. Some features including query words and their number are introduced into the model. At the same time data smoothing algorithm is used to solve data sparseness problem. It will improve the overall accuracy of then-gram model. The experimental results show that it is effective.
Published: 2015

25. A Language Modeling Approach to Sentiment Analysis

Author: Jianyong Duan, Yuquan Chen, Xuening Li, Yi Hu, and Ruzhan Lu
Subjects: Language identification, business.industry, Character (computing), Computer science, Sentiment analysis, Machine learning, computer.software_genre, Support vector machine, Text processing, Artificial intelligence, Language model, business, Divergence (statistics), computer, Classifier (UML), Natural language processing
Abstract: This paper presents a language modeling approach to the sentiment detection problem. It captures the subtle information in text processing to character the semantic orientation of documents as "thumb up" (positive) or "thumb down" (negative). To handle this problem, we propose an idea to estimate both the positive and negative language models from training collections. Tests are done through computing the Kullback-Leibler divergence between the language model estimated from test document and these two trained sentiment models. We assert the polarity of a test document by observing whether its language model is close to the trained "thumb up" model or the "thumb down" model. When compared with an outstanding classifier, i.e., SVMs on movie review corpus, language modeling approach showed its better performance.
Published: 2007

26. A weakly supervised learning approach for spoken language understanding

Author: Feng Gao, Yuquan Chen, Wei-Lin Wu, Hui Liu, Jianyong Duan, and Ruzhan Lu
Subjects: Computer science, business.industry, Supervised learning, computer.software_genre, Machine learning, ComputingMethodologies_PATTERNRECOGNITION, Robustness (computer science), Classifier (linguistics), Active learning, Artificial intelligence, business, computer, Natural language processing, Utterance, Spoken language
Abstract: In this paper, we present a weakly supervised learning approach for spoken language understanding in domain-specific dialogue systems. We model the task of spoken language understanding as a successive classification problem. The first classifier (topic classifier) is used to identify the topic of an input utterance. With the restriction of the recognized target topic, the second classifier (semantic classifier) is trained to extract the corresponding slot-value pairs. It is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. Most importantly, it allows the employment of weakly supervised strategies for training the two classifiers. We first apply the training strategy of combining active learning and self-training (Tur et al., 2005) for topic classifier. Also, we propose a practical method for bootstrapping the topic-dependent semantic classifiers from a small amount of labeled sentences. Experiments have been conducted in the context of Chinese public transportation information inquiry domain. The experimental results demonstrate the effectiveness of our proposed SLU framework and show the possibility to reduce human labeling efforts significantly.
Published: 2006

27. A bio-inspired approach for multi-word expression extraction

Author: Wei-Lin Wu, Yan Tian, Yi Hu, Ruzhan Lu, and Jianyong Duan
Subjects: Longest common subsequence problem, Sequence, ComputingMethodologies_PATTERNRECOGNITION, Computer science, Extraction (chemistry), Affine transformation, Data mining, computer.software_genre, ComputingMethodologies_ARTIFICIALINTELLIGENCE, computer, Expression (mathematics), Multi word expression
Abstract: This paper proposes a new approach for Multi-word Expression (MWE)extraction on the motivation of gene sequence alignment because textual sequence is similar to gene sequence in pattern analysis. Theory of Longest Common Subsequence (LCS) originates from computer science and has been established as affine gap model in Bioinformatics. We perform this developed LCS technique combined with linguistic criteria in MWE extraction. In comparison with traditional n-gram method, which is the major technique for MWE extraction, LCS approach is applied with great efficiency and performance guarantee. Experimental results show that LCS-based approach achieves better results than n-gram.
Published: 2006

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

27 results on '"Jianyong Duan"'

1. Visual Clue Guidance and Consistency Matching Framework for Multimodal Named Entity Recognition

2. Self-Distillation and Pinyin Character Prediction for Chinese Spelling Correction Based on Multimodality

3. DaGATN: A Type of Machine Reading Comprehension Based on Discourse-Apperceptive Graph Attention Networks

4. An Open-Domain Event Extraction Method Incorporating Semantic and Dependent Syntactic Information

5. Event Detection Using a Self-Constructed Dependency and Graph Convolution Network

6. Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training

7. Document-Level Event Role Filler Extraction Using Key-Value Memory Network

8. An Open-Domain Event Extraction Method Incorporating Semantic and Dependent Syntactic Information

9. Hierarchical Preference Hash Network for News Recommendation

10. New Word Detection Using BiLSTM+CRF Model with Features

11. Measuring Semantic Similarity between Words Based on Multiple Relational Information

12. Single Failure Recovery Method for Erasure Coded Storage System with Heterogeneous Devices

13. Strip-Switched Deployment Method to Optimize Single Failure Recovery for Erasure Coded Storage Systems

14. Error Correction for Search Engine by Mining Bad Case

15. Detecting Transportation Modes Using Deep Neural Network

16. Chinese Spelling Error Detection Using a Fusion Lattice LSTM

17. The Collaborative Filtering Method Based on Social Information Fusion

18. An Indirected Recommendation Model for Chinese Microblog

19. Geometric analysis of concept vectors based on similarity values

20. A hybrid framework to extract bilingual multiword expression from free text

21. Spoken language understanding using weakly supervised learning

22. A bio-inspired application of natural language processing: A case study in extracting multiword expression

23. MULTI-ENGINE COLLABORATIVE BOOTSTRAPPING FOR WORD SENSE DISAMBIGUATION

24. Error Checking for Chinese Query by Mining Web Log

25. A Language Modeling Approach to Sentiment Analysis

26. A weakly supervised learning approach for spoken language understanding

27. A bio-inspired approach for multi-word expression extraction

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

27 results on '"Jianyong Duan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources