Author: "Qin, Zengchang" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Qin, Zengchang"' showing total 375 results

Start Over Author "Qin, Zengchang"

375 results on '"Qin, Zengchang"'

1. SyntheT2C: Generating Synthetic Data for Fine-Tuning Large Language Models on the Text2Cypher Task

Author: Zhong, Ziije, Zhong, Linqing, Sun, Zhaoze, Jin, Qingyun, Qin, Zengchang, and Zhang, Xiaofan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Integrating Large Language Models (LLMs) with existing Knowledge Graph (KG) databases presents a promising avenue for enhancing LLMs' efficacy and mitigating their "hallucinations". Given that most KGs reside in graph databases accessible solely through specialized query languages (e.g., Cypher), there exists a critical need to bridge the divide between LLMs and KG databases by automating the translation of natural language into Cypher queries (commonly termed the "Text2Cypher" task). Prior efforts tried to bolster LLMs' proficiency in Cypher generation through Supervised Fine-Tuning. However, these explorations are hindered by the lack of annotated datasets of Query-Cypher pairs, resulting from the labor-intensive and domain-specific nature of annotating such datasets. In this study, we propose SyntheT2C, a methodology for constructing a synthetic Query-Cypher pair dataset, comprising two distinct pipelines: (1) LLM-based prompting and (2) template-filling. SyntheT2C facilitates the generation of extensive Query-Cypher pairs with values sampled from an underlying Neo4j graph database. Subsequently, SyntheT2C is applied to two medical databases, culminating in the creation of a synthetic dataset, MedT2C. Comprehensive experiments demonstrate that the MedT2C dataset effectively enhances the performance of backbone LLMs on the Text2Cypher task. Both the SyntheT2C codebase and the MedT2C dataset will be released soon., Comment: 19 pages, 15 figures, 8 tables
Published: 2024

2. Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented Generation

Author: Zhong, Zijie, Liu, Hanwen, Cui, Xiaoya, Zhang, Xiaofan, and Qin, Zengchang
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Integrating information from different reference data sources is a major challenge for Retrieval-Augmented Generation (RAG) systems because each knowledge source adopts a unique data structure and follows different conventions. Retrieving from multiple knowledge sources with one fixed strategy usually leads to under-exploitation of information. To mitigate this drawback, inspired by Mix-of-Expert, we introduce Mix-of-Granularity (MoG), a method that dynamically determines the optimal granularity of a knowledge database based on input queries using a router. The router is efficiently trained with a newly proposed loss function employing soft labels. We further extend MoG to Mix-of-Granularity-Graph (MoGG), where reference documents are pre-processed into graphs, enabling the retrieval of relevant information from distantly situated chunks. Extensive experiments demonstrate that both MoG and MoGG effectively predict optimal granularity levels, significantly enhancing the performance of the RAG system in downstream tasks. The code of both MoG and MoGG will be made public., Comment: 17 pages, 6 figures and 8 tables
Published: 2024

3. From Image to Video, what do we need in multimodal LLMs?

Author: Huang, Suyuan, Zhang, Haoxin, Gao, Yan, Hu, Yao, and Qin, Zengchang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multimodal Large Language Models (MLLMs) have demonstrated profound capabilities in understanding multimodal information, covering from Image LLMs to the more complex Video LLMs. Numerous studies have illustrated their exceptional cross-modal comprehension. Recently, integrating video foundation models with large language models to build a comprehensive video understanding system has been proposed to overcome the limitations of specific pre-defined vision tasks. However, the current advancements in Video LLMs tend to overlook the foundational contributions of Image LLMs, often opting for more complicated structures and a wide variety of multimodal data for pre-training. This approach significantly increases the costs associated with these methods.In response to these challenges, this work introduces an efficient method that strategically leverages the priors of Image LLMs, facilitating a resource-efficient transition from Image to Video LLMs. We propose RED-VILLM, a Resource-Efficient Development pipeline for Video LLMs from Image LLMs, which utilizes a temporal adaptation plug-and-play structure within the image fusion module of Image LLMs. This adaptation extends their understanding capabilities to include temporal information, enabling the development of Video LLMs that not only surpass baseline performances but also do so with minimal instructional data and training resources. Our approach highlights the potential for a more cost-effective and scalable advancement in multimodal models, effectively building upon the foundational work of Image LLMs.
Published: 2024

4. LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense Knowledge

Author: Liu, Yuhe, Kang, Mengxue, Qin, Zengchang, and Chu, Xiangxiang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Large text-to-image models have achieved astonishing performance in synthesizing diverse and high-quality images guided by texts. With detail-oriented conditioning control, even finer-grained spatial control can be achieved. However, some generated images still appear unreasonable, even with plentiful object features and a harmonious style. In this paper, we delve into the underlying causes and find that deep-level logical information, serving as common-sense knowledge, plays a significant role in understanding and processing images. Nonetheless, almost all models have neglected the importance of logical relations in images, resulting in poor performance in this aspect. Following this observation, we propose LogicalDefender, which combines images with the logical knowledge already summarized by humans in text. This encourages models to learn logical knowledge faster and better, and concurrently, extracts the widely applicable logical knowledge from both images and human knowledge. Experiments show that our model has achieved better logical performance, and the extracted logical knowledge can be effectively applied to other scenarios.
Published: 2024

5. CADReN: Contextual Anchor-Driven Relational Network for Controllable Cross-Graphs Node Importance Estimation

Author: Zhong, Zijie, Zhang, Yunhui, Chang, Ziyi, and Qin, Zengchang
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Information Retrieval, 68T07
Abstract: Node Importance Estimation (NIE) is crucial for integrating external information into Large Language Models through Retriever-Augmented Generation. Traditional methods, focusing on static, single-graph characteristics, lack adaptability to new graphs and user-specific requirements. CADReN, our proposed method, addresses these limitations by introducing a Contextual Anchor (CA) mechanism. This approach enables the network to assess node importance relative to the CA, considering both structural and semantic features within Knowledge Graphs (KGs). Extensive experiments show that CADReN achieves better performance in cross-graph NIE task, with zero-shot prediction ability. CADReN is also proven to match the performance of previous models on single-graph NIE task. Additionally, we introduce and opensource two new datasets, RIC200 and WK1K, specifically designed for cross-graph NIE research, providing a valuable resource for future developments in this domain., Comment: 8 pages, 6 figures
Published: 2024

6. Boosting Semantic Segmentation from the Perspective of Explicit Class Embeddings

Author: Liu, Yuhe, Liu, Chuanjian, Han, Kai, Tang, Quan, and Qin, Zengchang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Semantic segmentation is a computer vision task that associates a label with each pixel in an image. Modern approaches tend to introduce class embeddings into semantic segmentation for deeply utilizing category semantics, and regard supervised class masks as final predictions. In this paper, we explore the mechanism of class embeddings and have an insight that more explicit and meaningful class embeddings can be generated based on class masks purposely. Following this observation, we propose ECENet, a new segmentation paradigm, in which class embeddings are obtained and enhanced explicitly during interacting with multi-stage image features. Based on this, we revisit the traditional decoding process and explore inverted information flow between segmentation masks and class embeddings. Furthermore, to ensure the discriminability and informativity of features from backbone, we propose a Feature Reconstruction module, which combines intrinsic and diverse branches together to ensure the concurrence of diversity and redundancy in features. Experiments show that our ECENet outperforms its counterparts on the ADE20K dataset with much less computational cost and achieves new state-of-the-art results on PASCAL-Context dataset. The code will be released at https://gitee.com/mindspore/models and https://github.com/Carol-lyh/ECENet.
Published: 2023

7. Sparse Double Descent: Where Network Pruning Aggravates Overfitting

Author: He, Zheng, Xie, Zeke, Zhu, Quanzhi, and Qin, Zengchang
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: People usually believe that network pruning not only reduces the computational cost of deep networks, but also prevents overfitting by decreasing model capacity. However, our work surprisingly discovers that network pruning sometimes even aggravates overfitting. We report an unexpected sparse double descent phenomenon that, as we increase model sparsity via network pruning, test performance first gets worse (due to overfitting), then gets better (due to relieved overfitting), and gets worse at last (due to forgetting useful information). While recent studies focused on the deep double descent with respect to model overparameterization, they failed to recognize that sparsity may also cause double descent. In this paper, we have three main contributions. First, we report the novel sparse double descent phenomenon through extensive experiments. Second, for this phenomenon, we propose a novel learning distance interpretation that the curve of $\ell_{2}$ learning distance of sparse models (from initialized parameters to final parameters) may correlate with the sparse double descent curve well and reflect generalization better than minima flatness. Third, in the context of sparse double descent, a winning ticket in the lottery ticket hypothesis surprisingly may not always win., Comment: ICML 2022
Published: 2022

8. Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog

Author: Zhang, Shunyu, Jiang, Xiaoze, Yang, Zequn, Wan, Tao, and Qin, Zengchang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Visual Dialog requires an agent to engage in a conversation with humans grounded in an image. Many studies on Visual Dialog focus on the understanding of the dialog history or the content of an image, while a considerable amount of commonsense-required questions are ignored. Handling these scenarios depends on logical reasoning that requires commonsense priors. How to capture relevant commonsense knowledge complementary to the history and the image remains a key challenge. In this paper, we propose a novel model by Reasoning with Multi-structure Commonsense Knowledge (RMK). In our model, the external knowledge is represented with sentence-level facts and graph-level facts, to properly suit the scenario of the composite of dialog history and image. On top of these multi-structure representations, our model can capture relevant knowledge and incorporate them into the vision and semantic features, via graph-based interaction and transformer-based fusion. Experimental results and analysis on VisDial v1.0 and VisDialCK datasets show that our proposed model effectively outperforms comparative methods., Comment: MULA Workshop, CVPR 2022
Published: 2022

9. KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

Author: Jiang, Xiaoze, Du, Siyi, Qin, Zengchang, Sun, Yajing, and Yu, Jing
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Visual dialogue is a challenging task that needs to extract implicit information from both visual (image) and textual (dialogue history) contexts. Classical approaches pay more attention to the integration of the current question, vision knowledge and text knowledge, despising the heterogeneous semantic gaps between the cross-modal information. In the meantime, the concatenation operation has become de-facto standard to the cross-modal information fusion, which has a limited ability in information retrieval. In this paper, we propose a novel Knowledge-Bridge Graph Network (KBGN) model by using graph to bridge the cross-modal semantic relations between vision and text knowledge in fine granularity, as well as retrieving required knowledge via an adaptive information selection mode. Moreover, the reasoning clues for visual dialogue can be clearly drawn from intra-modal entities and inter-modal bridges. Experimental results on VisDial v1.0 and VisDial-Q datasets demonstrate that our model outperforms existing models with state-of-the-art results., Comment: Accepted by the 28th ACM International Conference on Multimedia (ACM MM 2020), Oral
Published: 2020

10. DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

Author: Jiang, Xiaoze, Yu, Jing, Sun, Yajing, Qin, Zengchang, Zhu, Zihao, Hu, Yue, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Visual Dialogue task requires an agent to be engaged in a conversation with human about an image. The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation. In this paper, we propose a novel generative decoding architecture to generate high-quality responses, which moves away from decoding the whole encoded semantics towards the design that advocates both transparency and flexibility. In this architecture, word generation is decomposed into a series of attention-based information selection steps, performed by the novel recurrent Deliberation, Abandon and Memory (DAM) module. Each DAM module performs an adaptive combination of the response-level semantics captured from the encoder and the word-level semantics specifically selected for generating each word. Therefore, the responses contain more detailed and non-repetitive descriptions while maintaining the semantic accuracy. Furthermore, DAM is flexible to cooperate with existing visual dialogue encoders and adaptive to the encoder structures by constraining the information selection mode in DAM. We apply DAM to three typical encoders and verify the performance on the VisDial v1.0 dataset. Experimental results show that the proposed models achieve new state-of-the-art performance with high-quality responses. The code is available at https://github.com/JXZe/DAM., Comment: Accepted by IJCAI 2020. SOLE copyright holder is IJCAI (International Joint Conferences on Artificial Intelligence)
Published: 2020

11. Multi-Level Network for High-Speed Multi-Person Pose Estimation

Author: Huang, Ying, Zhuang, Jiankai, and Qin, Zengchang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In multi-person pose estimation, the left/right joint type discrimination is always a hard problem because of the similar appearance. Traditionally, we solve this problem by stacking multiple refinement modules to increase network's receptive fields and capture more global context, which can also increase a great amount of computation. In this paper, we propose a Multi-level Network (MLN) that learns to aggregate features from lower-level (left/right information), upper-level (localization information), joint-limb level (complementary information) and global-level (context) information for discrimination of joint type. Through feature reuse and its intra-relation, MLN can attain comparable performance to other conventional methods while runtime speed retains at 42.2 FPS., Comment: 5 pages, published at ICIP 2019
Published: 2019

12. FollowMeUp Sports: New Benchmark for 2D Human Keypoint Recognition

Author: Huang, Ying, Sun, Bin, Kan, Haipeng, Zhuang, Jiankai, and Qin, Zengchang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Human pose estimation has made significant advancement in recent years. However, the existing datasets are limited in their coverage of pose variety. In this paper, we introduce a novel benchmark FollowMeUp Sports that makes an important advance in terms of specific postures, self-occlusion and class balance, a contribution that we feel is required for future development in human body models. This comprehensive dataset was collected using an established taxonomy of over 200 standard workout activities with three different shot angles. The collected videos cover a wider variety of specific workout activities than previous datasets including push-up, squat and body moving near the ground with severe self-occlusion or occluded by some sport equipment and outfits. Given these rich images, we perform a detailed analysis of the leading human pose estimation approaches gaining insights for the success and failures of these methods., Comment: 12 pages, accepted at PRCV 2019
Published: 2019

13. DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue

Author: Jiang, Xiaoze, Yu, Jing, Qin, Zengchang, Zhuang, Yingying, Zhang, Xingxing, Hu, Yue, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Different from Visual Question Answering task that requires to answer only one question about an image, Visual Dialogue involves multiple questions which cover a broad range of visual content that could be related to any objects, relationships or semantics. The key challenge in Visual Dialogue task is thus to learn a more comprehensive and semantic-rich image representation which may have adaptive attentions on the image for variant questions. In this research, we propose a novel model to depict an image from both visual and semantic perspectives. Specifically, the visual view helps capture the appearance-level information, including objects and their relationships, while the semantic view enables the agent to understand high-level visual semantics from the whole image to the local regions. Futhermore, on top of such multi-view image features, we propose a feature selection framework which is able to adaptively capture question-relevant information hierarchically in fine-grained level. The proposed method achieved state-of-the-art results on benchmark Visual Dialogue datasets. More importantly, we can tell which modality (visual or semantic) has more contribution in answering the current question by visualizing the gate values. It gives us insights in understanding of human cognition in Visual Dialogue., Comment: Accepted by the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020)
Published: 2019

14. Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering

Author: Yang, Zhuoqian, Qin, Zengchang, Yu, Jing, and Hu, Yue
Subjects: Computer Science - Multimedia, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: One of the key issues of Visual Question Answering (VQA) is to reason with semantic clues in the visual content under the guidance of the question, how to model relational semantics still remains as a great challenge. To fully capture visual semantics, we propose to reason over a structured visual representation - scene graph, with embedded objects and inter-object relationships. This shows great benefit over vanilla vector representations and implicit visual relationship learning. Based on existing visual relationship models, we propose a visual relationship encoder that projects visual relationships into a learned deep semantic space constrained by visual context and language priors. Upon the constructed graph, we propose a Scene Graph Convolutional Network (SceneGCN) to jointly reason the object properties and relational semantics for the correct answer. We demonstrate the model's effectiveness and interpretability on the challenging GQA dataset and the classical VQA 2.0 dataset, remarkably achieving state-of-the-art 54.56% accuracy on GQA compared to the existing best model., Comment: 14 pages, 9 figures
Published: 2018

15. Topic Modeling of Political Dynamics with Shifted Cosine Similarity

Author: Luo, Yifan, Wan, Tao, Qin, Zengchang, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Honda, Katsuhiro, editor, Entani, Tomoe, editor, Ubukata, Seiki, editor, Huynh, Van-Nam, editor, and Inuiguchi, Masahiro, editor
Published: 2022
Full Text: View/download PDF

16. A sequential guiding network with attention for image captioning

Author: Sow, Daouda, Qin, Zengchang, Niasse, Mouhamed, and Wan, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: The recent advances of deep learning in both computer vision (CV) and natural language processing (NLP) provide us a new way of understanding semantics, by which we can deal with more challenging tasks such as automatic description generation from natural images. In this challenge, the encoder-decoder framework has achieved promising performance when a convolutional neural network (CNN) is used as image encoder and a recurrent neural network (RNN) as decoder. In this paper, we introduce a sequential guiding network that guides the decoder during word generation. The new model is an extension of the encoder-decoder framework with attention that has an additional guiding long short-term memory (LSTM) and can be trained in an end-to-end manner by using image/descriptions pairs. We validate our approach by conducting extensive experiments on a benchmark dataset, i.e., MS COCO Captions. The proposed model achieves significant improvement comparing to the other state-of-the-art deep learning models., Comment: 5 pages, 2 figures, 1 table, IEEE ICASSP 2019
Published: 2018

17. Semantic Modeling of Textual Relationships in Cross-Modal Retrieval

Author: Yu, Jing, Yang, Chenghao, Qin, Zengchang, Yang, Zhuoqian, Hu, Yue, and Zhang, Weifeng
Subjects: Computer Science - Multimedia
Abstract: Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval. Existing models typically project texts and images into one embedding space, in which semantically similar information will have a shorter distance. Semantic modeling of textural relationships is notoriously difficult. In this paper, we propose an approach to model texts using a featured graph by integrating multi-view textual relationships including semantic relations, statistical co-occurrence, and prior relations in the knowledge base. A dual-path neural network is adopted to learn multi-modal representations of information and cross-modal similarity measure jointly. We use a Graph Convolutional Network (GCN) for generating relation-aware text representations, and use a Convolutional Neural Network (CNN) with non-linearities for image representations. The cross-modal similarity measure is learned by distance metric learning. Experimental results show that, by leveraging the rich relational semantics in texts, our model can outperform the state-of-the-art models by 3.4% and 6.3% on accuracy on two benchmark datasets., Comment: To appear in KSEM 2019
Published: 2018

18. Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Author: Liu, Shuangting, Zhang, Jiaqi, Chen, Yuxin, Liu, Yifan, Qin, Zengchang, and Wan, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Semantic segmentation is one of the basic topics in computer vision, it aims to assign semantic labels to every pixel of an image. Unbalanced semantic label distribution could have a negative influence on segmentation accuracy. In this paper, we investigate using data augmentation approach to balance the semantic label distribution in order to improve segmentation performance. We propose using generative adversarial networks (GANs) to generate realistic images for improving the performance of semantic segmentation networks. Experimental results show that the proposed method can not only improve segmentation performance on those classes with low accuracy, but also obtain 1.3% to 2.1% increase in average segmentation accuracy. It shows that this augmentation method can boost accuracy and be easily applicable to any other segmentation models., Comment: 5 pages
Published: 2018

19. Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval

Author: Yu, Jing, Lu, Yuhang, Qin, Zengchang, Liu, Yanbing, Tan, Jianlong, Guo, Li, and Zhang, Weifeng
Subjects: Computer Science - Information Retrieval
Abstract: Cross-modal information retrieval aims to find heterogeneous data of various modalities from a given query of one modality. The main challenge is to map different modalities into a common semantic space, in which distance between concepts in different modalities can be well modeled. For cross-modal information retrieval between images and texts, existing work mostly uses off-the-shelf Convolutional Neural Network (CNN) for image feature extraction. For texts, word-level features such as bag-of-words or word2vec are employed to build deep learning models to represent texts. Besides word-level semantics, the semantic relations between words are also informative but less explored. In this paper, we model texts by graphs using similarity measure based on word2vec. A dual-path neural network model is proposed for couple feature learning in cross-modal information retrieval. One path utilizes Graph Convolutional Network (GCN) for text modeling based on graph representations. The other path uses a neural network with layers of nonlinearities for image modeling based on off-the-shelf features. The model is trained by a pairwise similarity loss function to maximize the similarity of relevant text-image pairs and minimize the similarity of irrelevant pairs. Experimental results show that the proposed model outperforms the state-of-the-art methods significantly, with 17% improvement on accuracy for the best case., Comment: 7 pages, 11 figures
Published: 2018

20. A deep learning method for automatic evaluation of diagnostic information from multi-stained histopathological images

Author: Ji, Junyu, Wan, Tao, Chen, Dong, Wang, Hao, Zheng, Menghan, and Qin, Zengchang
Published: 2022
Full Text: View/download PDF

21. Text Generation Based on Generative Adversarial Nets with Latent Variable

Author: Wang, Heng, Qin, Zengchang, and Wan, Tao
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we propose a model using generative adversarial net (GAN) to generate realistic text. Instead of using standard GAN, we combine variational autoencoder (VAE) with generative adversarial net. The use of high-level latent random variables is helpful to learn the data distribution and solve the problem that generative adversarial net always emits the similar data. We propose the VGAN model where the generative model is composed of recurrent neural network and VAE. The discriminative model is a convolutional neural network. We train the model via policy gradient. We apply the proposed model to the task of text generation and compare it to other recent neural network based models, such as recurrent neural network language model and SeqGAN. We evaluate the performance of the model by calculating negative log-likelihood and the BLEU score. We conduct experiments on three benchmark datasets, and results show that our model outperforms other previous models.
Published: 2017

22. Data Augmentation in Emotion Classification Using Generative Adversarial Networks

Author: Zhu, Xinyue, Liu, Yifan, Qin, Zengchang, and Li, Jiahong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: It is a difficult task to classify images with multiple class labels using only a small number of labeled examples, especially when the label (class) distribution is imbalanced. Emotion classification is such an example of imbalanced label distribution, because some classes of emotions like \emph{disgusted} are relatively rare comparing to other labels like {\it happy or sad}. In this paper, we propose a data augmentation method using generative adversarial networks (GAN). It can complement and complete the data manifold and find better margins between neighboring classes. Specifically, we design a framework with a CNN model as the classifier and a cycle-consistent adversarial networks (CycleGAN) as the generator. In order to avoid gradient vanishing problem, we employ the least-squared loss as adversarial loss. We also propose several evaluation methods on three benchmark datasets to validate GAN's performance. Empirical results show that we can obtain 5%~10% increase in the classification accuracy after employing the GAN-based data augmentation techniques.
Published: 2017

23. Motif Iteration Model for Network Representation

Author: Lv, Lintao, Qin, Zengchang, and Wan, Tao
Subjects: Computer Science - Social and Information Networks
Abstract: Social media mining has become one of the most popular research areas in Big Data with the explosion of social networking information from Facebook, Twitter, LinkedIn, Weibo and so on. Understanding and representing the structure of a social network is a key in social media mining. In this paper, we propose the Motif Iteration Model (MIM) to represent the structure of a social network. As the name suggested, the new model is based on iteration of basic network motifs. In order to better show the properties of the model, a heuristic and greedy algorithm called Vertex Reordering and Arranging (VRA) is proposed by studying the adjacency matrix of the three-vertex undirected network motifs. The algorithm is for mapping from the adjacency matrix of a network to a binary image, it shows a new perspective of network structure visualization. In summary, this model provides a useful approach towards building link between images and networks and offers a new way of representing the structure of a social network., Comment: 10 pages, 3 figures and it is an extended vision of our conference paper in ICONIP 2017
Published: 2017

24. Logical Parsing from Natural Language Based on a Neural Translation Model

Author: Li, Liang, Li, Pengyu, Liu, Yifan, Wan, Tao, and Qin, Zengchang
Subjects: Computer Science - Computation and Language
Abstract: Semantic parsing has emerged as a significant and powerful paradigm for natural language interface and question answering systems. Traditional methods of building a semantic parser rely on high-quality lexicons, hand-crafted grammars and linguistic features which are limited by applied domain or representation. In this paper, we propose a general approach to learn from denotations based on Seq2Seq model augmented with attention mechanism. We encode input sequence into vectors and use dynamic programming to infer candidate logical forms. We utilize the fact that similar utterances should have similar logical forms to help reduce the searching space. Under our learning policy, the Seq2Seq model can learn mappings gradually with noises. Curriculum learning is adopted to make the learning smoother. We test our method on the arithmetic domain which shows our model can successfully infer the correct logical forms and learn the word meanings, compositionality and operation orders simultaneously.
Published: 2017

25. Generative Cooperative Net for Image Generation and Data Augmentation

Author: Xu, Qiangeng, Qin, Zengchang, and Wan, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: How to build a good model for image generation given an abstract concept is a fundamental problem in computer vision. In this paper, we explore a generative model for the task of generating unseen images with desired features. We propose the Generative Cooperative Net (GCN) for image generation. The idea is similar to generative adversarial networks except that the generators and discriminators are trained to work accordingly. Our experiments on hand-written digit generation and facial expression generation show that GCN's two cooperative counterparts (the generator and the classifier) can work together nicely and achieve promising results. We also discovered a usage of such generative model as an data-augmentation tool. Our experiment of applying this method on a recognition task shows that it is very effective comparing to other existing methods. It is easy to set up and could help generate a very large synthesized dataset., Comment: 12 pages, 8 figures
Published: 2017

26. Stock Volatility Prediction Using Recurrent Neural Networks with Sentiment Analysis

Author: Liu, Yifan, Qin, Zengchang, Li, Pengyu, and Wan, Tao
Subjects: Computer Science - Social and Information Networks, G.3
Abstract: In this paper, we propose a model to analyze sentiment of online stock forum and use the information to predict the stock volatility in the Chinese market. We have labeled the sentiment of the online financial posts and make the dataset public available for research. By generating a sentimental dictionary based on financial terms, we develop a model to compute the sentimental score of each online post related to a particular stock. Such sentimental information is represented by two sentiment indicators, which are fused to market data for stock volatility prediction by using the Recurrent Neural Networks (RNNs). Empirical study shows that, comparing to using RNN only, the model performs significantly better with sentimental indicators., Comment: 10 pages, 5 figures and it is an extended vision of our conference paper in IEA/AIE 2017
Published: 2017

27. Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks

Author: Liu, Yifan, Qin, Zengchang, Luo, Zhenbo, and Wang, Hua
Subjects: Computer Science - Computer Vision and Pattern Recognition, I.4.9, I.4.8, I.3.3
Abstract: Recently, realistic image generation using deep neural networks has become a hot topic in machine learning and computer vision. Images can be generated at the pixel level by learning from a large collection of images. Learning to generate colorful cartoon images from black-and-white sketches is not only an interesting research problem, but also a potential application in digital entertainment. In this paper, we investigate the sketch-to-image synthesis problem by using conditional generative adversarial networks (cGAN). We propose the auto-painter model which can automatically generate compatible colors for a sketch. The new model is not only capable of painting hand-draw sketch with proper colors, but also allowing users to indicate preferred colors. Experimental results on two sketch datasets show that the auto-painter performs better that existing image-to-image methods., Comment: 12 pages, 7 figures
Published: 2017

28. Random Neural Graph Generation with Structure Evolution

Author: Zhou, Yuguang, He, Zheng, Wan, Tao, Qin, Zengchang, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Mantoro, Teddy, editor, Lee, Minho, editor, Ayu, Media Anugerah, editor, Wong, Kok Wai, editor, and Hidayanto, Achmad Nizar, editor
Published: 2021
Full Text: View/download PDF

29. Robust Lightweight Depth Estimation Model via Data-Free Distillation

Author: Gao, Zihan, primary, Gao, Peng, additional, Yin, Wei, additional, Liu, Yifan, additional, and Qin, Zengchang, additional
Published: 2024
Full Text: View/download PDF

30. A Deep Learning Model for Early Prediction of Sepsis from Intensive Care Unit Records

Author: Zhao, Rui, Wan, Tao, Li, Deyu, Zhang, Zhengbo, Qin, Zengchang, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Yang, Haiqin, editor, Pasupa, Kitsuchart, editor, Leung, Andrew Chi-Sing, editor, Kwok, James T., editor, Chan, Jonathan H., editor, and King, Irwin, editor
Published: 2020
Full Text: View/download PDF

31. Many-to-One Stable Matching for Prediction in Social Networks

Author: Dong, Ke, Qin, Zengchang, Wan, Tao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Fujita, Hamido, editor, Fournier-Viger, Philippe, editor, Ali, Moonis, editor, and Sasaki, Jun, editor
Published: 2020
Full Text: View/download PDF

32. Automatic vessel segmentation in X-ray angiogram using spatio-temporal fully-convolutional neural network

Author: Wan, Tao, Chen, Jianhui, Zhang, Zhonghua, Li, Deyu, and Qin, Zengchang
Published: 2021
Full Text: View/download PDF

33. Stable Matching with Incomplete Information in Structured Networks

Author: Ling, Ying, Wan, Tao, and Qin, Zengchang
Subjects: Computer Science - Computer Science and Game Theory, Computer Science - Social and Information Networks, Physics - Physics and Society
Abstract: In this paper, we investigate stable matching in structured networks. Consider case of matching in social networks where candidates are not fully connected. A candidate on one side of the market gets acquaintance with which one on the heterogeneous side depends on the structured network. We explore four well-used structures of networks and define the social circle by the distance between each candidate. When matching within social circle, we have equilibrium distinguishes from each other since each social network's topology differs. Equilibrium changes with the change on topology of each network and it always converges to the same stable outcome as complete information algorithm if there is no block to reach anyone in agent's social circle., Comment: 13 pages; 6 figures
Published: 2015

34. Semantic Modeling of Textual Relationships in Cross-modal Retrieval

Author: Yu, Jing, Yang, Chenghao, Qin, Zengchang, Yang, Zhuoqian, Hu, Yue, Shi, Zhiguo, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Douligeris, Christos, editor, Karagiannis, Dimitris, editor, and Apostolou, Dimitris, editor
Published: 2019
Full Text: View/download PDF

35. Generative Cooperative Net for Image Generation and Data Augmentation

Author: Xu, Qiangeng, Qin, Zengchang, Wan, Tao, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Pandu Rangan, C., Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Seki, Hirosato, editor, Nguyen, Canh Hao, editor, Huynh, Van-Nam, editor, and Inuiguchi, Masahiro, editor
Published: 2019
Full Text: View/download PDF

36. Robust nuclei segmentation in histopathology using ASPPU-Net and boundary refinement

Author: Wan, Tao, Zhao, Lei, Feng, Hongxiang, Li, Deyu, Tong, Chao, and Qin, Zengchang
Published: 2020
Full Text: View/download PDF

37. Cross-modal learning with prior visual relation knowledge

Author: Yu, Jing, Zhang, Weifeng, Yang, Zhuoqian, Qin, Zengchang, and Hu, Yue
Published: 2020
Full Text: View/download PDF

38. Multimodal feature fusion by relational reasoning and attention for visual question answering

Author: Zhang, Weifeng, Yu, Jing, Hu, Hua, Hu, Haiyang, and Qin, Zengchang
Published: 2020
Full Text: View/download PDF

39. Logical Parsing from Natural Language Based on a Neural Translation Model

Author: Li, Liang, Liu, Yifan, Qin, Zengchang, Li, Pengyu, Wan, Tao, Barbosa, Simone Diniz Junqueira, Series Editor, Chen, Phoebe, Series Editor, Filipe, Joaquim, Series Editor, Kotenko, Igor, Series Editor, Sivalingam, Krishna M., Series Editor, Washio, Takashi, Series Editor, Yuan, Junsong, Series Editor, Zhou, Lizhu, Series Editor, Hasida, Kôiti, editor, and Pa, Win Pa, editor
Published: 2018
Full Text: View/download PDF

40. Emotion Classification with Data Augmentation Using Generative Adversarial Networks

Author: Zhu, Xinyue, Liu, Yifan, Li, Jiahong, Wan, Tao, Qin, Zengchang, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Phung, Dinh, editor, Tseng, Vincent S., editor, Webb, Geoffrey I., editor, Ho, Bao, editor, Ganji, Mohadeseh, editor, and Rashidi, Lida, editor
Published: 2018
Full Text: View/download PDF

41. Text Generation Based on Generative Adversarial Nets with Latent Variables

Author: Wang, Heng, Qin, Zengchang, Wan, Tao, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Phung, Dinh, editor, Tseng, Vincent S., editor, Webb, Geoffrey I., editor, Ho, Bao, editor, Ganji, Mohadeseh, editor, and Rashidi, Lida, editor
Published: 2018
Full Text: View/download PDF

42. Improved Nuclear Segmentation on Histopathology Images Using a Combination of Deep Learning and Active Contour Model

Author: Zhao, Lei, Wan, Tao, Feng, Hongxiang, Qin, Zengchang, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Cheng, Long, editor, Leung, Andrew Chi Sing, editor, and Ozawa, Seiichi, editor
Published: 2018
Full Text: View/download PDF

43. Accurate segmentation of overlapping cells in cervical cytology with deep convolutional neural networks

Author: Wan, Tao, Xu, Shusong, Sang, Chen, Jin, Yulan, and Qin, Zengchang
Published: 2019
Full Text: View/download PDF

44. Random Neural Graph Generation with Structure Evolution

Author: Zhou, Yuguang, primary, He, Zheng, additional, Wan, Tao, additional, and Qin, Zengchang, additional
Published: 2021
Full Text: View/download PDF

45. Automated identification and grading of coronary artery stenoses with X-ray angiography

Author: Wan, Tao, Feng, Hongxiang, Tong, Chao, Li, Deyu, and Qin, Zengchang
Published: 2018
Full Text: View/download PDF

46. Many-to-One Stable Matching for Prediction in Social Networks

Author: Dong, Ke, primary, Qin, Zengchang, additional, and Wan, Tao, additional
Published: 2020
Full Text: View/download PDF

47. A Radiomics Approach for Automated Identification of Aggressive Tumors on Combined PET and Multi-parametric MRI

Author: Wan, Tao, Cui, Bixiao, Wang, Yaping, Qin, Zengchang, Lu, Jie, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Liu, Derong, editor, Xie, Shengli, editor, Li, Yuanqing, editor, Zhao, Dongbin, editor, and El-Alfy, El-Sayed M., editor
Published: 2017
Full Text: View/download PDF

48. A Bayesian Model of Game Decomposition

Author: Zhao, Hanqing, Qin, Zengchang, Liu, Weijia, Wan, Tao, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Benferhat, Salem, editor, Tabia, Karim, editor, and Ali, Moonis, editor
Published: 2017
Full Text: View/download PDF

49. Automated coronary artery tree segmentation in X-ray angiography using improved Hessian based enhancement and statistical region merging

Author: Wan, Tao, Shang, Xiaoqing, Yang, Weilin, Chen, Jianhui, Li, Deyu, and Qin, Zengchang
Published: 2018
Full Text: View/download PDF

50. Bayesian Methods Based on Label Semantics

Author: Qin, Zengchang, Tang, Yongchuan, Qin, Zengchang, and Tang, Yongchuan
Published: 2014
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

375 results on '"Qin, Zengchang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources