105 results on '"Yueting Zhuang"'
Search Results
2. Slimmable Domain Adaptation
- Author
-
Rang Meng, Weijie Chen, Shicai Yang, Jie Song, Luojun Lin, Di Xie, Shiliang Pu, Xinchao Wang, Mingli Song, and Yueting Zhuang
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Vanilla unsupervised domain adaptation methods tend to optimize the model with fixed neural architecture, which is not very practical in real-world scenarios since the target data is usually processed by different resource-limited devices. It is therefore of great necessity to facilitate architecture adaptation across various devices. In this paper, we introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank, from which models of different capacities can be sampled to accommodate different accuracy-efficiency trade-offs. The main challenge in this framework lies in simultaneously boosting the adaptation performance of numerous models in the model bank. To tackle this problem, we develop a Stochastic EnsEmble Distillation method to fully exploit the complementary knowledge in the model bank for inter-model interaction. Nevertheless, considering the optimization conflict between inter-model interaction and intra-model adaptation, we augment the existing bi-classifier domain confusion architecture into an Optimization-Separated Tri-Classifier counterpart. After optimizing the model bank, architecture adaptation is leveraged via our proposed Unsupervised Performance Evaluation Metric. Under various resource constraints, our framework surpasses other competing approaches by a very large margin on multiple benchmarks. It is also worth emphasizing that our framework can preserve the performance improvement against the source-only model even when the computing complexity is reduced to $1/64$. Code will be available at https://github.com/hikvision-research/SlimDA., To appear in CVPR 2022. Code is coming soon: https://github.com/hikvision-research/SlimDA
- Published
- 2022
3. Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
- Author
-
Juncheng Li, Junlin Xie, Long Qian, Linchao Zhu, Siliang Tang, Fei Wu, Yi Yang, Yueting Zhuang, and Xin Eric Wang
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Temporal grounding in videos aims to localize one target video segment that semantically corresponds to a given query sentence. Thanks to the semantic diversity of natural language descriptions, temporal grounding allows activity grounding beyond pre-defined classes and has received increasing attention in recent years. The semantic diversity is rooted in the principle of compositionality in linguistics, where novel semantics can be systematically described by combining known words in novel ways (compositional generalization). However, current temporal grounding datasets do not specifically test for the compositional generalizability. To systematically measure the compositional generalizability of temporal grounding models, we introduce a new Compositional Temporal Grounding task and construct two new dataset splits, i.e., Charades-CG and ActivityNet-CG. Evaluating the state-of-the-art methods on our new dataset splits, we empirically find that they fail to generalize to queries with novel combinations of seen words. To tackle this challenge, we propose a variational cross-graph reasoning framework that explicitly decomposes video and language into multiple structured hierarchies and learns fine-grained semantic correspondence among them. Experiments illustrate the superior compositional generalizability of our approach. The repository of this work is at https://github.com/YYJMJC/ Compositional-Temporal-Grounding., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022
- Published
- 2022
4. Simulation-and-Mining: Towards Accurate Source-Free Unsupervised Domain Adaptive Object Detection
- Author
-
Peng Yuan, Weijie Chen, Shicai Yang, Yunyi Xuan, Di Xie, Yueting Zhuang, and Shiliang Pu
- Published
- 2022
5. Learning to Learn by Jointly Optimizing Neural Architecture and Weights
- Author
-
Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, and Qi Tian
- Abstract
Meta-learning enables models to adapt to new environments rapidly with a few training examples. Current gradient-based meta-learning methods concentrate on finding good model-agnostic initialization (meta-weights) for learners. In this paper, we aim to obtain better meta-learners by co-optimizing the architecture and meta-weights simultaneously. Existing NAS-based meta-learning methods apply a two-stage strategy, i.e., first searching architectures and then re-training meta-weights on the searched architecture. However, this two-stage strategy would break the mutual impact of the architecture and meta-weights since they are optimized separately. Differently, we propose progressive connection consolidation, fixing the architecture layer by layer, in which the layer with the largest weight value would be fixed first. In this way, we can jointly search architectures and train the meta-weights on fixed layers. Besides, to improve the generalization performance of the searched meta-learner on all tasks, we propose a more effective rule for co-optimization, namely Connection-Adaptive Meta-learning (CAML). By searching only once, we can obtain both adaptive architecture and meta-weights for meta-learning. Extensive experiments show that our method achieves state-of-the-art performance with 3x less computational cost, revealing our method's effectiveness and efficiency.
- Published
- 2022
6. Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference
- Author
-
Juncheng Li, Siliang Tang, Linchao Zhu, Haochen Shi, Xuanwen Huang, Fei Wu, Yi Yang, and Yueting Zhuang
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Video-and-Language Inference is a recently proposed task for joint video-and-language understanding. This new task requires a model to draw inference on whether a natural language statement entails or contradicts a given video clip. In this paper, we study how to address three critical challenges for this task: judging the global correctness of the statement involved multiple semantic meanings, joint reasoning over video and subtitles, and modeling long-range relationships and complex social interactions. First, we propose an adaptive hierarchical graph network that achieves in-depth understanding of the video over complex interactions. Specifically, it performs joint reasoning over video and subtitles in three hierarchies, where the graph structure is adaptively adjusted according to the semantic structures of the statement. Secondly, we introduce semantic coherence learning to explicitly encourage the semantic coherence of the adaptive hierarchical graph network from three hierarchies. The semantic coherence learning can further improve the alignment between vision and linguistics, and the coherence across a sequence of video segments. Experimental results show that our method significantly outperforms the baseline by a large margin.
- Published
- 2021
7. Semi-supervised Active Learning for Semi-supervised Models: Exploit Adversarial Examples with Graph-based Virtual Labels
- Author
-
Jiannan Guo, Haochen Shi, Yangyang Kang, Kun Kuang, Siliang Tang, Zhuoren Jiang, Changlong Sun, Fei Wu, and Yueting Zhuang
- Published
- 2021
8. Discriminate and Reconstruct: Learning from Language Model to Answer Keyword Questions
- Author
-
Boyuan Pan, Yueting Zhuang, Yazheng Yang, and Deng Cai
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,computer.software_genre ,Task (project management) ,Question answering ,Reinforcement learning ,Quality (business) ,Language model ,Artificial intelligence ,business ,computer ,Natural language ,Sentence ,Natural language processing ,Simple (philosophy) ,media_common - Abstract
We consider a new problem of question answering when the questions are in form of keywords, rather than natural language. While searching on machines or interacting with the robots, people usually prefer to raise a query by several keywords rather than a complete sentence. The new task of Keyword Question Answering (KQA) is challenging and significant because small variations to a question may completely change its semantical information, thus yields different answers. In this paper, we propose a simple but strong system for KQA composed of (1) a Keyword Question Discriminator to recognize the keyword questions, (2) a Question Reconstructor that learns from a language model to reconstruct the keyword questions and (3) a question answering model to produce answers. We further finetune the reconstructor via reinforcement learning by the quality of the answers to help generate answerable questions. Moreover, We also develop a semi-supervised learning method to build the keyword question datasets. Empirical results demonstrate the effectiveness of our method in comparison with various baselines, and we also find that the high layers of the language model are helpful in handling grammatical blunder and semantic fuzziness.
- Published
- 2019
9. Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction
- Author
-
Jian Shao, Dejing Xu, Yueting Zhuang, Di Xie, Jun Xiao, and Zhou Zhao
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Convolutional neural network ,Task (project management) ,k-nearest neighbors algorithm ,Categorization ,Dynamics (music) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,Representation (mathematics) ,business ,Feature learning ,0105 earth and related environmental sciences - Abstract
We propose a self-supervised spatiotemporal learning technique which leverages the chronological order of videos. Our method can learn the spatiotemporal representation of the video by predicting the order of shuffled clips from the video. The category of the video is not required, which gives our technique the potential to take advantage of infinite unannotated videos. There exist related works which use frames, while compared to frames, clips are more consistent with the video dynamics. Clips can help to reduce the uncertainty of orders and are more appropriate to learn a video representation. The 3D convolutional neural networks are utilized to extract features for clips, and these features are processed to predict the actual order. The learned representations are evaluated via nearest neighbor retrieval experiments. We also use the learned networks as the pre-trained models and finetune them on the action recognition task. Three types of 3D convolutional neural networks are tested in experiments, and we gain large improvements compared to existing self-supervised methods.
- Published
- 2019
10. HTMVS: Visualizing hierarchical topics and their evolution
- Author
-
Si Li, Fei Wu, Dong Haoling, Siliang Tang, and Yueting Zhuang
- Subjects
Topic model ,Information retrieval ,Hierarchy (mathematics) ,business.industry ,Computer science ,Unstructured data ,Semantics ,computer.software_genre ,Visualization ,Information visualization ,Data visualization ,Transformation (function) ,Data mining ,business ,computer - Abstract
Topic model has been an active research area for many years, it can be used for discovering latent semantics and finding hidden knowledge in unstructured data corpus. In this paper, we investigated the problems in visualizing hierarchical topic and their evolution. The contribution of this paper is threefold, first we explore the static visualization of hierarchical topics using the ‘nested circle’ layout, and then in order to present the topic evolution over time, we extended a hierarchical topic model and employ topic transformation visualizations to track the arising, splitting and disappearing of certain topics under the dynamic topical hierarchy. Finally, a Hierarchical Topic Model Visualization System (HTMVS) is designed to take advantage of both static and dynamic hierarchical topic visualization.
- Published
- 2015
11. Attribute prediction with long-range interactions via path coding
- Author
-
Zhuhao Wang, Yueting Zhuang, Yahong Han, Fei Wu, Jiebo Luo, and Qi Tian
- Subjects
Attribute domain ,Data mining ,Flow network ,Directed acyclic graph ,computer.software_genre ,computer ,Mathematics ,Coding (social sciences) - Abstract
Due to the describable or human-nameable nature of visual attributes, the appropriate utilization of attributes has been receiving much attention in recent years in many applications. Motivated by the assumption that the long-range interactions between attributes can boost image understanding and classification, path coding is utilized in this paper to model the long-range interactions between attributes for the attribute prediction, we call it attribute prediction via a path coding penalty (abbreviated as AP2CP). AP2CP not only introduces structured sparsity penalties over paths on a directed acyclic graph, but also captures the intrinsical long-range dependent interactions between attributes. The proposed AP2CP can be efficiently solved by leveraging network flow optimization. The experiments show that the proposed AP2CP achieves a better performance in attribute prediction.
- Published
- 2014
12. Geo-informative discriminative image representation by semi-supervised hierarchical topic modeling
- Author
-
Siliang Tang, Yueting Zhuang, Weiming Lu, Jian Shao, and Li Zijian
- Subjects
Topic model ,Hierarchy ,Information retrieval ,business.industry ,Computer science ,Semantics ,Machine learning ,computer.software_genre ,Data modeling ,Tree (data structure) ,Categorization ,Discriminative model ,Feature (computer vision) ,Artificial intelligence ,business ,computer - Abstract
Nowadays, the prevalence of sharing tourist photos to online communities has created an increasing demand for mining discriminative architecture aspects from historic landmarks. Some previous researches have demonstrated that topic models could discover discriminative features represented by meaningful visual-topics. However, they seldom exploited the indicative function of geo-tags and the hierarchy in architecture characteristics. In order to utilize this information, we proposed a semi-supervised hierarchical topic modeling approach (namely, shTM). In our approach, every image could be represented by a probability distribution over selected geo-related visual-topics from a partly randomized topic tree. We evaluated our approach on a real-world dataset with over 26 thousand geo-informative photos from Flickr. Experiments show that shTM topics could reveal more discriminative aspects of a specific architecture than other well-known image features, such as HOG and SIFT, on the tasks of automatic photo categorization and geographical information retrieval.
- Published
- 2014
13. Digital Library Engine: Adapting Digital Library for Cloud Computing
- Author
-
Liangju Zheng, Yueting Zhuang, Baogang Wei, Jian Shao, and Weiming Lu
- Subjects
World Wide Web ,Multitenancy ,Service (systems architecture) ,business.industry ,Computer science ,Quality of service ,Reliability (computer networking) ,Scalability ,Cloud computing ,Digital library ,Resource management (computing) ,business - Abstract
With the rapid growth of digital libraries, more data and smart services are involved. People come to recognize the importance of digital libraries and the convenience they might bring to the society. However, the cost of owning a digital library is quite high, and many institutions do not have the ability to run and maintain a digital library by themselves, especially for massive data and complex services which require lots of storage and computing resources. In this paper, we proposed the Digital Library Engine, which aims to provide a new Platform as a Service for fast developing and deploying digital libraries in cloud. To the best of our knowledge, it is the first work to create a PaaS system for digital libraries. With the help of Digital Library Engine, institutions only need to develop some service bundles, which can be deployed in the engine, and then their own digital libraries could be running well with features of scalability, reliability, security, extensibility, availability and manageability. The practice in CADAL and the experiments demonstrate the feasibility and efficiency of our engine.
- Published
- 2013
14. Nonnegative Matrix Factorization for Multimodality Data from Multi-source Domain
- Author
-
Jian Shao, Fei Wu, Shuai Ma, Yueting Zhuang, and Xu Tan
- Subjects
Information retrieval ,Group method of data handling ,Computer science ,business.industry ,Machine learning ,computer.software_genre ,Non-negative matrix factorization ,Multimodality ,Domain (software engineering) ,Automatic image annotation ,Artificial intelligence ,business ,Image retrieval ,computer ,Subspace topology ,Multi-source - Abstract
With the growing popularity of social tagging, more and more images are annotated by users on web sites(e.g., Flickr, Blogspace and Youtube). Since the tags annotated by users are often noisy, ambiguous, and subjective, it is beneficial to fully utilize the multiview information and borrow strength from multiple data sources to boost the performance of image annotation and tag-based image retrieval. Therefore, the appropriate integration and utilization of complimentary cues from multiple modality in multiple data sources is an important research topic. Inspired by the recent advances of multiview learning and shared subspace learning, this paper proposes an approach, namely Multimodality Multi-source Nonnegative Matrix Factorization (M2NMF) to learn a shared and corresponding individual structures from multimodality data via nonnegative matrix factorization. The experimental results demonstrate the feasibility and effectiveness of the proposed approach.
- Published
- 2012
15. Graph-guided sparse reconstruction for region tagging
- Author
-
Yahong Han, Yueting Zhuang, Qi Tian, Fei Wu, and Jian Shao
- Subjects
Correlation ,Computer science ,business.industry ,Graph (abstract data type) ,Pattern recognition ,Graph theory ,Artificial intelligence ,Iterative reconstruction ,Image segmentation ,business ,Visualization - Abstract
Many of contextual correlations co-exist within the segmented regions among images, like the visual context and semantic context. The appropriate integration and utilization of such contexts are very important to boost the performance of region tagging. Inspired by the recent advances of sparse reconstruction methods, this paper proposes an approach, called Graph-Guided Sparse Reconstruction for Region Tagging (G2SRRT). The G2SRRT consists of two steps: sparse reconstruction for testing regions and tag propagation from training regions to testing regions. In G2SRRT, graph is conducted to flexibly model the contextual correlations among regions. To integrate the graph structure learned from training regions into the sparse reconstruction, we define a Graph-Guided Fusion (G2F) penalty over the graph to encourage the sparsity of differences between two reconstruction coefficients, which corresponds to the linked regions in the graph. Guided by this G2F penalty, the highly correlated regions tend to be jointly selected for the reconstruction, which results in a better performance of region tagging. Experiments on three open benchmark image datasets demonstrate the effectiveness of the proposed algorithm.
- Published
- 2012
16. Tag Clustering and Refinement on Semantic Unity Graph
- Author
-
Fei Wu, Yueting Zhuang, Yang Liu, Yin Zhang, and Jian Shao
- Subjects
Information retrieval ,User experience design ,Computer science ,business.industry ,Graph (abstract data type) ,Graph theory ,Polysemy ,Cluster analysis ,business ,Image retrieval - Abstract
Recently, there has been extensive research towards the user-provided tags on photo sharing websites which can greatly facilitate image retrieval and management. However, due to the arbitrariness of the tagging activities, these tags are often imprecise and incomplete. As a result, quite a few technologies has been proposed to improve the user experience on these photo sharing systems, including tag clustering and refinement, etc. In this work, we propose a novel framework to model the relationships among tags and images which can be applied to many tag based applications. Different from previous approaches which model images and tags as heterogeneous objects, images and their tags are uniformly viewed as compositions of Semantic Unities in our framework. Then Semantic Unity Graph (SUG) is introduced to represent the complex and high-order relationships among these Semantic Unities. Based on the representation of Semantic Unity Graph, the relevance of images and tags can be naturally measured in terms of the similarity of their Semantic Unities. Then Tag clustering and refinement can then be performed on SUG and the polysemy of images and tags is explicitly considered in this framework. The experiment results conducted on NUS-WIDE and MIR-Flickr datasets demonstrate the effectiveness and efficiency of the proposed approach.
- Published
- 2011
17. Inverse-degree Sampling for Spectral Clustering
- Author
-
Yueting Zhuang, Haidong Gao, Fei Wu, and Jian Shao
- Subjects
Fuzzy clustering ,business.industry ,Correlation clustering ,Constrained clustering ,Sampling (statistics) ,Machine learning ,computer.software_genre ,Data stream clustering ,CURE data clustering algorithm ,Canopy clustering algorithm ,Artificial intelligence ,business ,Cluster analysis ,computer ,Algorithm ,Mathematics - Abstract
Among those classical clustering algorithms, spectral clustering performs much better than K-means in most cases. However, for the sake of cubic time complexity, spectral clustering is hardly used for clustering large-scale data sets. Therefore, sampling-based methods such as Nystri§om method and Column sampling are respectively conducted as potential approaches to tackle this challenge. As we know, current sampling-based methods often utilize the uniform or other random sampling policies to select representative data and tend to disregard the data in small size clusters. This paper proposes an unbiased sampling framework, derives a new sampling method called inverse-degree sampling and then introduces an entropy criterion to prove it in theory simply. According to the selection of representative data by inverse-degree sampling in spectral clustering, the time complexity of spectral clustering becomes quadratic. Experiments on both toy data and real-world data demonstrate both the good sampling performance and the comparable clustering quality.
- Published
- 2011
18. Automatic annotation of geo-information in panoramic street view by image retrieval
- Author
-
Fei Wu, Yueting Zhuang, and Ming Chen
- Subjects
Information retrieval ,Panorama ,Pixel ,Computer science ,business.industry ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Visualization ,Motion estimation ,Computer vision ,Artificial intelligence ,Cluster analysis ,business ,Image retrieval - Abstract
Panoramic street view is now becoming a popular service in digital map due to its expedient of virtual walking through. Recently, some geo-tagged photos have been added into street view as additional illustration images for the same scenes. However, these images come directly from certain photo sharing websites where users manually tag the locations of their uploaded images; the quality of annotated location is very poor. In this paper, we propose a system to annotate the location tags of users' images in panoramic street view only with visual features. By searching the exemplar region which mostly matches the query image, the system can annotate the query image with geo-information provided by the matched exemplar image. In order to boost matching performance, various techniques such as the motion estimation, visual clustering and feature matching are implemented in our system. The experiments show satisfactory performance and promising results.
- Published
- 2010
19. Sparse representation using nonnegative curds and whey
- Author
-
Zhihua Zhang, Yanan Liu, Shuicheng Yan, Yueting Zhuang, and Fei Wu
- Subjects
Caltech 101 ,Standard test image ,business.industry ,Pattern recognition ,Artificial intelligence ,Sparse approximation ,business ,Linear combination ,Linear discriminant analysis ,Mathematics ,Non-negative matrix factorization ,Matrix decomposition ,Sparse matrix - Abstract
It has been of great interest to find sparse and/or nonnegative representations in computer vision literature. In this paper we propose a novel method to such a purpose and refer to it as nonnegative curds and whey (NNCW). The NNCW procedure consists of two stages. In the first stage we consider a set of sparse and nonnegative representations of a test image, each of which is a linear combination of the images within a certain class, by solving a set of regressiontype nonnegative matrix factorization problems. In the second stage we incorporate these representations into a new sparse and nonnegative representation by using the group nonnegative garrote. This procedure is particularly appropriate for discriminant analysis owing to its supervised and nonnegativity nature in sparsity pursuing. Experiments on several benchmark face databases and Caltech 101 image dataset demonstrate the efficiency and effectiveness of our nonnegative curds and whey method.
- Published
- 2010
20. Web image interpretation: semi-supervised mining annotated words
- Author
-
Hanwang Zhang, Fei Wu, Dingyi xia, Yueting Zhuang, and Wenhao Liu
- Subjects
Information retrieval ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,WordNet ,computer.software_genre ,Semantics ,Visualization ,Annotation ,Search engine ,Text mining ,Automatic image annotation ,Ranking ,Web page ,Artificial intelligence ,business ,computer ,Image retrieval ,Natural language processing ,Natural language - Abstract
An image is worth of thousand words. Automatic web image annotation is a practical and effective way for both web image retrieval and image understanding. However, current annotation techniques are very difficult to get natural language interpretation for images such as “pandas eat bamboo”. In this paper, we proposed an approach to interpret image semantics through semi-supervised mining annotated words. The idea in this approach mainly consists of three parts: at first, the visibility of annotated words of target image is calculated by semi-supervised learning approach from the landmark words in WordNet; then the annotated words are used as queries to retrieve matched web pages; at last, the meaningful sentences in the matched web pages are ranked as the interpretation of target image by semi-supervised learning approach. Experiments conducted on real-world web images demonstrate the effectiveness of the proposed approach.
- Published
- 2009
21. Face inpainting by feature guidance
- Author
-
Timothy K. Shih, Joseph C. Tsai, Yueting Zhuang, Nick C. Tang, and Yushun Wang
- Subjects
Pixel ,business.industry ,Computer science ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Inpainting ,Pattern recognition ,Iterative reconstruction ,Facial recognition system ,Feature (computer vision) ,Face (geometry) ,Computer vision ,Artificial intelligence ,business ,Image restoration - Abstract
Face image partially occluded or damaged can be repaired automatically. We proposed a new inpainting algorithm, based on patch guidance deduced from an existing face database, to recover the damaged portions. This newly proposed concept of guided inpainting method produces seamless faces which are hardly seen drawbacks. Examples of our results can be retrieved from http://member.mine.tku.edu.tw/www/ISCAS09/.
- Published
- 2009
22. Knowledge Approximation and Rule Acquisition Based on VPRS in Ordered Information Systems
- Author
-
Shen-Ming Gu, Xiang-Qing Zhao, and Yueting Zhuang
- Subjects
Algebra ,Approximation theory ,Dominance relation ,Information system ,Data mining ,Rough set ,computer.software_genre ,Decision table ,computer ,Knowledge acquisition ,Variable precision ,Mathematics - Abstract
This paper deals with the knowledge approximation and rule acquisition based on variable precision rough sets in ordered information systems. The concepts of upward $\beta$-lower approximation, upward $\beta$-upper approximation, downward $\beta$-lower approximation and downward $\beta$-upper approximation based on dominance relation are introduced, and an approach to rule acquisition is also proposed in ordered decision tables with an illustrative example.
- Published
- 2009
23. Clustering by evidence accumulation on affinity propagation
- Author
-
Xuqing Zhang, Fei Wu, and Yueting Zhuang
- Subjects
Theoretical computer science ,Fuzzy clustering ,business.industry ,Correlation clustering ,Single-linkage clustering ,Constrained clustering ,CURE data clustering algorithm ,Canopy clustering algorithm ,Affinity propagation ,Artificial intelligence ,business ,Cluster analysis ,Algorithm ,Mathematics - Abstract
Affinity propagation (AP) is a clustering algorithm which has much better performance than traditional clustering approach such as k-means algorithm. In this paper, we present an algorithm called voting partition affinity propagation (voting-PAP) which is a method for clustering using evidence accumulation based on AP. Resulting clusters by voting-PAP are not constrained to be hyper-spherically shaped. Voting-PAP consists of three parts: partition affinity propagation (PAP), relaxed multi-root minimum spanning tree (MST) and majority voting. PAP is a method which can produce different exemplar set based on AP. Relaxed multi-root MST is a data point assign algorithm which has better performance than nearest assign rule. Majority voting is a scheme used to find a consistent clustering result of different partitions based on the idea of evidence accumulation. We also discuss how to find an appropriate threshold corresponding to an approximate ideal consistent partition in this paper.
- Published
- 2008
24. Expressive 3D face synthesis by multi-space modeling
- Author
-
Jun Xiao, Yushun Wang, Yujie Wang, and Yueting Zhuang
- Subjects
Facial expression ,Face hallucination ,business.industry ,Facial motion capture ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Facial recognition system ,Motion capture ,Face (geometry) ,Computer vision ,Artificial intelligence ,business ,Computer facial animation ,Computer animation ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Parameterization of facial expressions is essential in generating vivid virtual avatars. This paper proposes a novel 3D parameterization method of facial images for expressive face synthesis by multi-space modeling. Given a face photograph, we first build a 3D face model by synthesizing in the facial shape space. The selected key expressions from the facial expression space are then transferred to the newly synthesized facial shape. Finally, with the purpose to produce accurate timing of 3D facial animation, blending coefficients of each frame are estimated in the individual blend coefficientspsila space with regard to motion capture data. By utilizing the advantages of multiple spaces, i.e. facial shape space, facial expression space and individual blend coefficientspsila space, our algorithm provide an effective parameterization solution of facial images. The experiments show that our method can produce promising results of expressive 3D faces of those whose face even only appeared once in the cyberspace.
- Published
- 2008
25. Adaptive and compact shape descriptor by progressive feature combination and selection with boosting
- Author
-
Fei Wu, Cheng Chen, Jun Xiao, and Yueting Zhuang
- Subjects
Boosting (machine learning) ,business.industry ,Computation ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Statistical classification ,Effective method ,Artificial intelligence ,AdaBoost ,business ,Linear combination ,Shape analysis (digital geometry) ,Mathematics - Abstract
Many types of shape descriptors have been proposed for 2D shape analysis, but most of them consist of component features that are not adapted to specific problems. This has two drawbacks. First, computation is wasted on the irrelevant components; second, the accuracy is impaired. This paper proposes an effective method that generates compact descriptors adapted to specific problems in hand, where each component of the new descriptor is a linear combination of the components in some classic descriptors. A progressive strategy is used to construct and select the most suitable linear combinations in successive rounds, where a variant of Adaboost is employed to ensure the optimum of the selected combinations in each round. Experiments show that our method effectively generates adaptive and compact descriptors for typical applications such as shape classification and retrieval.
- Published
- 2008
26. Speeding Up Similarity Queries over Large Chinese Calligraphic Character Databases Using Data Grid
- Author
-
Yi Zhuang, Qing Li, Fei Wu, and Yueting Zhuang
- Subjects
iDistance ,Query expansion ,Web search query ,Database ,Web query classification ,Computer science ,Node (computer science) ,Character encoding ,Sargable ,Data mining ,computer.software_genre ,Query optimization ,computer - Abstract
This paper proposes a novel data-grid-based k nearest neighbor query over large Chinese calligraphic character databases, which can significantly speed up the retrieval efficiency. Three steps are made. Firstly, when a user submits a query request to a query node, a process of character set reduction is performed using iDistance index in different data nodes, followed by sending the candidate characters to the executing nodes through a package-based transfer technique. Secondly, a refinement process of the candidate characters is conducted in the executing nodes in parallel to get the answer set. Finally, the answer set is transferred to the query node. The proposed method incorporates a uniform-start- distance-based character data allocation policy and character reduction algorithm. The analysis and experimental results show that the performance of the algorithm is effective in minimizing the response time by decreasing network transfer cost and increasing the parallelism of I/O and CPU.
- Published
- 2007
27. Video Motion Capture by Silhouette Analysis and Pose Optimization
- Author
-
Cheng Chen, Yueting Zhuang, Shicong Zhao, and Yin Cheng
- Subjects
Motion analysis ,business.industry ,Computer science ,3D reconstruction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Iterative reconstruction ,Motion capture ,Silhouette ,Camera auto-calibration ,Computer graphics (images) ,Computer vision ,Artificial intelligence ,Smart camera ,User interface ,business ,Camera resectioning - Abstract
Video based 3D reconstruction of human motion plays an important role in many applications. We implement a system that robustly reconstructs 3D human motion from markerless videos taken by a single camera. Our system only requires a desktop PC and a mainstream camera, and doesn't involve complex camera calibration, making it easy to implement and widely accessible in daily uses such as human computer interaction or entertainment.
- Published
- 2007
28. Adaptive Weight Selection for Incremental Eigen-Background Modeling
- Author
-
Yueting Zhuang and Jian Zhang
- Subjects
Background subtraction ,business.industry ,Computer science ,Frame (networking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Motion detection ,Iterative reconstruction ,Object detection ,symbols.namesake ,Salient ,Robustness (computer science) ,Motion estimation ,symbols ,Computer vision ,Artificial intelligence ,business ,Gaussian process - Abstract
Background modeling is an important approach for motion detection. The background model should adapt to dynamic change of the environment in time and generate background image with no moving foreground. Accordingly, we propose to incorporate the adaptive weight selection mechanism for roughly detected motion regions into the incremental eigen-background method. Comparing with existing works, we originally provide a way to reasonably design and adaptively compute the weight for each frame. Experiments show that the proposed adaptive incremental eigen-background method not only models the dynamic background scene well but also generates better background image with no ghost effect when salient motion occurs.
- Published
- 2007
29. Efficient Silhouette Extraction with Dynamic Viewpoint
- Author
-
Yueting Zhuang and Cheng Chen
- Subjects
Background subtraction ,Computer science ,business.industry ,Feature extraction ,Frame (networking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Nonlinear dimensionality reduction ,Manifold ,Silhouette ,Range (mathematics) ,Computer Science::Computer Vision and Pattern Recognition ,Computer vision ,Artificial intelligence ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
A novel approach is proposed that extends the classical background subtraction method to extract silhouettes from videos in real time with dynamic viewpoint variation caused by camera movement. First, manifold learning is used to model the background under viewpoint variations. Then, for each new frame, the background image corresponding to the same viewpoint is synthesized on the fly by examining the local neighborhood on the manifold, and the silhouette is extracted via background subtraction. An extension is also presented to generate stabilized silhouettes at any fixed viewpoint within the training range. Experiments show that our approach can efficiently extract accurate silhouettes in complex situations while maintaining a low noise level.
- Published
- 2007
30. A Novel Scalable Texture Video Coding Scheme with GPCA
- Author
-
Yueting Zhuang, Jian Liu, Lei Yao, and Fei Wu
- Subjects
Motion compensation ,Computer science ,business.industry ,Quantization (signal processing) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Image texture ,Computer Science::Computer Vision and Pattern Recognition ,Principal component analysis ,Discrete cosine transform ,Computer vision ,Artificial intelligence ,business ,Quantization (image processing) ,Coding (social sciences) ,Data compression - Abstract
This paper proposes a novel SNR scalable coding method with the support of generalized principle component analysis (GPCA). This method encodes the low-pass and high-pass pictures generated by the MCTF decomposition with a hybrid linear model instead of traditional block-based DCT transform. GPCA is a powerful tool to identify the hybrid linear model in the textures, which segment the texture into heterogeneous regions, and then encode each region with PCA method. By keeping various proportions of PCA coefficients, and altering the quantization step sizes for different layers, a better scalable coding result can be achieved.
- Published
- 2007
31. Data-driven Generation of Decision Tree based on Ensemble Multiple-instance Learning for Motion Retrieval
- Author
-
Yueting Zhuang, Fei Wu, and Jian Xiang
- Subjects
Computer science ,business.industry ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Decision tree ,Pattern recognition ,Image segmentation ,Motion (physics) ,Temporal database ,Data-driven ,Data retrieval ,Artificial intelligence ,business ,Image retrieval - Abstract
In this paper, a motion retrieval system is investigated from a multiple-instance learning view. In order to retrieve similar motion data, each human joint's motion clip is regarded as a bag, while each of its segments is regarded as an instance. First 3D temporal-spatial features and their keyspaces of each human joint are extracted. Then data driven decision trees based on ensemble multiple-instance are automatically constructed to reflect the influence of each point during the comparison of motion similarity. At last the method of multiple-instance retrieval is used to complete motion retrieval. Experimental results show that our approaches are effective for motion data retrieval.
- Published
- 2006
32. Filling Holes in Meshes and Recovering Sharp Edges
- Author
-
Tong-qiang Guo, Yueting Zhuang, Jijun Li, and Jian-guang Weng
- Subjects
General Relativity and Quantum Cosmology ,Astrophysics::High Energy Astrophysical Phenomena ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Triangulation (social science) ,Polygon mesh ,Geometry ,Dihedral angle ,Computational geometry ,MathematicsofComputing_DISCRETEMATHEMATICS ,ComputingMethodologies_COMPUTERGRAPHICS ,Mathematics - Abstract
An efficient approach to recover implicit sharp edges with the help of filling holes in meshes is presented in this paper. First, the edges near the hole are classified as smooth and sharp edges by the dihedral angel between their adjacent triangles. The hole-nodes are classified as smooth and sharp nodes according to the types of their adjacent edges. Second, different methods are adopted to shrinkage the holes for different types of hole-nodes. A triangulation method is used for smooth hole-nodes, while an extension method is used for sharp hole-nodes. Third, the filled patch is refined for a smooth and uniform mesh preserving sharp edges logically.
- Published
- 2006
33. Learning Semantic Correlations for Cross-Media Retrieval
- Author
-
Fei Wu, Hong Zhang, and Yueting Zhuang
- Subjects
Information retrieval ,Computer science ,Relevance feedback ,Object (computer science) ,Image retrieval ,Subspace topology - Abstract
This paper proposes a novel cross-media retrieval approach. First, an isomorphic subspace is constructed based on canonical correlation analysis (CCA) to learn multi-modal correlations of media objects; second, polar coordinates are used to judge the general distance of media objects with different modalities in the subspace. Since the integrity of semantic correlations is not likely learned from limited training samples, users' relevance feedback is used to accurately refine cross-media similarities. We also propose methods to map new media objects into the learned subspace, and any new media object would be taken as query example. Experiment results show that our approaches are effective for cross-media retrieval, and meanwhile achieve a significant improvement over content-based image retrieval and content-based audio retrieval.
- Published
- 2006
34. An Approach for Cross-media Retrieval with Cross-reference Graph and PageRank
- Author
-
Fei Wu, Hanhuai Shan, and Yueting Zhuang
- Subjects
Information retrieval ,PageRank ,law ,Computer science ,Human–computer information retrieval ,Multimodal data ,Semantic relationship ,Graph (abstract data type) ,Relevance feedback ,Cross media ,Cross-reference ,law.invention - Abstract
In this paper, we propose a novel cross-media retrieval method. The most important feature of it is to integrate the multi-modal data seamlessly via a cross-reference graph, and then based on the graph, it is able to use improved personalized PageRank to calculate how close the media object associates with the query on semantic and content level. It is also able to adjust the cross-reference graph according to user's relevance feedback, which refines the semantic relationship between the media objects, so as to improve the retrieval accuracy progressively. As demonstrated by the experiments, our method achieves satisfactory retrieval efficiency on multi-modal datasets.
- Published
- 2006
35. A Grid-based Framework for Pervasive Cross-Media Retrieval
- Author
-
Hong Zhang, Yueting Zhuang, and Fei Wu
- Subjects
Distributed Computing Environment ,Ubiquitous computing ,Multimedia ,Grid computing ,Process (engineering) ,Computer science ,Information sharing ,Search engine indexing ,Digital library ,computer.software_genre ,Grid ,computer - Abstract
Cross-media retrieval is an emerging and interesting research problem, which aims to breakthrough the restriction of modality during retrieval process. One basic challenge for cross-media retrieval is the efficient management and pervasive computing of multimedia content in heterogeneous, dynamic and distributed environment. In this paper, Grid technology is introduced for its strongpoint in wide area information sharing and cooperation. Multimedia content is organized in a semantically coherent space. Efficient organization, indexing, and location of dispersed large multimedia content are discussed. Regarding the problem that multi-discipline digital libraries are isolated and dispersed without sufficient interconnection, a Grid-based system framework and simulation platform are designed for case study.
- Published
- 2006
36. Web based Chinese Calligraphy Learning with 3-D Visualization Method
- Author
-
Ying-fei Wu, Yunhe Pan, Jiangqin Wu, and Yueting Zhuang
- Subjects
Multimedia ,business.industry ,Computer science ,Writing process ,computer.software_genre ,Visualization ,Writing style ,Data visualization ,Calligraphy ,Handwriting recognition ,Web application ,business ,computer ,Natural language - Abstract
Chinese calligraphy is pictographic and each calligraphist has his own writing style. People often feel difficult in writing a demanded beautiful calligraphy style. In order to help people enjoy the art of calligraphy and learn how it is written step-by-step we present a new approach to animate its writing process by 3-D visualization method. In this paper some novel algorithms used in the approach are presented to solve the following problems: 1) estimate varied stroke's thickness 2) extract strokes order from an offline Chinese calligraphic writing. Through this approach we implement a system. Experimental result is given to demonstrate the application finally.
- Published
- 2006
37. Secure Byzantine Fault Tolerant LDAP System
- Author
-
Honglun Hou, Xiuqun Wang, and Yueting Zhuang
- Subjects
Authentication ,Quantum Byzantine agreement ,Computer science ,business.industry ,Server ,Distributed computing ,Liveness ,Message authentication code ,Fault tolerance ,Cryptography ,business ,Byzantine fault tolerance ,Computer network - Abstract
LDAP is a set of protocols for accessing information directories which provides data integrity and authentication. It takes attacks on clients, Internet and benign attacks on servers into account. But the malicious attacks on servers and software errors is rarely involved. In this paper, a secure aware Byzantine fault tolerant LDAP system is proposed, which can tolerate malicious faults occurred in the servers. By using a new Byzantienfault- tolerant algorithm, the proposed LDAP system guarantees safety and liveness properties assuming no more than f replicas are faulty while it consists of 3f +1 tightly coupled servers. For the series of optimization, the system not only provides a much higher degree of security and reliability but also is practical.
- Published
- 2006
38. Sketch-based retrieval on Flash movies via primary scene
- Author
-
Yu Yang, Qing Li, Minhao Yu, and Yueting Zhuang
- Subjects
Cinematography ,Query expansion ,Flash (photography) ,Information retrieval ,Computer science ,Test set ,Human–computer information retrieval ,Visual Word ,Image retrieval ,Sketch - Abstract
As a multimedia format, Flash is becoming more and more popular over the Web. The typical structure of Flash can benefit from both image retrieval and video retrieval methods. In this paper, we present an approach of sketch-based retrieval on flash movies with analysis on directional and motional relations. Via the selection of primary scenes, query result can be displayed to users in an ideal way. Experiment of the proposed approach is evaluated on a test set with different genres of flash movies, and it shows the usefulness of the approach.
- Published
- 2006
39. Extracting Multimedia Semantics Based on Independent Modality Discovering and Fusion
- Author
-
Fei Wu, Yueting Zhuang, and Ruo-gui Xiao
- Subjects
Set (abstract data type) ,Support vector machine ,Modality (human–computer interaction) ,business.industry ,Computer science ,Artificial intelligence ,business ,Semantics ,Isomap ,Machine learning ,computer.software_genre ,computer ,Multimedia semantics - Abstract
Learning semantics from low-level features of multimedia learning resources enables high-level access to multimedia content. Considerable amount of researches have been focused on multi-modal analysis to detect multimedia semantics. However, two fundamental issues have not been adequately addressed. First, given a set of raw features extracted from multimedia sources, what are the best independent modalities? Second, once a set of modalities has been identified, how are they optimally fused to map to the high-level semantics? In this paper, we apply statistical and machine learning techniques to answer the two questions. ISOMAP combining with support vector clustering are used to discover independent modalities from raw features. Then Maximum Entropy method is applied to optimally fuse the individual modalities. Experiments show that the proposed method can learn multimedia semantics more efficiently than traditional methods.
- Published
- 2006
40. Byzantine Fault Tolerance in MDS of Grid System
- Author
-
Honglun Hou, Xiuqun Wang, and Yueting Zhuang
- Subjects
Computer science ,business.industry ,Distributed computing ,Liveness ,Fault tolerance ,Fault (power engineering) ,computer.software_genre ,Grid ,Software ,Grid computing ,Software fault tolerance ,Server ,business ,Byzantine fault tolerance ,computer - Abstract
Fault tolerance is a challenge problem in reliable distributed system. In Grid, detecting and correcting fault techniques is used in fault tolerance of MDS system. These techniques are limited in dealing with the benign faults on servers and the Internet. But they will not work when malicious faults on servers or software errors occur. In this paper, a secure aware MDS system, which can tolerate malicious faults occurred on servers, is proposed. By using a new Byzantine-fault-tolerant algorithm, the proposed MDS system guarantees safety and liveness properties under the condition that no more than f replicas are faulty if it consists of 3f + 1 tightly coupled servers, and it maintains the seamless interfaces to application programs as the usual formal MDS system does.
- Published
- 2006
41. Data-Driven Automatic Generation of Decision Tree for Motion Retrieval with Temporal-Spatial Features
- Author
-
Yueting Zhuang, Jian Xiang, and Fei Wu
- Subjects
Incremental decision tree ,Computer science ,business.industry ,Feature extraction ,Decision tree ,Pattern recognition ,Motion capture ,Motion (physics) ,Data-driven ,Temporal database ,Data retrieval ,Artificial intelligence ,business ,Image retrieval - Abstract
Along with the development of Motion Capture technique, more and more 3D motion libraries become available. In this paper, a novel approach is presented for motion retrieval based on data-driven decision tree with 3D temporal-spatial features. First 3D temporal-spatial features of each human joint are extracted with the help of keyspace. Since the gotten features of each joint are independent, data-driven decision tree is automatically constructed to reflect the influence of each point during the comparison of motion similarity. Experiment results show that the approaches are effective for motion data retrieval.
- Published
- 2006
42. Filling Holes in Complex Surfaces using Oriented Voxel Diffusion
- Author
-
Jian-guang Weng, Tong-qiang Guo, Jijun Li, and Yueting Zhuang
- Subjects
Surface (mathematics) ,Materials science ,Field (physics) ,business.industry ,Geometry ,Iterative reconstruction ,computer.software_genre ,Voxel ,Orientation (geometry) ,Computer vision ,Artificial intelligence ,Diffusion (business) ,business ,Distance transform ,computer ,Surface reconstruction - Abstract
Range scanning devices often yield imperfect surface sampling for real-world models with complex features. These holes in the surface are commonly filled with smooth patches conforming to the boundaries. We introduce an oriented voxel diffusion method to fill holes in complex surfaces. First, an initial field of oriented distance is measured according to the existing surface. The implicit surface of the oriented distance field coincides with the existing surface. Second, the oriented distance field diffuses inward the hole until the implicit surface converges. Particularly, the orientation information in the distance field is used to control the diffusion direction accurately. Therefore this method is able to restore the sharp features.
- Published
- 2006
43. Segmenting Layers in Automated Visual Surveillance
- Author
-
Lijuan Qin, Yueting Zhuang, Yunhe Pan, and Fei Wu
- Subjects
Background subtraction ,Pixel ,Computer science ,business.industry ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Codebook ,Pattern recognition ,Image segmentation ,Object detection ,Segmentation ,Computer vision ,Artificial intelligence ,business - Abstract
Detecting objects of interest from a video sequence is a fundamental and critical task in automated visual surveillance. Those objects can be either moving or stationary. However, most of current approaches only focus on discriminating moving objects by background subtraction. In this work, we propose layers segmentation to detect both of moving and stationary target objects from surveillance video. We first construct a codebook with set of codewords for each pixel and then extend the matrix entropy statistical model to segment layers with codewords features. Our experimental results are presented in terms of success layer segmentation rate
- Published
- 2005
44. Search for flash movies on the web
- Author
-
Liu Wenyin, Yueting Zhuang, Qing Li, and Jun Yang
- Subjects
Information retrieval ,Multimedia ,Computer science ,Search engine indexing ,computer.software_genre ,World Wide Web ,Flash (photography) ,Resource (project management) ,Human–computer information retrieval ,Web page ,Music information retrieval ,Visual Word ,Host (network) ,computer - Abstract
FlashTM is experiencing a breathtaking growth andhas become one of the prevailing media formats on theWeb. Unfortunately, no research effort has beendedicated to automatic retrieval of Flash movies bycontent, which is critical to the utilization of the enormousFlash resource. A close examination reveals that theintrinsic complexity of a Flash movie, including itsheterogeneous components, its dynamic nature, and userinteractivity, makes Flash retrieval a host of researchissues. As the first endeavor in this area, we propose ageneric framework termed as FLAME (FLash Access andManagement Environment) embodying a 3-tierarchitecture that addresses the representation, indexing,and retrieval of Flash movies by mining andunderstanding of movie content. In particular, FLAMEfeatures a unique multi-level indexing and retrievalapproach that supports characterization and retrieval ofFlash at (1) object level, which describes theheterogeneous components embedded in a movie, (2)event level, which depicts the movie's dynamic effectsconstituted by the spatio-temporal features of objects, and(3) interaction level, which models the relationshipsbetween user behaviors and the consequential events. Anexperimental prototype for Web-based Flash retrieval isimplemented to verify the feasibility and effectiveness ofFLAME.
- Published
- 2005
45. Multi-Modal Information Retrieval with a Semantic View Mechanism
- Author
-
Yueting Zhuang, Jun Yang, and Qing Li
- Subjects
Context model ,Information retrieval ,business.industry ,Computer science ,Context (language use) ,Construct (python library) ,business ,Semantics ,Semantic Web ,Content management ,Data modeling ,Semantic gap - Abstract
The explosive growth of multimedia information on the Web in recent years calls for an elegant means to model and manage multimedia content to facilitate semantic-level access and sharing across diversified applications. From the perspective of retrieval, the semantics of multimedia data features context-dependency and media-independency; both are inadequately supported by the state-of-the-art data modeling technology. In this paper, we address this problem by advocating MediaView as an extended object-oriented view mechanism to bridge the "semantic gap" between conventional databases and semantics-intensive multimedia applications. This mechanism captures the dynamic semantics of multimedia using a modeling construct named media view (MV), which formulates a customized context where heterogeneous media objects with similar/related semantics are characterized by additional properties and user-defined semantic relationships. View operators are proposed for the manipulation and derivation of individual MVs, which can be fit into the desired real-life scenarios automatically. The usefulness and elegancy of MediaView are demonstrated by its applications in various (subjective) activities supporting multi-modal retrieval.
- Published
- 2005
46. Self-adaptive MPEG video watermarking based on block perceptual features
- Author
-
Yueting Zhuang, Guo-Min Wu, Fei Wu, and Yunhe Pan
- Subjects
Masking (art) ,business.industry ,Computer science ,Data_MISCELLANEOUS ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,ComputingMilieux_LEGALASPECTSOFCOMPUTING ,Watermark ,Signal ,Synchronization ,Discrete cosine transform ,Key (cryptography) ,Computer vision ,Artificial intelligence ,business ,Digital watermarking ,Block (data storage) - Abstract
This paper presents a novel MPEG video-watermarking algorithm based on DCT-domain perceptual model. Its main idea is to generate a self-adaptive video watermarking in accordance with human perceptual model. The algorithm first divides the DCT-domains and computes the masking coefficients by block features. Second, the coefficient is corrected by self-adaptive modification. Finally, the copyright protection is embedded into the blocks according to the modified coefficients respectively. The watermarking method presented here does not require the use of the original signal for watermark detection. The watermark signal is generated by using a chaotic map and a key. Furthermore, the watermark is embedded in three-layer model, which ensures a more exciting result of "synchronization" between the watermark and the video scene and guarantees a more robust watermark. Experimental data show that watermarked video is resistant to several video degradations and distortions.
- Published
- 2005
47. An integrated framework for shot boundary detection with multi-level features similarity
- Author
-
Fei Wu, Lijuan Qin, Yunhe Pan, and Yueting Zhuang
- Subjects
Boundary detection ,business.industry ,Feature (computer vision) ,Robustness (computer science) ,Computer science ,Feature extraction ,Computer vision ,Pattern recognition ,Artificial intelligence ,Video processing ,business ,Precision and recall ,Data compression - Abstract
This work describes an integrated framework for the detection of shot boundaries with multi-level features. Three kinds of features are extracted: low-level features of frame-to-frame difference, temporal-level features and high-level features with video-context sensitivity. Major advantages of the proposed framework are its accuracy and robustness. The information from shot transitions tagging also offers great help to the content-based video processing. The experimental results demonstrate that the new method improves the detection performance of hard cuts and gradual transitions in terms of precision and recall values.
- Published
- 2005
48. Application of set pair analysis in urban planning project comprehensive evaluation
- Author
-
Zu-Xin Li, Yunliang Jiang, Hong-Ping Cao, and Yueting Zhuang
- Subjects
Set (abstract data type) ,Operations research ,Computer science ,Urban planning ,media_common.quotation_subject ,Fuzzy set ,Measurement uncertainty ,Sensitivity analysis ,Certainty ,Degree (music) ,Uncertainty analysis ,media_common - Abstract
For all kinds of uncertainties such as fuzzy uncertainty, random uncertainty, indeterminate-known uncertainty, unknown and unexpected incident uncertainty, and uncertainty which is resulted from imperfective information, the set pair analysis (SPA) studies the relationship between the certainty and the uncertainty of a thing from three aspects of identity-discrepancy-contrary (IDC), and processes uniformly the uncertainties of a system that is resulted from the above uncertainties using the connection degree (/spl mu/). Because the study of this theory is carried through from the whole of uncertainties, and doesn't need to distinguish definitely which part is fuzzy uncertainty or random uncertainty or other uncertainties, this theory is relatively practical. SPA has been successfully applied to many fields though there is only more than ten years from its birth to now. In this paper, SPA is applied to the urban planning project comprehensive evaluation. In the course of the evaluation, such indexes as cost of construction, traffic convenience degree, virescence coverage rate, etc. are taken into account.
- Published
- 2005
49. A new method for shot gradual transiton detection using support vector machine
- Author
-
Yueting Zhuang, Yi-Qun Lian, and Jian Ling
- Subjects
Support vector machine ,Computer science ,Feature vector ,Shot (filmmaking) ,Feature extraction ,Variance (accounting) ,Resolution (logic) ,Algorithm - Abstract
The detection of gradual transition is much more difficult than that of abrupt transition. In this paper, a new method for gradual transition detection that applies support vector machine is proposed. First, an improved variance projection function is introduced, and its practicality to the detection of gradual transition is analyzed as well. Then by using this variance projection function, the distance between the video frames is defined, and a method to calculate the feature vector of changes of the distance is proposed. Finally, a statistical learning method based on the support vector machine is devised to determine whether the changes of the distance are caused by gradual transition or not. The experiments results show that this method has better detection resolution and less timing complexity, and thus satisfactorily meets the requirements of real-time video-shot detection.
- Published
- 2005
50. Fuzzy hierarchical clustering algorithm facing large databases
- Author
-
Yueting Zhuang and Yihong Dong
- Subjects
Fuzzy clustering ,Fuzzy classification ,Database ,Correlation clustering ,Single-linkage clustering ,computer.software_genre ,ComputingMethodologies_PATTERNRECOGNITION ,CURE data clustering algorithm ,Canopy clustering algorithm ,Fuzzy number ,Fuzzy set operations ,Data mining ,Algorithm ,computer ,Mathematics - Abstract
Applying fuzzy theory into hierarchical clustering method, we presented a fuzzy hierarchical clustering algorithm. After datasets were divided into several sub-clusters using partitioning method, a fuzzy graph of sub-clusters was constructed by analyzing the linked fuzzy degree among the sub-clusters. By making /spl lambda/ cut graph for the fuzzy graph, we got the connected components of the fuzzy graph, which was the result of clustering we wanted to get. The algorithm could be performed in high-dimensional data set to cluster the arbitrary shape of clusters. Furthermore, not only could this algorithm dispose the data with numeric attributes, but with categorical attributes also. The results of our experimental study in data sets with arbitrary shape and size are very encouraging. We have also conducted an experimental study with Web log files that could help us to discover the user access patterns effectively. Our study shows that this algorithm generates better quality clusters than traditional algorithms, and scales well for large databases.
- Published
- 2004
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.