10 results on '"CHANGSHENG XU"'
Search Results
2. A discriminative graph inferring framework towards weakly supervised image parsing
- Author
-
Bing-Kun Bao, Lei Yu, and Changsheng Xu
- Subjects
Computer Networks and Communications ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Cryptography ,02 engineering and technology ,Machine learning ,computer.software_genre ,Computer graphics ,Discriminative model ,020204 information systems ,Image parsing ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Computer communication networks ,business.industry ,Pattern recognition ,Active appearance model ,ComputingMethodologies_PATTERNRECOGNITION ,Automatic image annotation ,Hardware and Architecture ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Software ,Information Systems - Abstract
In this paper, we focus on the task of assigning labels to the over-segmented image patches in a weakly supervised manner, in which the training images contain the labels but do not have the labels' locations in the images. We propose a unified discriminative graph inferring framework by simultaneously inferring patch labels and learning the patch appearance models. On one hand, graph inferring reasons the patch labels by a graph propagation procedure. The graph is constructed by connecting the nearest neighbors which share the same image label, and multiple correlations among patches and image labels are imposed as constraints to the inferring. On the other hand, for each label, the patches which do not contain the target label are adopted as negative samples to learn the appearance model. In this way, the predicted labels will be more accurate in the propagation. Graph inferring and the learned patch appearance models are finally embedded to complement each other in one unified formulation. Experiments on three public datasets demonstrate the effectiveness of our method in comparison with other baselines.
- Published
- 2015
3. An incremental probabilistic model for temporal theme analysis of landmarks
- Author
-
Weiqing Min, Bing-Kun Bao, and Changsheng Xu
- Subjects
Information retrieval ,Landmark ,Computer Networks and Communications ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Timeline ,Statistical model ,02 engineering and technology ,Visualization ,Computer graphics ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Social media ,Theme (computing) ,Temporal information ,Software ,Information Systems - Abstract
Social media sites (e.g., Flickr) generate a huge amount of landmark photos with temporal information in the real-world, such as the photos describing the events happening near landmarks, and those showing different seasonal sceneries. Analyzing this temporal information of landmarks can benefit various applications, such as landmark timeline construction and tour recommendation. In this paper, we propose a novel Incremental Spatio-Temporal Theme Model (ISTTM), which can incrementally mine temporal themes that characterize the temporal information of landmarks, by differentiating them from the other three kinds of themes, i.e., general themes shared by most of all landmarks, local themes related to certain landmarks and the background theme including non-informative content. ISTTM works in an online way and is capable of selectively processing the updates of the distributions on different types of themes. Based on the proposed ISTTM, we present a framework, namely Temporal Theme Analysis for Landmarks (TTAL), which enables both periodic theme detection from discovered temporal themes and temporal theme visualization by selecting the relevant photos. We have conducted experiments on a large-scale landmark dataset from Flickr. Qualitative and quantitative evaluation results demonstrate the effectiveness of the ISTTM as well as the TTAL framework.
- Published
- 2014
4. A new discriminative coding method for image classification
- Author
-
Xiaoshan Yang, Tianzhu Zhang, and Changsheng Xu
- Subjects
Contextual image classification ,Computer Networks and Communications ,Computer science ,business.industry ,Locality ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Scale-invariant feature transform ,Pattern recognition ,Computer graphics ,Discriminative model ,Hardware and Architecture ,Bag-of-words model ,Media Technology ,Artificial intelligence ,Quantization (image processing) ,business ,Software ,Information Systems ,Coding (social sciences) - Abstract
The bag-of-words (BOW) based methods are widely used in image classification. However, huge number of visual information is omitted inevitably in the quantization step of the BOW. Recently, NBNN and its improved methods like Local NBNN were proposed to solve this problem. Nevertheless, these methods do not perform better than the state-of-the-art BOW based methods. In this paper, based on the advantages of BOW and Local NBNN, we introduce a novel locality discriminative coding (LDC) method. We convert each low level local feature, such as SIFT, into code vector using the Local Feature-to-Class distance other than by k-means quantization. After coding, sum-pooling combined with SPM is used to construct a single feature representation vector for each image. Extensive experimental results on several challenging benchmark datasets show that our LDC method outperforms six state-of-the-art image classification methods.
- Published
- 2014
5. Multi-object tracking via MHT with multiple information fusion in surveillance video
- Author
-
Long Ying, Changsheng Xu, and Tianzhu Zhang
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,Optical flow ,Video content analysis ,Object (computer science) ,Active appearance model ,Hardware and Architecture ,Feature (computer vision) ,Video tracking ,Histogram ,Media Technology ,Identity (object-oriented programming) ,Computer vision ,Artificial intelligence ,business ,Software ,Information Systems - Abstract
Tracking multiple objects is critical to automatic video content analysis and virtual reality. The major problem is how to solve data association problem when ambiguous measurements are caused by objects in close proximity. To tackle this problem, we propose a multiple information fusion-based multiple hypotheses tracking algorithm integrated with appearance feature, local motion pattern feature and repulsion---inertia model for multi-object tracking. Appearance model based on HSV---local binary patterns histogram and local motion pattern based on optical flow are adopted to describe objects. A likelihood calculation framework is proposed to incorporate the similarities of appearance, dynamic process and local motion pattern. To consider the changes in appearance and motion pattern over time, we make use of an effective template updating strategy for each object. In addition, a repulsion---inertia model is adopted to explore more useful information from ambiguous detections. Experimental results show that the proposed approach generates better trajectories with less missing objects and identity switches.
- Published
- 2014
6. @ICT: attention-based virtual content insertion
- Author
-
Changsheng Xu, Shuqiang Jiang, Qingming Huang, and Huiying Liu
- Subjects
Intrusiveness ,Multimedia ,Computer Networks and Communications ,business.industry ,Computer science ,Video content analysis ,Cryptography ,computer.software_genre ,Computer graphics ,Consistency (database systems) ,Insertion time ,Hardware and Architecture ,Information and Communications Technology ,Media Technology ,business ,computer ,Software ,Information Systems ,Meaning (linguistics) - Abstract
In this paper, we propose an attention-based virtual content insertion solution, called @ICT. Virtual content insertion (VCI) is an emerging application of video analysis and has been used in video augmentation and advertisement insertion. An ideal VCI solution should make the inserted virtual content being noticed by audiences and at the same time should not interfere with audiences' viewing experience on the original content. To balance these two conflicting issues, meaning high attention and low intrusiveness, we choose higher attentive shots as insertion time while determine insertion place and content interdependently by considering lower attention together with visual consistency. We also propose a measurement of intrusiveness from the viewpoint of visual attention. Furthermore, @ICT includes an in-scene insertion module, which embeds the virtual content into the videos with higher vividness and lower intrusiveness. @ICT is able to obtain an optimal balance between the noticing of the virtual content by audiences and disruption of viewing experience to the original content. It needs little prior knowledge and is applied to general videos. Extensive quantitative and qualitative evaluations on the VCI result have verified the effectiveness of the solution.
- Published
- 2011
7. Automatic composition of broadcast sports video
- Author
-
Qi Tian, Eng Siong Chng, Hanqing Lu, Changsheng Xu, and Jinjun Wang
- Subjects
Video production ,Multimedia ,Computer Networks and Communications ,Computer science ,Video capture ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video content analysis ,Video processing ,computer.software_genre ,Video compression picture types ,Video editing ,Hardware and Architecture ,Video tracking ,Media Technology ,Multiview Video Coding ,business ,computer ,Software ,Information Systems - Abstract
This study examines an automatic broadcast soccer video composition system. The research is important as the ability to automatically compose broadcast sports video will not only improve broadcast video generation efficiency, but also provides the possibility to customize sports video broadcasting. We present a novel approach to the two major issues required in the system's implementation, specifically the camera view selection/switching module and the automatic replay generation module. In our implementation, we use multi-modal framework to perform video content analysis, event and event boundary detection from the raw unedited main/sub-camera captures. This framework explores the possible cues using mid-level representations to bridge the gap between low-level features and high-level semantics. The video content analysis results are utilized for camera view selection/switching in the generated video composition, and the event detection results and mid-level representations are used to generate replays which are automatically inserted into the broadcast soccer video. Our experimental results are promising and found to be comparable to those generated by broadcast professionals.
- Published
- 2008
8. Robust and efficient content-based digital audio watermarking
- Author
-
Changsheng Xu and David Dagan Feng
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,Speech recognition ,Data_MISCELLANEOUS ,Speech coding ,ComputingMilieux_LEGALASPECTSOFCOMPUTING ,Cryptography ,Tracing ,computer.software_genre ,Copy protection ,Hardware and Architecture ,Robustness (computer science) ,Media Technology ,Digital signal ,business ,Audio signal processing ,computer ,Digital watermarking ,Software ,Information Systems - Abstract
This paper proposes a set of digital watermarking schemes for WAV audio, WAV-table synthesis audio and compressed audio. The watermark embedding scheme is closely related to audio content and based on the human auditory system. The experimental results in listening and robustness illustrate that the proposed watermarking schemes can achieve an optimal balance between audibility and robustness of the watermarked audio. The proposed methods are also very useful and effective for copyright protection, tracing illegal distribution and other applications.
- Published
- 2002
9. Guest editorial: selected papers from ICIMCS 2011
- Author
-
Changsheng Xu, Abdulmotaleb El Saddik, Xiao Wu, and Chong-Wah Ngo
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,Search engine indexing ,Information and Computer Science ,Cryptography ,Metadata ,World Wide Web ,Hardware and Architecture ,Scalability ,Media Technology ,The Internet ,Cluster analysis ,business ,Classifier (UML) ,Software ,Information Systems - Abstract
International Conference on Internet Multimedia Computing and Services (ICIMCS) is an annual conference sponsored by ACM SIGMM China Chapter. The conference is especially interested in the latest technologies and applications that deal with the web-scale processing and management of heterogeneous data from the Internet for multimedia computing and service. ICIMCS 2011 held in Chengdu, China— the ancient hometown of lovely panda. The conference has attracted around 80 participants, including researchers from academia and industries across ten countries/regions, for sharing their recent works in the topics ranging from visual information analysis and mining, query processing and search, multimedia privacy and security. This special issue comprises the extended versions of five papers, including two best papers, and three papers from the regular and special sessions of ICIMCS 2011. These papers cover key issues in multimedia computing, including the leveraging of Internet resources for multi-modality fusion and visual classifier learning, preprocessing and indexing of million-scale Internet data, and sharing and streaming of Internet videos. Some of these techniques also demonstrate applications for emerging Internet services, such as video recommendation and product search system, by processing and modeling the heterogeneous forms of resources associated with multimedia data. The first two papers are about the web-scale processing of Internet data. The paper entitled ‘‘Video Recommendation over Multiple Information Sources’’, presented by Meng Wang and his colleagues from National University of Singapore, proposes a unified framework that explores heterogeneous information sources for video recommendation. The framework, based on multi-task SVM learning, aggregates multiple ranked lists generated from personal data, social network, and video metadata into an optimized list for recommendation. The framework is experimented on a large video dataset composed of 1-month social activities happened on Facebook and YouTube websites by 76 users. The second paper entitled ‘‘Multi-label Multiinstance Learning with Missing Object Tags’’, presented by Yi Shen and his colleagues from University of North Carolina at Charlotte, proposes a web-scale learning of object classifiers for free from a collection of user-tagged Flickr images as many as 10 millions. Particularly, the paper addresses three important issues toward fully automatic learning: scalable filtering of spam tags by distributed image clustering; joint modeling of loose tags and missing tags by multiple instance learning that is capable of performing tag prediction; structural learning that takes into account the object relationship to train discriminant classifiers. The next two papers address the search and mining of visual instances. The paper entitled ‘‘Combining Global and Local Matching of Multiple Features for Precise Item Image Retrieval’’, co-authored by Haojie Li and his colleagues from Dalian University of Technology and Nanjing C.-W. Ngo (&) City University of Hong Kong, Hong Kong, Hong Kong e-mail: cscwngo@cityu.edu.hk
- Published
- 2012
10. Automatic composition of broadcast sports video.
- Author
-
Changsheng Xu, Engsiong Chng, Hanqing Lu, and Qi Tian
- Subjects
- *
VIDEO production & direction , *SOCCER on television , *AUTOMATION , *CAMCORDERS , *VIDEO editing , *BROADCASTING industry , *SEMANTICS - Abstract
This study examines an automatic broadcast soccer video composition system. The research is important as the ability to automatically compose broadcast sports video will not only improve broadcast video generation efficiency, but also provides the possibility to customize sports video broadcasting. We present a novel approach to the two major issues required in the system’s implementation, specifically the camera view selection/switching module and the automatic replay generation module. In our implementation, we use multi-modal framework to perform video content analysis, event and event boundary detection from the raw unedited main/sub-camera captures. This framework explores the possible cues using mid-level representations to bridge the gap between low-level features and high-level semantics. The video content analysis results are utilized for camera view selection/switching in the generated video composition, and the event detection results and mid-level representations are used to generate replays which are automatically inserted into the broadcast soccer video. Our experimental results are promising and found to be comparable to those generated by broadcast professionals. [ABSTRACT FROM AUTHOR]
- Published
- 2008
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.