125 results on '"Huang, Thomas S."'
Search Results
2. Under Vehicle Inspection with 3d Imaging.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Sukumar, S. R., Page, D. L., Koschan, A. F., and Abidi, M. A.
- Abstract
This research is motivated towards the deployment of intelligent robots for under vehicle inspection at check-points, gate-entry terminals and parking lots. Using multi-modality measurements of temperature, range, color, radioactivity and with future potential for chemical and biological sensors, our approach is based on a modular robotic "sensor brick" architecture that integrates multi-sensor data into scene intelligence in 3D virtual reality environments. The remote 3D scene visualization capability reduces the risk on close-range inspection personnel, transforming the inspection task into an unmanned robotic mission. Our goal in this chapter is to focus on the 3D range "sensor brick" as a vital component in this multi-sensor robotics framework and demonstrate the potential of automatic threat detection using the geometric information from the 3D sensors. With the 3D data alone, we propose two different approaches for the detection of anomalous objects as potential threats. The first approach is to perform scene verification using a 3D registration algorithm for quickly and efficiently finding potential changes to the undercarriage by comparing previously archived scans of the same vehicle. The second 3D shape analysis approach assumes availability of the CAD models of the undercarriage that can be matched with the scanned real data using a novel perceptual curvature variation measure (CVM). The definition of the CVM, that can be understood as the entropy of surface curvature, describes the under vehicle scene as a graph network of smooth surface patches that readily lends to matching with the graph description of the aprioriCAD data. By presenting results of real-time acquisition, visualization, scene verification and description, we emphasize the scope of 3D imaging over several drawbacks with present day inspection systems using mirrors and 2D cameras. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
3. 3D Site Modelling and Verification.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Sequeira, V., Boström, G., and Gonçalves, J. G. M.
- Abstract
Copyright of 3D Imaging for Safety & Security is the property of Springer eBooks and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2007
- Full Text
- View/download PDF
4. 3D Modeling of Indoor Environments.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Biber, P., Fleck, S., Duckett, T., and Wand, M.
- Abstract
Autonomous mobile robots will play a major role in future security and surveillance tasks for large scale environments such as shopping malls, airports, hospitals and museums. Robotic security guards will autonomously survey such environments, unless a remote human operator takes over control. In this context a 3D model can convey much more useful information than the typical 2D maps used in many robotic applications today, both for visualization of information and as human machine interface for remote control. This paper addresses the challenge of building such a model of a large environment (50x60m2) using data from the robot's own sensors: a 2D laser scanner and a panoramic camera. The data are processed in a pipeline that comprises automatic, semiautomatic and manual stages. The user can interact with the reconstruction process where necessary to ensure robustness and completeness of the model. A hybrid representation, tailored to the application, has been chosen: floors and walls are represented efficiently by textured planes. Non-planar structures like stairs and tables, which are represented by point clouds, can be added if desired. Our methods to extract these structures include: simultaneous localization and mapping in 2D and wall extraction based on laser scanner range data, building textures from multiple omnidirectional images using multiresolution blending, and calculation of 3D geometry by a graph cut stereo technique. Various renderings illustrate the usability of the model for visualizing the security guard's position and environment. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
5. Dynamic Pushbroom Stereo Vision.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Zhu, Z., Wolberg, G., and Layne, J.R.
- Abstract
We present a dynamic pushbroom stereo geometry model for both 3D reconstruction and moving target extraction in applications such as aerial surveillance and cargo inspection. In a dynamic pushbroom camera model, a "line scan camera" scans across the scene. Both the scanning sensor and the objects in the scene are moving, and thus the image generated is a "moving picture" with one axis being space and the other being time. We study the geometry under a linear motion model for both the sensor and the object, and we investigate the advantages of using two such scanning systems to construct a dynamic pushbroom stereo vision system for 3D reconstruction and moving target extraction. Two real examples are given using the proposed models. In the first application, a fast and practical calibration procedure and an interactive 3D estimation method are provided for 3D cargo inspection with dual gamma-ray (or X-ray) scanning systems. In the second application, dynamic pushbroom stereo mosaics are generated by using a single camera mounted on an airplane, and a unified segmentation-based stereo matching algorithm is proposed to extract both 3D structures and moving targets from urban scenes. Experimental results are given. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
6. Synthetic Aperture Focusing Using Dense Camera Arrays.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Vaish, V., Garg, G., Talvala, E.-V., Antunez, E., Wilburn, B., Horowitz, M., and Levoy, M.
- Abstract
Copyright of 3D Imaging for Safety & Security is the property of Springer eBooks and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2007
- Full Text
- View/download PDF
7. Human Ear Detection From 3D Side Face Range Images.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Chen, H., and Bhanu, B.
- Abstract
Ear is a new class of relatively stable biometrics which is not affected by facial expressions, cosmetics and eye glasses. To use ear biometrics for human identification, ear detection is the first part of an ear recognition system. In this chapter we propose two approaches for locating human ears in side face range images: (a) template matching based ear detection and (b) ear shape model based detection. For the first approach, the model template is represented by an averaged histogram of shape index that can be computed from principal curvatures. The ear detection is a four-step process: step edge detection and thresholding, image dilation, connect-component labeling and template matching. For the second approach, the ear shape model is represented by a set of discrete 3D vertices corresponding to ear helix and anti-helix parts. Given a side face range image, step edges are extracted and then the edge segments are dilated, thinned and grouped into different clusters which are the potential regions containing an ear. For each cluster, we register the ear shape model with the edges. The region with the minimum mean registration error is declared as the detected ear region; during this process the ear helix and anti-helix parts are identified. Experiments are performed with a large number of real side face range images to demonstrate the effectiveness of the proposed approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
8. Story of Cinderella.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Bronstein, Alexander M., Bronstein, Michael M., and Kimmel, Ron
- Abstract
In this chapter, we address the question of what are the facial measures one could use in order to distinguish between people. Our starting point is the fact that the expressions of our face can, in most cases, be modeled as isometries, which we validate empirically. Then, based on this observation, we introduce a technique that enables us to distinguish between people based on the intrinsic geometry of their faces. We provide empirical evidence that the proposed geometric measures are invariant to facial expressions and relate our findings to the broad context of biometric methods, ranging from modern face recognition technologies to fairy tales and biblical stories. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
9. A Genetic Algorithm Based Approach for 3D Face Recognition.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Sun, Y., and Yin, L.
- Abstract
The ability to distinguish different people by using 3D facial information is an active research problem being undertaken by the face recognition community. In this paper, we propose to use a generic model to label 3D facial features. This approach relies on our realistic face modeling technique, by which the individual face model is created using a generic model and two views of a face. In the individualized model, we label face features by their principal curvatures. Among the labeled features, "good features" are selected by using a Genetic Algorithm based approach. The feature space is then formed by using these new 3D shape descriptors, and each individual face is classified according to its feature space correlation. We applied 105 individual models for the experiment. The experimental results show that the shape information obtained from the 3D individualized model can be used to classify and identify individual facial surfaces. The rank-4 recognition rate is 92%. The 3D individualized model provides consistent and sufficient details to represent individual faces while using a much more simplified representation than the range data models. To verify the accuracy and robustness of the selected feature spaces, a similar procedure is applied on the range data obtained from the 3D scanner. We used a subset of the optimal feature space derived from the Genetic Algorithm, and achieved an 87% rank-4 recognition rate. It shows that our approach provides a possible way to reduce the complexity of 3D data processing and is feasible to applications using different sources of 3D data. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
10. Automatic 3D Face Registration Without Initialization.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Koschan, A., Ayyagari, V. R., Boughorbel, F., and Abidi, M. A.
- Abstract
Recently 3D face reconstruction and recognition has gained an important role in computer vision and biometrics research. Depth information of a 3D face can aid solving the uncertainties in illumination and pose variation associated with face recognition. The registration of data that is usually acquired from different views is a fundamental element of any reconstruction process. This chapter focuses on the problem of automatic registration of 3D face point sets through a criterion based on Gaussian fields. The method defines a straightforward energy function, which is always differentiable and convex in a large neighborhood of the alignment parameters; allowing for the use of powerful standard optimization techniques. The introduced technique overcomes the necessity of close initialization, which is a requirement when applying the Iterative Closest Point algorithm. Moreover, the use of the Fast Gauss Transform reduces the computational complexity of the registration algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
11. A Survey on 3D Modeling of Human Faces for Face Recognition.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Huq, S., Abidi, B., Kong, S. G., and Abidi, M.
- Abstract
In its quest for more reliability and higher recognition rates the face recognition community has been focusing more and more on 3D based recognition. Depth information adds another dimension to facial features and provides ways to minimize the effects of pose and illumination variations for achieving greater recognition accuracy. This chapter reviews, therefore, the major techniques for 3D face modeling, the first step in any 3D assisted face recognition system. The reviewed techniques are laser range scans, 3D from structured light projection, stereo vision, morphing, shape from motion, shape from space carving, and shape from shading. Concepts, accuracy, feasibility, and limitations of these techniques and their effectiveness for 3D face recognition are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
12. 3D Assisted Face Recognition: A Survey.
- Author
-
Viergever, Max, Borgefors, Gunilla, Deriche, Rachid, Huang, Thomas S., Ikeuchi, Katsushi, Jiang, Tianzi, Klette, Reinhard, Leonardis, Ales, Peitgen, Heinz-Otto, Tsotsos, John K., Koschan, Andreas, Pollefeys, Marc, Abidi, Mongi, Hamouz, M., Tena, J. R., Kittler, J., Hilton, A., and Illingworth, J.
- Abstract
3D face recognition has lately been attracting ever increasing attention. In this chapter we review the full spectrum of 3D face processing technology, from sensing to recognition. The review covers 3D face modelling, 3D to 3D and 3D to 2D registration, 3D based recognition and 3D assisted 2D based recognition. The fusion of 2D and 3D modalities is also addressed. The chapter complements other reviews in the face biometrics area by focusing on the sensor technology, and by detailing the efforts in 3D face modelling and 3D assisted 2D face matching. A detailed evaluation of a typical state-of-the-art 3D face registration algorithm is discussed and conclusions drawn. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
13. Foundations of Human Computing: Facial Expression and Emotion.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, and Cohn, Jeffrey F.
- Abstract
Many people believe that emotions and subjective feelings are one and the same and that a goal of human-centered computing is emotion recognition. The first belief is outdated; the second mistaken. For human-centered computing to succeed, a different way of thinking is needed. Emotions are species-typical patterns that evolved because of their value in addressing fundamental life tasks. Emotions consist of multiple components, of which subjective feelings may be one. They are not directly observable, but inferred from expressive behavior, self-report, physiological indicators, and context. I focus on expressive facial behavior because of its coherence with other indicators and research. Among the topics included are measurement, timing, individual differences, dyadic interaction, and inference. I propose that design and implementation of perceptual user interfaces may be better informed by considering the complexity of emotion, its various indicators, measurement, individual differences, dyadic interaction, and problems of inference. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
14. Affect Detection and an Automated Improvisational AI Actor in E-Drama.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, Zhang, Li, Gillies, Marco, Barnden, John A., Hendley, Robert J., Lee, Mark G., and Wallington, Alan M.
- Abstract
Enabling machines to understand emotions and feelings of the human users in their natural language textual input during interaction is a challenging issue in Human Computing. Our work presented here has tried to make our contribution toward such machine automation. We report work on adding affect-detection to an existing e-drama program, a text-based software system for dramatic improvisation in simple virtual scenarios, for use primarily in learning contexts. The system allows a human director to monitor improvisations and make interventions, for instance in reaction to excessive, insufficient or inappropriate emotions in the characters' speeches. Within an endeavour to partially automate directors' functions, and to allow for automated affective bit-part characters, we have developed an affect-detection module. It is aimed at detecting affective aspects (concerning emotions, moods, value judgments, etc.) of human-controlled characters' textual "speeches". The work also accompanies basic research into how affect is conveyed linguistically. A distinctive feature of the project is a focus on the metaphorical ways in which affect is conveyed. Moreover, we have also introduced how the detected affective states activate the animation engine to produce gestures for human-controlled characters. The description of our approach in this paper is taken in part from our previous publications [1, 2] with new contributions mainly on metaphorical language processing (practically and theoretically), 3D emotional animation generation and user testing evaluation. Finally, Our work on affect detection in open-ended improvisational text contributes to the development of automatic understanding of human language and emotion. The generation of emotional believable animations based on detected affective states and the production of appropriate responses for the automated affective bit-part character based on the detection of affect contribute greatly to the ease and innovative user interface in e-drama, which leads to high-level user engagement and enjoyment. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
15. SmartWeb Handheld — Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, Sonntag, Daniel, Engel, Ralf, Herzog, Gerd, Pfalzgraf, Alexander, Pfleger, Norbert, Romanelli, Massimo, and Reithinger, Norbert
- Abstract
SmartWeb aims to provide intuitive multimodal access to a rich selection of Web-based information services. We report on the current prototype with a smartphone client interface to the Semantic Web. An advanced ontology-based representation of facts and media structures serves as the central description for rich media content. Underlying content is accessed through conventional web service middleware to connect the ontological knowledge base and an intelligent web service composition module for external web services, which is able to translate between ordinary XML-based data structures and explicit semantic representations for user queries and system responses. The presentation module renders the media content and the results generated from the services and provides a detailed description of the content and its layout to the fusion module. The user is then able to employ multiple modalities, like speech and gestures, to interact with the presented multimedia material in a multimodal way. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
16. Challenges for Virtual Humans in Human Computing.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Pantic, Maja, Pentland, Alex, Reidsma, Dennis, Ruttkay, Zsófia, and Nijholt, Anton
- Abstract
The vision of Ambient Intelligence (AmI) presumes a plethora of embedded services and devices that all endeavor to support humans in their daily activities as unobtrusively as possible. Hardware gets distributed throughout the environment, occupying even the fabric of our clothing. The environment is equipped with a diversity of sensors, the information of which can be accessed from all over the AmI network. Individual services are distributed over hardware, share sensors with other services and are generally detached from the traditional singleaccess- point computer (see also the paper of Pantic et al. in this volume [51]). [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
17. A Learning-Based High-Level Human Computer Interface for Face Modeling and Animation.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, and Blanz, Volker
- Abstract
This paper describes a system for animation and modeling of faces in images or in 3D. It provides high-level control of facial appearance to users, due to a learning-based approach that extracts class-specific information from a database of 3D scans. The modification tools include changes of facial attributes, such as body weight, masculine or feminine look, or overall head shape. Facial expressions are learned from examples and can be applied to new individuals. The system is intrinsically based on 3D face shapes and surface colors, but it can be applied to existing images as well, using a 3D shape reconstruction algorithm that operates on single images. After reconstruction, faces can be modified and drawn back into the original image, so the users can manipulate, animate and exchange faces in images at any given pose and illumination. The system can be used to create face models or images from a vague description or mental image, for example based on the recollection of eyewitnesses in forensic applications. For this specific problem, we present a software tool and a user study with a forensic artist. Our model-based approach may be considered a prototype implementation of a high-level user interface to control meaningful attributes in human faces. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
18. Evaluating the Future of HCI: Challenges for the Evaluation of Emerging Applications.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, Poppe, Ronald, Rienks, Rutger, and van Dijk, Betsy
- Abstract
Current evaluation methods are inappropriate for emerging HCI applications. In this paper, we give three examples of these applications and show that traditional evaluation methods fail. We identify trends in HCI development and discuss the issues that arise with evaluation. We aim at achieving increased awareness that evaluation too has to evolve in order to support the emerging trends in HCI systems. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
19. Gaze-X: Adaptive, Affective, Multimodal Interface for Single-User Office Scenarios.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pentland, Alex, Maat, Ludo, and Pantic, Maja
- Abstract
This paper describes an intelligent system that we developed to support affective multimodal human-computer interaction (AMM-HCI) where the user's actions and emotions are modeled and then used to adapt the interaction and support the user in his or her activity. The proposed system, which we named Gaze-X, is based on sensing and interpretation of the human part of the computer's context, known as W5+ (who, where, what, when, why, how). It integrates a number of natural human communicative modalities including speech, eye gaze direction, face and facial expression, and a number of standard HCI modalities like keystrokes, mouse movements, and active software identification, which, in turn, are fed into processes that provide decision making and adapt the HCI to support the user in his or her activity according to his or her preferences. A usability study conducted in an office scenario with a number of users indicates that Gaze-X is perceived as effective, easy to use, useful, and affectively qualitative. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
20. Feedback Loops in Communication and Human Computing.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, op den Akker, Rieks, and Heylen, Dirk
- Abstract
Building systems that are able to analyse communicative behaviours or take part in conversations requires a sound methodology in which the complex organisation of conversations is understood and tested on real-life samples. The data-driven approaches to human computing not only have a value for the engineering of systems, but can also provide feedback to the study of conversations between humans and between human and machines. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
21. Social Intelligence Design and Human Computing.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, and Nishida, Toyoaki
- Abstract
The central concern of Social Intelligence Design is the under-standing and augmentation of social intelligence that might be attributed to both an individual and a group. Social Intelligence Design addresses understanding and augmentation of social intelligence resulting from bilateral interaction of intelligence attributed to an individual to coordinate her/his behavior with others in a society and that attributed to a collection of individuals to achieve goals as a whole and learn from experiences. Social intelligence can be addressed from multiple perspectives. In this chapter, I will focus on three aspects. First, I highlight interaction from the social discourse perspective in which social intelligence manifests in rapid interaction in a small group. Second, I look at the community media and social interaction in the large, where slow and massive interaction takes place in a large collection of people. Third, I survey work on social artifacts that embody social intelligence. Finally, I attempt to provide a structured view of the field. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
22. Modeling Influence Between Experts.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, and Dong, Wen
- Abstract
A common problem of ubiquitous sensor-network computing is combining evidence between multiple agents or experts. We demonstrate that the latent structure influence model, our novel formulation for combining evidence from multiple dynamic classification processes ("experts"), can achieve greater accuracy, efficiency, and robustness to data corruption than standard methods such as HMMs. It accomplishes this by simultaneously modeling the structure of interaction and the latent states. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
23. Trajectory-Based Representation of Human Actions.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pentland, Alex, Oikonomopoulos, Antonios, Patras, Ioannis, Pantic, Maja, and Paragios, Nikos
- Abstract
This work addresses the problem of human action recognition by introducing a representation of a human action as a collection of short trajectories that are extracted in areas of the scene with significant amount of visual activity. The trajectories are extracted by an auxiliary particle filtering tracking scheme that is initialized at points that are considered salient both in space and time. The spatiotemporal salient points are detected by measuring the variations in the information content of pixel neighborhoods in space and time. We implement an online background estimation algorithm in order to deal with inadequate localization of the salient points on the moving parts in the scene, and to improve the overall performance of the particle filter tracking scheme. We use a variant of the Longest Common Subsequence algorithm (LCSS) in order to compare different sets of trajectories corresponding to different actions. We use Relevance Vector Machines (RVM) in order to address the classification problem. We propose new kernels for use by the RVM, which are specifically tailored to the proposed representation of short trajectories. The basis of these kernels is the modified LCSS distance of the previous step. We present results on real image sequences from a small database depicting people performing 12 aerobic exercises. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
24. Modelling the Communication Atmosphere: A Human Centered Multimedia Approach to Evaluate Communicative Situations.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, Rutkowski, Tomasz M., and Mandic, Danilo P.
- Abstract
This chapter addresses the problem of multimodal analysis of human face-to-face communication. This is imporant since in the near future, smart environments equipped with multiple sensory systems will be able to sense the presence of humans and assess recognize their behaviours, actions, and emotional states. The main goal of the presented study is to develop models of communicative/interactive events in multimedia (audio and video), suitable for the analysis and subsequent incorporation within virtual reality environments. Interactive, environmental, and emotional characteristics of the communicators are estimated in order to define the communication event as one entity. This is achieved by putting together results obtained in social sciences and multimedia signal processing under one umbrella - the communication atmosphere analysis. Experiments based on real life recordings support the approach. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
25. Emotion and Reinforcement: Affective Facial Expressions Facilitate Robot Learning.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, and Broekens, Joost
- Abstract
Computer models can be used to investigate the role of emotion in learning. Here we present EARL, our framework for the systematic study of the relation between emotion, adaptation and reinforcement learning (RL). EARL enables the study of, among other things, communicated affect as reinforcement to the robot; the focus of this chapter. In humans, emotions are crucial to learning. For example, a parent—observing a child—uses emotional expression to encourage or discourage specific behaviors. Emotional expression can therefore be a reinforcement signal to a child. We hypothesize that affective facial expressions facilitate robot learning, and compare a social setting with a non-social one to test this. The non-social setting consists of a simulated robot that learns to solve a typical RL task in a continuous grid-world environment. The social setting additionally consists of a human (parent) observing the simulated robot (child). The human's emotional expressions are analyzed in real time and converted to an additional reinforcement signal used by the robot; positive expressions result in reward, negative expressions in punishment. We quantitatively show that the "social robot" indeed learns to solve its task significantly faster than its "non-social sibling". We conclude that this presents strong evidence for the potential benefit of affective communication with humans in the reinforcement learning loop. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
26. Modeling Naturalistic Affective States Via Facial, Vocal, and Bodily Expressions Recognition.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, Karpouzis, Kostas, Caridakis, George, Kessous, Loic, Amir, Noam, Raouzaiou, Amaryllis, Malatesta, Lori, and Kollias, Stefanos
- Abstract
Affective and human-centered computing have attracted a lot of attention during the past years, mainly due to the abundance of devices and environments able to exploit multimodal input from the part of the users and adapt their functionality to their preferences or individual habits. In the quest to receive feedback from the users in an unobtrusive manner, the combination of facial and hand gestures with prosody information allows us to infer the users' emotional state, relying on the best performing modality in cases where one modality suffers from noise or bad sensing conditions. In this paper, we describe a multi-cue, dynamic approach to detect emotion in naturalistic video sequences. Contrary to strictly controlled recording conditions of audiovisual material, the proposed approach focuses on sequences taken from nearly real world situations. Recognition is performed via a 'Simple Recurrent Network' which lends itself well to modeling dynamic events in both user's facial expressions and speech. Moreover this approach differs from existing work in that it models user expressivity using a dimensional representation of activation and valence, instead of detecting discrete 'universal emotions', which are scarce in everyday human-machine interaction. The algorithm is deployed on an audiovisual database which was recorded simulating human-human discourse and, therefore, contains less extreme expressivity and subtle variations of a number of emotion labels. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
27. Audio-Visual Spontaneous Emotion Recognition.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Nijholt, Anton, Pantic, Maja, Pentland, Alex, Zhihong Zeng, Yuxiao Hu, Roisman, Glenn I., Zhen Wen, Yun Fu, and Huang, Thomas S.
- Abstract
Automatic multimodal recognition of spontaneous emotional expressions is a largely unexplored and challenging problem. In this paper, we explore audio-visual emotion recognition in a realistic human conversation setting—the Adult Attachment Interview (AAI). Based on the assumption that facial expression and vocal expression are at the same coarse affective states, positive and negative emotion sequences are labeled according to Facial Action Coding System. Facial texture in visual channel and prosody in audio channel are integrated in the framework of Adaboost multi-stream hidden Markov model (AdaMHMM) in which the Adaboost learning scheme is used to build component HMM fusion. Our approach is evaluated in AAI spontaneous emotion recognition experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
28. Human Computing and Machine Understanding of Human Behavior: A Survey.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Pantic, Maja, Pentland, Alex, Nijholt, Anton, and Huang, Thomas S.
- Abstract
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affecti0ve and social signaling. This article discusses how far are we from enabling computers to understand human behavior. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
29. Instinctive Computing.
- Author
-
Carbonell, Jaime G., Siekmann, Jörg, Huang, Thomas S., Nijholt, Anton, Pantic, Maja, Pentland, Alex, and Yang Cai
- Abstract
Instinctive computing is a computational simulation of biological and cognitive instincts. It is a meta-program of life, just like universal gravity in nature. It profoundly influences how we look, feel, think, and act. If we want a computer to be genuinely intelligent and to interact naturally with us, we must give computers the ability to recognize, understand, even to have primitive instincts. In this paper, we will review the recent work in this area, the building blocks for the instinctive operating system, and potential applications. The paper proposes a 'bottom-up' approach that is focused on human basic instincts: forage, vigilance, reproduction, intuition and learning. They are the machine codes in human operating systems, where high-level programs, such as social functions can override the low-level instinct. However, instinctive computing has been always a default operation. Instinctive computing is the foundation of Ambient Intelligence as well as Empathic Computing. It is an essential part of Human Computing. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
30. 3-D Camera Modeling and Its Applications in Sports Broadcast Video Analysis.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Jungong Han
- Abstract
This paper concentrates on a unified 3-D camera modeling technique, which can be applied to analyze several sports types. To this end, we have based our modeling on collecting feature points from two perpendicular planes: the ground plane and the net plane, as they exist in most court-net sports. A two-step algorithm is employed to extract and distinguish the feature lines and points from these two planes for determining the camera calibration parameters. Our proposed modeling enables a defined mapping from real 3-D scene coordinates to image coordinates. The use of this modeling helps in the improvement of many emerging applications, such as moving-player segmentation and 3-D scene adaptation. We evaluate the promising performance of the proposed modeling for a variety of court-net sports videos containing badminton, tennis and volleyball, and also demonstrate its capability on the case study of player segmentation and 3-D scene adaptation. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
31. Face Recognition by Matching 2D and 3D Geodesic Distances.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Berretti, S.
- Abstract
Face recognition has been addressed both in 2D, using still images or video sequences, and in 3D using three-dimensional face models. In this paper, we propose an original framework which provides a description capable to support 3D-3D face recognition as well as to directly compare 2D face images against 3D face models. This representation is extracted by measuring geodesic distances in 3D and 2D. In 3D, the geodesic distance between two points on a surface is computed as the length of the shortest path connecting the two points on the model. In 2D, the geodesic distance between two pixels is computed based on the differences of gray level intensities along the segment connecting the two pixels in the image. Experimental results are reported for 3D-3D and 2D-3D face recognition, in order to demonstrate the viability of the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
32. On the Robustness of Parametric Watermarking of Speech.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Gurijala, Aparna
- Abstract
Parameter-embedded watermarking is effected through slight perturbations of parametric models of some deeply-integrated dynamics of a signal. This paper is concerned with particular model form, linear prediction (LP), which is naturally suited to the application of interest, speech watermarking. The focus of this paper is on the robustness performance of LP-embedded speech watermarking. It is shown that the technique is quite robust to a wide array of attacks including noise addition, cropping, compression, filtering, and others. In the LP formulation, a set-theoretic adjunct to the parameter embedding can be used to identify a watermark that is optimally robust against certain attacks, within a quantified fidelity constraint. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
33. Moving Object Tracking in H.264/AVC Bitstream.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Wonsang You
- Abstract
Data broadcasting services are required to provide user interactivity through connecting additional contents such as object information to audio-visual contents. H.264/AVC-based metadata authoring tools include functions which identify and track position and motion of objects. In this work, we propose a method for tracking the target object by using partially decoded texture data and motion vectors extracted directly from H.264/AVC bitstream. This method achieves low computational complexity and high performance through the dissimilarity energy minimization algorithm which tracks feature points adaptively according to these characteristics. The experiment has shown that the proposed method had high performance with fast processing time. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
34. SVM-Based Audio Classification for Content- Based Multimedia Retrieval.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Yingying Zhu
- Abstract
Audio classification is very important in multimedia retrieval such as audio indexing, analysis and content-based video retrieval. In this paper, we have proposed a clip-based support vector machine (SVM) approach to classify audio signals into six classes, which are pure speech, music, silence, environmental sound, speech with music and speech with environmental sound. The classification results are then used to partition a video into homogeneous audio segments, which is used to analyze and retrieve its higher-level content. The experimental results show that the proposed system not only improves classification accuracy, but also performs better than the other classification systems using the decision tree (DT), K Nearest Neighbor (K-NN) and Neural Network (NN). [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
35. A Prediction Error Compression Method with Tensor-PCA in Video Coding.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Huang, Thomas S., Jian Liu, and Fei Wu
- Abstract
Discrete Cosine Transform (DCT), which is employed by block-based hybrid video coding to encode motion prediction errors, has dominated practical video coding standards for several decades. However, DCT is only a good approximation to Principle Component Analysis (PCA, also called KLT), which is optimal among all unitary transformations. PCA is rejected by coding standards due to its complexity. This paper tries to use a matrix form of PCA (which we call tensor-PCA) to encode prediction errors in video coding. This method retains the performance of traditional PCA, but can be computed with much less time and space complexity. We compared tensor-PCA with DCT and GPCA in motion prediction error coding, which shows that it is a good trade-off between compression efficiency and computational cost. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
36. Players and Ball Detection in Soccer Videos Based on Color Segmentation and Shape Analysis.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Yu Huang
- Abstract
This paper proposes a scheme to detect and locate the players and the ball on the grass playfield in soccer videos. We put forward a shape analysis-based approach to identify the players and the ball from the roughly extracted foreground, which is obtained by a trained, color histogram-based playfield detector and connected component analysis. We employ Euclidean distance transform to extract skeletons for every foreground blob, and then perform shape analysis to remove false alarms (non-player and non-ball blobs) and cut-off the artifacts (mostly due to playfield lines) based on skeleton pruning and reverse Euclidean distance transform. Results are given to demonstrate the proposed algorithm works well in soccer video clips. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
37. Efficient Image Retrieval Using Conceptualization of Annotated Images.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Miyoung Cho
- Abstract
As the amount of visual information is rapidly increasing, users want to find the more semantic information easily. Most retrieval systems by low-level features(such as color, texture) could not satisfy user's demand. To interpret semantic of image, many researchers use keywords as textual annotation. However, it's the image retrieval without ranking by text matching which is the simplest way to retrieval according to keyword's existence or nonexistence. In this paper, we propose conceptualization by similarity measure using relations among keywords for efficient image retrieval. We experiment annotated image retrieval by lowering the unrelated keyword's weight value and raising important keyword's one. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
38. Fast Mode Decision by Exploiting Spatio-temporal Correlation in H.264.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Sung-Hoon Jeon
- Abstract
The H.264 video coding standard provides considerably higher coding efficiency than those of previous standards but its complexity is significantly increased. In this paper, we propose an efficient method of fast mode decision by exploiting spatio-temporal correlation in H.264. Firstly, we select skip mode or inter mode by considering the temporal correlation. Secondly, we select variable block size on inter mode by considering the spatial correlation. Simulations show that the proposed method reduces the encoding time by 71% on average without any significant PSNR losses. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
39. A Three-Level Scheme for Real-Time Ball Tracking.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Xiaofeng Tong
- Abstract
A three-level method is proposed to achieve robust and real-time ball tracking in soccer videos. It includes object-, intra-trajectory-, and inter-trajectory-level processing. Due to much noise and frequent occlusion, it's difficult to get the solely ball in one frame. Thus, in object level, multiple objects instead of a single one are detected and taken as ball candidates with shape and color features. Then at intra-trajectory level, each ball candidate is tracked by a Kalman filter in successive frames, which results in lots of initial trajectories in a video segment. These trajectories are thereafter scored and filtered according to their length and relationship in a time-line model. With these trajectories, we construct a distance graph, in which a node represents a trajectory, and an edge means distance between two trajectories. We use the Dijkstra algorithm to get the optimal path in the graph at the inter-trajectory level. To smooth the trajectory, we finally apply cubic spline interpolation to bridge the gap between adjacent trajectories. The algorithm is tested on broadcast soccer games in FIFA2006 and got the F-score 80.26%. The whole speed far exceeds real-time, 35.6 fps on mpeg2 data. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
40. A Blind Watermarking Scheme Based on Visual Model for Copyright Security.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Cong Jin
- Abstract
A novel blind watermarking scheme is developed based on discrete wavelet transform (DWT) in this paper. In order to make the watermarking imperceptibility and robustness, the watermarking is embedded in the average of wavelet blocks using the visual model based on the human visual system (HVS). Low-pass wavelet coefficients n least significant bits (LSBs) are adjusted in concert with the average. Simulation results show that the proposed scheme is imperceptibility and robustness against many attacks such as JPEG compression, adding noise, rescaling, cropping, rotation, filtering, etc. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
41. Color-Texture Image Segmentation by Combining Region and Photometric Invariant Edge Information.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Shengyang Yu
- Abstract
An improved approach for JSEG algorithm is proposed for unsupervised color-texture image segmentation. The region and photometric invariant edge information are combined. A novel measure for color-texture homogeneity is defined by weighting the textural homogeneity measure with photometric invariant edge measure. Based on the map whose pixel values are values of the new measure, region growing-merging algorithm used in JSEG is then employed to segment the image. Finally, experiments on a variety of real color images demonstrate performance improvement due to the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
42. Object Re-detection Using SIFT and MPEG-7 Color Descriptors.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Schügerl, Philipp
- Abstract
Information about the occurrence of objects in videos and their interactions conveys an important part of the semantics of audiovisual content and can be used to narrow the semantic gap in video analysis, retrieval and summarization. Object re-detection, which aims at finding occurrences of specific objects in a single video or a collection of still images and videos, is an object identification problem and can thus be more satisfactorily solved than a general object recognition problem. As structural information and color information are often complementary, we propose a combined object re-detection approach using SIFT and MPEG-7 color descriptors extracted around the same interest points. We evaluate the approach on two different data sets and show that the MPEG-7 ColorLayout descriptor performs best of the tested color descriptors and that the joint approach yields better results than the use of SIFT or color descriptors only. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
43. QoS Adaptive Data Organizing and Delivery Framework for P2P Media Streaming.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Longshe Huo
- Abstract
This paper presents a novel data organizing and delivery framework for P2P media streaming which is aware of media content. In this framework, the media data is partitioned GOP by GOP, and reorganized according to the frame-priorities within a GOP. All GOPs cached in the media buffer of a peer node can be scheduled concurrently, while the frames contained in each GOP are transmitted by order strictly according to their priorities. Analysis and primary experimental results show that the proposed techniques can provide not only self-adaptive QoS for heterogeneous network conditions, but also quick channel switching with near fixed delay time. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
44. Adaptive Interpolation for Error Concealment in H.264 Using Directional Histograms.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Nguyen, Toan
- Abstract
Error Concealment methods for Intra frames in H.264 reconstruct the missing macroblock by computing weighted average of the boundary pixels of the neighboring blocks. However, the simple averaging of pixel values leads to blurring and degrades the picture quality severely. To solve this problem, an adaptive interpolation considering directions of neighboring blocks is proposed. Directional interpolation is adaptively chosen by a threshold computed from the distribution of local directions in areas around the lost block. Experiments show improvement of picture quality of about 0.5~2.0 dB compared to existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
45. SIEVE—Search Images Effectively Through Visual Elimination.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Ying Liu
- Abstract
Existing Web image search engines index images by textual descriptions including filename, image caption, surrounding text, etc. However, the textual description available on the Web could be ambiguous or inaccurate in describing the actual image content and some images irrelevant to user's query are also returned by text-based search engines. In this paper, we propose to integrate the existing text-based image search engine with visual features, in order to improve the performance of pure text-based Web image search. The proposed algorithm is named SIEVE. Practical fusion methods are proposed to integrate SIEVE with contemporary text-based search engines. In our approach, text-based image search results for a given query are obtained first. Then, SIEVE is used to filter out those images which are semantically irrelevant to the query. Experimental results show that the image retrieval performance using SIEVE improves over Google image search significantly. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
46. A Closed-Form Solution of Reconstruction from Nonparallel Stereo Geometry Used in Image Guided System for Surgery.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yueting Zhuang, Huang, Thomas S., Jianhua Wang, and Yuncai Liu
- Abstract
In this paper, a closed-form solution of 3D reconstruction from nonparallel stereo geometry is derived theoretically. From a pair of conjugate image points in an arbitrarily configured stereo system that has been calibrated, one can directly reconstruct the 3D scene point using the closed-form solution, bypassing the process of rectifying the images or iteratively optimizing. Experiment results from both simulated data and real images validate the closed-form solution. Practical application to Image Guided System for surgery shows that the closed-form solution has advantages to improve efficiency and accuracy of 3D reconstruction from nonparallel stereo system in comparison with the conventional method that employs algorithm for the standard parallel axis stereo geometry. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
47. Virtual Community Based Secure Service Discovery and Access for 3D Video Steaming Applications.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Shudong Chen
- Abstract
The Freeband I-Share project aims to define the mechanisms for trust, willingness, resource discovery and sharing mechanisms in virtual communities. To improve the secure and performance of a 3D video streaming application, which is a research vehicle of the I-Share project, we propose a virtual community based access control approach for secure service discovery and access (VICSDA) which groups services in virtual communities and only grants authenticated community members to discover and access these community services. There are two main contributions associated with this approach. First, different from most of the other access control approaches it adopts a dual access control mechanism which allows community services to define their local access control policy besides following the community membership policy. Second, behavior of these community services is monitored in order to guarantee a better QoS provision. Using this approach, the 3D video streaming application can be guaranteed with authentication and message confidentiality through the dual secure service discovery and access mechanism. Better application performance can also be achieved through the community member behavior audit. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
48. Managing and Searching Distributed Multidimensional Annotations with Large Scale Image Data.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Tian Xia
- Abstract
Advanced imaging applications are producing large-scale image data for medical diagnosis, part visualization and inspection, content retrieval, analysis, e-reporting, and so on. In many cases, service and non-textual annotations are made on top of the raw images to provide essential information on regions of interest, diseases, defects, evaluations, comments, and etc. These applications pose several challenges: i) Large data volume and the latency of data transfer over the Internet leads to a performance bottleneck; ii) The applications need advanced query support such as similarity queries on the multidimensional annotation data; and iii) Local applications need synchronizing data with a remote central database. In our work, we develop a general distributed multimedia data management system that achieves these goals by providing: i) an intelligent multimedia content caching system to support smooth local applications; ii) a loosely coupled extensible multi-indexing server to support different types of multimedia queries including similarity queries; iii) unified multimedia data access interfaces for universal data access; and iv) an integrated architecture that brings these technologies together into a robust system. The system is now successfully used to support image-based inspection and diagnosis applications such as global part inspection and knowledge database guided medical diagnosis. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
49. Senary Huffman Compression - A Reversible Data Hiding Scheme for Binary Images.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Chung-Chuan Wang
- Abstract
Over the past few years many studies have proposed reversible data hiding schemes, but few have been applied to binary images. Some studies have utilized spread spectrum, compression and binary operations methods to achieve the data hiding goal, but most of them suffered from poor visual quality, capacity and inability to extract the hidden data during recovery. The performance of existing methods also is unsatisfactory. Therefore, this paper proposes a reversible data hiding scheme for binary images: SHC (senary Huffman compression). SHC adopts the half-white and half-black pixels of 4×1 or 2×2 blocks of six types to increase visual quality. Moreover, SHC is senary instead of binary and becomes double senary as a compression unit to increase compression rate and secret hiding capacity. Experimental results show that recovered images are well within human visual perception and have PSNR(s) greater than 33, high secret hiding capacity for 1.1 secret bits / 2 bits, and an effective compression rate of over 17% on average. All results demonstrate that the scheme has advantages in reversible data hiding for binary images. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
50. A New Type of Proxy Ring Signature Scheme with Revocable Anonymity and No Info Leaked.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Sebe, Nicu, Yuncai Liu, Yueting Zhuang, Huang, Thomas S., and Chengyu Hu
- Abstract
In some real situations, we must apply proxy signature and ring signature both concurrently. In this paper, we present a new type of proxy ring signature scheme with revocable anonymity which allows the original signer to know exactly who the signer is and unlike other schemes, the original signer doesn't need to publish extra information except his original public key. It can play an important role in some real applications. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.